Right now your agent is a blank slate. It forgets everything after each message. Let's fix that — give it a system prompt, conversation memory, and the ability to hold a real conversation.
How to shape agent behavior and personality
Persist context across multiple messages
Build natural back-and-forth conversations
Make sure you've completed the main workshop and have a working agent system running in Docker. You should be able to ask your agent about the weather and get a response.
# Make sure everything is running docker compose ps # You should see mcp-server, travel-agent, and agent-web as "healthy" # If not, run: docker compose up -d
Open services/agent/app.py and look at how the agent handles queries.
Right now, every request starts fresh — the agent has no memory of previous messages.
Try this experiment in the web UI:
Your goal: Make the second message work naturally, because the agent remembers the conversation.
A system prompt tells the AI how to behave. It's like giving your agent a personality and a job description. Find where the agent sends messages to OpenAI and add a system message at the beginning of the conversation.
What a system prompt looks like:
SYSTEM_PROMPT = """You are a helpful travel assistant called Ingrid. You help people plan trips by checking weather, finding news about destinations, and sharing interesting facts. You speak in a friendly, conversational tone. When someone asks about a city, proactively offer to check the weather there. Always respond in the same language the user writes in.""" # Then in your messages list sent to OpenAI, add this as the first message: messages = [ {"role": "system", "content": SYSTEM_PROMPT}, # ... user messages go here ]
Try it out: Change the personality completely! Make the agent a pirate, a poet, or a sarcastic weather reporter. The system prompt is where all the fun happens.
Tip: Look for where the agent calls client.chat.completions.create() —
that's where you add the system message to the messages list.
OpenAI's API is stateless — it doesn't remember previous messages unless you send them. To have a conversation, you need to store the message history and include it in every request.
The simplest approach — an in-memory list:
# Store conversations by session ID conversations: Dict[str, list] = {} def get_conversation(session_id: str) -> list: """Get or create a conversation history.""" if session_id not in conversations: conversations[session_id] = [ {"role": "system", "content": SYSTEM_PROMPT} ] return conversations[session_id] # When handling a query: def handle_query(query: str, session_id: str = "default"): history = get_conversation(session_id) # Add the user's new message history.append({"role": "user", "content": query}) # Send the FULL history to OpenAI response = client.chat.completions.create( model="gpt-4o-mini", messages=history, # <-- all previous messages included! tools=tool_definitions, ) # Add the assistant's response to history assistant_message = response.choices[0].message history.append({ "role": "assistant", "content": assistant_message.content }) return assistant_message.content
How it works: Each time the user sends a message, you append it to the history, send the entire history to OpenAI, and then append the response. OpenAI sees the full conversation and can reference earlier messages naturally.
Right now the /query endpoint doesn't know which conversation
a message belongs to. You need to accept a session ID so different users (or browser tabs) get separate conversations.
Update the query endpoint:
class QueryRequest(BaseModel): query: str session_id: str = "default" # Add this field @app.post("/query") async def handle_query(request: QueryRequest): # Pass session_id to your conversation handler result = await process_query(request.query, request.session_id) return {"response": result}
The web frontend can generate a random session ID when the page loads and include it in every request. This way, each browser tab gets its own conversation thread.
OpenAI models have a maximum context window (how many tokens they can process at once). If you keep appending messages forever, you'll eventually hit the limit and get an error. A simple fix: only keep the last N messages.
MAX_HISTORY = 20 # Keep last 20 messages (10 exchanges) def get_trimmed_history(session_id: str) -> list: """Get conversation history, trimmed to fit context window.""" history = get_conversation(session_id) # Always keep the system prompt (first message) if len(history) > MAX_HISTORY + 1: # System prompt + last N messages return [history[0]] + history[-(MAX_HISTORY):] return history
Why +1? The system prompt always stays — it's index 0. You trim user/assistant messages from the middle, keeping the most recent ones so the conversation still makes sense.
Rebuild and test that your agent now holds a real conversation:
# Rebuild the agent docker compose build travel-agent && docker compose up -d # Test multi-turn conversation curl -X POST "http://localhost:8001/query" \ -H "Content-Type: application/json" \ -d '{"query": "What is the weather in Oslo?", "session_id": "test-1"}' # Now ask a follow-up — it should understand context! curl -X POST "http://localhost:8001/query" \ -H "Content-Type: application/json" \ -d '{"query": "What about Bergen?", "session_id": "test-1"}' # Different session = fresh conversation curl -X POST "http://localhost:8001/query" \ -H "Content-Type: application/json" \ -d '{"query": "What about Bergen?", "session_id": "test-2"}'
In-memory conversations disappear when the container restarts. For a production agent, you'd store them
in a database. The workshop repo already has a conversation_memory.py
file in the agent service — take a look at how it uses SQLite and try integrating it!
Hints:
services/agent/conversation_memory.py for the existing implementation/data volume (already mounted in docker-compose.yml)http://localhost:8090 using Datasette