Zep + LangChain: Give Your AI Assistant Persistent Memory
Your chatbot has amnesia.
Every time a user starts a new session, your AI assistant forgets everything—their name, preferences, previous conversations, all of it. Gone. You're essentially building a goldfish with a PhD.
This isn't a minor inconvenience. It's why your customer support bot asks "How can I help you today?" to someone who's been troubleshooting the same issue for three days.
Zep fixes this. It's a memory layer that gives your LLM applications persistent, long-term memory. And integrating it with LangChain takes about 15 minutes.
What Zep Actually Does
Zep sits between your application and your LLM. When a conversation happens, Zep:
- Stores the full conversation history
- Automatically summarizes long conversations
- Extracts facts and entities from chats
- Retrieves relevant context when you need it
The magic is in that last part. When a user returns three weeks later and mentions "that project we discussed," Zep knows exactly what they're talking about.
Prerequisites
- Python 3.9+
- A running Zep instance (we'll use Elestio for deployment)
- Basic familiarity with LangChain
Setting Up Zep
First, deploy Zep on Elestio. Select Zep from the marketplace, pick your provider (2 CPU / 4GB RAM minimum), and click deploy. You'll have a running instance in about 3 minutes.
Once deployed, grab your Zep URL from the Elestio dashboard. It'll look like https://zep-xxxxx.vm.elestio.app.
Installing Dependencies
pip install zep-python langchain langchain-openai
The Basic Integration
Here's a minimal example that gives your chatbot memory:
from zep_python import ZepClient
from zep_python.memory import Memory, Message
from langchain_openai import ChatOpenAI
from langchain.schema import HumanMessage, AIMessage, SystemMessage
# Initialize clients
zep = ZepClient(base_url="https://your-zep-instance.vm.elestio.app")
llm = ChatOpenAI(model="gpt-4")
SESSION_ID = "user-123-session"
async def chat_with_memory(user_input: str) -> str:
# Get existing memory for this session
try:
memory = await zep.memory.aget(SESSION_ID)
context = memory.summary.content if memory.summary else ""
recent_messages = memory.messages[-10:] # Last 10 messages
except:
context = ""
recent_messages = []
# Build message history
messages = [
SystemMessage(content=f"Context from previous conversations: {context}"),
]
for msg in recent_messages:
if msg.role == "human":
messages.append(HumanMessage(content=msg.content))
else:
messages.append(AIMessage(content=msg.content))
messages.append(HumanMessage(content=user_input))
# Get response from LLM
response = await llm.ainvoke(messages)
# Store the conversation in Zep
await zep.memory.aadd(
SESSION_ID,
Memory(messages=[
Message(role="human", content=user_input),
Message(role="ai", content=response.content)
])
)
return response.content
That's it. Your chatbot now remembers conversations across sessions.
Using LangChain's Built-in Zep Integration
LangChain has native Zep support, which makes things even cleaner:
from langchain_community.memory import ZepMemory
from langchain.chains import ConversationChain
from langchain_openai import ChatOpenAI
memory = ZepMemory(
session_id="user-456",
url="https://your-zep-instance.vm.elestio.app",
memory_key="chat_history",
return_messages=True
)
chain = ConversationChain(
llm=ChatOpenAI(model="gpt-4"),
memory=memory,
verbose=True
)
# This conversation is automatically stored and retrieved
response = chain.predict(input="My name is Sarah and I work at Acme Corp")
# Later...
response = chain.predict(input="What company do I work at?")
# Returns: "You work at Acme Corp, Sarah."
Fact Extraction: The Hidden Power
Zep doesn't just store messages—it extracts structured facts. This is incredibly useful for customer support:
# After some conversations, retrieve extracted facts
memory = await zep.memory.aget("user-456")
for fact in memory.facts:
print(f"- {fact.fact}")
# Output:
# - User's name is Sarah
# - User works at Acme Corp
# - User is interested in enterprise pricing
# - User had an issue with API rate limits on Nov 15
You can use these facts to personalize responses, trigger workflows, or populate CRM records automatically.
Handling Multiple Users
In production, you'll want to create sessions per user:
def get_session_id(user_id: str, conversation_id: str = None) -> str:
if conversation_id:
return f"{user_id}-{conversation_id}"
return f"{user_id}-default"
# Each user gets their own memory space
sarah_session = get_session_id("sarah@acme.com")
john_session = get_session_id("john@example.com")
Troubleshooting
Memory not persisting? Check that your Zep instance is running and accessible. Test with:
curl https://your-zep-instance.vm.elestio.app/healthz
Context window too large? Zep automatically summarizes old conversations, but you can control how many recent messages to include. Reduce recent_messages[-10:] to a smaller number.
Slow retrieval? Make sure your Elestio instance has adequate resources. For production workloads with many concurrent users, consider upgrading to 4 CPU / 8GB RAM.
Why Self-Host Your AI Memory?
Here's the thing about conversation history: it's incredibly sensitive data. Every chat contains context about your users, their problems, their business details.
Self-hosting Zep means that data stays on your infrastructure. No third-party has access to your users' conversations. For enterprise customers, this isn't optional—it's a requirement.
With Zep on Elestio, you get managed deployment without the data sovereignty concerns. Automatic backups, SSL, and updates—but your data stays yours.
What's Next
Once you have basic memory working, explore:
- Custom metadata: Tag sessions with user segments or conversation types
- Search: Query past conversations semantically
- Webhooks: Trigger actions when specific facts are extracted
Your AI assistant just got a lot smarter. More importantly, it now remembers that it got smarter.
Thanks for reading! If you're building AI applications that need memory, deploy Zep on Elestio and have it running in minutes.