MemoryOS gives your agents queryable memory across sessions, with temporal awareness, a knowledge graph, and sub-100ms retrieval. Self-hostable. Production ready.
100 questions across 6 categories from the ICLR 2025 benchmark, each with ~50 conversation sessions in the haystack.
A temporal knowledge graph layered on hybrid vector retrieval. Designed for the problems standard RAG pipelines cannot solve.
Facts stored with timestamps, never deleted, only superseded. Ask "where did Alice live in 2022?" and get the historically correct answer.
Four signals combined: raw similarity, enriched similarity, BM25 keywords, and graph proximity. Each tunable per tenant.
Pronouns resolved before embedding. "I moved" becomes "Alice Chen moved to Seattle." Each chunk gets its own enriched vector.
Stale memories fade naturally. Retrieval reinforces stability. Superseded facts decay immediately. Nothing is ever deleted.
Every retrieval returns a score breakdown. See which signal fired, what the decay score was, why each memory was retrieved.
PostgreSQL handles relational metadata, pgvector HNSW for ANN search, and graph edges in one transaction. One system to operate.
MemoryOS sits between your agent and the LLM. It retrieves the right context. The LLM does the reasoning.
Raw text goes in. spaCy extracts entities. LLM enriches each chunk. Triples become graph edges.
Query is embedded and scored. Graph traversal finds entity-linked memories. Top candidates reranked.
Top-k memories injected into your LLM prompt as context. Agent now knows the right facts.
Each retrieval reinforces memory stability. New facts supersede old ones via append-only graph.
A technical deep dive into the architecture decisions and performance journey.
Why vector databases alone fail for conversational memory, how the temporal knowledge graph works, and the optimization journey from 28-second queries to 79ms warm-path retrieval.
Read the articleOne docker compose up and you're running.