New Read the engineering deep dive on building MemoryOS
M Open Source · MIT License

Persistent memory
for AI agents

MemoryOS gives your agents queryable memory across sessions, with temporal awareness, a knowledge graph, and sub-100ms retrieval. Self-hostable. Production ready.

79ms Fast query (warm p50)
82% LongMemEval accuracy
$0 Self-hosted, no per-query cost
memoryos.oxide.fun/app
MemoryOS
142
Entities
389
Relations
1.2K
Memories
12
Sessions
Upload
Graph
Query
Metrics
ENTITY TYPES
Person12
Org8
Place5
Alice Chen
TechCo
Seattle
Boston
Bob Martinez
Sequoia
MIT
Sarah Kim
Stripe

Tested on LongMemEval-s

100 questions across 6 categories from the ICLR 2025 benchmark, each with ~50 conversation sessions in the haystack.

Single-session (User)
100%
Direct recall of user statements
Preference
96%
Persistent style & preferences
Knowledge Update
94%
Tracking facts that change
Temporal Reasoning
88%
"Where did Alice live in 2022?"
Multi-session
72%
Synthesis across sessions
Overall
82%
Across all six categories

Built for real agent workloads

A temporal knowledge graph layered on hybrid vector retrieval. Designed for the problems standard RAG pipelines cannot solve.

Temporal knowledge graph

Facts stored with timestamps, never deleted, only superseded. Ask "where did Alice live in 2022?" and get the historically correct answer.

Hybrid retrieval

Four signals combined: raw similarity, enriched similarity, BM25 keywords, and graph proximity. Each tunable per tenant.

Sliding window enrichment

Pronouns resolved before embedding. "I moved" becomes "Alice Chen moved to Seattle." Each chunk gets its own enriched vector.

Ebbinghaus memory decay

Stale memories fade naturally. Retrieval reinforces stability. Superseded facts decay immediately. Nothing is ever deleted.

Full observability

Every retrieval returns a score breakdown. See which signal fired, what the decay score was, why each memory was retrieved.

Single-database architecture

PostgreSQL handles relational metadata, pgvector HNSW for ANN search, and graph edges in one transaction. One system to operate.

How memory works

MemoryOS sits between your agent and the LLM. It retrieves the right context. The LLM does the reasoning.

01

Ingest

Raw text goes in. spaCy extracts entities. LLM enriches each chunk. Triples become graph edges.

02

Retrieve

Query is embedded and scored. Graph traversal finds entity-linked memories. Top candidates reranked.

03

Inject

Top-k memories injected into your LLM prompt as context. Agent now knows the right facts.

04

Persist

Each retrieval reinforces memory stability. New facts supersede old ones via append-only graph.

Self-hostable in minutes

One docker compose up and you're running.