md-first memory extraction framework for AI agents. Markdown is the single source of truth; SQLite holds state and LanceDB provides the rebuildable vector + BM25 + scalar index. The codebase follows a single-direction DDD layering (entrypoints -> service -> memory -> infra, with component / core / config cross-cutting) enforced by import-linter. Engineering surface: - Coding conventions in .claude/rules/ (path-scoped) and workflows in .claude/skills/ (/commit, /new-branch, /pr). - GitHub Actions CI runs make lint + test + integration; pre-commit mirrors the gates locally (ruff, hygiene hooks, gitlint commit-msg). - Commit messages follow Conventional Commits, enforced by gitlint. - make lint also enforces datetime two-zone discipline and OpenAPI drift.
295 lines
15 KiB
Markdown
295 lines
15 KiB
Markdown
# How Memory Works
|
||
|
||
How EverOS turns a stream of messages into durable, searchable memory —
|
||
the storage stack, the path layout on disk, the write→index→read
|
||
pipeline, and the consistency guarantees.
|
||
|
||
This is the narrative companion to the reference docs: see
|
||
[storage_layout.md](storage_layout.md) for the exact file encoding,
|
||
[architecture.md](architecture.md) for the layer boundaries, and
|
||
[api.md](api.md) for the HTTP contract.
|
||
|
||
## Table of contents
|
||
|
||
- [The storage stack](#the-storage-stack)
|
||
- [Storage paths](#storage-paths)
|
||
- [How a memory is born](#how-a-memory-is-born)
|
||
- [Memory types & storage strategies](#memory-types--storage-strategies)
|
||
- [The cascade daemon](#the-cascade-daemon)
|
||
- [The Offline Memory Engine (OME)](#the-offline-memory-engine-ome)
|
||
- [Consistency model](#consistency-model)
|
||
- [Zero external services](#zero-external-services)
|
||
- [Operating it](#operating-it)
|
||
|
||
## The storage stack
|
||
|
||
Three embedded pieces, each owning what it is best at. Markdown is the
|
||
**source of truth**; the other two are **derived and rebuildable**.
|
||
|
||
| Layer | Backed by | Holds | Rebuildable? |
|
||
|---|---|---|---|
|
||
| **Markdown + YAML frontmatter** | plain `.md` files | the memory content itself — the only portable, human-editable asset | — (it *is* the truth) |
|
||
| **SQLite** (`aiosqlite`) | `.index/sqlite/*.db` | system state, audit log, the cascade queue, the boundary buffer, OME engine state | ✅ from markdown |
|
||
| **LanceDB** (Arrow) | `.index/lancedb/*.lance` | vector + BM25 + scalar columns for retrieval | ✅ from markdown |
|
||
|
||
!!! note "The one rule that follows from this"
|
||
Delete the entire `.index/` directory and **no memory is lost** — it
|
||
rebuilds from the `.md` tree. There is no separate "export"; the
|
||
markdown *is* the export. (How to trigger a rebuild:
|
||
[Operating it](#operating-it).)
|
||
|
||
## Storage paths
|
||
|
||
The default memory root is **`~/.everos/`** (override with
|
||
`EVEROS_MEMORY__ROOT` or `[memory] root` in TOML). Configuration (the
|
||
`.env` file) is separate from data (the memory root): the server searches
|
||
`./.env` → `$XDG_CONFIG_HOME/everos/.env` → `~/.everos/.env`.
|
||
|
||
Memory is partitioned by **`<app_id>/<project_id>`** *before* the
|
||
user-visible directories, so different `(app, project)` spaces never share
|
||
a directory or cross in search. The reserved id `"default"` materialises as
|
||
`default_app` / `default_project` on disk (so a default space stays
|
||
visually distinct from a user-named one).
|
||
|
||
```
|
||
~/.everos/ ← memory root (EVEROS_MEMORY__ROOT)
|
||
├── default_app/ ← <app_id> ("default" → default_app)
|
||
│ └── default_project/ ← <project_id> ("default" → default_project)
|
||
│ ├── users/ ← user-visible (source of truth)
|
||
│ │ └── <user_id>/
|
||
│ │ ├── user.md single-file (profile)
|
||
│ │ ├── episodes/
|
||
│ │ │ └── episode-<YYYY-MM-DD>.md daily-log append
|
||
│ │ ├── .atomic_facts/ daily-log (hidden)
|
||
│ │ │ └── atomic_fact-<YYYY-MM-DD>.md
|
||
│ │ └── .foresights/ daily-log (hidden)
|
||
│ │ └── foresight-<YYYY-MM-DD>.md
|
||
│ ├── agents/
|
||
│ │ └── <agent_id>/
|
||
│ │ ├── .cases/ daily-log (hidden)
|
||
│ │ │ └── agent_case-<YYYY-MM-DD>.md
|
||
│ │ └── skills/ skill-named dir
|
||
│ │ └── skill_<name>/SKILL.md (+ references/ scripts/)
|
||
│ └── knowledge/ ← shared / global (reserved)
|
||
│
|
||
├── .index/ ← system-managed, rebuildable (gitignore)
|
||
│ ├── sqlite/
|
||
│ │ ├── system.db state / audit / cascade queue (md_change_state) / buffer / LSN
|
||
│ │ ├── ome.db Offline Memory Engine state
|
||
│ │ ├── ome.aps.db APScheduler jobstore (split to avoid lock contention)
|
||
│ │ └── ome.db.lock OME single-engine guard (portalocker)
|
||
│ └── lancedb/
|
||
│ └── <kind>.lance/ one Arrow table per kind
|
||
│
|
||
├── ome.toml ← user-editable OME strategy overrides (hot-reloaded)
|
||
└── .tmp/ atomic-write staging
|
||
```
|
||
|
||
!!! warning "Differences from older PRD-era docs"
|
||
The index dir is **`.index/`** (dot-prefixed), not `_index/`. The
|
||
cascade queue and LSN/audit state live in **SQLite** (`system.db`,
|
||
table `md_change_state`) — there is no `.cascade.log` / `.manifest.json`
|
||
file in the current implementation. The `<app>/<project>` nesting is
|
||
real and always present (`default_app/default_project` for the default
|
||
scope). There is **no `everos reindex` command** (see
|
||
[Operating it](#operating-it)).
|
||
|
||
The path manager is
|
||
[`MemoryRoot`](../src/everos/core/persistence/memory_root.py); every path
|
||
above is a property on it. `MemoryRoot.ensure()` creates the runtime dirs
|
||
(`.index/{sqlite,lancedb}/`, `.tmp/`) and copies the OME template to
|
||
`ome.toml`; user-visible dirs appear on first write.
|
||
|
||
## How a memory is born
|
||
|
||
A message does not become memory immediately — it accumulates, a boundary
|
||
is detected, an LLM extracts a cell, writers persist markdown, and the
|
||
index catches up asynchronously.
|
||
|
||
```
|
||
POST /add ──▶ unprocessed_buffer (SQLite) ← messages accumulate per (session, app, project)
|
||
│
|
||
├─ boundary detector trips ─┐
|
||
POST /flush ─────────┤ (or you force it) │ one LLM call
|
||
│ ▼
|
||
│ extract MemCell ──▶ memcell row (SQLite)
|
||
│ │
|
||
│ ┌──────────────┴───────────────┐
|
||
│ ▼ ▼
|
||
│ UserMemoryPipeline (sync) AgentMemoryPipeline (fire-and-forget)
|
||
│ writes episode .md NOW emits AgentPipelineStarted
|
||
▼ │ │
|
||
(response returns once md is on disk) │
|
||
▼ ▼
|
||
┌─────────────────── Offline Memory Engine (OME) ───────────────────┐
|
||
│ async strategies write derived .md: │
|
||
│ atomic_facts · foresight · user profile · agent cases · agent skills │
|
||
└───────────────────────────────┬──────────────────────────────────────┘
|
||
▼
|
||
cascade daemon watches the .md tree
|
||
▼
|
||
md_change_state queue (SQLite, durable)
|
||
▼
|
||
rebuild LanceDB rows ──▶ searchable
|
||
```
|
||
|
||
- **`/add`** appends messages to a per-`(session_id, app_id, project_id)`
|
||
buffer and returns `accumulated` (or `extracted` if the boundary tripped
|
||
on this call). See [api.md](api.md).
|
||
- **`/flush`** forces the boundary now (one extraction LLM call), used at
|
||
the end of a chat/agent run.
|
||
- Episode markdown is written **synchronously** — when `/flush` returns
|
||
`extracted`, the episode file is already on disk.
|
||
- Everything else (atomic facts, foresight, profile, agent cases/skills)
|
||
is produced **asynchronously** by the OME — see
|
||
[the OME section](#the-offline-memory-engine-ome).
|
||
- The **cascade daemon** turns every `.md` write into LanceDB rows so the
|
||
content becomes searchable.
|
||
|
||
## Memory types & storage strategies
|
||
|
||
Six business memory kinds today, each user- or agent-owned, each picking
|
||
one of three on-disk patterns:
|
||
|
||
| Kind | Owner | Dir / file | Strategy | Produced by |
|
||
|---|---|---|---|---|
|
||
| **episode** | user | `episodes/episode-<date>.md` | daily-log | extraction (sync) |
|
||
| **atomic_fact** | user | `.atomic_facts/atomic_fact-<date>.md` (hidden) | daily-log | OME |
|
||
| **foresight** | user | `.foresights/foresight-<date>.md` (hidden) | daily-log | OME |
|
||
| **profile** | user | `user.md` | single-file rewrite | OME |
|
||
| **agent_case** | agent | `.cases/agent_case-<date>.md` (hidden) | daily-log | OME |
|
||
| **agent_skill** | agent | `skills/skill_<name>/SKILL.md` | skill-named dir | OME (clustering) |
|
||
|
||
The three strategies:
|
||
|
||
| Strategy | Shape | Why |
|
||
|---|---|---|
|
||
| **Daily-log append** | `<prefix>-<YYYY-MM-DD>.md`, one entry appended per memory | collapses thousands of per-entry files into one file per day |
|
||
| **Single-file rewrite** | a fixed filename overwritten in place | for a single evolving document (a user/agent profile) |
|
||
| **Skill-named dir** | one directory per skill | a skill is a richer unit (body + optional `references/` `scripts/`) |
|
||
|
||
!!! note
|
||
The single-file writer also supports `agent.md` / `soul.md` /
|
||
`tools.md` / `behaviors.md`, but no shipped OME strategy produces those
|
||
yet — today only `user.md` is written. Detailed frontmatter and
|
||
entry-id encoding live in [storage_layout.md](storage_layout.md).
|
||
|
||
## The cascade daemon
|
||
|
||
The cascade subsystem keeps LanceDB in sync with the markdown tree. It runs
|
||
**in-process** with the server (a coroutine started by the app lifespan),
|
||
not as a separate OS daemon.
|
||
|
||
1. A native filesystem watcher (`watchdog`: FSEvents on macOS, inotify on
|
||
Linux) sees a `.md` create/modify.
|
||
2. The change is enqueued in the **`md_change_state`** table (SQLite) —
|
||
durable, so a crash mid-sync replays on restart.
|
||
3. A worker drains the queue at **entry-level** granularity: it diffs the
|
||
file, re-embeds only changed entries (keyed by `content_sha256`), and
|
||
upserts the LanceDB rows.
|
||
|
||
Because markdown is the source of truth, **editing a file directly is
|
||
fully supported** — open an episode in VSCode / Obsidian / Vim, change an
|
||
entry, save, and the daemon re-indexes just that entry. Operate the queue
|
||
with `everos cascade` ([Operating it](#operating-it)); deeper runbook in
|
||
[cascade_runbook.md](cascade_runbook.md).
|
||
|
||
## The Offline Memory Engine (OME)
|
||
|
||
Most memory kinds are **not** extracted on the request path — they are
|
||
derived later by the OME, an in-process async strategy engine. When
|
||
extraction carves a MemCell, it emits an event; OME strategies pick it up
|
||
and write their markdown when ready:
|
||
|
||
- `extract_atomic_facts` — single-sentence facts from an episode
|
||
- `extract_foresight` — anticipatory notes
|
||
- `extract_user_profile` — the aggregated `user.md`
|
||
- `extract_agent_case` — a reusable agent trajectory (only when the cell is
|
||
substantive enough; thin trajectories are skipped by design)
|
||
- `extract_agent_skill` — clusters related cases into a named skill
|
||
|
||
Strategies are configurable without a code change via **`ome.toml`** at the
|
||
memory root (hot-reloaded within ~2 s). Example — turn two off:
|
||
|
||
```toml
|
||
[strategies.extract_foresight]
|
||
enabled = false
|
||
|
||
[strategies.extract_user_profile]
|
||
enabled = false
|
||
```
|
||
|
||
OME keeps its own state in `.index/sqlite/ome.db` (run records, counters)
|
||
and its scheduler jobstore in `.index/sqlite/ome.aps.db` (split so the sync
|
||
APScheduler writer and the async OME writer never contend for one file
|
||
lock).
|
||
|
||
!!! tip "Implication for clients"
|
||
After `/flush` returns `extracted`, the **episode** is queryable soon
|
||
(once cascade indexes it), but **atomic facts / profile / agent cases**
|
||
appear only after their OME strategy runs — typically seconds later.
|
||
Poll / retry if you need them immediately.
|
||
|
||
## Consistency model
|
||
|
||
Two paths, two guarantees:
|
||
|
||
| Path | Guarantee | Detail |
|
||
|---|---|---|
|
||
| **Write** (`/add`, `/flush`) | **strong** | the episode `.md` is on disk before the call returns `extracted`; never blocks on LanceDB |
|
||
| **Read** (`/search`, `/get`) | **eventual** | reads LanceDB, which lags md by the cascade processing time — sub-second typically, up to ~10–15 s under load |
|
||
|
||
So a `/search` immediately after the `/flush` that produced a record may
|
||
miss it. The markdown is durable regardless; index lag never loses data. If
|
||
you need read-your-write, retry with backoff, or force the queue with
|
||
`everos cascade sync`.
|
||
|
||
Integrity is anchored by a few invariants (details in
|
||
[storage_layout.md](storage_layout.md)): the frontmatter `id` /
|
||
`entry_id` is the immutable join key; `content_sha256` decides whether an
|
||
entry needs re-embedding; an LSN watermark (in `system.db`) orders
|
||
rebuilds; the durable `md_change_state` queue is the replayable audit
|
||
trail.
|
||
|
||
## Zero external services
|
||
|
||
No database server, message broker, or vector service to run. Vector ANN,
|
||
full-text BM25, and scalar filtering all execute inside the **embedded
|
||
LanceDB** engine in one query; SQLite is a local file. The whole stack is a
|
||
single directory you can copy, back up, or check the user-visible parts of
|
||
into git.
|
||
|
||
!!! note
|
||
There is no automatic "grep over markdown" search fallback today — if
|
||
the LanceDB index is unavailable, rebuild it from markdown (it is
|
||
derived and disposable) rather than relying on a degraded search path.
|
||
|
||
## Operating it
|
||
|
||
The CLI ([cli.md](cli.md)) is intentionally small:
|
||
|
||
| Command | What it does |
|
||
|---|---|
|
||
| `everos init` | write a starter `.env` |
|
||
| `everos server start` | run the HTTP API (cascade + OME start with it) |
|
||
| `everos cascade status` | queue / LSN summary |
|
||
| `everos cascade sync` | drain the cascade queue now (force md → LanceDB) |
|
||
| `everos cascade fix` | list failed rows / re-enqueue retryable ones |
|
||
|
||
!!! warning "There is no `everos reindex` or `everos flush`"
|
||
- **Reindex** = the index is rebuildable: stop the server,
|
||
`rm -rf <memory-root>/.index/lancedb`, restart — the cascade
|
||
rebuilds from markdown. For an incremental catch-up, use
|
||
`everos cascade sync`.
|
||
- **Flush** is an HTTP endpoint (`POST /api/v1/memory/flush`), not a
|
||
CLI command — it forces *extraction* of the session buffer, which is
|
||
a different thing from forcing *index sync* (`cascade sync`).
|
||
|
||
## References
|
||
|
||
- [storage_layout.md](storage_layout.md) — exact file encoding, frontmatter
|
||
chassis, entry-id format, atomic-write semantics
|
||
- [architecture.md](architecture.md) — DDD layers and dependency rules
|
||
- [api.md](api.md) — the HTTP contract (`/add` `/flush` `/search` `/get`)
|
||
- [cascade_runbook.md](cascade_runbook.md) — operating the sync queue
|