Files
EverOS/docs/how-memory-works.md
Elliot Chen 518b8eca85 chore: initialize EverOS 1.0.0
md-first memory extraction framework for AI agents.

Markdown is the single source of truth; SQLite holds state and LanceDB
provides the rebuildable vector + BM25 + scalar index. The codebase follows
a single-direction DDD layering (entrypoints -> service -> memory -> infra,
with component / core / config cross-cutting) enforced by import-linter.

Engineering surface:
- Coding conventions in .claude/rules/ (path-scoped) and workflows in
  .claude/skills/ (/commit, /new-branch, /pr).
- GitHub Actions CI runs make lint + test + integration; pre-commit mirrors
  the gates locally (ruff, hygiene hooks, gitlint commit-msg).
- Commit messages follow Conventional Commits, enforced by gitlint.
- make lint also enforces datetime two-zone discipline and OpenAPI drift.
2026-06-06 07:33:17 +08:00

295 lines
15 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# How Memory Works
How EverOS turns a stream of messages into durable, searchable memory —
the storage stack, the path layout on disk, the write→index→read
pipeline, and the consistency guarantees.
This is the narrative companion to the reference docs: see
[storage_layout.md](storage_layout.md) for the exact file encoding,
[architecture.md](architecture.md) for the layer boundaries, and
[api.md](api.md) for the HTTP contract.
## Table of contents
- [The storage stack](#the-storage-stack)
- [Storage paths](#storage-paths)
- [How a memory is born](#how-a-memory-is-born)
- [Memory types & storage strategies](#memory-types--storage-strategies)
- [The cascade daemon](#the-cascade-daemon)
- [The Offline Memory Engine (OME)](#the-offline-memory-engine-ome)
- [Consistency model](#consistency-model)
- [Zero external services](#zero-external-services)
- [Operating it](#operating-it)
## The storage stack
Three embedded pieces, each owning what it is best at. Markdown is the
**source of truth**; the other two are **derived and rebuildable**.
| Layer | Backed by | Holds | Rebuildable? |
|---|---|---|---|
| **Markdown + YAML frontmatter** | plain `.md` files | the memory content itself — the only portable, human-editable asset | — (it *is* the truth) |
| **SQLite** (`aiosqlite`) | `.index/sqlite/*.db` | system state, audit log, the cascade queue, the boundary buffer, OME engine state | ✅ from markdown |
| **LanceDB** (Arrow) | `.index/lancedb/*.lance` | vector + BM25 + scalar columns for retrieval | ✅ from markdown |
!!! note "The one rule that follows from this"
Delete the entire `.index/` directory and **no memory is lost** — it
rebuilds from the `.md` tree. There is no separate "export"; the
markdown *is* the export. (How to trigger a rebuild:
[Operating it](#operating-it).)
## Storage paths
The default memory root is **`~/.everos/`** (override with
`EVEROS_MEMORY__ROOT` or `[memory] root` in TOML). Configuration (the
`.env` file) is separate from data (the memory root): the server searches
`./.env``$XDG_CONFIG_HOME/everos/.env``~/.everos/.env`.
Memory is partitioned by **`<app_id>/<project_id>`** *before* the
user-visible directories, so different `(app, project)` spaces never share
a directory or cross in search. The reserved id `"default"` materialises as
`default_app` / `default_project` on disk (so a default space stays
visually distinct from a user-named one).
```
~/.everos/ ← memory root (EVEROS_MEMORY__ROOT)
├── default_app/ ← <app_id> ("default" → default_app)
│ └── default_project/ ← <project_id> ("default" → default_project)
│ ├── users/ ← user-visible (source of truth)
│ │ └── <user_id>/
│ │ ├── user.md single-file (profile)
│ │ ├── episodes/
│ │ │ └── episode-<YYYY-MM-DD>.md daily-log append
│ │ ├── .atomic_facts/ daily-log (hidden)
│ │ │ └── atomic_fact-<YYYY-MM-DD>.md
│ │ └── .foresights/ daily-log (hidden)
│ │ └── foresight-<YYYY-MM-DD>.md
│ ├── agents/
│ │ └── <agent_id>/
│ │ ├── .cases/ daily-log (hidden)
│ │ │ └── agent_case-<YYYY-MM-DD>.md
│ │ └── skills/ skill-named dir
│ │ └── skill_<name>/SKILL.md (+ references/ scripts/)
│ └── knowledge/ ← shared / global (reserved)
├── .index/ ← system-managed, rebuildable (gitignore)
│ ├── sqlite/
│ │ ├── system.db state / audit / cascade queue (md_change_state) / buffer / LSN
│ │ ├── ome.db Offline Memory Engine state
│ │ ├── ome.aps.db APScheduler jobstore (split to avoid lock contention)
│ │ └── ome.db.lock OME single-engine guard (portalocker)
│ └── lancedb/
│ └── <kind>.lance/ one Arrow table per kind
├── ome.toml ← user-editable OME strategy overrides (hot-reloaded)
└── .tmp/ atomic-write staging
```
!!! warning "Differences from older PRD-era docs"
The index dir is **`.index/`** (dot-prefixed), not `_index/`. The
cascade queue and LSN/audit state live in **SQLite** (`system.db`,
table `md_change_state`) — there is no `.cascade.log` / `.manifest.json`
file in the current implementation. The `<app>/<project>` nesting is
real and always present (`default_app/default_project` for the default
scope). There is **no `everos reindex` command** (see
[Operating it](#operating-it)).
The path manager is
[`MemoryRoot`](../src/everos/core/persistence/memory_root.py); every path
above is a property on it. `MemoryRoot.ensure()` creates the runtime dirs
(`.index/{sqlite,lancedb}/`, `.tmp/`) and copies the OME template to
`ome.toml`; user-visible dirs appear on first write.
## How a memory is born
A message does not become memory immediately — it accumulates, a boundary
is detected, an LLM extracts a cell, writers persist markdown, and the
index catches up asynchronously.
```
POST /add ──▶ unprocessed_buffer (SQLite) ← messages accumulate per (session, app, project)
├─ boundary detector trips ─┐
POST /flush ─────────┤ (or you force it) │ one LLM call
│ ▼
│ extract MemCell ──▶ memcell row (SQLite)
│ │
│ ┌──────────────┴───────────────┐
│ ▼ ▼
│ UserMemoryPipeline (sync) AgentMemoryPipeline (fire-and-forget)
│ writes episode .md NOW emits AgentPipelineStarted
▼ │ │
(response returns once md is on disk) │
▼ ▼
┌─────────────────── Offline Memory Engine (OME) ───────────────────┐
│ async strategies write derived .md: │
│ atomic_facts · foresight · user profile · agent cases · agent skills │
└───────────────────────────────┬──────────────────────────────────────┘
cascade daemon watches the .md tree
md_change_state queue (SQLite, durable)
rebuild LanceDB rows ──▶ searchable
```
- **`/add`** appends messages to a per-`(session_id, app_id, project_id)`
buffer and returns `accumulated` (or `extracted` if the boundary tripped
on this call). See [api.md](api.md).
- **`/flush`** forces the boundary now (one extraction LLM call), used at
the end of a chat/agent run.
- Episode markdown is written **synchronously** — when `/flush` returns
`extracted`, the episode file is already on disk.
- Everything else (atomic facts, foresight, profile, agent cases/skills)
is produced **asynchronously** by the OME — see
[the OME section](#the-offline-memory-engine-ome).
- The **cascade daemon** turns every `.md` write into LanceDB rows so the
content becomes searchable.
## Memory types & storage strategies
Six business memory kinds today, each user- or agent-owned, each picking
one of three on-disk patterns:
| Kind | Owner | Dir / file | Strategy | Produced by |
|---|---|---|---|---|
| **episode** | user | `episodes/episode-<date>.md` | daily-log | extraction (sync) |
| **atomic_fact** | user | `.atomic_facts/atomic_fact-<date>.md` (hidden) | daily-log | OME |
| **foresight** | user | `.foresights/foresight-<date>.md` (hidden) | daily-log | OME |
| **profile** | user | `user.md` | single-file rewrite | OME |
| **agent_case** | agent | `.cases/agent_case-<date>.md` (hidden) | daily-log | OME |
| **agent_skill** | agent | `skills/skill_<name>/SKILL.md` | skill-named dir | OME (clustering) |
The three strategies:
| Strategy | Shape | Why |
|---|---|---|
| **Daily-log append** | `<prefix>-<YYYY-MM-DD>.md`, one entry appended per memory | collapses thousands of per-entry files into one file per day |
| **Single-file rewrite** | a fixed filename overwritten in place | for a single evolving document (a user/agent profile) |
| **Skill-named dir** | one directory per skill | a skill is a richer unit (body + optional `references/` `scripts/`) |
!!! note
The single-file writer also supports `agent.md` / `soul.md` /
`tools.md` / `behaviors.md`, but no shipped OME strategy produces those
yet — today only `user.md` is written. Detailed frontmatter and
entry-id encoding live in [storage_layout.md](storage_layout.md).
## The cascade daemon
The cascade subsystem keeps LanceDB in sync with the markdown tree. It runs
**in-process** with the server (a coroutine started by the app lifespan),
not as a separate OS daemon.
1. A native filesystem watcher (`watchdog`: FSEvents on macOS, inotify on
Linux) sees a `.md` create/modify.
2. The change is enqueued in the **`md_change_state`** table (SQLite) —
durable, so a crash mid-sync replays on restart.
3. A worker drains the queue at **entry-level** granularity: it diffs the
file, re-embeds only changed entries (keyed by `content_sha256`), and
upserts the LanceDB rows.
Because markdown is the source of truth, **editing a file directly is
fully supported** — open an episode in VSCode / Obsidian / Vim, change an
entry, save, and the daemon re-indexes just that entry. Operate the queue
with `everos cascade` ([Operating it](#operating-it)); deeper runbook in
[cascade_runbook.md](cascade_runbook.md).
## The Offline Memory Engine (OME)
Most memory kinds are **not** extracted on the request path — they are
derived later by the OME, an in-process async strategy engine. When
extraction carves a MemCell, it emits an event; OME strategies pick it up
and write their markdown when ready:
- `extract_atomic_facts` — single-sentence facts from an episode
- `extract_foresight` — anticipatory notes
- `extract_user_profile` — the aggregated `user.md`
- `extract_agent_case` — a reusable agent trajectory (only when the cell is
substantive enough; thin trajectories are skipped by design)
- `extract_agent_skill` — clusters related cases into a named skill
Strategies are configurable without a code change via **`ome.toml`** at the
memory root (hot-reloaded within ~2 s). Example — turn two off:
```toml
[strategies.extract_foresight]
enabled = false
[strategies.extract_user_profile]
enabled = false
```
OME keeps its own state in `.index/sqlite/ome.db` (run records, counters)
and its scheduler jobstore in `.index/sqlite/ome.aps.db` (split so the sync
APScheduler writer and the async OME writer never contend for one file
lock).
!!! tip "Implication for clients"
After `/flush` returns `extracted`, the **episode** is queryable soon
(once cascade indexes it), but **atomic facts / profile / agent cases**
appear only after their OME strategy runs — typically seconds later.
Poll / retry if you need them immediately.
## Consistency model
Two paths, two guarantees:
| Path | Guarantee | Detail |
|---|---|---|
| **Write** (`/add`, `/flush`) | **strong** | the episode `.md` is on disk before the call returns `extracted`; never blocks on LanceDB |
| **Read** (`/search`, `/get`) | **eventual** | reads LanceDB, which lags md by the cascade processing time — sub-second typically, up to ~1015 s under load |
So a `/search` immediately after the `/flush` that produced a record may
miss it. The markdown is durable regardless; index lag never loses data. If
you need read-your-write, retry with backoff, or force the queue with
`everos cascade sync`.
Integrity is anchored by a few invariants (details in
[storage_layout.md](storage_layout.md)): the frontmatter `id` /
`entry_id` is the immutable join key; `content_sha256` decides whether an
entry needs re-embedding; an LSN watermark (in `system.db`) orders
rebuilds; the durable `md_change_state` queue is the replayable audit
trail.
## Zero external services
No database server, message broker, or vector service to run. Vector ANN,
full-text BM25, and scalar filtering all execute inside the **embedded
LanceDB** engine in one query; SQLite is a local file. The whole stack is a
single directory you can copy, back up, or check the user-visible parts of
into git.
!!! note
There is no automatic "grep over markdown" search fallback today — if
the LanceDB index is unavailable, rebuild it from markdown (it is
derived and disposable) rather than relying on a degraded search path.
## Operating it
The CLI ([cli.md](cli.md)) is intentionally small:
| Command | What it does |
|---|---|
| `everos init` | write a starter `.env` |
| `everos server start` | run the HTTP API (cascade + OME start with it) |
| `everos cascade status` | queue / LSN summary |
| `everos cascade sync` | drain the cascade queue now (force md → LanceDB) |
| `everos cascade fix` | list failed rows / re-enqueue retryable ones |
!!! warning "There is no `everos reindex` or `everos flush`"
- **Reindex** = the index is rebuildable: stop the server,
`rm -rf <memory-root>/.index/lancedb`, restart — the cascade
rebuilds from markdown. For an incremental catch-up, use
`everos cascade sync`.
- **Flush** is an HTTP endpoint (`POST /api/v1/memory/flush`), not a
CLI command — it forces *extraction* of the session buffer, which is
a different thing from forcing *index sync* (`cascade sync`).
## References
- [storage_layout.md](storage_layout.md) — exact file encoding, frontmatter
chassis, entry-id format, atomic-write semantics
- [architecture.md](architecture.md) — DDD layers and dependency rules
- [api.md](api.md) — the HTTP contract (`/add` `/flush` `/search` `/get`)
- [cascade_runbook.md](cascade_runbook.md) — operating the sync queue