chore: initialize EverOS 1.0.0

md-first memory extraction framework for AI agents.

Markdown is the single source of truth; SQLite holds state and LanceDB
provides the rebuildable vector + BM25 + scalar index. The codebase follows
a single-direction DDD layering (entrypoints -> service -> memory -> infra,
with component / core / config cross-cutting) enforced by import-linter.

Engineering surface:
- Coding conventions in .claude/rules/ (path-scoped) and workflows in
  .claude/skills/ (/commit, /new-branch, /pr).
- GitHub Actions CI runs make lint + test + integration; pre-commit mirrors
  the gates locally (ruff, hygiene hooks, gitlint commit-msg).
- Commit messages follow Conventional Commits, enforced by gitlint.
- make lint also enforces datetime two-zone discipline and OpenAPI drift.
This commit is contained in:
Elliot Chen
2026-06-05 22:35:51 +08:00
commit 518b8eca85
636 changed files with 160553 additions and 0 deletions

294
docs/how-memory-works.md Normal file
View File

@ -0,0 +1,294 @@
# How Memory Works
How EverOS turns a stream of messages into durable, searchable memory —
the storage stack, the path layout on disk, the write→index→read
pipeline, and the consistency guarantees.
This is the narrative companion to the reference docs: see
[storage_layout.md](storage_layout.md) for the exact file encoding,
[architecture.md](architecture.md) for the layer boundaries, and
[api.md](api.md) for the HTTP contract.
## Table of contents
- [The storage stack](#the-storage-stack)
- [Storage paths](#storage-paths)
- [How a memory is born](#how-a-memory-is-born)
- [Memory types & storage strategies](#memory-types--storage-strategies)
- [The cascade daemon](#the-cascade-daemon)
- [The Offline Memory Engine (OME)](#the-offline-memory-engine-ome)
- [Consistency model](#consistency-model)
- [Zero external services](#zero-external-services)
- [Operating it](#operating-it)
## The storage stack
Three embedded pieces, each owning what it is best at. Markdown is the
**source of truth**; the other two are **derived and rebuildable**.
| Layer | Backed by | Holds | Rebuildable? |
|---|---|---|---|
| **Markdown + YAML frontmatter** | plain `.md` files | the memory content itself — the only portable, human-editable asset | — (it *is* the truth) |
| **SQLite** (`aiosqlite`) | `.index/sqlite/*.db` | system state, audit log, the cascade queue, the boundary buffer, OME engine state | ✅ from markdown |
| **LanceDB** (Arrow) | `.index/lancedb/*.lance` | vector + BM25 + scalar columns for retrieval | ✅ from markdown |
!!! note "The one rule that follows from this"
Delete the entire `.index/` directory and **no memory is lost** — it
rebuilds from the `.md` tree. There is no separate "export"; the
markdown *is* the export. (How to trigger a rebuild:
[Operating it](#operating-it).)
## Storage paths
The default memory root is **`~/.everos/`** (override with
`EVEROS_MEMORY__ROOT` or `[memory] root` in TOML). Configuration (the
`.env` file) is separate from data (the memory root): the server searches
`./.env``$XDG_CONFIG_HOME/everos/.env``~/.everos/.env`.
Memory is partitioned by **`<app_id>/<project_id>`** *before* the
user-visible directories, so different `(app, project)` spaces never share
a directory or cross in search. The reserved id `"default"` materialises as
`default_app` / `default_project` on disk (so a default space stays
visually distinct from a user-named one).
```
~/.everos/ ← memory root (EVEROS_MEMORY__ROOT)
├── default_app/ ← <app_id> ("default" → default_app)
│ └── default_project/ ← <project_id> ("default" → default_project)
│ ├── users/ ← user-visible (source of truth)
│ │ └── <user_id>/
│ │ ├── user.md single-file (profile)
│ │ ├── episodes/
│ │ │ └── episode-<YYYY-MM-DD>.md daily-log append
│ │ ├── .atomic_facts/ daily-log (hidden)
│ │ │ └── atomic_fact-<YYYY-MM-DD>.md
│ │ └── .foresights/ daily-log (hidden)
│ │ └── foresight-<YYYY-MM-DD>.md
│ ├── agents/
│ │ └── <agent_id>/
│ │ ├── .cases/ daily-log (hidden)
│ │ │ └── agent_case-<YYYY-MM-DD>.md
│ │ └── skills/ skill-named dir
│ │ └── skill_<name>/SKILL.md (+ references/ scripts/)
│ └── knowledge/ ← shared / global (reserved)
├── .index/ ← system-managed, rebuildable (gitignore)
│ ├── sqlite/
│ │ ├── system.db state / audit / cascade queue (md_change_state) / buffer / LSN
│ │ ├── ome.db Offline Memory Engine state
│ │ ├── ome.aps.db APScheduler jobstore (split to avoid lock contention)
│ │ └── ome.db.lock OME single-engine guard (portalocker)
│ └── lancedb/
│ └── <kind>.lance/ one Arrow table per kind
├── ome.toml ← user-editable OME strategy overrides (hot-reloaded)
└── .tmp/ atomic-write staging
```
!!! warning "Differences from older PRD-era docs"
The index dir is **`.index/`** (dot-prefixed), not `_index/`. The
cascade queue and LSN/audit state live in **SQLite** (`system.db`,
table `md_change_state`) — there is no `.cascade.log` / `.manifest.json`
file in the current implementation. The `<app>/<project>` nesting is
real and always present (`default_app/default_project` for the default
scope). There is **no `everos reindex` command** (see
[Operating it](#operating-it)).
The path manager is
[`MemoryRoot`](../src/everos/core/persistence/memory_root.py); every path
above is a property on it. `MemoryRoot.ensure()` creates the runtime dirs
(`.index/{sqlite,lancedb}/`, `.tmp/`) and copies the OME template to
`ome.toml`; user-visible dirs appear on first write.
## How a memory is born
A message does not become memory immediately — it accumulates, a boundary
is detected, an LLM extracts a cell, writers persist markdown, and the
index catches up asynchronously.
```
POST /add ──▶ unprocessed_buffer (SQLite) ← messages accumulate per (session, app, project)
├─ boundary detector trips ─┐
POST /flush ─────────┤ (or you force it) │ one LLM call
│ ▼
│ extract MemCell ──▶ memcell row (SQLite)
│ │
│ ┌──────────────┴───────────────┐
│ ▼ ▼
│ UserMemoryPipeline (sync) AgentMemoryPipeline (fire-and-forget)
│ writes episode .md NOW emits AgentPipelineStarted
▼ │ │
(response returns once md is on disk) │
▼ ▼
┌─────────────────── Offline Memory Engine (OME) ───────────────────┐
│ async strategies write derived .md: │
│ atomic_facts · foresight · user profile · agent cases · agent skills │
└───────────────────────────────┬──────────────────────────────────────┘
cascade daemon watches the .md tree
md_change_state queue (SQLite, durable)
rebuild LanceDB rows ──▶ searchable
```
- **`/add`** appends messages to a per-`(session_id, app_id, project_id)`
buffer and returns `accumulated` (or `extracted` if the boundary tripped
on this call). See [api.md](api.md).
- **`/flush`** forces the boundary now (one extraction LLM call), used at
the end of a chat/agent run.
- Episode markdown is written **synchronously** — when `/flush` returns
`extracted`, the episode file is already on disk.
- Everything else (atomic facts, foresight, profile, agent cases/skills)
is produced **asynchronously** by the OME — see
[the OME section](#the-offline-memory-engine-ome).
- The **cascade daemon** turns every `.md` write into LanceDB rows so the
content becomes searchable.
## Memory types & storage strategies
Six business memory kinds today, each user- or agent-owned, each picking
one of three on-disk patterns:
| Kind | Owner | Dir / file | Strategy | Produced by |
|---|---|---|---|---|
| **episode** | user | `episodes/episode-<date>.md` | daily-log | extraction (sync) |
| **atomic_fact** | user | `.atomic_facts/atomic_fact-<date>.md` (hidden) | daily-log | OME |
| **foresight** | user | `.foresights/foresight-<date>.md` (hidden) | daily-log | OME |
| **profile** | user | `user.md` | single-file rewrite | OME |
| **agent_case** | agent | `.cases/agent_case-<date>.md` (hidden) | daily-log | OME |
| **agent_skill** | agent | `skills/skill_<name>/SKILL.md` | skill-named dir | OME (clustering) |
The three strategies:
| Strategy | Shape | Why |
|---|---|---|
| **Daily-log append** | `<prefix>-<YYYY-MM-DD>.md`, one entry appended per memory | collapses thousands of per-entry files into one file per day |
| **Single-file rewrite** | a fixed filename overwritten in place | for a single evolving document (a user/agent profile) |
| **Skill-named dir** | one directory per skill | a skill is a richer unit (body + optional `references/` `scripts/`) |
!!! note
The single-file writer also supports `agent.md` / `soul.md` /
`tools.md` / `behaviors.md`, but no shipped OME strategy produces those
yet — today only `user.md` is written. Detailed frontmatter and
entry-id encoding live in [storage_layout.md](storage_layout.md).
## The cascade daemon
The cascade subsystem keeps LanceDB in sync with the markdown tree. It runs
**in-process** with the server (a coroutine started by the app lifespan),
not as a separate OS daemon.
1. A native filesystem watcher (`watchdog`: FSEvents on macOS, inotify on
Linux) sees a `.md` create/modify.
2. The change is enqueued in the **`md_change_state`** table (SQLite) —
durable, so a crash mid-sync replays on restart.
3. A worker drains the queue at **entry-level** granularity: it diffs the
file, re-embeds only changed entries (keyed by `content_sha256`), and
upserts the LanceDB rows.
Because markdown is the source of truth, **editing a file directly is
fully supported** — open an episode in VSCode / Obsidian / Vim, change an
entry, save, and the daemon re-indexes just that entry. Operate the queue
with `everos cascade` ([Operating it](#operating-it)); deeper runbook in
[cascade_runbook.md](cascade_runbook.md).
## The Offline Memory Engine (OME)
Most memory kinds are **not** extracted on the request path — they are
derived later by the OME, an in-process async strategy engine. When
extraction carves a MemCell, it emits an event; OME strategies pick it up
and write their markdown when ready:
- `extract_atomic_facts` — single-sentence facts from an episode
- `extract_foresight` — anticipatory notes
- `extract_user_profile` — the aggregated `user.md`
- `extract_agent_case` — a reusable agent trajectory (only when the cell is
substantive enough; thin trajectories are skipped by design)
- `extract_agent_skill` — clusters related cases into a named skill
Strategies are configurable without a code change via **`ome.toml`** at the
memory root (hot-reloaded within ~2 s). Example — turn two off:
```toml
[strategies.extract_foresight]
enabled = false
[strategies.extract_user_profile]
enabled = false
```
OME keeps its own state in `.index/sqlite/ome.db` (run records, counters)
and its scheduler jobstore in `.index/sqlite/ome.aps.db` (split so the sync
APScheduler writer and the async OME writer never contend for one file
lock).
!!! tip "Implication for clients"
After `/flush` returns `extracted`, the **episode** is queryable soon
(once cascade indexes it), but **atomic facts / profile / agent cases**
appear only after their OME strategy runs — typically seconds later.
Poll / retry if you need them immediately.
## Consistency model
Two paths, two guarantees:
| Path | Guarantee | Detail |
|---|---|---|
| **Write** (`/add`, `/flush`) | **strong** | the episode `.md` is on disk before the call returns `extracted`; never blocks on LanceDB |
| **Read** (`/search`, `/get`) | **eventual** | reads LanceDB, which lags md by the cascade processing time — sub-second typically, up to ~1015 s under load |
So a `/search` immediately after the `/flush` that produced a record may
miss it. The markdown is durable regardless; index lag never loses data. If
you need read-your-write, retry with backoff, or force the queue with
`everos cascade sync`.
Integrity is anchored by a few invariants (details in
[storage_layout.md](storage_layout.md)): the frontmatter `id` /
`entry_id` is the immutable join key; `content_sha256` decides whether an
entry needs re-embedding; an LSN watermark (in `system.db`) orders
rebuilds; the durable `md_change_state` queue is the replayable audit
trail.
## Zero external services
No database server, message broker, or vector service to run. Vector ANN,
full-text BM25, and scalar filtering all execute inside the **embedded
LanceDB** engine in one query; SQLite is a local file. The whole stack is a
single directory you can copy, back up, or check the user-visible parts of
into git.
!!! note
There is no automatic "grep over markdown" search fallback today — if
the LanceDB index is unavailable, rebuild it from markdown (it is
derived and disposable) rather than relying on a degraded search path.
## Operating it
The CLI ([cli.md](cli.md)) is intentionally small:
| Command | What it does |
|---|---|
| `everos init` | write a starter `.env` |
| `everos server start` | run the HTTP API (cascade + OME start with it) |
| `everos cascade status` | queue / LSN summary |
| `everos cascade sync` | drain the cascade queue now (force md → LanceDB) |
| `everos cascade fix` | list failed rows / re-enqueue retryable ones |
!!! warning "There is no `everos reindex` or `everos flush`"
- **Reindex** = the index is rebuildable: stop the server,
`rm -rf <memory-root>/.index/lancedb`, restart — the cascade
rebuilds from markdown. For an incremental catch-up, use
`everos cascade sync`.
- **Flush** is an HTTP endpoint (`POST /api/v1/memory/flush`), not a
CLI command — it forces *extraction* of the session buffer, which is
a different thing from forcing *index sync* (`cascade sync`).
## References
- [storage_layout.md](storage_layout.md) — exact file encoding, frontmatter
chassis, entry-id format, atomic-write semantics
- [architecture.md](architecture.md) — DDD layers and dependency rules
- [api.md](api.md) — the HTTP contract (`/add` `/flush` `/search` `/get`)
- [cascade_runbook.md](cascade_runbook.md) — operating the sync queue