md-first memory extraction framework for AI agents. Markdown is the single source of truth; SQLite holds state and LanceDB provides the rebuildable vector + BM25 + scalar index. The codebase follows a single-direction DDD layering (entrypoints -> service -> memory -> infra, with component / core / config cross-cutting) enforced by import-linter. Engineering surface: - Coding conventions in .claude/rules/ (path-scoped) and workflows in .claude/skills/ (/commit, /new-branch, /pr). - GitHub Actions CI runs make lint + test + integration; pre-commit mirrors the gates locally (ruff, hygiene hooks, gitlint commit-msg). - Commit messages follow Conventional Commits, enforced by gitlint. - make lint also enforces datetime two-zone discipline and OpenAPI drift.
223 lines
9.9 KiB
Markdown
223 lines
9.9 KiB
Markdown
# Storage Layout
|
|
|
|
How `everos` lays out a memory-root on disk: directory tree, file
|
|
naming, frontmatter chassis, and entry-id encoding.
|
|
|
|
The contents are the **source of truth**; SQLite and LanceDB are
|
|
derived indexes that can be rebuilt from markdown alone.
|
|
|
|
## 1. Memory-root tree
|
|
|
|
A memory-root is a single directory holding all persisted memory. The
|
|
default location is `~/.everos/`; override via `EVEROS_MEMORY__ROOT`
|
|
env var or `[memory] root` in the TOML config.
|
|
|
|
Memory is partitioned by **`<app_id>/<project_id>`** *before* the
|
|
user-visible scope dirs, so different `(app, project)` spaces never share
|
|
a directory. The reserved id `"default"` materialises as `default_app` /
|
|
`default_project` on disk. The scope is encoded **in the path**, not in
|
|
the frontmatter (see [§3](#3-frontmatter-chassis-yaml)).
|
|
|
|
```
|
|
<memory-root>/ default ~/.everos
|
|
│
|
|
├── <app_id>/ user-visible; "default" → default_app
|
|
│ └── <project_id>/ "default" → default_project
|
|
│ ├── users/
|
|
│ │ └── <user_id>/
|
|
│ │ ├── user.md single-file rewrite (profile)
|
|
│ │ ├── episodes/ daily-log append
|
|
│ │ │ └── episode-<YYYY-MM-DD>.md
|
|
│ │ ├── .atomic_facts/ daily-log append (hidden)
|
|
│ │ │ └── atomic_fact-<YYYY-MM-DD>.md
|
|
│ │ └── .foresights/ daily-log append (hidden)
|
|
│ │ └── foresight-<YYYY-MM-DD>.md
|
|
│ ├── agents/
|
|
│ │ └── <agent_id>/
|
|
│ │ ├── .cases/ daily-log append (hidden)
|
|
│ │ │ └── agent_case-<YYYY-MM-DD>.md
|
|
│ │ └── skills/ skill-named dir
|
|
│ │ └── skill_<name>/
|
|
│ │ ├── SKILL.md
|
|
│ │ ├── references/ (optional)
|
|
│ │ └── scripts/ (optional)
|
|
│ └── knowledge/ user-visible (shared / global, reserved)
|
|
│
|
|
├── .index/ system-managed, rebuildable (gitignore)
|
|
│ ├── sqlite/
|
|
│ │ ├── system.db state / cascade queue (md_change_state) / buffer / audit / LSN (+ -wal / -shm)
|
|
│ │ ├── ome.db Offline Memory Engine state
|
|
│ │ ├── ome.aps.db APScheduler jobstore (split to avoid lock contention)
|
|
│ │ └── ome.db.lock OME single-engine guard (portalocker)
|
|
│ └── lancedb/
|
|
│ └── <kind>.lance/ one directory per LanceDB table
|
|
│
|
|
├── ome.toml user-editable OME strategy overrides (hot-reloaded)
|
|
└── .tmp/ staging dir for batch / multi-step writes
|
|
```
|
|
|
|
> Cascade queue state, the LSN watermark, and the change audit all live in
|
|
> SQLite (`system.db`, table `md_change_state`) — crash-recovery replays
|
|
> from that durable queue, not a log file. (`MemoryRoot` also exposes a
|
|
> `.lock` anchor for the `memory_root_lock` primitive; there is no
|
|
> `.cascade.log` / `.manifest.json`.)
|
|
|
|
The path manager is [`MemoryRoot`](../src/everos/core/persistence/memory_root.py),
|
|
exposing every path as a property. `MemoryRoot.ensure()` creates the
|
|
runtime-required dirs (`.index/{sqlite,lancedb}/`, `.tmp/`) and copies the
|
|
OME template to `ome.toml`; the user-visible dirs are *not* pre-created —
|
|
they appear on first write.
|
|
|
|
> The single-file writer also supports `agent.md` / `soul.md` / `tools.md`
|
|
> / `behaviors.md`, but no shipped strategy produces those today — only
|
|
> `user.md` is written. `memcell` is a SQLite-only kind (the boundary
|
|
> ledger); it has no markdown file.
|
|
|
|
## 2. Three storage strategies
|
|
|
|
Each business memory kind picks one of three on-disk patterns:
|
|
|
|
| Strategy | Filename | Mutation | Examples |
|
|
|---|---|---|---|
|
|
| **Daily-log append** | `<FILE_PREFIX>-<YYYY-MM-DD>.md` under `<DIR_NAME>/` | append entries | episode / atomic_fact / foresight / agent_case |
|
|
| **Skill-named dir** | `skills/skill_<name>/SKILL.md` (+ `references/` `scripts/`) | overwrite the file | agent skills (procedural memory) |
|
|
| **Single-file rewrite** | `user.md` (writer also supports `agent.md` / `soul.md` / `tools.md` / `behaviors.md`, not yet produced) | overwrite the file | user profile |
|
|
|
|
Markdown IO primitives live in
|
|
[`core/persistence/markdown/`](../src/everos/core/persistence/markdown/);
|
|
business-aware writers live in
|
|
[`infra/persistence/markdown/writers/`](../src/everos/infra/persistence/markdown/writers/)
|
|
and pick the right strategy via a base class.
|
|
|
|
To add a new memory kind, define its per-kind frontmatter schema under
|
|
[`infra/persistence/markdown/mds/`](../src/everos/infra/persistence/markdown/mds/)
|
|
and add a matching writer/reader pair under
|
|
[`writers/`](../src/everos/infra/persistence/markdown/writers/) and
|
|
[`readers/`](../src/everos/infra/persistence/markdown/readers/).
|
|
|
|
## 3. Frontmatter chassis (YAML)
|
|
|
|
Every markdown file carries a YAML frontmatter block at the top:
|
|
|
|
```markdown
|
|
---
|
|
id: episode_log_alice_2026-06-01
|
|
type: episode_daily
|
|
file_type: episode_daily
|
|
schema_version: 1
|
|
user_id: alice
|
|
track: user
|
|
date: '2026-06-01'
|
|
entry_count: 11
|
|
last_appended_at: '2026-06-01T09:12:13+00:00'
|
|
---
|
|
<!-- entry:ep_20260601_00000001 -->
|
|
...content...
|
|
<!-- /entry:ep_20260601_00000001 -->
|
|
```
|
|
|
|
Scope (`app_id` / `project_id`) is **not** a frontmatter field — it is
|
|
carried by the `<app>/<project>` path segments and recovered by the
|
|
cascade path parser. The frontmatter only holds the file-level owner
|
|
(`user_id` / `agent_id`) and `track`.
|
|
|
|
The chassis lives in [`core/persistence/markdown/frontmatter.py`](../src/everos/core/persistence/markdown/frontmatter.py)
|
|
(Pydantic v2):
|
|
|
|
```
|
|
BaseFrontmatter id / type / schema_version + SCOPE_DIR ClassVar
|
|
├─ UserScopedFrontmatter + user_id / track="user" + SCOPE_DIR="users"
|
|
└─ AgentScopedFrontmatter + agent_id / track="agent" + SCOPE_DIR="agents"
|
|
```
|
|
|
|
Concrete business schemas subclass one of the scope mixins and add
|
|
per-kind fields plus three more ClassVars that drive path resolution
|
|
+ entry-id assembly:
|
|
|
|
```python
|
|
class EpisodeDailyFrontmatter(DailyLogPathMixin, UserScopedFrontmatter):
|
|
ENTRY_ID_PREFIX: ClassVar[str] = "ep"
|
|
DIR_NAME: ClassVar[str] = "episodes"
|
|
FILE_PREFIX: ClassVar[str] = "episode"
|
|
type: Literal["episode_daily"] = "episode_daily"
|
|
date: dt.date
|
|
entry_count: int = 0
|
|
last_appended_at: dt.datetime | None = None
|
|
```
|
|
|
|
## 4. Entry-id encoding
|
|
|
|
Inside daily-log files each entry is bracketed by HTML-comment markers
|
|
so the raw markdown stays clean for human readers:
|
|
|
|
```
|
|
<!-- entry:<entry_id> -->
|
|
...content...
|
|
<!-- /entry:<entry_id> -->
|
|
```
|
|
|
|
`<entry_id>` is `<prefix>_<YYYYMMDD>_<NNNNNNNN>` (8-digit sequence),
|
|
e.g. `ep_20260601_00000001`:
|
|
|
|
| Segment | Source |
|
|
|---|---|
|
|
| `prefix` | `Frontmatter.ENTRY_ID_PREFIX` (declared by the schema subclass) |
|
|
| `<YYYYMMDD>` | The daily-log file's date bucket |
|
|
| `NNNNNNNN` | Per-file sequence, 8-digit zero-padded, restarts at `00000001` each day per scope |
|
|
|
|
Implementation: [`core/persistence/markdown/entries.py`](../src/everos/core/persistence/markdown/entries.py)
|
|
(`EntryId.parse / format / next_for`).
|
|
|
|
> **File-level seq, not global**: the same `ep_20260601_00000001` may
|
|
> appear across two different `user_id`s (each user has its own daily file).
|
|
> Cross-table joins must therefore key on **`(scope_id, entry_id)`**
|
|
> rather than `entry_id` alone — see SQLite/LanceDB tables that follow.
|
|
|
|
## 5. SQLite + LanceDB derived indexes
|
|
|
|
```
|
|
.index/
|
|
├── sqlite/
|
|
│ └── system.db state / audit log / task queue / LSN watermark
|
|
│ + per-kind business state tables (composite key)
|
|
└── lancedb/
|
|
└── <kind>.lance/ one Arrow-based table per kind
|
|
stores text / vector / tags / metadata
|
|
```
|
|
|
|
- **SQLite** schema lives in
|
|
[`infra/persistence/sqlite/tables/`](../src/everos/infra/persistence/sqlite/tables/);
|
|
every business table that joins back to markdown declares a
|
|
`UniqueConstraint("user_id", "entry_id")` (or `agent_id` symmetric).
|
|
- **LanceDB** schemas live in
|
|
[`infra/persistence/lancedb/tables/`](../src/everos/infra/persistence/lancedb/tables/);
|
|
`Vector(N)` dimension matches the embedding model output.
|
|
|
|
Both layers are **fully derivable from markdown** — wipe `.index/`
|
|
and the in-process cascade subsystem re-builds everything by scanning the
|
|
user-visible tree (the durable `md_change_state` SQLite queue covers
|
|
crash-recovery replay).
|
|
|
|
## 6. Atomic write semantics
|
|
|
|
`MarkdownWriter` uses a same-directory temp file
|
|
(`.<name>.tmp.<uuid>`) + `os.replace` for atomicity. Keeping the temp
|
|
file in the same directory guarantees `os.replace` is atomic on POSIX
|
|
(the rename is only atomic within a single filesystem).
|
|
|
|
`MarkdownWriter.append_entry` reads → merges frontmatter →
|
|
appends an entry block → atomic write back. The caller passes a full
|
|
`EntryId` (built via `EntryId.next_for(prefix, date, current_count)`);
|
|
this primitive is **schema-agnostic** — field-level semantics
|
|
(`entry_count` / `last_appended_at`) are a business writer's job
|
|
(see `BaseDailyAppender._frontmatter_updates` in
|
|
[`infra/persistence/markdown/writers/base.py`](../src/everos/infra/persistence/markdown/writers/base.py)).
|
|
|
|
## 7. References
|
|
|
|
- Code:
|
|
- [`core/persistence/memory_root.py`](../src/everos/core/persistence/memory_root.py) — memory-root resolution
|
|
- [`core/persistence/markdown/`](../src/everos/core/persistence/markdown/) — schema-agnostic read/write chassis
|
|
- [`infra/persistence/markdown/mds/`](../src/everos/infra/persistence/markdown/mds/) — per-kind frontmatter schemas
|
|
- [`infra/persistence/{markdown,sqlite,lancedb}/`](../src/everos/infra/persistence/) — business-aware adapters
|