chore: initialize EverOS 1.0.0

md-first memory extraction framework for AI agents.

Markdown is the single source of truth; SQLite holds state and LanceDB
provides the rebuildable vector + BM25 + scalar index. The codebase follows
a single-direction DDD layering (entrypoints -> service -> memory -> infra,
with component / core / config cross-cutting) enforced by import-linter.

Engineering surface:
- Coding conventions in .claude/rules/ (path-scoped) and workflows in
  .claude/skills/ (/commit, /new-branch, /pr).
- GitHub Actions CI runs make lint + test + integration; pre-commit mirrors
  the gates locally (ruff, hygiene hooks, gitlint commit-msg).
- Commit messages follow Conventional Commits, enforced by gitlint.
- make lint also enforces datetime two-zone discipline and OpenAPI drift.
This commit is contained in:
Elliot Chen
2026-06-05 22:35:51 +08:00
commit 518b8eca85
636 changed files with 160553 additions and 0 deletions

1049
docs/api.md Normal file

File diff suppressed because it is too large Load Diff

213
docs/architecture.md Normal file
View File

@ -0,0 +1,213 @@
# Architecture
> Companion: [.claude/rules/architecture.md](../.claude/rules/architecture.md) (auto-loaded coding rules)
## DDD layered architecture
```
┌──────────────────────────────────────────────────────┐
│ entrypoints/ (Presentation) │
│ cli + api │
├──────────────────────────────────────────────────────┤
│ service/ (Application — Use Case orchestration) │
│ memorize / retrieve / evolve / manage │
├──────────────────────────────────────────────────────┤
│ memory/ (Domain — Business core) │
│ models + extract + search + cascade + prompt_slots │
├──────────────────────────────────────────────────────┤
│ infra/persistence (Storage adapters; infra/ may host other adapter types) │
│ markdown + sqlite + lancedb │
└──────────────────────────────────────────────────────┘
Cross-cutting (used by all layers, depends on none):
component/ ← Injectable providers (LLM / Embedding / config / utils)
core/ ← Runtime base (observability / lifespan / context)
config/ ← Configuration data (Settings schema + default.toml)
```
## Dependency direction (single-direction, enforced)
```
entrypoints → service → memory → infra
```
| from → to | Allowed? |
|---|---|
| entrypoints → service | ✅ |
| entrypoints → memory / infra | ❌ (must go through service) |
| service → memory | ✅ |
| memory → infra | ✅ |
| memory → service | ❌ |
| infra → memory | ❌ |
| infra cross-subpackage (e.g. lancedb → markdown within persistence/) | ❌ (use service to orchestrate) |
| any → component / core / config | ✅ (cross-cutting) |
Enforced via `import-linter` in CI:
```toml
[tool.importlinter]
root_packages = ["everos"]
[[tool.importlinter.contracts]]
name = "Layered architecture"
type = "layers"
layers = [
"everos.entrypoints",
"everos.service",
"everos.memory",
"everos.infra",
]
```
## Storage three-piece set
```
┌────────────────────────────────────────────────────────────────┐
│ md-first storage stack │
└────────────────────────────────────────────────────────────────┘
┌──────────────┐ ┌──────────────┐ ┌─────────────────┐
│ Markdown │ │ SQLite │ │ LanceDB │
│ (truth) │ │ (state) │ │ (index) │
├──────────────┤ ├──────────────┤ ├─────────────────┤
│ entries + │ │ change queue │ │ vector ANN │
│ frontmatter │ │ + state/LSN │ │ BM25 (Tantivy) │
│ Git friendly │ │ buffer / │ │ scalar filter │
│ Obsidian OK │ │ audit │ │ multi-modal │
└──────────────┘ └──────────────┘ └─────────────────┘
│ │ │
▼ ▼ ▼
memory-root/ .index/sqlite/ .index/lancedb/
(truth source) (system data) (rebuildable)
```
## Write path
```
External message
1. service.memorize (entrypoint of write path)
2. memory.extract.pipeline (calls everalgo)
3. infra.persistence.markdown.write (atomic: tmp + fsync + rename)
│ ✅ md write success → return immediately
┌───┴────┐
│ │
▼ ▼
4a. SQLite 4b. memory.cascade (async daemon)
audit watches md → diff entries → LanceDB sync
```
**Key guarantee**: md write is strongly consistent (fsync). LanceDB is eventually consistent. LanceDB unavailability does not block response — changes buffer in the SQLite `md_change_state` queue, replayed on recovery.
## Read path
```
User query
1. service.retrieve
2. memory.search.hybrid single LanceDB query =
BM25 + vector ANN + scalar filter
3. (optional) read md original markdown for context
Return
```
## Key components
### `memory/extract/`
```
extract/
├── ingest/ Standardized message intake + multi-modal parser dispatch
├── pipeline/ Main extraction pipeline (calls everalgo + dual-track split + writes store)
└── evolution/ Async memory evolution (event/counter/cron triggers)
```
### `memory/cascade/`
Daemon that watches markdown changes and syncs to LanceDB:
- inotify / FSEvents file watcher (cross-platform via `watchdog`)
- 500ms debounce
- Entry-level diff (added / changed / removed)
- LanceDB single-transaction update (text + vector columns atomic)
- LSN-based crash recovery via the SQLite `md_change_state` queue
### `memory/prompt_slots/`
Three-layer prompt overlay:
```
config/prompt_slots/*.yaml (Layer 1: defaults, ships with package)
~/.everos/prompt_slots/*.yaml (Layer 2: app-level override)
runtime override (Layer 3: per-call override)
```
everalgo receives PromptSlot as parameter — no hardcoded prompts in algorithm code.
### `core/observability/`
Three-piece observability:
- `metrics/` — Prometheus counter / gauge / histogram + global registry
- `logging/` — structlog with context processor (trace_id propagation)
- `tracing/` — OpenTelemetry tracer + span helpers
## Markdown layout
```
~/.everos/ # memory root (default; EVEROS_MEMORY__ROOT)
└── <app_id>/<project_id>/ # scope ("default" → default_app/default_project)
├── users/<user_id>/
│ ├── user.md # profile (single-file rewrite)
│ ├── episodes/episode-<YYYY-MM-DD>.md # daily-log append
│ ├── .atomic_facts/atomic_fact-<YYYY-MM-DD>.md # hidden, framework-derived
│ └── .foresights/foresight-<YYYY-MM-DD>.md # hidden, framework-derived
├── agents/<agent_id>/
│ ├── .cases/agent_case-<YYYY-MM-DD>.md # hidden, framework-derived
│ └── skills/skill_<name>/SKILL.md # named-dir
└── knowledge/ # global shared knowledge
```
System-managed entries (`.index/`, `.tmp/`) and `ome.toml` live directly
under the memory root.
Full tree + frontmatter chassis: [storage_layout.md](storage_layout.md) and
[how-memory-works.md](how-memory-works.md). Frontmatter has 4-tier field
protection (L1 read-only / L2 system / L3 business / L4 user).
## everalgo boundary
[`everalgo`](https://github.com/EverMind-AI/EverAlgo) is a separate Python library (published as the `everalgo-*` PyPI packages) holding **only memory extraction algorithms**:
- `everalgo.parser` — multi-modal parsing
- `everalgo.user_memory` — ConvMemCell / Episode / Foresight / AtomicFact / Profile extractors
- `everalgo.agent_memory` — AgentMemCell / Case / Skill extractors
- `everalgo.knowledge` — file-to-knowledge
everalgo is:
- **Stateless** — pure functions, no class hierarchy
- **No I/O** — does not touch md files / LanceDB / SQLite
- **No prompts inline** — receives `PromptSlot` parameter, project supplies defaults
This boundary lets everalgo be reused across product forms (this open-source build, EverOS Cloud, OpenClaw plugins, etc.).
## Further reading
- [docs/overview.md](overview.md) — vision and scope
- [docs/engineering.md](engineering.md) — engineering tooling and CI / CD
- [.claude/rules/architecture.md](../.claude/rules/architecture.md) — short-form rules for Claude Code

271
docs/cascade_runbook.md Normal file
View File

@ -0,0 +1,271 @@
# Cascade Runbook
The cascade daemon keeps LanceDB in sync with the markdown files under
the memory root. Service / entry points only ever write markdown; the
daemon is the **sole** writer of the LanceDB index. This runbook covers
the recurring operational questions.
## What runs where
When `everos server start` boots, the FastAPI lifespan wires four
providers in order:
1. **Metrics** — Prometheus collector.
2. **SQLite** — system DB + schema (`SQLModel.metadata.create_all`).
3. **LanceDB** — async connection + schema verification + FTS indexes.
4. **Cascade** — watcher + scanner + worker, all in-process tasks.
The cascade subsystem itself is three independent loops:
| Loop | Source signal | Effect |
|---|---|---|
| Watcher | `watchdog` filesystem events (sync thread) | `md_change_state.upsert` per registered kind |
| Scanner | Periodic walk (`scan_interval_seconds`, default 30 s) | Same — catches changes the watcher missed |
| Worker | `claim_pending_batch` polling (default 1 s when idle) | Handler dispatch → LanceDB upsert / delete |
Every loop talks to the same `md_change_state` sqlite table. The
worker's claim mode (`pending → processing → done/failed`) keeps
concurrent workers honest.
## Health: `everos cascade status`
```
queue:
pending: 3
done: 1247
failed (retryable=TRUE): 1 (eligible for `cascade fix --apply`)
failed (retryable=FALSE): 1 (fix md and re-save to recover)
lsn:
max: 1252
last_processed: 1250
lag: 2
```
- `lag > 0` means the worker is behind. Steady state should hover near
zero; sustained lag points at a slow handler or a stuck retry.
- `failed (retryable=FALSE)` is always user-actionable. Cascade will
never auto-clear these — they represent malformed md the user must
edit.
## Recovering from failures: `everos cascade fix`
`cascade fix` (no flag) lists every failed row. With `--apply`:
1. `UPDATE md_change_state SET status='pending', retry_count=0
WHERE status='failed' AND retryable=TRUE` (the partial index
`idx_md_change_retryable` makes this O(retryable)).
2. Drain the worker once so the retry runs synchronously.
Retryable failures cover transient embedding / HTTP errors (5xx, 429,
network resets) after the inline `MAX_RETRY=3` was exhausted. The
fix command resets the counter so a working backend gets a clean
start.
`retryable=FALSE` rows require the user to edit the md (typically a
YAML frontmatter issue) and re-save; the watcher picks the change up
naturally.
## One-shot replay: `everos cascade sync [PATH]`
Use this when the watcher missed an event (WSL mount, network share,
external editor with no inotify) or when you want a deterministic
flush before, say, a smoke test:
```bash
everos cascade sync # drain everything pending
everos cascade sync users/u1/episodes/X.md # re-enqueue + drain
```
The CLI builds the same `CascadeOrchestrator` as the daemon but only
calls `sync_once` / `drain_once` — no watcher / scanner background
task. So it's safe to run in parallel with a live `everos server`.
## Recovery paths
### LanceDB schema drift on startup
`LanceDBLifespanProvider.startup` calls `verify_business_schemas`. If
an on-disk table has columns the current Pydantic schema does not
declare (or vice versa), the boot fails with:
```
LanceDB table 'episode' schema drift: missing=[...], extra=[...].
The index is rebuildable from md — recover with
`rm -rf ~/.everos/.index/lancedb` and restart.
```
This is the documented recovery: delete the index, restart the
server, the scanner will pick up every md file on its first sweep and
the worker repopulates LanceDB. Markdown is the source of truth, so
no data is lost.
### inotify watch-limit exhaustion (Linux)
Default kernel limit is 8 192 watches per user. On a sizeable memory
root the watcher may silently miss events. Symptoms:
- Scanner catches the file changes but the watcher never logs an
event for the same path.
- `cat /proc/sys/fs/inotify/max_user_watches` is at the limit.
Fix by bumping the kernel parameter:
```bash
echo fs.inotify.max_user_watches=524288 | sudo tee -a /etc/sysctl.conf
sudo sysctl -p
```
### WSL2 / network mounts
Filesystem events do not propagate from the Windows host into WSL2
(or across most SMB / NFS shares). The watcher will start without
error and silently see nothing.
Workarounds:
- Rely on the scanner — at default 30 s interval, throughput is
bounded but eventually-consistent.
- Drop the scan interval to ~5 s if the memory root is small.
- Run `everos cascade sync` explicitly after batch edits.
### Daemon process crash mid-batch
`claim_pending_batch` flips rows to `processing` *atomically*. If the
process dies before `mark_done` / `mark_failed`, those rows stay in
`processing` until the next boot. **The orchestrator auto-recovers**
on startup: `CascadeOrchestrator.start` calls
`md_change_state_repo.recover_orphan_processing()` before launching
the watcher / scanner / worker, which resets every `processing` row
back to `pending`. Single-process cascade means no race — at boot
time no other worker could legitimately own a `processing` row.
No operator action required; the structured log line
`cascade_recovered_orphan_processing` reports the count when it
fires.
### FD exhaustion (`os error 24` / EMFILE)
Symptoms (any of these on a long-running daemon):
- LanceDB query / index build fails with `lance error: ... Too many
open files (os error 24)`.
- `lsof -p <pid> | wc -l` grows monotonically over hours / days.
- Health log lines like `cascade_lancedb_optimize_failed` /
`cascade_lancedb_rebuild_failed` carrying `OSError: [Errno 24]`.
Cause (verified against `lance crate 4.0`): the LanceDB *index* cache
(`GlobalIndexCache`) holds one reader object per opened FTS / vector
/ scalar index, and each reader pins the file descriptors of its
`_indices/<uuid>/...` files. With a long-running daemon and steady-
state cascade ingest, every `optimize()` call adds new readers; with
LanceDB's own default (`index_cache_size_bytes=None`, unbounded), they
**are never evicted** and the FDs leak monotonically.
`drop_index` does **not** help — it is a manifest-only operation and
leaves the on-disk UUID directories untouched. Even an explicit
`optimize(cleanup_older_than=0)` `unlink()`-ing the files does not
release FDs: POSIX keeps the inode alive as long as a process holds
an open FD on it (the entries show as `(deleted)` in `lsof`). Only an
LRU eviction inside the cache (or a connection close) actually closes
the FDs.
Fix (already wired in `LanceDBSettings.index_cache_size_bytes` —
default 16 MB, ~290 FD ceiling): see
[Tuning knobs § LanceDB index cache](#lancedb-index-cache-index_cache_size_bytes)
for the sizing table and the env-var override path.
If you have already hit EMFILE in a running process, the cleanest
recovery is a daemon restart — the open connection closes, every FD
is released, and the next start comes up with the capped Session in
place.
## Tuning knobs
### Cascade scheduler knobs
All defaults live in `everos.memory.cascade.orchestrator.CascadeConfig`
and `everos.memory.cascade.worker.CascadeWorker`:
| Knob | Default | Effect |
|---|---|---|
| `scan_interval_seconds` | 30 | Scanner sweep cadence |
| `worker_batch_size` | 50 | Rows claimed per worker cycle |
| `worker_max_retry` | 3 | Inline retries before `mark_failed(retryable=TRUE)` |
| `worker_poll_interval_seconds` | 1 | Idle wait between empty drain attempts |
| `worker_retry_backoff_seconds` | 2 | Linear backoff seed; doubles per attempt |
Tuning surface is intentionally not in `Settings` yet — once we have
wall-clock numbers from real workloads, the values that need
operator override will surface there.
### LanceDB index cache (`index_cache_size_bytes`)
Lives in `LanceDBSettings`; overridable via the
`EVEROS_LANCEDB__INDEX_CACHE_SIZE_BYTES` environment variable. This
is the only knob that bounds the steady-state file-descriptor count
of a long-running EverOS daemon — see
[Recovery paths § FD exhaustion](#fd-exhaustion-os-error-24-emfile)
for why nothing else (prune, rebuild, `drop_index`) helps.
Measured cap → FD ceiling (30 add+optimize cycles + 100-query stress
on the real `Episode` schema):
| Cap | FD ceiling | Query latency (p50) | Safe under `ulimit -n` |
|---|---|---|---|
| `2 MB` | ~45 | ~5 ms | macOS default 256 (5× headroom) |
| `4 MB` | ~52 | ~3 ms | macOS default 256 |
| `8 MB` | ~140 | ~2.4 ms | macOS default 256 (1.8× headroom) |
| **`16 MB`** (default) | **~290** | **~2.3 ms** | **Linux default 1024 (3.5× headroom); macOS needs `ulimit -n 1024`** |
| `32 MB` | ~630 | ~1.4 ms | Linux default 1024 (1.6× headroom) |
| `unbounded` | grows forever | ~1.3 ms | NEVER use in a daemon |
EverOS's measured steady-state working set after a `rebuild_indexes`
cycle is roughly **50-100 readers / 3-6 MB resident** (5 tables × ~7
BM25 columns × ~10 `part_N` reader entries each), so the 16 MB default
provides ~3× headroom for burst traffic and stale-but-not-yet-evicted
readers.
When to override:
- **Tight `ulimit -n` environments** (containers; macOS dev boxes
that haven't bumped the default 256) → drop to `4 MB` or `8 MB`.
Query latency increases by ~1-3 ms but correctness is unaffected.
- **Larger working sets** (many more tables or much wider FTS
indexes than the default schema set) → bump to `32-64 MB`. Verify
your platform's `ulimit -n` covers the corresponding FD ceiling
with at least 2× headroom.
- **Diagnostic-only**: set to a tiny value (e.g. `1 MB`) to
*force* LRU thrashing and reproduce cache-miss latency in tests.
Do **not** set `metadata_cache_size_bytes` — it is intentionally left
at LanceDB's default (unbounded) because the metadata cache holds
parsed manifests / fragment stats and has zero effect on FD count;
capping it just thrashes parsing work without solving anything.
## Concurrency
The worker is async, not multi-process. Inside one drain cycle,
`asyncio.gather(*[_process_one(row) for row in batch])` runs every
claimed row concurrently — cascade is IO-bound (embedding HTTP calls
dominate wall time) so single-process coroutine concurrency saturates
the bottleneck. The `worker_batch_size` knob (default 50) caps
in-flight rows.
Multi-process workers are a scaling axis we'd reach for only if a
single process becomes CPU-bound, which the current design does not
anticipate. `claim_pending_batch` is already race-safe (the
``WHERE status='pending'`` filter ensures each row lands in exactly
one batch even if multiple workers raced), so adding processes later
is a deployment-side change with no schema work.
## What cascade does NOT do (yet)
- **Schema migration**: LanceDB column changes require `rm -rf`.
- **Parent-id back-link**: Episode rows currently carry
`parent_id=None`; the writer doesn't preserve the source memcell id
in the entry inline. Tracked separately.
- **Reference-file change detection (agent_skill)**: edits to
`references/*.md` siblings won't trigger a re-index — only changes
to `SKILL.md` itself fire the watcher. Workaround: run
`everos cascade sync agents/<a>/skills/skill_<n>/SKILL.md` after
editing references.

116
docs/cli.md Normal file
View File

@ -0,0 +1,116 @@
# CLI
The `everos` command-line entry point covers **setup and operations**
generate a starter `.env` (`init`), run the HTTP API server (`server
start`), and operate the md → LanceDB index queue (`cascade`). Hot-path
business (`/add` `/flush` `/search` `/get`) is the **HTTP API**, not the
CLI.
CLI commands run **in-process** — they call into the `service/` /
infrastructure layers directly rather than the HTTP loopback.
## Installation
The script is exposed via `pyproject.toml`:
```toml
[project.scripts]
everos = "everos.entrypoints.cli.main:app"
```
After `uv sync` (or `pip install -e .`) the `everos` command resolves
to [`src/everos/entrypoints/cli/main.py`](../src/everos/entrypoints/cli/main.py),
a [Typer](https://typer.tiangolo.com/) app.
## Subcommand layout
```
everos
├── init Generate a starter .env from the packaged template
├── server
│ └── start Start the HTTP API server (uvicorn)
└── cascade Inspect / operate the md → LanceDB sync queue
├── status Queue / LSN summary
├── sync Drain the queue now (force md → LanceDB)
└── fix List failed rows / re-enqueue retryable ones
```
Each subcommand lives in its own module under
[`entrypoints/cli/commands/`](../src/everos/entrypoints/cli/commands/) and is
registered in `cli/main.py`. The CLI is intentionally small — hot-path
business (`/add` `/flush` `/search` `/get`) is the **HTTP API**, not the
CLI; the CLI covers setup (`init`), running the server, and index ops
(`cascade`). There is no `reindex` command — rebuild by deleting
`<root>/.index/lancedb` and restarting, or run `everos cascade sync`.
## `everos server start`
Wraps `uvicorn` to launch the FastAPI app from
[`entrypoints/api/app.py`](../src/everos/entrypoints/api/app.py)
in *factory* mode.
```bash
everos server start \
--host 127.0.0.1 \
--port 8000 \
--log-level info \
--env-file .env
```
| Flag | Env var | Default |
|---|---|---|
| `--host` | `EVEROS_API__HOST` | `127.0.0.1` (loopback only; binding `0.0.0.0` logs a warning — EverOS ships no auth) |
| `--port` | `EVEROS_API__PORT` | `8000` |
| `--log-level` | `EVEROS_LOG_LEVEL` | `INFO` |
| `--env-file` | — | searched: `./.env``$XDG_CONFIG_HOME/everos/.env``~/.everos/.env` |
| `--reload` | — | off (use in development) |
Lifespan startup wires the storage backends (SQLite engine + LanceDB
connection) on app boot; see
[`entrypoints/api/lifespans/`](../src/everos/entrypoints/api/lifespans/).
## Configuration via env vars
Both CLI and HTTP server read configuration from `pydantic-settings`:
| Env var | Settings field |
|---|---|
| `EVEROS_MEMORY__ROOT` | `Settings.memory.root` (memory-root path) |
| `EVEROS_MEMORY__TIMEZONE` | `Settings.memory.timezone` (e.g. `Asia/Shanghai`) |
| `EVEROS_SQLITE__BUSY_TIMEOUT_MS` | `Settings.sqlite.busy_timeout_ms` |
| `EVEROS_LANCEDB__READ_CONSISTENCY_SECONDS` | `Settings.lancedb.read_consistency_seconds` |
Pattern: `EVEROS_<SECTION>__<KEY>` (double underscore = nesting). See
[`config/settings.py`](../src/everos/config/settings.py).
## Logging
`configure_logging` runs at CLI startup and configures `structlog` with
the resolved log level. All in-process logs (CLI command bodies +
service / infra layers) flow through the same handler.
```bash
everos server start --log-level debug # see all sql / lance traffic
```
## API ↔ CLI division of labour
| Responsibility | API | CLI |
|---|---|---|
| Hot-path business (`/add` `/flush` `/search` `/get`) | ✅ | — (HTTP only) |
| Setup (generate `.env`) | — | `everos init` |
| Run the server | — | `everos server start` |
| Index ops (drain / inspect / fix the cascade queue) | — | `everos cascade {status,sync,fix}` |
| Health probe | `GET /health` | (use HTTP) |
| Metrics scrape | `GET /metrics` | (use HTTP) |
The CLI is the **shell-friendly** surface for ops + scripting; the
HTTP API is the **process-friendly** surface for clients (web UIs,
agents, automation).
## See also
- [architecture.md](architecture.md) — DDD layering between
entrypoints / service / memory / infra
- [`entrypoints/cli/main.py`](../src/everos/entrypoints/cli/main.py)
- [`entrypoints/cli/commands/server.py`](../src/everos/entrypoints/cli/commands/server.py)

263
docs/datetime.md Normal file
View File

@ -0,0 +1,263 @@
# Datetime & Timezones
> Audience: contributors. Read this once before touching any code that
> records a moment in time.
## Table of contents
- [The two-zone discipline](#the-two-zone-discipline)
- [Why two zones](#why-two-zones)
- [Helper reference](#helper-reference)
- [Field-type rules](#field-type-rules)
- [End-to-end data flow](#end-to-end-data-flow)
- [Common pitfalls](#common-pitfalls)
- [Testing guidance](#testing-guidance)
## The two-zone discipline
EverOS treats datetimes on **two separate rails**:
| Rail | Where it lives | Helper |
|---|---|---|
| **UTC** (storage) | SQLite, LanceDB, OME events — anything persisted to disk | `get_utc_now`, `ensure_utc`, `UtcDatetime` |
| **Display tz** | Markdown frontmatter, HTTP API responses, daily-log filename buckets, fallback zone for naive caller input | `get_now_with_timezone`, `today_with_timezone`, `to_display_tz` |
The display timezone is set by the `EVEROS_MEMORY__TIMEZONE`
environment variable (or `[memory] timezone` in TOML). Default `UTC`.
**Inviolable rule**: the display tz must **never** reach storage. Once
the user switches `EVEROS_MEMORY__TIMEZONE`, existing on-disk rows
must not misalign.
## Why two zones
### What goes wrong with a single "configured" zone
The naive design — "use one configured timezone everywhere" — has two
failure modes, both subtle:
1. **Configuration drift.** Day 1 the user configures
`EVEROS_MEMORY__TIMEZONE=Asia/Shanghai`. Everything stores
Shanghai-local datetimes. On Day 30 they switch to
`UTC`. SQLite (which strips tz on write and returns naive on read)
silently reinterprets the old Shanghai values as UTC — every old
row jumps eight hours into the future.
2. **Cross-region replication.** If two deployments share storage
but configure different display zones, both interpret the same
naive bytes against their own local zone and diverge by the
offset delta. There is no "true" reading.
UTC-only storage forecloses both: bytes on disk are zone-independent.
### Why not UTC everywhere then?
Users want to read timestamps in their wall-clock zone. Markdown
frontmatter that says `2026-05-29T06:00:00Z` for a meeting that
happened locally at 14:00 is jarring. The display rail solves this
without polluting storage: render UTC bytes through `to_display_tz`
at the boundary.
## Helper reference
All helpers live in [`everos.component.utils.datetime`](../src/everos/component/utils/datetime.py).
### Storage rail
| Helper | Behaviour |
|---|---|
| `get_utc_now() -> datetime` | Current UTC instant, `tzinfo=UTC`. Independent of any setting. Use as `default_factory` on any storage field. |
| `ensure_utc(d) -> datetime` | Naive → attach display tz → convert to UTC. Aware → `astimezone(UTC)`. Use at the storage boundary if you receive a datetime you didn't construct. |
| `UtcDatetime` | `Annotated[datetime, AfterValidator(ensure_utc)]`. Apply to any SQLite field. Pydantic auto-runs validation on both INSERT defaults and read-back rows. |
### Display rail
| Helper | Behaviour |
|---|---|
| `get_now_with_timezone() -> datetime` | Current instant in the configured display tz. `.isoformat()` produces e.g. `2026-05-29T14:00:00+08:00`. |
| `today_with_timezone() -> date` | Today's date in the display tz. Use for daily-log filename buckets. |
| `to_display_tz(d) -> datetime` | Convert any datetime to the display tz. Naive input is treated as already display-tz local. |
### Parsing & rendering
| Helper | Behaviour |
|---|---|
| `from_iso_format(value)` | Parse an ISO string / datetime / epoch. Naive input attaches **display tz** (the "if you didn't say a zone, assume your zone" rule). |
| `from_timestamp(ts)` | Parse epoch seconds / milliseconds (auto-detects). Returns display-tz aware. |
| `to_iso_format(d)` | `.isoformat()` after light validation. |
| `to_timestamp_ms(d)` | Milliseconds epoch (`int`). |
## Field-type rules
### SQLite tables
```python
from everos.component.utils.datetime import UtcDatetime, get_utc_now
from everos.core.persistence.sqlite import BaseTable, Field
class MyRow(BaseTable, table=True):
happened_at: UtcDatetime = Field(default_factory=get_utc_now)
```
Why `UtcDatetime` and not plain `datetime`? SQLAlchemy silently strips
tz on SQLite writes. `UtcDatetime`'s `AfterValidator` runs on
**construction** to make sure whatever the caller hands in gets
normalised to UTC before persistence.
SQLModel's ORM hydrate path (rows from `select(...)`) **bypasses**
the Pydantic validator — SQLAlchemy assigns column values straight
to instance attributes. To close that gap,
[core/persistence/sqlite/base.py](../src/everos/core/persistence/sqlite/base.py)
registers a SQLAlchemy `load` event listener that re-attaches
`tzinfo=UTC` to every `UtcDatetime` column after hydrate. Net effect:
**callers never see a naive datetime from a SQLite repo**, whatever
the code path.
`BaseTable.created_at` / `updated_at` already use `UtcDatetime` and
`get_utc_now` — any subclass inherits both the construction-time
validator **and** the load-time hook for free.
### LanceDB tables — zero configuration
```python
import datetime as _dt
class MyLanceRow(BaseLanceTable):
ts: _dt.datetime # automatically tz=UTC in the Arrow schema
```
LanceDB's Pydantic → PyArrow converter does not understand
`typing.Annotated` metadata; using `UtcDatetime` as the annotation
would raise `TypeError: Converting Pydantic type to Arrow Type`.
Instead, `BaseLanceTable.to_arrow_schema()` walks the inferred schema
and rewrites **every** naive `timestamp[us]` column to
`timestamp[us, tz=UTC]`. PyArrow then:
* **on write** — `astimezone(UTC)` any aware input automatically.
* **on read** — returns aware UTC datetimes (not naive).
No caller-side coercion needed, no per-table declaration. The
response shapers only run `to_display_tz(...)` to convert UTC to the
configured display zone.
If a future schema genuinely needs a naive datetime column (project
convention says storage is always UTC, so this would be unusual),
override `to_arrow_schema` on that subclass and skip the patch for
that one column.
### OME events / in-memory state
OME events are persisted-adjacent (the `run_record` / `counter` stores
serialise them). Use `get_utc_now()` for any `default_factory` on the
event payload.
## Two centralised defenses
| Backend | Defense | Where |
|---|---|---|
| **SQLite** | SQLAlchemy `load` event listener on `BaseTable` re-attaches `tzinfo=UTC` after every ORM hydrate | [core/persistence/sqlite/base.py](../src/everos/core/persistence/sqlite/base.py) |
| **LanceDB** | `BaseLanceTable.to_arrow_schema()` rewrites `UTC_DATETIME_FIELDS` columns to `timestamp[us, tz=UTC]`; PyArrow handles UTC end-to-end | [core/persistence/lancedb/base.py](../src/everos/core/persistence/lancedb/base.py) |
| **CI gate** | `scripts/check_datetime_discipline.py` fails the build on any code that bypasses `component/utils/datetime` | wired into `make lint` |
These defenses replace what used to be an "every consumer must call
`ensure_utc()`" shotgun discipline. With both in place, callers never
observe a naive datetime from either backend.
## End-to-end data flow
```
User input (any zone)
from_iso_format ← naive → attach display tz
ensure_utc ← storage boundary: → UTC
┌────────────────┬────────────────┐
│ SQLite │ LanceDB │
│ (UtcDatetime │ (Arrow │
│ re-attaches │ stripped to │
│ UTC on read) │ UTC bytes) │
└────────────────┴────────────────┘
from_iso_format ← read path normalises naive → display tz
to_display_tz ← response boundary: → display tz
Pydantic .isoformat() → "2026-05-29T14:00:00+08:00"
HTTP API response / markdown frontmatter
```
The storage boundary and response boundary are the two points where
the zone discipline is enforced. Everything in between just passes
datetimes through.
## Common pitfalls
> [!WARNING]
> **`datetime.now()` without `tz=`.** Forbidden. Always use
> `get_utc_now()` (storage) or `get_now_with_timezone()` (display).
> Linted by `.claude/rules/datetime-handling.md` and CI.
> [!WARNING]
> **Calling `astimezone()` on a value just read from SQLite.** If the
> field isn't typed `UtcDatetime`, SQLite returns naive — and
> `astimezone()` on a naive datetime silently interprets it as
> **local process time**, not UTC. Always use `UtcDatetime` on SQLite
> fields.
> [!WARNING]
> **Storing `get_now_with_timezone()` directly.** That returns
> display-tz time. If the display tz later changes, your stored values
> are stranded. Use `get_utc_now()` for any persisted field.
> [!INFO]
> **Migrating existing rows.** Q2 was rolled out on a clean codebase
> with no production data. If you operate an instance where SQLite
> values were written with display-tz-aware values (pre-Q2), you must
> either drop the database or write a one-time migration that
> reinterprets each row's naive value against the old display tz
> before re-writing as UTC. The project does not ship such a
> migration.
## Testing guidance
For unit tests that depend on display-tz behaviour, both caches must
clear:
```python
import pytest
from everos.component.utils import datetime as dt_module
from everos.config import load_settings
@pytest.fixture(autouse=True)
def _isolate_tz(monkeypatch: pytest.MonkeyPatch) -> None:
monkeypatch.delenv("EVEROS_MEMORY__TIMEZONE", raising=False)
load_settings.cache_clear()
dt_module._display_tz.cache_clear()
```
The autouse fixture in [tests/conftest.py](../tests/conftest.py) does
exactly this — it runs for every test by default. If you write a
locally-scoped test that needs a non-default zone, monkeypatch the env
var **and** clear both caches:
```python
def test_my_thing(monkeypatch):
monkeypatch.setenv("EVEROS_MEMORY__TIMEZONE", "Asia/Shanghai")
load_settings.cache_clear()
dt_module._display_tz.cache_clear()
...
```
The full invariant set is covered in
[tests/unit/test_component/test_utils/test_datetime.py](../tests/unit/test_component/test_utils/test_datetime.py)
under the "Q2 two-zone discipline invariants" section. If you change
the storage / display contract, those tests are the first line of
defense — update them in lockstep.

553
docs/engineering.md Normal file
View File

@ -0,0 +1,553 @@
# Engineering & Dev-Efficiency Infrastructure
> Companions: business architecture lives in [architecture.md](architecture.md);
> hard coding constraints live in [../.claude/rules/](../.claude/rules/).
> This document covers the surrounding tooling, configuration, and processes
> — what we adopted, what role each piece plays, and how they fit together.
> CI runs on GitHub Actions; all checks are invoked through the `Makefile`.
---
## 1. Scope
Engineering / dev-efficiency infrastructure does not solve business problems —
it solves **team + code + time** problems:
```
┌──────────────────────────────────────────────────────────┐
│ │
│ Business architecture (docs/architecture.md) │
│ — answers "how to build the system" │
│ │
│ Engineering rules (.claude/rules/) │
│ — answers "how to write the code" │
│ │
│ Engineering / dev-efficiency infrastructure (this doc) │
│ — answers "how the team collaborates, │
│ how code is auto-checked, │
│ how releases are automated, │
│ how tools land in the project" │
│ │
└──────────────────────────────────────────────────────────┘
```
Reasons this is documented separately:
- **Cross-project reusable** — `CLAUDE.md` / rules / `pyproject.toml` are
patterns, not content. The next project can adopt them as-is.
- **Decoupled from business** — business architecture changes do not affect
these; upgrading these does not affect business.
- **Onboarding-oriented** — new contributors read this first to understand
what the tooling looks like.
---
## 2. Infrastructure overview
```
┌─────────────────────────────────────────────────────────────────────┐
│ Team collaboration / Code quality / CI/CD │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─ Claude Code engineering layer ────────────────────────────┐ │
│ │ │ │
│ │ CLAUDE.md ← team-shared context (auto loaded into │ │
│ │ system prompt) │ │
│ │ .claude/ │ │
│ │ ├── CLAUDE.md subdir context (optional) │ │
│ │ ├── rules/ (10) path-scoped hard coding rules │ │
│ │ ├── skills/ (3) slash command workflows │ │
│ │ └── settings.json permissions allowlist │ │
│ │ │ │
│ └────────────────────────────────────────────────────────────┘ │
│ │
│ ┌─ Code quality gates ───────────────────────────────────────┐ │
│ │ │ │
│ │ pre-commit runs locally before commit │ │
│ │ ├ ruff (lint+fmt) │ │
│ │ ├ trailing-whitespace / end-of-file-fixer │ │
│ │ ├ check-yaml / check-toml │ │
│ │ ├ check-added-large-files (≥1MB warn) │ │
│ │ ├ detect-private-key │ │
│ │ └ gitlint (commit-msg stage) │ │
│ │ │ │
│ │ ruff lint + format │ │
│ │ (replaces black / isort / flake8) │ │
│ │ import-linter DDD layer-direction enforcement │ │
│ │ pytest unit / integration │ │
│ │ │ │
│ └────────────────────────────────────────────────────────────┘ │
│ │
│ ┌─ Dependencies & build ─────────────────────────────────────┐ │
│ │ │ │
│ │ uv sole package manager │ │
│ │ (no `pip install`) │ │
│ │ pyproject.toml src layout + extras + groups │ │
│ │ uv.lock checked in; CI uses --frozen │ │
│ │ hatchling wheel build backend │ │
│ │ Makefile unified entry; CI calls it │ │
│ │ src/everos/templates/env.template │ │
│ │ environment variable template │ │
│ │ │ │
│ └────────────────────────────────────────────────────────────┘ │
│ │
│ ┌─ CI/CD (GitHub Actions) ───────────────────────────────────┐ │
│ │ │ │
│ │ CI: .github/workflows/ci.yml lint / test / integ │ │
│ │ Docs: .github/workflows/docs.yml Markdown link check │ │
│ │ Both invoke Makefile targets; the Makefile is the │ │
│ │ single source of truth for commands. │ │
│ │ │ │
│ └────────────────────────────────────────────────────────────┘ │
│ │
│ ┌─ Collaboration workflow ───────────────────────────────────┐ │
│ │ │ │
│ │ Branch model: dev / master (GitFlow Lite) │ │
│ │ PR template: .github/PULL_REQUEST_TEMPLATE.md │ │
│ │ ISSUE_TEMPLATE: bug / feature / use-case / docs / config │ │
│ │ CONTRIBUTING.md: contributor onboarding │ │
│ │ │ │
│ └────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────┘
```
---
## 3. Claude Code engineering layer
### 3.1 Loading mechanism
Claude Code automatically loads the following into the system prompt at
session start (no manual import):
```
┌────────────────────────┬──────────────────────────────────────────┐
│ File │ Purpose │
├────────────────────────┼──────────────────────────────────────────┤
│ CLAUDE.md (repo root) │ Team-shared context: architecture │
│ │ overview, commands, convention index │
│ .claude/rules/*.md │ Hard coding constraints │
│ │ (path-scoped on-demand load) │
│ .claude/settings.json │ Permissions allowlist (not in prompt) │
│ ~/.claude/CLAUDE.md │ User-level (personal preferences) │
│ CLAUDE.local.md │ Project-local personal (gitignored) │
└────────────────────────┴──────────────────────────────────────────┘
```
### 3.2 Rules (10 files, path-scoped)
| File | Paths (auto-load condition) |
|---|---|
| architecture.md | always loaded (no paths) |
| code-style.md | always loaded (no paths) |
| language-policy.md | always loaded (no paths) |
| imports.md | `src/**/*.py`, `tests/**/*.py` |
| init-py-and-reexport.md | `src/**/__init__.py`, `src/**/*.py` |
| module-docstring.md | `src/{infra,memory,service,component,core}/**/*.py` |
| async-programming.md | `src/**/*.py`, `tests/**/*.py` |
| datetime-handling.md | `src/**/*.py`, `tests/**/*.py` |
| logging-observability.md | `src/**/*.py` |
| testing.md | `tests/**/*.py` |
**Why path-scoped**: avoid loading 1000+ lines of rules every session
(~58K tokens). At startup only architecture + code-style + language-policy
load (~1.52K tokens); the rest load on demand when Claude Code reads a
matching `.py` file.
### 3.3 Skills (3 slash commands)
| Command | Purpose | When to use |
|---|---|---|
| `/commit` | Generate a Conventional Commits message | After a focused change, ready to commit |
| `/new-branch` | Create branch under dev/master strategy | Starting a new feat / fix / hotfix |
| `/pr` | Open a GitHub PR with the repo template | Ready to merge |
Skills and rules use **independent loading mechanisms**: rules auto-load
into the system prompt, skills only trigger when the user types `/<name>`.
### 3.4 settings.json
```json
{
"permissions": {
"allow": ["Bash(uv sync*)", "Bash(make*)", "Bash(uv run pytest*)", ...]
}
}
```
**Purpose**: reduce permission prompts. Team-shared config goes into
`settings.json` (in git); personal preferences go into `settings.local.json`
(gitignored).
---
## 4. Code quality gates
```
┌──────────────────────────────────────────────────────┐
│ Each stage can independently fail the change │
└──────────────────────────────────────────────────────┘
[Local editor]
Stage 1: editor real-time feedback
├ ruff (lint + format) on save
└ path-relevant .claude/rules guide Claude Code
Stage 2: pre-commit (triggered by `git commit`)
├ ruff fix + format
├ trailing-whitespace, end-of-file-fixer
├ check-yaml, check-toml
├ check-added-large-files (≥1MB)
├ detect-private-key
└ gitlint (commit-msg stage; rejects malformed messages)
Stage 3: local `make ci` (manual, before push)
├ make lint (ruff check + ruff format --check + import-linter)
├ make test (pytest tests/unit)
└ make integration (pytest tests/integration)
Stage 4: CI (GitHub Actions, push + PR triggered)
└ re-runs the same `make lint / test / integration` targets
Stage 5: PR review
├ ≥ 1 approval
└ all threads resolved + all CI green
```
**Key design**: when any stage fails, **never merge** — there is no
`--no-verify` / `--allow-failure` escape hatch.
---
## 5. Dependencies & build
### 5.1 pyproject.toml overview
```toml
[project]
name = "everos"
requires-python = ">=3.12"
dependencies = [...] # runtime deps (minimal set)
[project.optional-dependencies]
multimodal = [...] # extras (install on demand)
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
[tool.hatch.build.targets.wheel]
packages = ["src/everos"] # src layout
[project.scripts]
everos = "everos.entrypoints.cli.main:app" # exposes CLI command
[tool.ruff] # code style
[tool.pytest.ini_options] # tests
[tool.coverage.run] # coverage config (gate lives in `make cov`)
[tool.importlinter] # dependency direction
[dependency-groups]
dev = ["ruff", "pytest", "pytest-asyncio", "pytest-cov",
"import-linter", "pre-commit", "ipdb"]
```
**Single-file principle**: configuration that used to live in `pylintrc`,
`pytest.ini`, `.isort.cfg` is **all consolidated into `pyproject.toml`**.
### 5.2 Makefile commands
```
make help list all targets
make install uv sync --frozen
make format ruff fix + format
make lint ruff + import-linter + datetime discipline + openapi drift
make test pytest tests/unit
make integration pytest tests/integration
make cov pytest unit + integration, coverage gate (fail under 80%)
make ci lint + test + integration ← CI invokes these targets
make clean clear caches
```
**Single source of truth**: CI only invokes `make <target>`, so local and CI
run identical commands and cannot drift.
### 5.3 env.template (slimmed down)
The template lives at `src/everos/templates/env.template` (bundled
inside the wheel as package data, copied to `./.env` via `everos init`).
It groups settings by provider, each block sharing the OpenAI-protocol
`MODEL` / `API_KEY` / `BASE_URL` triple:
```
EVEROS_LLM__* # text model (model / api_key / base_url)
EVEROS_MULTIMODAL__* # vision model for image/office inputs
EVEROS_EMBEDDING__* # embedding model (vector index)
EVEROS_RERANK__* # cross-encoder reranker
EVEROS_MEMORY__ROOT # memory-root (md files + .index/{sqlite,lancedb}/)
EVEROS_LOG_LEVEL # DEBUG | INFO | WARNING | ERROR
EVEROS_LOG_FORMAT # json | text
TZ # display timezone (storage is always UTC)
```
Every key has a sensible default except the `API_KEY` fields, which you fill in.
---
## 6. CI/CD (GitHub Actions)
### 6.1 Strategy
```
┌──────────────────────────────────────────────────────────┐
│ │
│ GitHub Actions (.github/workflows/) │
│ ci.yml push (main/dev/master) + PR │
│ ├ make install-deps (uv sync --frozen) │
│ ├ make lint (ruff + import-linter + │
│ │ datetime + openapi drift) │
│ ├ make test (pytest tests/unit) │
│ └ make integration (pytest tests/integration) │
│ docs.yml Markdown link check + issue-template YAML │
│ │
│ Consistency: │
│ ├ astral-sh/setup-uv (cache keyed by uv.lock) │
│ ├ Makefile is the single source of CI commands │
│ └ pre-commit runs locally first to reduce CI churn │
│ │
└──────────────────────────────────────────────────────────┘
```
### 6.2 CI checklist
| Check | Tool | Failure condition |
|---|---|---|
| Lint | `make lint` (ruff check + ruff format --check) | any error |
| Layer direction | `make lint` (lint-imports inside) | layer violation |
| Datetime discipline | `make lint` (check_datetime_discipline.py) | bypasses helper module |
| OpenAPI drift | `make lint` (dump_openapi.py --check) | schema ≠ committed openapi.json |
| Unit | `make test` (pytest tests/unit) | any failure |
| Integration | `make integration` (pytest tests/integration) | any failure |
Integration tests run with a `FakeLLMClient` — no live credentials are needed in CI.
Commit message format is enforced **locally** via `gitlint` in the `commit-msg`
pre-commit stage; it does not run in CI.
### 6.3 Branch protection
| Branch | Rule |
|---|---|
| **master** | branch protection: PR + 1 review + green CI; no direct push |
| **dev** | same as above |
| feat / fix / hotfix | free push; rebase parent before merge |
---
## 7. Collaboration workflow
### 7.1 Branch model (GitFlow Lite)
```
v0.1 v0.2 v1.0
▲ ▲ ▲
│ release PR │ release PR │ release PR
│ (dev→master+tag) │ (dev→master+tag) │ (dev→master+tag)
master ●──────────────────────●─────────────●──────────────────●──────────────────────────────────●────► stable / released
│ ▲ │ │
│ │ merge hotfix │ │
│ │ │ │
│ ●──●──┘ │ │
│ │ hotfix branch │ │
│ │ (cut from master) │ │
│ │ │ │
│ ▼ sync to dev │ │
│ │ │ │
dev ●──●──●──●──●──●──●──●──●─●──●──●─●──●──●──●──●──●──●──●──●─●──●──●──●──●──●──●──●──●──●──●──●─────► integration
▲ ↑ ↑ ↑
│ release point release point release point
feat/A (dev HEAD → (dev HEAD → (dev HEAD →
●──●──● master + v0.1) master + v0.2) master + v1.0)
feat/* : cut from dev → PR → merge into dev
hotfix/* : cut from master → merge into master + sync into dev (double merge)
release : dev → master + tag on master (no separate release branch)
Vertical │ in the diagram = "dev HEAD merged into master via release PR + v0.x tag"
```
Details in [../.claude/skills/new-branch/SKILL.md](../.claude/skills/new-branch/SKILL.md).
### 7.2 PR template
A single PR template at [`.github/PULL_REQUEST_TEMPLATE.md`](../.github/PULL_REQUEST_TEMPLATE.md)
with five sections: **Summary / Area / Verification / Checklist / Notes for
Reviewers**. The `/pr` skill fills it in (see
[../.claude/skills/pr/SKILL.md](../.claude/skills/pr/SKILL.md)).
### 7.3 Commit convention (Conventional Commits)
Format: `<type>[(scope)][!]: <description>` per
[Conventional Commits](https://www.conventionalcommits.org).
```
feat: new feature
fix: bug fix
refactor: restructuring (no behavior change)
test: add / update tests
docs: documentation
style: formatting
perf: performance optimization
chore: configuration / build / tooling
build: build system or dependencies
ci: CI configuration
revert: revert a previous commit
```
`gitlint` enforces the format **locally** via its `contrib-title-conventional-commits`
rule in the commit-msg pre-commit stage. See
[../.claude/skills/commit/SKILL.md](../.claude/skills/commit/SKILL.md).
---
## 8. Issue templates / user support
```
.github/ISSUE_TEMPLATE/
├── bug_report.yml structured bug report (form)
├── feature_request.yml feature proposal (form)
├── use_case.yml share a use case / integration
├── docs.yml documentation issue
└── config.yml disable blank issues + community links
CONTRIBUTING.md contributor onboarding: setup / code style /
branch / commit / PR / testing
```
---
## 9. Infrastructure summary table
```
┌─────────────────────┬──────────────────────────────────────┬─────────────┐
│ Facility │ Location / file │ Failure │
│ │ │ impact │
├─────────────────────┼──────────────────────────────────────┼─────────────┤
│ CLAUDE.md │ /CLAUDE.md │ cc loses │
│ │ │ context │
│ Team rules │ /.claude/rules/ (10) │ cc unaware │
│ │ │ of conv. │
│ Team skills │ /.claude/skills/ (3) │ no slash │
│ │ │ workflows │
│ Permissions │ /.claude/settings.json │ cc prompts │
│ │ │ on each op │
├─────────────────────┼──────────────────────────────────────┼─────────────┤
│ pyproject │ /pyproject.toml │ build fail │
│ Lock file │ /uv.lock │ dep drift │
│ Makefile │ /Makefile │ no unified │
│ │ │ entry │
│ pre-commit │ /.pre-commit-config.yaml │ no local │
│ │ │ gate │
│ env template │ /src/everos/templates/env.template │ newcomers │
│ │ │ lost on env│
├─────────────────────┼──────────────────────────────────────┼─────────────┤
│ CI │ /.github/workflows/ci.yml │ PR cannot │
│ │ │ merge │
│ Docs CI │ /.github/workflows/docs.yml │ broken │
│ │ │ doc links │
│ PR template │ /.github/PULL_REQUEST_TEMPLATE.md │ no PR temp │
│ Issue templates │ /.github/ISSUE_TEMPLATE/ (5) │ scattered │
│ CONTRIBUTING │ /CONTRIBUTING.md │ contrib. │
│ │ │ confused │
└─────────────────────┴──────────────────────────────────────┴─────────────┘
```
---
## 10. Future extensions
```
Near-term
□ /new-module skill: scaffold a subpackage that complies with rules
□ ruff rule sets: add D (docstring), ANN (annotations)
□ Static type checking (pyright or mypy) once hot paths stabilize
Mid-term
□ release-please / Conventional Commits → automated changelog
□ Automated PyPI wheel upload on tag
□ Multi-Python version matrix (3.12 / 3.13)
□ Performance benchmark CI with historical comparison
Long-term
□ Mutation testing (mutmut)
□ Coverage ratchet (raise the 80% gate as the suite matures)
```
---
## 11. On investing in engineering infrastructure
```
┌──────────────────────────────────────────────────────────┐
│ │
│ Plain business code ≠ an engineering project │
│ │
│ Engineering project = business code + │
│ coding rules + │
│ quality gates (pre-commit + CI) + │
│ automation (Makefile + skills) + │
│ collaboration (branch + PR) + │
│ knowledge base (CLAUDE.md + │
│ rules + docs) │
│ │
│ The earlier this infrastructure lands, the faster and │
│ farther the team can run. │
│ │
└──────────────────────────────────────────────────────────┘
```
Old project vs. new project after this rewrite:
| Dimension | Old project | New project |
|---|---|---|
| Lint tools | black + isort + pylint | ruff (single tool) |
| Config files | pyproject + pylintrc + pyrightconfig + pytest.ini | unified pyproject.toml |
| pre-commit | basic | adds gitlint commit-msg + import / yaml / private-key checks |
| Layer direction | not enforced | import-linter enforced in CI |
| Commit format | freeform | gitlint pre-commit hook (Conventional Commits) |
| Claude Code integration | partial rules | rules + skills + settings (full) |
| CI platform | ad hoc | GitHub Actions calling Makefile targets |
| Tests | basic | unit + integration + e2e + coverage report |
These are not perfectionism — they are baseline requirements for
**multi-person collaboration, long-term maintenance, and sustainable
evolution**.
---
## 12. References
- Hard coding rules: [../.claude/rules/](../.claude/rules/) (auto-loaded by Claude Code)
- Slash command workflows: [../.claude/skills/](../.claude/skills/)
- Contributor onboarding: [../CONTRIBUTING.md](../CONTRIBUTING.md)
- Architecture: [architecture.md](architecture.md)
- Claude Code memory mechanism: [code.claude.com/docs/en/memory.md](https://code.claude.com/docs/en/memory.md)
- Claude Code skills: [code.claude.com/docs/en/skills.md](https://code.claude.com/docs/en/skills.md)
- ruff: [docs.astral.sh/ruff](https://docs.astral.sh/ruff/)
- import-linter: [import-linter.readthedocs.io](https://import-linter.readthedocs.io/)
- gitlint: [jorisroovers.com/gitlint](https://jorisroovers.com/gitlint/)
- uv: [docs.astral.sh/uv](https://docs.astral.sh/uv/)
- pre-commit: [pre-commit.com](https://pre-commit.com/)
- Conventional Commits: [conventionalcommits.org](https://www.conventionalcommits.org/)
- GitHub Actions: [docs.github.com/en/actions](https://docs.github.com/en/actions)

294
docs/how-memory-works.md Normal file
View File

@ -0,0 +1,294 @@
# How Memory Works
How EverOS turns a stream of messages into durable, searchable memory —
the storage stack, the path layout on disk, the write→index→read
pipeline, and the consistency guarantees.
This is the narrative companion to the reference docs: see
[storage_layout.md](storage_layout.md) for the exact file encoding,
[architecture.md](architecture.md) for the layer boundaries, and
[api.md](api.md) for the HTTP contract.
## Table of contents
- [The storage stack](#the-storage-stack)
- [Storage paths](#storage-paths)
- [How a memory is born](#how-a-memory-is-born)
- [Memory types & storage strategies](#memory-types--storage-strategies)
- [The cascade daemon](#the-cascade-daemon)
- [The Offline Memory Engine (OME)](#the-offline-memory-engine-ome)
- [Consistency model](#consistency-model)
- [Zero external services](#zero-external-services)
- [Operating it](#operating-it)
## The storage stack
Three embedded pieces, each owning what it is best at. Markdown is the
**source of truth**; the other two are **derived and rebuildable**.
| Layer | Backed by | Holds | Rebuildable? |
|---|---|---|---|
| **Markdown + YAML frontmatter** | plain `.md` files | the memory content itself — the only portable, human-editable asset | — (it *is* the truth) |
| **SQLite** (`aiosqlite`) | `.index/sqlite/*.db` | system state, audit log, the cascade queue, the boundary buffer, OME engine state | ✅ from markdown |
| **LanceDB** (Arrow) | `.index/lancedb/*.lance` | vector + BM25 + scalar columns for retrieval | ✅ from markdown |
!!! note "The one rule that follows from this"
Delete the entire `.index/` directory and **no memory is lost** — it
rebuilds from the `.md` tree. There is no separate "export"; the
markdown *is* the export. (How to trigger a rebuild:
[Operating it](#operating-it).)
## Storage paths
The default memory root is **`~/.everos/`** (override with
`EVEROS_MEMORY__ROOT` or `[memory] root` in TOML). Configuration (the
`.env` file) is separate from data (the memory root): the server searches
`./.env``$XDG_CONFIG_HOME/everos/.env``~/.everos/.env`.
Memory is partitioned by **`<app_id>/<project_id>`** *before* the
user-visible directories, so different `(app, project)` spaces never share
a directory or cross in search. The reserved id `"default"` materialises as
`default_app` / `default_project` on disk (so a default space stays
visually distinct from a user-named one).
```
~/.everos/ ← memory root (EVEROS_MEMORY__ROOT)
├── default_app/ ← <app_id> ("default" → default_app)
│ └── default_project/ ← <project_id> ("default" → default_project)
│ ├── users/ ← user-visible (source of truth)
│ │ └── <user_id>/
│ │ ├── user.md single-file (profile)
│ │ ├── episodes/
│ │ │ └── episode-<YYYY-MM-DD>.md daily-log append
│ │ ├── .atomic_facts/ daily-log (hidden)
│ │ │ └── atomic_fact-<YYYY-MM-DD>.md
│ │ └── .foresights/ daily-log (hidden)
│ │ └── foresight-<YYYY-MM-DD>.md
│ ├── agents/
│ │ └── <agent_id>/
│ │ ├── .cases/ daily-log (hidden)
│ │ │ └── agent_case-<YYYY-MM-DD>.md
│ │ └── skills/ skill-named dir
│ │ └── skill_<name>/SKILL.md (+ references/ scripts/)
│ └── knowledge/ ← shared / global (reserved)
├── .index/ ← system-managed, rebuildable (gitignore)
│ ├── sqlite/
│ │ ├── system.db state / audit / cascade queue (md_change_state) / buffer / LSN
│ │ ├── ome.db Offline Memory Engine state
│ │ ├── ome.aps.db APScheduler jobstore (split to avoid lock contention)
│ │ └── ome.db.lock OME single-engine guard (portalocker)
│ └── lancedb/
│ └── <kind>.lance/ one Arrow table per kind
├── ome.toml ← user-editable OME strategy overrides (hot-reloaded)
└── .tmp/ atomic-write staging
```
!!! warning "Differences from older PRD-era docs"
The index dir is **`.index/`** (dot-prefixed), not `_index/`. The
cascade queue and LSN/audit state live in **SQLite** (`system.db`,
table `md_change_state`) — there is no `.cascade.log` / `.manifest.json`
file in the current implementation. The `<app>/<project>` nesting is
real and always present (`default_app/default_project` for the default
scope). There is **no `everos reindex` command** (see
[Operating it](#operating-it)).
The path manager is
[`MemoryRoot`](../src/everos/core/persistence/memory_root.py); every path
above is a property on it. `MemoryRoot.ensure()` creates the runtime dirs
(`.index/{sqlite,lancedb}/`, `.tmp/`) and copies the OME template to
`ome.toml`; user-visible dirs appear on first write.
## How a memory is born
A message does not become memory immediately — it accumulates, a boundary
is detected, an LLM extracts a cell, writers persist markdown, and the
index catches up asynchronously.
```
POST /add ──▶ unprocessed_buffer (SQLite) ← messages accumulate per (session, app, project)
├─ boundary detector trips ─┐
POST /flush ─────────┤ (or you force it) │ one LLM call
│ ▼
│ extract MemCell ──▶ memcell row (SQLite)
│ │
│ ┌──────────────┴───────────────┐
│ ▼ ▼
│ UserMemoryPipeline (sync) AgentMemoryPipeline (fire-and-forget)
│ writes episode .md NOW emits AgentPipelineStarted
▼ │ │
(response returns once md is on disk) │
▼ ▼
┌─────────────────── Offline Memory Engine (OME) ───────────────────┐
│ async strategies write derived .md: │
│ atomic_facts · foresight · user profile · agent cases · agent skills │
└───────────────────────────────┬──────────────────────────────────────┘
cascade daemon watches the .md tree
md_change_state queue (SQLite, durable)
rebuild LanceDB rows ──▶ searchable
```
- **`/add`** appends messages to a per-`(session_id, app_id, project_id)`
buffer and returns `accumulated` (or `extracted` if the boundary tripped
on this call). See [api.md](api.md).
- **`/flush`** forces the boundary now (one extraction LLM call), used at
the end of a chat/agent run.
- Episode markdown is written **synchronously** — when `/flush` returns
`extracted`, the episode file is already on disk.
- Everything else (atomic facts, foresight, profile, agent cases/skills)
is produced **asynchronously** by the OME — see
[the OME section](#the-offline-memory-engine-ome).
- The **cascade daemon** turns every `.md` write into LanceDB rows so the
content becomes searchable.
## Memory types & storage strategies
Six business memory kinds today, each user- or agent-owned, each picking
one of three on-disk patterns:
| Kind | Owner | Dir / file | Strategy | Produced by |
|---|---|---|---|---|
| **episode** | user | `episodes/episode-<date>.md` | daily-log | extraction (sync) |
| **atomic_fact** | user | `.atomic_facts/atomic_fact-<date>.md` (hidden) | daily-log | OME |
| **foresight** | user | `.foresights/foresight-<date>.md` (hidden) | daily-log | OME |
| **profile** | user | `user.md` | single-file rewrite | OME |
| **agent_case** | agent | `.cases/agent_case-<date>.md` (hidden) | daily-log | OME |
| **agent_skill** | agent | `skills/skill_<name>/SKILL.md` | skill-named dir | OME (clustering) |
The three strategies:
| Strategy | Shape | Why |
|---|---|---|
| **Daily-log append** | `<prefix>-<YYYY-MM-DD>.md`, one entry appended per memory | collapses thousands of per-entry files into one file per day |
| **Single-file rewrite** | a fixed filename overwritten in place | for a single evolving document (a user/agent profile) |
| **Skill-named dir** | one directory per skill | a skill is a richer unit (body + optional `references/` `scripts/`) |
!!! note
The single-file writer also supports `agent.md` / `soul.md` /
`tools.md` / `behaviors.md`, but no shipped OME strategy produces those
yet — today only `user.md` is written. Detailed frontmatter and
entry-id encoding live in [storage_layout.md](storage_layout.md).
## The cascade daemon
The cascade subsystem keeps LanceDB in sync with the markdown tree. It runs
**in-process** with the server (a coroutine started by the app lifespan),
not as a separate OS daemon.
1. A native filesystem watcher (`watchdog`: FSEvents on macOS, inotify on
Linux) sees a `.md` create/modify.
2. The change is enqueued in the **`md_change_state`** table (SQLite) —
durable, so a crash mid-sync replays on restart.
3. A worker drains the queue at **entry-level** granularity: it diffs the
file, re-embeds only changed entries (keyed by `content_sha256`), and
upserts the LanceDB rows.
Because markdown is the source of truth, **editing a file directly is
fully supported** — open an episode in VSCode / Obsidian / Vim, change an
entry, save, and the daemon re-indexes just that entry. Operate the queue
with `everos cascade` ([Operating it](#operating-it)); deeper runbook in
[cascade_runbook.md](cascade_runbook.md).
## The Offline Memory Engine (OME)
Most memory kinds are **not** extracted on the request path — they are
derived later by the OME, an in-process async strategy engine. When
extraction carves a MemCell, it emits an event; OME strategies pick it up
and write their markdown when ready:
- `extract_atomic_facts` — single-sentence facts from an episode
- `extract_foresight` — anticipatory notes
- `extract_user_profile` — the aggregated `user.md`
- `extract_agent_case` — a reusable agent trajectory (only when the cell is
substantive enough; thin trajectories are skipped by design)
- `extract_agent_skill` — clusters related cases into a named skill
Strategies are configurable without a code change via **`ome.toml`** at the
memory root (hot-reloaded within ~2 s). Example — turn two off:
```toml
[strategies.extract_foresight]
enabled = false
[strategies.extract_user_profile]
enabled = false
```
OME keeps its own state in `.index/sqlite/ome.db` (run records, counters)
and its scheduler jobstore in `.index/sqlite/ome.aps.db` (split so the sync
APScheduler writer and the async OME writer never contend for one file
lock).
!!! tip "Implication for clients"
After `/flush` returns `extracted`, the **episode** is queryable soon
(once cascade indexes it), but **atomic facts / profile / agent cases**
appear only after their OME strategy runs — typically seconds later.
Poll / retry if you need them immediately.
## Consistency model
Two paths, two guarantees:
| Path | Guarantee | Detail |
|---|---|---|
| **Write** (`/add`, `/flush`) | **strong** | the episode `.md` is on disk before the call returns `extracted`; never blocks on LanceDB |
| **Read** (`/search`, `/get`) | **eventual** | reads LanceDB, which lags md by the cascade processing time — sub-second typically, up to ~1015 s under load |
So a `/search` immediately after the `/flush` that produced a record may
miss it. The markdown is durable regardless; index lag never loses data. If
you need read-your-write, retry with backoff, or force the queue with
`everos cascade sync`.
Integrity is anchored by a few invariants (details in
[storage_layout.md](storage_layout.md)): the frontmatter `id` /
`entry_id` is the immutable join key; `content_sha256` decides whether an
entry needs re-embedding; an LSN watermark (in `system.db`) orders
rebuilds; the durable `md_change_state` queue is the replayable audit
trail.
## Zero external services
No database server, message broker, or vector service to run. Vector ANN,
full-text BM25, and scalar filtering all execute inside the **embedded
LanceDB** engine in one query; SQLite is a local file. The whole stack is a
single directory you can copy, back up, or check the user-visible parts of
into git.
!!! note
There is no automatic "grep over markdown" search fallback today — if
the LanceDB index is unavailable, rebuild it from markdown (it is
derived and disposable) rather than relying on a degraded search path.
## Operating it
The CLI ([cli.md](cli.md)) is intentionally small:
| Command | What it does |
|---|---|
| `everos init` | write a starter `.env` |
| `everos server start` | run the HTTP API (cascade + OME start with it) |
| `everos cascade status` | queue / LSN summary |
| `everos cascade sync` | drain the cascade queue now (force md → LanceDB) |
| `everos cascade fix` | list failed rows / re-enqueue retryable ones |
!!! warning "There is no `everos reindex` or `everos flush`"
- **Reindex** = the index is rebuildable: stop the server,
`rm -rf <memory-root>/.index/lancedb`, restart — the cascade
rebuilds from markdown. For an incremental catch-up, use
`everos cascade sync`.
- **Flush** is an HTTP endpoint (`POST /api/v1/memory/flush`), not a
CLI command — it forces *extraction* of the session buffer, which is
a different thing from forcing *index sync* (`cascade sync`).
## References
- [storage_layout.md](storage_layout.md) — exact file encoding, frontmatter
chassis, entry-id format, atomic-write semantics
- [architecture.md](architecture.md) — DDD layers and dependency rules
- [api.md](api.md) — the HTTP contract (`/add` `/flush` `/search` `/get`)
- [cascade_runbook.md](cascade_runbook.md) — operating the sync queue

63
docs/index.md Normal file
View File

@ -0,0 +1,63 @@
# EverOS Documentation
Documentation for [EverOS](../README.md) — md-first memory extraction
framework. Organised by [Diátaxis](https://diataxis.fr/) — what kind of
question you have determines which section to read.
## Reference
Technical reference: contracts, commands, schemas — read these when you
already know what you want to do and need to know exactly how.
| Doc | Purpose |
|---|---|
| [api.md](api.md) | HTTP API v1 reference — endpoints, request / response, error contracts |
| [cli.md](cli.md) | `everos` CLI subcommands + env var conventions |
| [storage_layout.md](storage_layout.md) | Memory-root tree + frontmatter chassis + EntryId encoding |
| [prompt_slots.md](prompt_slots.md) | YamlConfigLoader + three-layer prompt override |
## Explanation
Design decisions and architectural concepts — read these to understand
why the system is shaped the way it is.
| Doc | Purpose |
|---|---|
| [overview.md](overview.md) | Project vision, scope, design philosophy |
| [how-memory-works.md](how-memory-works.md) | Storage stack + on-disk paths + write→index→read pipeline + consistency |
| [architecture.md](architecture.md) | DDD layered architecture + dependency rules |
| [datetime.md](datetime.md) | Two-zone discipline — UTC at storage, display tz at boundaries |
## How-to
Task-driven operational guides — read these when you need to do a
specific thing (drain a queue, recover from a stuck row, etc.).
| Doc | Purpose |
|---|---|
| [cascade_runbook.md](cascade_runbook.md) | Cascade subsystem ops — drain queue, recover stuck rows |
## Engineering / Internal
For maintainers and contributors working on the framework itself,
not for using it.
| Doc | Purpose |
|---|---|
| [engineering.md](engineering.md) | Engineering & dev-efficiency infrastructure (CI / tooling / Claude Code) |
## See also
Top-level project files live next to the repo root:
- [README.md](../README.md) — quick start & feature overview
- [QUICKSTART.md](../QUICKSTART.md) — 5-minute walkthrough (install → service → search)
- [CONTRIBUTING.md](../CONTRIBUTING.md) — how to contribute (issue-only model)
- [CHANGELOG.md](../CHANGELOG.md) — release notes
- [SECURITY.md](../SECURITY.md) — security policy & private vulnerability reporting
- [CITATION.md](../CITATION.md) — academic citation info
- [ACKNOWLEDGMENTS.md](../ACKNOWLEDGMENTS.md) — third-party acknowledgments
Coding conventions and slash command workflows are auto-loaded by
Claude Code from [.claude/rules/](../.claude/rules/) and
[.claude/skills/](../.claude/skills/).

126
docs/locomo_benchmark.md Normal file
View File

@ -0,0 +1,126 @@
# Running the LoCoMo Benchmark
This guide walks through reproducing EverOS's LoCoMo retrieval scores
locally using the `hybrid` and `agentic` search methods.
## Contents
- [Prerequisites](#prerequisites)
- [1. Prepare the dataset](#1-prepare-the-dataset)
- [2. Start the server](#2-start-the-server)
- [3. Run `hybrid`](#3-run-hybrid)
- [4. Run `agentic`](#4-run-agentic)
- [5. Where the results land](#5-where-the-results-land)
- [Notes](#notes)
---
## Prerequisites
- Python **3.12**, [uv](https://docs.astral.sh/uv/)
- A `.env` at the repo root with the LLM / embedding credentials EverOS
needs:
- `EVEROS_LLM__MODEL`, `EVEROS_LLM__API_KEY`, `EVEROS_LLM__BASE_URL`
- `EVEROS_EMBEDDING__*`
- `EVEROS_RERANK__*`
- The benchmark driver also reads `LLM_API_KEY` / `ANSWER_MODEL` /
`JUDGE_MODEL` for the answer + judge passes.
Install the project:
```bash
uv sync
```
## 1. Prepare the dataset
Place the LoCoMo file at `data/locomo10.json` (the dataset is
distributed by the LoCoMo authors, not this repo). Override the path
later with `--data-path` if you keep it elsewhere.
## 2. Start the server
```bash
EVEROS_MEMORY__ROOT=~/.everos \
uv run python -m everos.entrypoints.cli.main server start --port 8000
```
`EVEROS_MEMORY__ROOT` isolates one benchmark's corpus from another —
change it (or `rm -rf` it) whenever you want a clean run.
Leave the server running in one terminal; run the benchmark from
another.
## 3. Run `hybrid`
Single conversation:
```bash
bash tests/run_locomo_batch.sh \
--conv-indices 0 \
--methods hybrid \
--base-url http://localhost:8000 \
--top-k 10
```
All 10 conversations, 2-way parallel:
```bash
bash tests/run_locomo_batch.sh \
--conv-indices 0-9 \
--methods hybrid \
--base-url http://localhost:8000 \
--top-k 10 \
--concurrency 2
```
The wrapper picks up `EVEROS_MEMORY__ROOT` from the environment so the
cascade poll path matches the server's data root. If you set them
differently, pass `--corpus-path` explicitly.
## 4. Run `agentic`
Same wrapper, swap `--methods`:
```bash
bash tests/run_locomo_batch.sh \
--conv-indices 0-9 \
--methods agentic \
--base-url http://localhost:8000 \
--top-k 10 \
--concurrency 2
```
You can also benchmark multiple methods in one go — they share the
same ingested corpus:
```bash
bash tests/run_locomo_batch.sh \
--conv-indices 0-9 \
--methods hybrid,agentic \
--base-url http://localhost:8000 \
--top-k 10 \
--concurrency 2
```
## 5. Where the results land
Default output root is `benchmark_results/run_<timestamp>/`. Override
with `--output-root`:
```
<output_root>/
├── conv0.json … conv9.json # per-conv summary + per-question details
├── conv0.log … conv9.log # per-conv stdout (only in --concurrency >1 mode)
└── conv0_checkpoints/ … # incremental search/answer/eval JSON
```
An aggregate accuracy table prints at the end of the wrapper run.
## Notes
- **Re-running on the same corpus**: add `--skip-add` to skip ingest and
reuse what's already in `~/.everos`. Useful when comparing methods
side by side.
- **Judge variance**: `--judge-runs 3` runs the judge three times per
question and majority-votes; slower but reduces LLM-judge noise.

1664
docs/openapi.json Normal file

File diff suppressed because it is too large Load Diff

85
docs/overview.md Normal file
View File

@ -0,0 +1,85 @@
# EverOS — Project Overview
## Vision
Build an open-source Python memory framework where **AI agents' long-term memory is plain Markdown files on the user's disk**, not opaque rows in a hosted database.
## Scope
**In scope (v1)**:
- Local deployment for personal agents or small teams
- Conversation, workflow, agent-trace, file-knowledge → structured memory
- Hybrid retrieval (BM25 + vector + scalar filter)
- Cascade index sync (md edit → LanceDB sub-second)
- Dual-track memory (user-track / agent-track)
- Offline memory evolution (Foresight / AtomicFact / Profile / Skill)
- CLI + HTTP API
**Out of scope (v1, future v2)**:
- Multi-tenant / group / community deployment (10K+ users)
- End-to-cloud sync (planned for v2)
- Distributed deployment / sharding
## Design philosophy
### 1. Markdown as Source of Truth
```
delete all LanceDB / SQLite files → can rebuild from md
delete any md file → memory is gone
```
User trust comes from physical visibility — the user can `cat` / `vim` / `grep` their own memory at any time.
### 2. Three-piece storage with clear job boundaries
| Component | Role | Does NOT do |
|---|---|---|
| Markdown files | Truth source — entries, frontmatter | Search (grep is degraded fallback only) |
| SQLite | Queue, cascade audit log, sensitive data isolation | Vector / full-text |
| LanceDB | Vector ANN + BM25 + scalar filter, single-query hybrid | Be the source of truth (loss = rebuild from md) |
### 3. Algorithm-orchestration separation
[`everalgo`](https://github.com/EverMind-AI/EverAlgo) (a separate library, published as the `everalgo-*` PyPI packages) holds the extraction algorithms (MemCell extraction, Episode generation, Profile evolution). EverOS calls everalgo via the PromptSlot interface; everalgo knows nothing about storage.
This boundary lets the same algorithm power both this open-source lightweight version and other product forms.
### 4. DDD layered architecture
```
entrypoints → service → memory → infra
component / core / config
```
Strict single-direction dependency, enforced by `import-linter` in CI.
## Why src layout (`src/everos/`)
- Standard PyPA project structure used when shipping to PyPI
- Avoid namespace collision with system packages named `memory`, `infra`, etc.
- Avoid accidental import of working-tree code in dev (PyPA recommendation)
## Comparable projects (where EverOS differs)
| Project | Position | Difference |
|---|---|---|
| [mem0](https://github.com/mem0ai/mem0) | API-first memory service | mem0 stores in vector DB; we store in md files |
| [Letta](https://github.com/letta-ai/letta) | Agent OS w/ Core/Recall/Archival | Letta uses Postgres; we use markdown filesystem |
| [MemOS](https://github.com/MemTensor/MemOS) | Multi-classification memory | MemOS targets enterprise; we target lightweight (single-user / small team) |
| [memsearch](https://github.com/zilliztech/memsearch) | md-first search engine | Closest to us; we add memory extraction (not just search) |
## Roadmap
- **v0.1 (MVP)** — Phase 1 core loop: markdown + lancedb + cascade + episode extraction
- **v0.2** — Full extraction pipeline (workspace / agent / knowledge), evolution framework
- **v0.3** — Production hardening, full CLI, HTTP API, Obsidian demo
- **v1.0** — Stable API, PyPI release, comprehensive docs
- **v2** (future) — Edge-to-cloud sync via EverMe (separate project)
## Status
**Alpha — v0.1.0 in active development**. Core API may change before v1.0.

111
docs/prompt_slots.md Normal file
View File

@ -0,0 +1,111 @@
# PromptSlot
PromptSlot is the layer between the algorithm code (`everalgo`) and
the prompts it sends to LLMs. Algorithm code receives a `PromptSlot`
parameter; the *project* (EverOS) supplies defaults and lets operators
override.
> **Status (2026-05-07)**: the YAML loader is implemented; the higher-
> level `PromptSlot` model + sandbox dry-run + three-layer overlay
> resolution arrive when the memory layer ships (see Stage 2).
## Three-layer overlay
```
config/prompt_slots/<name>.yaml (Layer 1: defaults shipped with the package)
~/.everos/prompt_slots/<name>.yaml (Layer 2: app-level override; per-deployment)
runtime override (Layer 3: per-call override; e.g. "force model X")
```
Effective prompt = layer 3 wins → layer 2 → layer 1. Layer 1 is
loaded eagerly at startup; layer 2 is loaded on first reference (lazy);
layer 3 is supplied at the call site.
## Loader
The category loader lives at
[`src/everos/component/config/loader.py`](../src/everos/component/config/loader.py)
as `YamlConfigLoader`:
```python
from pathlib import Path
from everos.component.config import YamlConfigLoader
loader = YamlConfigLoader(
root=Path("src/everos/config"),
categories={"prompt_slots": None}, # subdir == category name
)
# Reads <root>/prompt_slots/episode_extract.yaml → dict
slot = loader.find("prompt_slots", "episode_extract")
# Refresh after on-disk edits.
loader.refresh() # drop the entire cache
loader.refresh("prompt_slots") # drop one category
loader.refresh("prompt_slots", "episode_extract") # drop one entry
```
Top-level YAML is required to be a mapping; a list / scalar root
raises `TypeError` to fail-fast (loud, not silent).
## YAML format (proposed; subject to change)
```yaml
# config/prompt_slots/episode_extract.yaml
template: |
Extract a single episode from this conversation:
{{ memcell.text }}
variables:
memcell: input memcell
output_schema:
type: object
properties:
summary: { type: string }
participants: { type: array }
llm:
model: gpt-4o-mini
temperature: 0.3
max_tokens: 2000
validation:
test_cases:
- input: { memcell: { text: "Hi" } }
expected: { summary: "...", participants: [] }
```
When layer 2 supplies an override the loader will be re-pointed at
`~/.everos/prompt_slots/`; the runtime resolution logic (currently TBD)
sandbox-runs the merged slot before returning it.
## Why YAML (not TOML)
Two reasons:
1. **Multiline templates** — TOML's basic-string grammar fights
prompt content (no easy `{{ jinja }}` variables, awkward escaping).
YAML's literal block scalar (`|`) preserves prompts as-is.
2. **Comment + reference ergonomics** — operators frequently inherit
slots, tweak a few keys, and leave inline notes. YAML is more
forgiving for hand-editing.
The Pydantic Settings file (`config/default.toml`) stays TOML — it's
machine-managed and type-validated; YAML's flexibility costs more
than it pays for that case.
## Why a separate loader (not Pydantic Settings)
Settings = **one** structured tree, validated at load time, tied to a
single source of truth. PromptSlots = **many** separate templates
discovered by name, layered per-deployment. They're different shapes;
forcing one model on the other gets clunky.
## See also
- [`src/everos/component/config/loader.py`](../src/everos/component/config/loader.py)
- [`tests/unit/test_component/test_config/test_loader.py`](../tests/unit/test_component/test_config/test_loader.py)
- [`docs/architecture.md`](architecture.md) — layer placement

222
docs/storage_layout.md Normal file
View File

@ -0,0 +1,222 @@
# Storage Layout
How `everos` lays out a memory-root on disk: directory tree, file
naming, frontmatter chassis, and entry-id encoding.
The contents are the **source of truth**; SQLite and LanceDB are
derived indexes that can be rebuilt from markdown alone.
## 1. Memory-root tree
A memory-root is a single directory holding all persisted memory. The
default location is `~/.everos/`; override via `EVEROS_MEMORY__ROOT`
env var or `[memory] root` in the TOML config.
Memory is partitioned by **`<app_id>/<project_id>`** *before* the
user-visible scope dirs, so different `(app, project)` spaces never share
a directory. The reserved id `"default"` materialises as `default_app` /
`default_project` on disk. The scope is encoded **in the path**, not in
the frontmatter (see [§3](#3-frontmatter-chassis-yaml)).
```
<memory-root>/ default ~/.everos
├── <app_id>/ user-visible; "default" → default_app
│ └── <project_id>/ "default" → default_project
│ ├── users/
│ │ └── <user_id>/
│ │ ├── user.md single-file rewrite (profile)
│ │ ├── episodes/ daily-log append
│ │ │ └── episode-<YYYY-MM-DD>.md
│ │ ├── .atomic_facts/ daily-log append (hidden)
│ │ │ └── atomic_fact-<YYYY-MM-DD>.md
│ │ └── .foresights/ daily-log append (hidden)
│ │ └── foresight-<YYYY-MM-DD>.md
│ ├── agents/
│ │ └── <agent_id>/
│ │ ├── .cases/ daily-log append (hidden)
│ │ │ └── agent_case-<YYYY-MM-DD>.md
│ │ └── skills/ skill-named dir
│ │ └── skill_<name>/
│ │ ├── SKILL.md
│ │ ├── references/ (optional)
│ │ └── scripts/ (optional)
│ └── knowledge/ user-visible (shared / global, reserved)
├── .index/ system-managed, rebuildable (gitignore)
│ ├── sqlite/
│ │ ├── system.db state / cascade queue (md_change_state) / buffer / audit / LSN (+ -wal / -shm)
│ │ ├── ome.db Offline Memory Engine state
│ │ ├── ome.aps.db APScheduler jobstore (split to avoid lock contention)
│ │ └── ome.db.lock OME single-engine guard (portalocker)
│ └── lancedb/
│ └── <kind>.lance/ one directory per LanceDB table
├── ome.toml user-editable OME strategy overrides (hot-reloaded)
└── .tmp/ staging dir for batch / multi-step writes
```
> Cascade queue state, the LSN watermark, and the change audit all live in
> SQLite (`system.db`, table `md_change_state`) — crash-recovery replays
> from that durable queue, not a log file. (`MemoryRoot` also exposes a
> `.lock` anchor for the `memory_root_lock` primitive; there is no
> `.cascade.log` / `.manifest.json`.)
The path manager is [`MemoryRoot`](../src/everos/core/persistence/memory_root.py),
exposing every path as a property. `MemoryRoot.ensure()` creates the
runtime-required dirs (`.index/{sqlite,lancedb}/`, `.tmp/`) and copies the
OME template to `ome.toml`; the user-visible dirs are *not* pre-created —
they appear on first write.
> The single-file writer also supports `agent.md` / `soul.md` / `tools.md`
> / `behaviors.md`, but no shipped strategy produces those today — only
> `user.md` is written. `memcell` is a SQLite-only kind (the boundary
> ledger); it has no markdown file.
## 2. Three storage strategies
Each business memory kind picks one of three on-disk patterns:
| Strategy | Filename | Mutation | Examples |
|---|---|---|---|
| **Daily-log append** | `<FILE_PREFIX>-<YYYY-MM-DD>.md` under `<DIR_NAME>/` | append entries | episode / atomic_fact / foresight / agent_case |
| **Skill-named dir** | `skills/skill_<name>/SKILL.md` (+ `references/` `scripts/`) | overwrite the file | agent skills (procedural memory) |
| **Single-file rewrite** | `user.md` (writer also supports `agent.md` / `soul.md` / `tools.md` / `behaviors.md`, not yet produced) | overwrite the file | user profile |
Markdown IO primitives live in
[`core/persistence/markdown/`](../src/everos/core/persistence/markdown/);
business-aware writers live in
[`infra/persistence/markdown/writers/`](../src/everos/infra/persistence/markdown/writers/)
and pick the right strategy via a base class.
To add a new memory kind, define its per-kind frontmatter schema under
[`infra/persistence/markdown/mds/`](../src/everos/infra/persistence/markdown/mds/)
and add a matching writer/reader pair under
[`writers/`](../src/everos/infra/persistence/markdown/writers/) and
[`readers/`](../src/everos/infra/persistence/markdown/readers/).
## 3. Frontmatter chassis (YAML)
Every markdown file carries a YAML frontmatter block at the top:
```markdown
---
id: episode_log_alice_2026-06-01
type: episode_daily
file_type: episode_daily
schema_version: 1
user_id: alice
track: user
date: '2026-06-01'
entry_count: 11
last_appended_at: '2026-06-01T09:12:13+00:00'
---
<!-- entry:ep_20260601_00000001 -->
...content...
<!-- /entry:ep_20260601_00000001 -->
```
Scope (`app_id` / `project_id`) is **not** a frontmatter field — it is
carried by the `<app>/<project>` path segments and recovered by the
cascade path parser. The frontmatter only holds the file-level owner
(`user_id` / `agent_id`) and `track`.
The chassis lives in [`core/persistence/markdown/frontmatter.py`](../src/everos/core/persistence/markdown/frontmatter.py)
(Pydantic v2):
```
BaseFrontmatter id / type / schema_version + SCOPE_DIR ClassVar
├─ UserScopedFrontmatter + user_id / track="user" + SCOPE_DIR="users"
└─ AgentScopedFrontmatter + agent_id / track="agent" + SCOPE_DIR="agents"
```
Concrete business schemas subclass one of the scope mixins and add
per-kind fields plus three more ClassVars that drive path resolution
+ entry-id assembly:
```python
class EpisodeDailyFrontmatter(DailyLogPathMixin, UserScopedFrontmatter):
ENTRY_ID_PREFIX: ClassVar[str] = "ep"
DIR_NAME: ClassVar[str] = "episodes"
FILE_PREFIX: ClassVar[str] = "episode"
type: Literal["episode_daily"] = "episode_daily"
date: dt.date
entry_count: int = 0
last_appended_at: dt.datetime | None = None
```
## 4. Entry-id encoding
Inside daily-log files each entry is bracketed by HTML-comment markers
so the raw markdown stays clean for human readers:
```
<!-- entry:<entry_id> -->
...content...
<!-- /entry:<entry_id> -->
```
`<entry_id>` is `<prefix>_<YYYYMMDD>_<NNNNNNNN>` (8-digit sequence),
e.g. `ep_20260601_00000001`:
| Segment | Source |
|---|---|
| `prefix` | `Frontmatter.ENTRY_ID_PREFIX` (declared by the schema subclass) |
| `<YYYYMMDD>` | The daily-log file's date bucket |
| `NNNNNNNN` | Per-file sequence, 8-digit zero-padded, restarts at `00000001` each day per scope |
Implementation: [`core/persistence/markdown/entries.py`](../src/everos/core/persistence/markdown/entries.py)
(`EntryId.parse / format / next_for`).
> **File-level seq, not global**: the same `ep_20260601_00000001` may
> appear across two different `user_id`s (each user has its own daily file).
> Cross-table joins must therefore key on **`(scope_id, entry_id)`**
> rather than `entry_id` alone — see SQLite/LanceDB tables that follow.
## 5. SQLite + LanceDB derived indexes
```
.index/
├── sqlite/
│ └── system.db state / audit log / task queue / LSN watermark
│ + per-kind business state tables (composite key)
└── lancedb/
└── <kind>.lance/ one Arrow-based table per kind
stores text / vector / tags / metadata
```
- **SQLite** schema lives in
[`infra/persistence/sqlite/tables/`](../src/everos/infra/persistence/sqlite/tables/);
every business table that joins back to markdown declares a
`UniqueConstraint("user_id", "entry_id")` (or `agent_id` symmetric).
- **LanceDB** schemas live in
[`infra/persistence/lancedb/tables/`](../src/everos/infra/persistence/lancedb/tables/);
`Vector(N)` dimension matches the embedding model output.
Both layers are **fully derivable from markdown** — wipe `.index/`
and the in-process cascade subsystem re-builds everything by scanning the
user-visible tree (the durable `md_change_state` SQLite queue covers
crash-recovery replay).
## 6. Atomic write semantics
`MarkdownWriter` uses a same-directory temp file
(`.<name>.tmp.<uuid>`) + `os.replace` for atomicity. Keeping the temp
file in the same directory guarantees `os.replace` is atomic on POSIX
(the rename is only atomic within a single filesystem).
`MarkdownWriter.append_entry` reads → merges frontmatter →
appends an entry block → atomic write back. The caller passes a full
`EntryId` (built via `EntryId.next_for(prefix, date, current_count)`);
this primitive is **schema-agnostic** — field-level semantics
(`entry_count` / `last_appended_at`) are a business writer's job
(see `BaseDailyAppender._frontmatter_updates` in
[`infra/persistence/markdown/writers/base.py`](../src/everos/infra/persistence/markdown/writers/base.py)).
## 7. References
- Code:
- [`core/persistence/memory_root.py`](../src/everos/core/persistence/memory_root.py) — memory-root resolution
- [`core/persistence/markdown/`](../src/everos/core/persistence/markdown/) — schema-agnostic read/write chassis
- [`infra/persistence/markdown/mds/`](../src/everos/infra/persistence/markdown/mds/) — per-kind frontmatter schemas
- [`infra/persistence/{markdown,sqlite,lancedb}/`](../src/everos/infra/persistence/) — business-aware adapters