chore: initialize EverOS 1.0.0

md-first memory extraction framework for AI agents.

Markdown is the single source of truth; SQLite holds state and LanceDB
provides the rebuildable vector + BM25 + scalar index. The codebase follows
a single-direction DDD layering (entrypoints -> service -> memory -> infra,
with component / core / config cross-cutting) enforced by import-linter.

Engineering surface:
- Coding conventions in .claude/rules/ (path-scoped) and workflows in
  .claude/skills/ (/commit, /new-branch, /pr).
- GitHub Actions CI runs make lint + test + integration; pre-commit mirrors
  the gates locally (ruff, hygiene hooks, gitlint commit-msg).
- Commit messages follow Conventional Commits, enforced by gitlint.
- make lint also enforces datetime two-zone discipline and OpenAPI drift.
This commit is contained in:
Elliot Chen
2026-06-05 22:35:51 +08:00
commit 518b8eca85
636 changed files with 160553 additions and 0 deletions

View File

@ -0,0 +1,123 @@
# End-to-end memorize test
In-process driver that pushes a realistic fixture through `service.memorize`,
batching by 6 messages per `/add` call and then `/flush` at the end.
## What's here
| File | Purpose |
|---|---|
| `fixtures/chat_session.json` | 22 messages · 3 topic shifts · multi-user (Alice → Bob) — chat-mode fixture |
| `fixtures/agent_session.json` | 21 items · 2 task threads · interleaved `tool_calls` / `tool` results — agent-mode fixture |
| `run.py` | In-process runner (no HTTP) |
## Prereqs
1. **LLM client configured** in `.env`:
- `EVEROS_LLM__API_KEY=...`
- `EVEROS_LLM__BASE_URL=...` (OpenAI-compatible)
- `EVEROS_LLM__MODEL=...` (defaults to `gpt-4o-mini`)
- Without these, the boundary stage logs `memorize_no_llm_client` and skips the run.
2. **Memory root**: defaults to `~/.everos`; override with `EVEROS_MEMORY__ROOT=...`.
3. **Mode** is read from `settings.memorize.mode` (toml/env) before the first `memorize()` call.
## Run
```bash
# Chat mode — boundary uses everalgo.boundary.detect_boundaries
EVEROS_MEMORIZE__MODE=chat uv run python scripts/e2e_memorize/run.py \
--fixture scripts/e2e_memorize/fixtures/chat_session.json \
--expected-mode chat
# Agent mode — boundary uses everalgo.agent_memory.AgentBoundaryDetector
# (filter→detect→remap; tool items preserved in cells)
EVEROS_MEMORIZE__MODE=agent uv run python scripts/e2e_memorize/run.py \
--fixture scripts/e2e_memorize/fixtures/agent_session.json \
--expected-mode agent
# Dry run (print batch plan, no LLM calls)
uv run python scripts/e2e_memorize/run.py \
--fixture scripts/e2e_memorize/fixtures/chat_session.json --dry-run
```
## What to verify after a run
### 1. Console output
Each batch prints `status=` (`accumulated` while buffering, `extracted` when
cells got cut). Final `flush` should be `extracted` if any cell remained
in the tail. The trailing file walker lists md / sqlite files modified
in the last 10 minutes.
### 2. Episode md (sync — 4A)
```
~/.everos/users/<owner_id>/episodes/episode-YYYY-MM-DD.md
```
- Chat fixture: 2 owners (`u_alice`, `u_bob`) — expect Episodes split into
~3-4 cells aligned with topic shifts (Python bug → weekend ramen → Q3
review → SRE handoff/ramen wrap).
- Agent fixture: 1 user (`u_alice`) — expect ~2 Episodes aligned with the
two task threads (latency rollback → DB index fix).
### 3. SQLite memcell rows
```bash
sqlite3 ~/.everos/.index/sqlite/system.db \
"select memcell_id, track, owner_id, owner_type, json_array_length(sender_ids_json) as senders
from memcell order by timestamp"
```
- Chat run: rows with `track=user_memory`, `owner_type=user`.
- Agent run: parallel rows for both tracks (`user_memory` **and**
`agent_memory`) since agent mode dispatches both pipelines.
### 4. Unprocessed buffer
```bash
sqlite3 ~/.everos/.index/sqlite/system.db \
"select session_id, count(*) from unprocessed_buffer
where track='memorize' group by session_id"
```
After `flush` the buffer should be empty for the test session.
### 5. OME async output (only if subscribers exist)
- `users/<owner>/atomic_facts/atomic_fact-YYYY-MM-DD.md` (always; `extract_atomic_facts` is registered)
- `users/<owner>/foresights/foresight-YYYY-MM-DD.md` (always; `extract_foresight` is registered)
- `agents/<agent>/agent_cases/agent_case-YYYY-MM-DD.md` (**only after `extract_agent_cases` strategy is written + registered** — currently absent, the emit is a no-op)
### 6. Reset between runs
The fixture's session_id is randomised per invocation, so previous runs
don't pollute the new one. To wipe everything:
```bash
rm -rf ~/.everos/users ~/.everos/agents ~/.everos/.index/sqlite/system.db
```
## Boundary expectations cheat sheet
### Chat fixture topic shifts (timestamps ms)
| Range | Topic |
|---|---|
| msgs 1-6 (`17473968001747397010`) | Python KeyError debugging |
| msgs 7-12 (`17474004001747400610`) | Weekend ramen plans |
| msgs 13-16 (`17474076001747407720`) | Q3 revenue review meeting prep |
| msgs 17-22 (`17474112001747411410`) | Bob joins, SRE handoff + ramen + Q3 deck deadline |
Boundary detector should cut on topic gaps; 3 cuts → 4 cells is the most likely outcome.
### Agent fixture task threads
| Range | Task |
|---|---|
| items 1-13 (`17473968001747397140`) | API latency spike → identify keepalive pool regression → rollback |
| items 14-21 (`17474004001747400720`) | DB connection pool exhaustion → find unindexed query → CREATE INDEX CONCURRENTLY |
Boundary detector should cut between item 13 and item 14 (timestamp jump
~55 minutes, topic flip). Tool items inside each cell stay attached to
their initiating chat turn.