Files
EverOS/docs/api.md
tomtan b243018aff
Some checks failed
CI / lint (push) Has been cancelled
CI / unit tests (push) Has been cancelled
CI / integration tests (push) Has been cancelled
CI / package build (push) Has been cancelled
Commit lint / pull request title (push) Has been cancelled
Commit lint / commit messages (push) Has been cancelled
Docs / links (push) Has been cancelled
fix(service): enhance message filtering to drop empty chat messages while retaining tool requests
2026-06-16 16:18:24 +08:00

1054 lines
43 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# EverOS HTTP API (v1)
Human-readable reference for the EverOS HTTP API. Schema names, types
and validation constraints mirror the OpenAPI spec served at
`GET /openapi.json` (when the server runs with `ENV=DEV`; see
[OpenAPI spec source](#openapi-spec-source)). This document adds the
business semantics the raw spec does not carry.
## Table of contents
- [Overview](#overview)
- [Base URL and versioning](#base-url-and-versioning)
- [Content type](#content-type)
- [Authentication](#authentication)
- [Response envelope](#response-envelope)
- [Eventual consistency](#eventual-consistency)
- [Conventions](#conventions)
- [ScopeId: `app_id` and `project_id`](#scopeid-app_id-and-project_id)
- [Errors](#errors)
- [Common types](#common-types)
- [MessageItem](#messageitem)
- [ContentItem](#contentitem)
- [ToolCall](#toolcall)
- [ToolFunction](#toolfunction)
- [FilterNode (filter DSL)](#filternode-filter-dsl)
- [SearchMethod](#searchmethod)
- [GetMemoryType](#getmemorytype)
- [Endpoints](#endpoints)
- [POST /api/v1/memory/add](#post-apiv1memoryadd)
- [POST /api/v1/memory/flush](#post-apiv1memoryflush)
- [POST /api/v1/memory/search](#post-apiv1memorysearch)
- [POST /api/v1/memory/get](#post-apiv1memoryget)
- [OpenAPI spec source](#openapi-spec-source)
## Overview
### Base URL and versioning
| Setting | Default | Override |
|---|---|---|
| Host | `127.0.0.1` (loopback only) | `EVEROS_API__HOST` env var or `--host` flag |
| Port | `8000` | `EVEROS_API__PORT` env var or `--port` flag |
| Version prefix | `/api/v1` | — |
All business endpoints documented here live under `/api/v1/memory/`.
The operational endpoints `GET /health` and `GET /metrics` exist but
are intentionally outside this reference — they are runtime probes for
deployment, not part of the application contract.
### Content type
All `POST` endpoints require `Content-Type: application/json`. Request
and response bodies are UTF-8 JSON.
### Authentication
EverOS ships **no built-in authentication**. The server binds to
`127.0.0.1` by default; place your own gateway or auth layer in front
before exposing the API on any other interface. See
[../SECURITY.md](../SECURITY.md) for the threat model.
### Response envelope
Successful (`200 OK`) responses always wrap the payload in:
```json
{
"request_id": "<32-char hex>",
"data": { /* endpoint-specific payload */ }
}
```
`request_id` is generated server-side (32 lowercase hex chars) and is
echoed by structured logs / tracing / metrics, so a single request can
be correlated end-to-end. Error responses carry their own envelope —
`request_id` at the top level alongside a nested `error` object (not a
bare FastAPI `detail`); see [Errors](#errors).
### Eventual consistency
`/add` and `/flush` write the markdown file (the source of truth)
**synchronously** — when the call returns with `status: "extracted"`,
the new entry exists on disk. The LanceDB vector / BM25 / scalar index
is rebuilt by the in-process **cascade coroutine asynchronously**.
That means `/search` and `/get` may not see a record immediately after
the `/flush` that produced it. Typical sync latency is sub-second, but
under load it can reach ~1015 seconds. If you need read-your-write
semantics, retry with backoff. The markdown file is durable
regardless; index lag never loses data.
### Conventions
#### Timestamps
EverOS runs a **two-zone discipline**: every stored byte is UTC; every
rendered value carries the configured **display timezone**. The
display timezone is set by `EVEROS_MEMORY__TIMEZONE` (env var) or
`[memory] timezone` (TOML); default `"UTC"`.
| Direction | Format | Notes |
|---|---|---|
| **Request**`messages[].timestamp` (`/add`) | Integer, **Unix epoch milliseconds (ms)** | Server auto-detects `>= 10^12` as ms for backward compat, but the contract is ms |
| **Request**`filters.timestamp.{gte,lt,…}` (`/search` / `/get`) | Integer, **Unix epoch ms** *or* ISO-8601 string | Same auto-detect; pick whichever you have |
| **Response** — every `timestamp` field | ISO-8601 string with **explicit timezone offset** (`+HH:MM` or `Z`) | Always rendered in the configured display tz; never naive |
With the default `EVEROS_MEMORY__TIMEZONE=UTC` an episode produced
at 11:30 UTC renders as `"2026-05-28T11:30:36Z"` (Pydantic
canonicalises `timezone.utc` to `Z`). Switch to
`EVEROS_MEMORY__TIMEZONE=Asia/Shanghai` and the same UTC instant
renders as `"2026-05-28T19:30:36+08:00"` — bytes on disk are
unchanged.
**Fallback for naive input.** If you submit a `filters.timestamp` ISO
string without an offset (e.g. `"2026-05-28T11:30:00"`), the server
treats it as already display-tz-local before comparing against
storage. This is the same rule users see when reading rendered output:
"if you didn't say a zone, we assume your zone."
**For internal architecture.** The storage / display split lives in
`everos.component.utils.datetime`
(`get_utc_now` / `ensure_utc` / `UtcDatetime` for storage;
`get_now_with_timezone` / `to_display_tz` for display). See
[datetime.md](datetime.md) for the design rationale.
#### Other conventions
- **Server-generated IDs** follow `<owner>_<kind>_<YYYYMMDD>_<NNN>`,
e.g. `alice_ep_20260528_00000001` for an episode, `alice_af_...`
for an atomic fact. See
[storage_layout.md §4](storage_layout.md) for the encoding.
- **All endpoints are POST** for `/api/v1/memory/*` even when the
semantics look like a read (`/search`, `/get`) — the request bodies
are too rich (filters, methods, paging) to encode in a query string.
### ScopeId: `app_id` and `project_id`
`app_id` and `project_id` partition memory at the disk layer. Every
write lands under
`~/.everos/<app>/<project>/users/<user_id>/...` (or `.../agents/...`
for the agent track). The default scope materialises on disk as
`default_app` / `default_project` (the `_app` / `_project` suffix is
added only for the literal id `"default"` so the default space stays
visually distinct from user-named scopes).
A `/search` or `/get` query never crosses scopes — different
`(app_id, project_id)` pairs are isolated.
Both fields share the same validation:
| Setting | Value |
|---|---|
| Type | `string` |
| Default | `"default"` |
| Length | 1128 chars |
| Charset | `^[a-zA-Z0-9_.-]+$` |
| Rejected literals | `"."` and `".."` (path-traversal guard) |
## Errors
Every non-2xx response uses a uniform error envelope — `request_id` at
the top level (mirroring the success envelope) alongside a nested
`error` object. It is **not** a bare FastAPI `detail`:
```json
{
"request_id": "<32-char hex>",
"error": {
"code": "HTTP_ERROR",
"message": "Value error, exactly one of user_id / agent_id must be provided",
"timestamp": "2026-06-01T12:24:46+00:00",
"path": "/api/v1/memory/search"
}
}
```
| HTTP | `error.code` | `error.message` | When |
|---|---|---|---|
| `415 Unsupported Media Type` | `HTTP_ERROR` | the parse-failure reason | `/add` only — a `ContentItem` could not be parsed (unsupported modality for the configured multimodal LLM, or a payload that cannot be fetched / dispatched) |
| `422 Unprocessable Entity` | `HTTP_ERROR` | the **first** validation error (see below) | Request-body validation failure. Also covers `/search` / `/get` filter-DSL compile errors — the compile reason rides in `message` |
| `500 Internal Server Error` | `SYSTEM_ERROR` | `"Internal server error"` (fixed; internal details are logged, never leaked) | Unhandled exception caught by the global handler |
### error object
| Field | Type | Description |
|---|---|---|
| `code` | `string` | `"HTTP_ERROR"` for 4xx (validation / business / `HTTPException`); `"SYSTEM_ERROR"` for 5xx |
| `message` | `string` | Human-readable reason. For `422`, **only the first** validation error is surfaced, formatted `"<msg>: <dotted-loc>"` with the leading `body` segment stripped (e.g. `"Field required: messages"`); a model-level validator with no field location surfaces just `"<msg>"` (e.g. the XOR example above) |
| `timestamp` | `string` | ISO-8601 with timezone offset (display tz) |
| `path` | `string` | Request path, e.g. `/api/v1/memory/add` |
> Unlike FastAPI's default, the full per-field validation array is **not**
> returned — only the first error's message. A client that needs the
> offending field can read the `<loc>` suffix in `message`.
## Common types
### MessageItem
One turn in a `/add` batch. Shape mirrors the OpenAI Chat Completions
message structure plus a stable `sender_id` for indexing.
| Field | Type | Required | Default | Constraints |
|---|---|---|---|---|
| `sender_id` | `string` | yes | — | `minLength=1` |
| `sender_name` | `string \| null` | no | `null` | — |
| `role` | `"user" \| "assistant" \| "tool"` | yes | — | — |
| `timestamp` | `integer` | yes | — | `> 0`**Unix epoch milliseconds (ms)** per v1 contract |
| `content` | `string \| array<ContentItem>` | yes | — | — |
| `tool_calls` | `array<ToolCall> \| null` | no | `null` | — |
| `tool_call_id` | `string \| null` | no | `null` | — |
**`sender_id`** — Stable identifier of the entity producing this turn.
For `role: "user"`, this is the `user_id` the server will index the
extracted memory under: markdown lands at
`users/<sender_id>/episodes/...`, and `/search` / `/get` queries with
`user_id: "<sender_id>"` reach it. For `role: "assistant"` or
`"tool"`, the field is informational (the LLM sees it during
extraction but it is not used as an indexing key).
**`sender_name`** — Optional human-readable display name. Lets the LLM
use a real name during extraction without changing the indexing key.
For example, set `sender_id: "u_42"` (stable internal id) and
`sender_name: "Alice"` (what the LLM sees).
**`role`** — Speaker role; one of:
- `"user"` — content originates from a human (or human-acting client).
- `"assistant"` — content from the AI assistant.
- `"tool"` — output of a tool call; pair with `tool_call_id`.
**`timestamp`** — Wall-clock anchor in **Unix epoch milliseconds (ms)**
per the v1 API contract. Used by extraction to anchor the resulting
episode and by the daily-log writer to bucket the entry into the right
file (`episode-<YYYY-MM-DD>.md` etc.).
> Implementation note: the algo layer auto-detects seconds vs ms (values
> `>= 10^12` are treated as ms, smaller as seconds) for backward compat,
> but **clients should send ms** to honour the contract.
**`content`** — The message body.
- A bare **string** is shorthand for a single text content item.
- An **array of `ContentItem`** is for mixed input (`text` / `md` +
image / pdf / audio / ...). `md` items are read as UTF-8 text;
media/document items are parsed by the multimodal LLM configured via
`EVEROS_MULTIMODAL__*` env vars. See [ContentItem](#contentitem).
**`tool_calls`** — When `role: "assistant"`, the tool calls the
assistant emitted in this turn (OpenAI Chat Completions shape).
**`tool_call_id`** — When `role: "tool"`, the `id` of the call this
message is the response to.
### ContentItem
Mixed message-body element. Carry the payload in exactly one of `text` /
`uri` / `base64`; the others must be `null`. For `type: "text"` use
`text`. For `type: "md"` use `text`, a server-local `file://` URI, or
`base64` UTF-8 bytes. For every **non-text, non-md** type use `uri`
(`http(s)://`) or `base64` (with `ext`). Non-text, non-md items are
routed through the multimodal parser, which needs a fetchable or
decodable payload — passing only `text` returns `415`.
| Field | Type | Required | Default | Notes |
|---|---|---|---|---|
| `type` | `"text" \| "md" \| "image" \| "audio" \| "doc" \| "pdf" \| "html" \| "email"` | yes | — | — |
| `text` | `string \| null` | no | `null` | Required when `type: "text"`; optional inline Markdown when `type: "md"` |
| `uri` | `string \| null` | no | `null` | `http(s)://` (fetched server-side) or `file://` (read from the server's local fs, guardrailed) pointer |
| `base64` | `string \| null` | no | `null` | Inline payload, plain base64 (no `data:` prefix) |
| `ext` | `string \| null` | no | `null` | File-extension hint when `uri` lacks one |
| `name` | `string \| null` | no | `null` | Display filename, used in logs |
| `extras` | `object \| null` | no | `null` | Provider-specific metadata, opaque to EverOS |
**`type`** — The content kind. `text` and `md` are treated as text.
Each other type is dispatched to the multimodal LLM. If the multimodal
endpoint cannot handle the supplied payload, `/add` returns
`415 Unsupported Media Type`.
**`text`** — The literal text payload; valid for `type: "text"` and
inline `type: "md"`. A non-text, non-md type (including `"html"`) is
always routed to the parser and must carry `uri` or `base64`; passing
only `text` on those items returns `415`. To inline HTML as plain text,
send it as `type: "text"`.
**`uri`** — `http(s)://` or `file://` pointer to the asset. For
`type: "md"`, only `file://` is supported and the file is decoded as
UTF-8 text. For parser-backed content, an
`http(s)` uri is fetched by the server and dispatched by the response
Content-Type (use it for assets hosted elsewhere — S3 / OSS presigned
URL, http server). A `file://` uri is read from the **server's** local
filesystem (the path must be reachable by the server process), subject
to these guardrails:
- the resolved path (symlinks followed) must be an existing regular file;
- its size must be ≤ `EVEROS_MULTIMODAL__FILE_URI_MAX_BYTES` (default 50 MiB);
- when `EVEROS_MULTIMODAL__FILE_URI_ALLOW_DIRS` is set, the path must lie
within one of the listed roots (unset = any readable file, the
local-first default).
A guardrail violation surfaces as `415`. Mutually exclusive with `base64`.
**`base64`** — Inline binary payload, base64-encoded (no `data:`
prefix). Mutually exclusive with `uri`.
> **Size caution.** base64 inflates the payload ~4/3× on the wire, and
> the encoded blob is held **verbatim in the server's staging buffer
> (SQLite) from `/add` until the session is flushed** — a multi-MB PDF
> becomes multi-MB of SQLite text for the buffer's lifetime. It is *not*
> persisted past extraction (the memory cell and episode store only the
> parsed text, never the raw bytes), but the transient footprint is
> real, and a large inline blob also slows request parsing. **Prefer
> `uri` (`http(s)://`) for large assets**: the buffer then holds only
> the URL plus the parsed text, and the bytes are fetched transiently at
> parse time rather than stored. Reserve `base64` for small assets or
> when no reachable URL exists.
**`ext`** — File-extension hint (`"pdf"`, `"png"`, `"html"`, ...). For
`base64` payloads this drives modality dispatch and is effectively
**required** (without it the server falls back to `mime`, then `415`s
if neither resolves). For `uri` payloads it is optional — the fetched
Content-Type usually suffices.
**`name`** — Filename / human label for logging and traceability.
Does not affect parsing.
**`extras`** — Free-form bag of provider-specific metadata (e.g.
caption, page hints). Opaque to EverOS; passed through to the
multimodal LLM context.
### ToolCall
OpenAI-shaped tool invocation attached to an assistant turn.
| Field | Type | Required | Default | Notes |
|---|---|---|---|---|
| `id` | `string` | yes | — | Stable id; echoed by the paired `role: "tool"` turn via `tool_call_id` |
| `type` | `string` | no | `"function"` | Only `"function"` is meaningful today |
| `function` | [ToolFunction](#toolfunction) | yes | — | The function being called |
### ToolFunction
| Field | Type | Required | Notes |
|---|---|---|---|
| `name` | `string` | yes | Function name as registered with the agent |
| `arguments` | `string` | yes | **JSON-encoded string** (OpenAI convention — not an object) |
### FilterNode (filter DSL)
A recursive boolean tree of predicates. Used by `/search.filters` and
`/get.filters`. The Pydantic envelope only checks the recursive
combinator shape; field-level validity (which scalar fields are
filterable, which operators apply, value coercion) runs when the
node is compiled to a LanceDB `where` clause server-side. Compile
errors surface as `422` with the offending field / operator in
`error.message`.
#### Shape
A node is a JSON object whose keys are one of:
| Key | Value | Notes |
|---|---|---|
| `AND` | `array<FilterNode>` | All child nodes must match. Omit if not needed |
| `OR` | `array<FilterNode>` | At least one child node must match. Omit if not needed |
| *<allowed field>* | scalar or operator map | Predicate on that field — see [Allowed fields](#filter-allowed-fields) and [Operators](#filter-operators) |
`AND`, `OR`, and scalar predicates **mix freely at the same level**;
they are implicitly joined with `AND`. A node with only scalar keys is
fine — `AND` / `OR` arrays are not required.
Example — `session_id == "demo-001"` AND (`sender_id` contains `"alice"` OR `"bob"`):
```json
{
"session_id": "demo-001",
"OR": [
{ "sender_id": "alice" },
{ "sender_id": "bob" }
]
}
```
<a id="filter-allowed-fields"></a>
#### Allowed fields
These are the only field names that may appear as a predicate key.
Anything else returns `422`.
| Field | Column kind | Value type | Notes |
|---|---|---|---|
| `session_id` | string | `string` | Session the row was extracted from |
| `parent_type` | string | `string` | Parent kind, e.g. `"memcell"` |
| `parent_id` | string | `string` | Parent id |
| `timestamp` | timestamp | `integer` (Unix epoch **ms** per v1 contract) or `string` (ISO-8601) | Implementation auto-detects `>= 10^12` as ms / smaller as sec for backward compat, but the contract is ms |
| `sender_id` | array of strings | `string` | Matches via `array_has(sender_ids, <value>)` — checks whether the row's `sender_ids` list contains the given id |
The following names are **reserved** and rejected inside `filters`.
They must be set at the top of the request (`/search` or `/get`),
not inside the DSL:
```
owner_id owner_type app_id project_id
```
<a id="filter-operators"></a>
#### Operators
A predicate value may be a **scalar shorthand** (equality) or an
**operator map** (multiple operators in one dict are AND-joined):
| Operator | SQL | Applies to |
|---|---|---|
| `eq` | `=` | string, timestamp; on `sender_id` becomes `array_has(...)` |
| `ne` | `!=` | string, timestamp |
| `gt` | `>` | timestamp |
| `gte` | `>=` | timestamp |
| `lt` | `<` | timestamp |
| `lte` | `<=` | timestamp |
| `in` | `IN (...)` | string (list of values); on `sender_id` becomes `OR` of `array_has` |
Examples:
```json
// Equality shorthand
{ "session_id": "demo-001" }
// Operator map (operators inside one dict are AND-joined)
{ "timestamp": { "gte": 1748390400000, "lt": 1748400000000 } }
// IN list
{ "session_id": { "in": ["demo-001", "demo-002"] } }
// Array-membership on sender_ids
{ "sender_id": "alice" } // array_has(sender_ids, 'alice')
{ "sender_id": { "in": ["alice", "bob"] } } // OR of two array_has
```
### SearchMethod
| Value | Behaviour |
|---|---|
| `"keyword"` | BM25 only — pure lexical match, no embedding cost |
| `"vector"` | Dense vector ANN only — semantic recall, no lexical |
| `"hybrid"` *(default)* | Reciprocal-rank fuse of BM25 + vector + optional scalar filter in a single LanceDB query |
| `"agentic"` | Iterative cluster-path retrieval driven by a cross-encoder rerank loop; higher quality at higher latency / cost |
`"hybrid"` is the default because it balances recall and precision
with one LanceDB roundtrip. `"agentic"` calls the LLM in a loop and
should be reserved for offline or background workflows.
### GetMemoryType
| Value | Track | Returned in `data.<plural>` |
|---|---|---|
| `"episode"` | user | `data.episodes` — [GetEpisodeItem](#getepisodeitem) |
| `"profile"` | user | `data.profiles` — [GetProfileItem](#getprofileitem) |
| `"agent_case"` | agent | `data.agent_cases` — [GetAgentCaseItem](#getagentcaseitem) |
| `"agent_skill"` | agent | `data.agent_skills` — [GetAgentSkillItem](#getagentskillitem) |
`memory_type` must match the requested owner kind: `"episode"` /
`"profile"` require `user_id`; `"agent_case"` / `"agent_skill"`
require `agent_id`. The mismatching combinations are rejected with
`422`.
## Endpoints
### POST /api/v1/memory/add
Append a batch of messages to a session buffer. The server
accumulates messages until the boundary detector decides the session
is complete (or you force it with `/flush`); on boundary the LLM
extracts a memory cell and the writers persist markdown + index rows.
The response distinguishes "buffered" from "extracted this call" via
the `status` field — see [Response body](#response-body) below.
#### Request body
| Field | Type | Required | Default | Constraints |
|---|---|---|---|---|
| `session_id` | `string` | yes | — | length 1128 |
| `app_id` | `string` *(ScopeId)* | no | `"default"` | see [ScopeId](#scopeid-app_id-and-project_id) |
| `project_id` | `string` *(ScopeId)* | no | `"default"` | see [ScopeId](#scopeid-app_id-and-project_id) |
| `messages` | `array<MessageItem>` | yes | — | 1500 items |
**`session_id`** — Identifies the conversation buffer on the server.
Messages POSTed with the same `(session_id, app_id, project_id)`
accumulate into one extraction batch. Different `session_id`s never
share a buffer, even within the same scope.
**`app_id` / `project_id`** — Scope identifiers; see
[ScopeId](#scopeid-app_id-and-project_id).
**`messages`** — Ordered list of [MessageItem](#messageitem) to
append. Order matters (it is preserved when the buffer is later
extracted). The 1500 cap is a per-request safety bound, not a
session lifetime cap — you can call `/add` many times for the same
`session_id`.
#### Response body
`200 OK` returns a SuccessEnvelope wrapping:
| Field | Type | Notes |
|---|---|---|
| `message_count` | `integer` | Number of messages accepted in this call |
| `status` | `"accumulated" \| "extracted"` | What the server did with the batch |
**`message_count`** — Always equal to `len(messages)` on a `200`
there is no partial accept. The field is present mainly for log
correlation.
**`status`** — Two-state outcome:
- `"accumulated"` — Messages were added to the buffer; the boundary
detector did **not** trigger this call. The buffer remains open.
- `"extracted"` — During this call the boundary detector tripped; the
LLM ran extraction and the writers persisted at least one memory
cell. The buffer is empty after the call returns.
#### cURL example
```bash
TS=$(date +%s)
curl -X POST http://127.0.0.1:8000/api/v1/memory/add \
-H 'Content-Type: application/json' \
-d "{
\"session_id\": \"demo-002\",
\"app_id\": \"default\",
\"project_id\": \"default\",
\"messages\": [
{\"sender_id\": \"alice\", \"role\": \"user\", \"timestamp\": $TS, \"content\": \"I love climbing in Yosemite every spring.\"},
{\"sender_id\": \"alice\", \"role\": \"user\", \"timestamp\": $((TS+10)), \"content\": \"My favorite coffee shop is Blue Bottle in SOMA.\"},
{\"sender_id\": \"alice\", \"role\": \"user\", \"timestamp\": $((TS+20)), \"content\": \"I bike to work most days.\"}
]
}"
```
Response (real capture):
```json
{
"request_id": "ae78d3f689c941eea135893e702fd171",
"data": {
"message_count": 3,
"status": "accumulated"
}
}
```
### POST /api/v1/memory/flush
Force the boundary detector to decide **now** for the given session
buffer. The LLM runs extraction (one call) regardless of whether the
heuristic would have tripped on its own. Useful at the end of a chat
or agent run to make sure pending context becomes durable memory.
#### Request body
| Field | Type | Required | Default | Constraints |
|---|---|---|---|---|
| `session_id` | `string` | yes | — | length 1128 |
| `app_id` | `string` *(ScopeId)* | no | `"default"` | — |
| `project_id` | `string` *(ScopeId)* | no | `"default"` | — |
**`session_id`** — Identifies which buffer to flush. Must match the
`session_id` of prior `/add` calls in the same `(app_id, project_id)`
scope.
**`app_id` / `project_id`** — Scope identifiers; see
[ScopeId](#scopeid-app_id-and-project_id).
#### Response body
| Field | Type | Notes |
|---|---|---|
| `status` | `"extracted" \| "no_extraction"` | What happened |
**`status`**:
- `"extracted"` — The buffer had at least one message; extraction ran
and produced at least one memory cell.
- `"no_extraction"` — The buffer was empty (no prior `/add` calls for
this `(session_id, app_id, project_id)`, or it was already flushed).
`/flush` is synchronous with respect to markdown persistence: by the
time the response returns, the new entry is on disk. LanceDB index
sync is still asynchronous — see
[Eventual consistency](#eventual-consistency).
#### cURL example
```bash
curl -X POST http://127.0.0.1:8000/api/v1/memory/flush \
-H 'Content-Type: application/json' \
-d '{"session_id":"demo-002","app_id":"default","project_id":"default"}'
```
Response (real capture; this call took several seconds because of the
extraction LLM call):
```json
{
"request_id": "e65bcf4f56c042e39cdf50866810672c",
"data": {
"status": "extracted"
}
}
```
### POST /api/v1/memory/search
Hybrid retrieval over the memory store. Combines BM25, dense vector
ANN, optional scalar filtering, optional cross-encoder rerank, and
optional final LLM rerank. Returns ranked items grouped by kind.
#### Request body
| Field | Type | Required | Default | Constraints |
|---|---|---|---|---|
| `user_id` | `string \| null` | XOR with `agent_id` | `null` | `minLength=1` if set |
| `agent_id` | `string \| null` | XOR with `user_id` | `null` | `minLength=1` if set |
| `app_id` | `string` *(ScopeId)* | no | `"default"` | — |
| `project_id` | `string` *(ScopeId)* | no | `"default"` | — |
| `query` | `string` | yes | — | `minLength=1` |
| `method` | [SearchMethod](#searchmethod) | no | `"hybrid"` | — |
| `top_k` | `integer` | no | `-1` | `-1` or `1..100` |
| `radius` | `number \| null` | no | `null` | `0.0 ≤ x ≤ 1.0` if set |
| `include_profile` | `boolean` | no | `false` | — |
| `enable_llm_rerank` | `boolean` | no | `false` | — |
| `filters` | [FilterNode](#filternode-filter-dsl) `\| null` | no | `null` | — |
**`user_id` / `agent_id`** — **Exactly one** must be set. Determines
which track is searched: `user_id` → user-memory (episodes /
profiles); `agent_id` → agent-memory (cases / skills).
**`app_id` / `project_id`** — Scope identifiers; results never cross
scopes.
**`query`** — The retrieval query. The same string is fed to BM25
tokenization and to the embedding model, then the two recall lists
are fused.
**`method`** — Retrieval method; see [SearchMethod](#searchmethod).
Default `"hybrid"` is the recommended starting point.
**`top_k`** — Maximum number of items to return per kind. Two modes:
- `-1` (default) — "Use server defaults." Recall falls back to a
fixed internal cap, and a server-side `radius` default kicks in if
the caller did not pass one (see `radius` below). Use for "give me
whatever you find worth returning."
- `1..100` — Explicit cap. Recall is sized as `top_k × multiplier`
and **no** default `radius` is applied unless you set it yourself.
Values of `0` or outside `-1` / `1..100` are rejected with `422`.
**`radius`** — Optional **cosine-similarity threshold** in `[0.0, 1.0]`.
Candidates whose score is below this value are dropped — use it to cut
a long tail of weak matches. Three-level fallback for the effective
radius:
1. Caller-supplied `radius` (including the literal `0.0`) always wins.
2. With `top_k=-1` and no caller-supplied `radius`, a server-side
default radius kicks in.
3. With `top_k>0` and no caller-supplied `radius`, no threshold is
applied (`null`).
**`include_profile`** — When `user_id` is set, also fetch the user's
profile and include it in `data.profiles`. The profile is not
ranked; `score` is `null`. Ignored when `agent_id` is set.
**`enable_llm_rerank`** — Opt-in LLM rerank pass for
`method: "hybrid"`. Applies to `agent_case` and `agent_skill` fusion
only; the episode hierarchy path has built-in fact eviction and
ignores this flag. Adds one LLM call per request. Ignored by
`keyword` / `vector` (no fusion to rerank) and `agentic` (uses its
own cross-encoder loop).
**`filters`** — Optional filter-DSL node; see
[FilterNode](#filternode-filter-dsl). Applied **before** ranking, so
it does not perturb the ranker.
#### Response body
`200 OK` returns a SuccessEnvelope wrapping `SearchData`. All five
arrays are always present so client code can iterate without
branching on owner type; arrays that do not apply to the requested
owner kind stay as `[]`.
| Field | Type | Notes |
|---|---|---|
| `episodes` | `array<SearchEpisodeItem>` | Populated when `user_id` is set |
| `profiles` | `array<SearchProfileItem>` | Populated when `user_id` is set **and** `include_profile=true` |
| `agent_cases` | `array<SearchAgentCaseItem>` | Populated when `agent_id` is set |
| `agent_skills` | `array<SearchAgentSkillItem>` | Populated when `agent_id` is set |
| `unprocessed_messages` | `array<UnprocessedMessageDTO>` | Populated **only** when `filters.session_id` is a top-level eq scalar; otherwise stays `[]`. Independent of `user_id` / `agent_id` (buffer rows have no owner attribution — boundary detection runs before owner inference) |
#### SearchEpisodeItem
User-track conversation episode hit. `score` is the fused retrieval
score; `atomic_facts` lists single-sentence facts that matched the
query within this episode (already nested, no separate call needed).
| Field | Type | Notes |
|---|---|---|
| `id` | `string` | `<user_id>_ep_<YYYYMMDD>_<NNN>` |
| `user_id` | `string \| null` | Owner of this episode |
| `app_id` | `string` | Scope where the episode lives |
| `project_id` | `string` | Scope where the episode lives |
| `session_id` | `string` | The `session_id` whose messages produced this episode |
| `timestamp` | `string` | ISO-8601 with timezone offset; default `UTC` renders as `Z`, non-UTC defaults render as `±HH:MM` — see [Conventions](#conventions) |
| `sender_ids` | `array<string>` | Distinct `sender_id`s in the underlying messages |
| `summary` | `string` | Short summary (~200 chars), suitable for list-view rendering |
| `subject` | `string` | One-line subject / title |
| `episode` | `string` | Full extracted narrative |
| `type` | `"Conversation"` | Reserved; today only conversation-derived episodes ship |
| `score` | `number` | Fused retrieval score for this episode |
| `atomic_facts` | `array<SearchAtomicFactItem>` | Sub-facts extracted from the same episode that matched the query |
#### SearchAtomicFactItem
A single-sentence fact pulled out of an episode during extraction.
| Field | Type | Notes |
|---|---|---|
| `id` | `string` | `<user_id>_af_<YYYYMMDD>_<NNN>` |
| `content` | `string` | The fact as a single sentence |
| `score` | `number` | Same score scale as the parent episode |
#### SearchProfileItem
User profile. Only populated when `include_profile=true` and
`user_id` is set.
| Field | Type | Notes |
|---|---|---|
| `id` | `string` | Profile id |
| `user_id` | `string \| null` | Owner |
| `app_id` | `string` | Scope |
| `project_id` | `string` | Scope |
| `profile_data` | `object` | Free-form structured profile fields produced by extraction (schema is profile-specific) |
| `score` | `number \| null` | `null` for direct fetches (no query-aware profile ranking yet) |
#### SearchAgentCaseItem
Agent-track case hit. Returned only when the request uses `agent_id`.
| Field | Type | Notes |
|---|---|---|
| `id` | `string` | `<agent_id>_ac_<YYYYMMDD>_<NNN>` |
| `agent_id` | `string` | Owner |
| `app_id` | `string` | Scope |
| `project_id` | `string` | Scope |
| `session_id` | `string` | The agent run this case came from |
| `task_intent` | `string` | What the agent was trying to do (one-sentence framing) |
| `approach` | `string` | How the agent went about it |
| `quality_score` | `number` | Quality assessed at extraction time, in `[0, 1]` |
| `key_insight` | `string \| null` | One-line takeaway; `null` if extraction did not produce one |
| `timestamp` | `string` | ISO-8601 UTC of when the run happened |
| `score` | `number` | Fused retrieval score for this case |
#### SearchAgentSkillItem
Agent-track skill hit — a named procedural memory produced by
clustering related cases.
| Field | Type | Notes |
|---|---|---|
| `id` | `string` | `<agent_id>_sk_<YYYYMMDD>_<NNN>` |
| `agent_id` | `string` | Owner |
| `app_id` | `string` | Scope |
| `project_id` | `string` | Scope |
| `name` | `string` | Skill name (a stable, human-meaningful slug) |
| `description` | `string` | One-line description |
| `content` | `string` | Full skill body (procedure / playbook text) |
| `confidence` | `number` | Extraction confidence in `[0, 1]` |
| `maturity_score` | `number` | Maturity assessed at clustering time, in `[0, 1]` |
| `source_case_ids` | `array<string>` | `agent_case` ids that contributed to this skill |
| `score` | `number` | Fused retrieval score for this skill |
#### UnprocessedMessageDTO
A raw message still sitting in the boundary-detection buffer — sent to
`/add` but not yet carved into an episode / case. Returned **only**
when the request's `filters` contains `session_id` as a top-level eq
scalar (`{"session_id": "<sid>"}`); compound shapes (`AND` / `OR`
combinators, operator maps such as `{"eq": ...}` / `{"in": ...}`) do
not trigger the lookup because there is no defensible buffer-scope
mapping for them. Buffer rows have no `user_id` / `agent_id`
attribution, so `session_id` is the only meaningful query dimension.
| Field | Type | Notes |
|---|---|---|
| `id` | `string` | Original `message_id` from `/add` |
| `app_id` | `string` | Scope where the buffered message lives |
| `project_id` | `string` | Scope where the buffered message lives |
| `session_id` | `string` | Matches the requested `filters.session_id` |
| `sender_id` | `string` | Original sender id from `/add` |
| `sender_name` | `string \| null` | Original sender name; `null` if not provided |
| `role` | `"user" \| "assistant" \| "tool"` | Original role |
| `content` | `string \| array<object>` | `string` for the single-text shorthand, `array` of opaque content items for the original multimodal payload (mirrors [MessageItem.content](#messageitem)) |
| `timestamp` | `string` | ISO-8601 with timezone offset — see [Conventions](#conventions) |
| `tool_calls` | `array<object> \| null` | Original tool_calls payload if any |
| `tool_call_id` | `string \| null` | Original tool_call_id if any |
#### cURL example
```bash
curl -X POST http://127.0.0.1:8000/api/v1/memory/search \
-H 'Content-Type: application/json' \
-d '{
"user_id": "alice",
"app_id": "default",
"project_id": "default",
"query": "Where do I like to climb?",
"top_k": 5
}'
```
Response (real capture):
```json
{
"request_id": "3a8ad3dcd2484fe4a05ccbd89d438704",
"data": {
"episodes": [
{
"id": "alice_ep_20260528_00000001",
"user_id": "alice",
"app_id": "default",
"project_id": "default",
"session_id": "demo-002",
"timestamp": "2026-05-28T11:30:36Z",
"sender_ids": ["alice"],
"summary": "On May 28, 2026 at 11:30 AM UTC, Alice shared that she loves climbing in Yosemite every spring...",
"subject": "Alice's Outdoor Activities and Daily Routine on May 28, 2026: Climbing, Coffee, and Biking",
"episode": "On May 28, 2026 at 11:30 AM UTC, Alice shared that she loves climbing in Yosemite every spring, highlighting a recurring seasonal activity. She also mentioned that her favorite coffee shop is Blue Bottle located in SOMA, indicating a preferred local spot. Additionally, Alice stated that she bikes to work most days, describing a habitual mode of transportation.",
"type": "Conversation",
"score": 0.6298899054527283,
"atomic_facts": [
{
"id": "alice_af_20260528_00000001",
"content": "Alice said she loves climbing in Yosemite every spring.",
"score": 0.6298899054527283
}
]
}
],
"profiles": [],
"agent_cases": [],
"agent_skills": [],
"unprocessed_messages": []
}
}
```
### POST /api/v1/memory/get
Paginated listing of memory records of a given kind for a single
owner. No ranking — ordering is `sort_by` × `sort_order` only. Used
for UI browsing, exports, or filtered scans.
#### Request body
| Field | Type | Required | Default | Constraints |
|---|---|---|---|---|
| `user_id` | `string \| null` | XOR with `agent_id` | `null` | `minLength=1` if set |
| `agent_id` | `string \| null` | XOR with `user_id` | `null` | `minLength=1` if set |
| `app_id` | `string` *(ScopeId)* | no | `"default"` | — |
| `project_id` | `string` *(ScopeId)* | no | `"default"` | — |
| `memory_type` | [GetMemoryType](#getmemorytype) | yes | — | — |
| `page` | `integer` | no | `1` | `≥ 1` |
| `page_size` | `integer` | no | `20` | `1..100` |
| `sort_by` | `"timestamp" \| "updated_at"` | no | `"timestamp"` | — |
| `sort_order` | `"asc" \| "desc"` | no | `"desc"` | — |
| `filters` | [FilterNode](#filternode-filter-dsl) `\| null` | no | `null` | — |
**`user_id` / `agent_id`** — **Exactly one** must be set, and it must
match the track implied by `memory_type` (`"episode"` / `"profile"`
require `user_id`; `"agent_case"` / `"agent_skill"` require
`agent_id`).
**`app_id` / `project_id`** — Scope identifiers.
**`memory_type`** — Which item kind to list; see
[GetMemoryType](#getmemorytype). The route populates exactly one of
the four arrays in `data` based on this value.
**`page`** — 1-indexed page number. Together with `page_size`
determines the window. The response's `total_count` reports how many
items match the request before paging.
**`page_size`** — Items per page, in `1..100`. Default 20.
**`sort_by`** — Column to sort by. Note: for `memory_type` values
where `"timestamp"` does not apply (e.g. `"profile"` has no
timestamp, `"agent_skill"` is a named entity), the server silently
falls back to `"updated_at"`.
**`sort_order`** — `"desc"` (newest first, default) or `"asc"`.
**`filters`** — Optional [FilterNode](#filternode-filter-dsl) for
predicate-based filtering before pagination.
#### Response body
`200 OK` returns a SuccessEnvelope wrapping `GetData`. The four
arrays are always present so client code can iterate without
branching on `memory_type`; exactly one is populated.
| Field | Type | Notes |
|---|---|---|
| `episodes` | `array<GetEpisodeItem>` | Populated when `memory_type="episode"` |
| `profiles` | `array<GetProfileItem>` | Populated when `memory_type="profile"` |
| `agent_cases` | `array<GetAgentCaseItem>` | Populated when `memory_type="agent_case"` |
| `agent_skills` | `array<GetAgentSkillItem>` | Populated when `memory_type="agent_skill"` |
| `total_count` | `integer` | Total matching records **before** paging |
| `count` | `integer` | Number of items in **this page** (`len(items)`) |
#### GetEpisodeItem
Same shape as [SearchEpisodeItem](#searchepisodeitem) **minus**
`score` and `atomic_facts` (listing is unranked and does not nest
sub-facts).
| Field | Type | Notes |
|---|---|---|
| `id` | `string` | `<user_id>_ep_<YYYYMMDD>_<NNN>` |
| `user_id` | `string \| null` | Owner |
| `app_id` | `string` | Scope |
| `project_id` | `string` | Scope |
| `session_id` | `string` | Originating session |
| `timestamp` | `string` | ISO-8601 with timezone offset, identical to `/search` (both go through the same `from_iso_format` path) — see [Conventions](#conventions) |
| `sender_ids` | `array<string>` | Distinct `sender_id`s in the underlying messages |
| `summary` | `string` | Short summary |
| `subject` | `string` | One-line subject |
| `episode` | `string` | Full extracted narrative |
| `type` | `"Conversation"` | — |
#### GetProfileItem
| Field | Type | Notes |
|---|---|---|
| `id` | `string` | Profile id |
| `user_id` | `string \| null` | Owner |
| `app_id` | `string` | Scope |
| `project_id` | `string` | Scope |
| `profile_data` | `object` | Free-form structured profile fields |
#### GetAgentCaseItem
Same shape as [SearchAgentCaseItem](#searchagentcaseitem) **minus**
`score`.
| Field | Type | Notes |
|---|---|---|
| `id` | `string` | `<agent_id>_ac_<YYYYMMDD>_<NNN>` |
| `agent_id` | `string` | Owner |
| `app_id` | `string` | Scope |
| `project_id` | `string` | Scope |
| `session_id` | `string` | Originating run |
| `task_intent` | `string` | What the agent was trying to do |
| `approach` | `string` | How it went about it |
| `quality_score` | `number` | Quality assessed at extraction time |
| `key_insight` | `string \| null` | One-line takeaway, if any |
| `timestamp` | `string` | ISO-8601 UTC |
#### GetAgentSkillItem
Same shape as [SearchAgentSkillItem](#searchagentskillitem) **minus**
`score`.
| Field | Type | Notes |
|---|---|---|
| `id` | `string` | `<agent_id>_sk_<YYYYMMDD>_<NNN>` |
| `agent_id` | `string` | Owner |
| `app_id` | `string` | Scope |
| `project_id` | `string` | Scope |
| `name` | `string` | Skill name |
| `description` | `string` | One-line description |
| `content` | `string` | Full skill body |
| `confidence` | `number` | Extraction confidence in `[0, 1]` |
| `maturity_score` | `number` | Maturity assessed at clustering, in `[0, 1]` |
| `source_case_ids` | `array<string>` | `agent_case` ids that contributed to this skill |
#### cURL example
```bash
curl -X POST http://127.0.0.1:8000/api/v1/memory/get \
-H 'Content-Type: application/json' \
-d '{
"user_id": "alice",
"app_id": "default",
"project_id": "default",
"memory_type": "episode"
}'
```
Response (real capture):
```json
{
"request_id": "b8ba1255440147a684885f2084f6a25d",
"data": {
"episodes": [
{
"id": "alice_ep_20260528_00000001",
"user_id": "alice",
"app_id": "default",
"project_id": "default",
"session_id": "demo-002",
"timestamp": "2026-05-28T11:30:36Z",
"sender_ids": ["alice"],
"summary": "On May 28, 2026 at 11:30 AM UTC, Alice shared that she loves climbing in Yosemite every spring...",
"subject": "Alice's Outdoor Activities and Daily Routine on May 28, 2026: Climbing, Coffee, and Biking",
"episode": "On May 28, 2026 at 11:30 AM UTC, Alice shared that she loves climbing in Yosemite every spring, highlighting a recurring seasonal activity. She also mentioned that her favorite coffee shop is Blue Bottle located in SOMA, indicating a preferred local spot. Additionally, Alice stated that she bikes to work most days, describing a habitual mode of transportation.",
"type": "Conversation"
}
],
"profiles": [],
"agent_cases": [],
"agent_skills": [],
"total_count": 1,
"count": 1
}
}
```
## OpenAPI spec source
This document mirrors the OpenAPI 3.x spec that FastAPI auto-generates
from the Pydantic DTOs. **The spec is not exposed by default** — the
`/openapi.json`, `/docs` (Swagger UI), and `/redoc` endpoints only
mount when the server starts with `ENV=DEV` set.
```bash
ENV=DEV everos server start
# In another terminal:
curl http://127.0.0.1:8000/openapi.json | python -m json.tool
# Or interactively in a browser:
open http://127.0.0.1:8000/docs
```
This document and the spec are kept in sync by hand. Where they
disagree, the **spec is the structural ground truth** (field names,
required flags, value constraints). This document carries the
**business semantics** the spec cannot — when to use which method,
the eventual-consistency caveat, why a field is shaped the way it is.
---
For higher-level context (cascade design, DDD layering, on-disk
layout), see [architecture.md](architecture.md) and
[storage_layout.md](storage_layout.md).