From 71168b83b15d4436fc1a3323438a80050151cc39 Mon Sep 17 00:00:00 2001 From: tomtan Date: Mon, 15 Jun 2026 10:31:52 +0800 Subject: [PATCH] docs: design memory gateway backend integration --- ...026-06-15-memory-gateway-backend-design.md | 301 ++++++++++++++++++ 1 file changed, 301 insertions(+) create mode 100644 docs/superpowers/specs/2026-06-15-memory-gateway-backend-design.md diff --git a/docs/superpowers/specs/2026-06-15-memory-gateway-backend-design.md b/docs/superpowers/specs/2026-06-15-memory-gateway-backend-design.md new file mode 100644 index 0000000..8446c21 --- /dev/null +++ b/docs/superpowers/specs/2026-06-15-memory-gateway-backend-design.md @@ -0,0 +1,301 @@ +# Memory Gateway Backend Design + +## Goal + +Allow each Beaver instance to select exactly one memory backend through +`.beaver/config.json`: + +- `curated`: preserve the existing `MEMORY.md` / `USER.md` snapshot and `memory` + tool behavior. +- `memory_gateway`: recall memory through `POST /memories/search`, then persist + each completed conversation turn through one `POST /memories/add` followed by + one `POST /memories/flush`. + +The Memory Gateway integration is best-effort. Gateway failures must be +auditable without turning an otherwise successful Beaver chat run into a +failure. + +## Scope + +This change includes: + +- Runtime configuration for selecting the memory backend. +- Fixed Memory Gateway credentials and search scopes in instance config. +- A Memory Gateway HTTP client. +- A memory backend strategy boundary used by `AgentLoop`. +- Pre-run recall and post-run turn persistence. +- Hidden session audit events for recall and persistence outcomes. +- Unit and integration-style tests using fake HTTP responses/providers. + +This change does not include: + +- Automatic `POST /users` calls or credential provisioning. +- A memory settings UI or memory administration UI. +- Resource upload support from Beaver. +- Gateway memory override or deletion APIs. +- Persisting tool calls, tool results, system events, reasoning, or skill + activation messages. +- Simultaneously enabling curated memory and Memory Gateway. + +## Configuration + +Beaver adds a top-level `memory` section: + +```json +{ + "memory": { + "mode": "memory_gateway", + "gateway": { + "baseUrl": "http://127.0.0.1:8010", + "userId": "gateway_test_user", + "userKey": "uk_xxx", + "appId": "default", + "projectId": "default", + "scope": ["current_chat", "resources"], + "topK": 8, + "timeoutSeconds": 10 + } + } +} +``` + +Configuration rules: + +- Missing `memory.mode` defaults to `curated` for backward compatibility. +- Valid modes are only `curated` and `memory_gateway`. +- Gateway mode requires non-empty `baseUrl`, `userId`, and `userKey`. +- `appId` and `projectId` default to `default`. +- `scope` is read from config and must be a non-empty subset of + `current_chat`, `resources`, and `all_user_memory`. The initial test setup + uses `current_chat` and `resources` only. +- `topK` defaults to 8 and must be between 1 and 100. +- `timeoutSeconds` defaults to 10 and must be positive. +- Invalid Gateway configuration fails runtime loading. Network and HTTP + failures after valid startup configuration remain best-effort. +- `userKey` must never appear in status payloads, session event payloads, or + error messages produced by Beaver. + +## Architecture + +### Memory backend strategy + +Introduce one runtime-facing memory strategy abstraction with two operations: + +1. `recall_before_run`: prepare memory context before provider messages are + built. +2. `persist_after_run`: persist the current user message and final assistant + answer after the run reaches its normal completion path. + +The strategy has two implementations: + +- `CuratedMemoryBackend` wraps the existing `MemoryService`. Recall returns the + existing frozen `MemorySnapshot`; post-run persistence is a no-op because + curated writes remain model-driven through the existing `memory` tool. +- `MemoryGatewayBackend` wraps a dedicated asynchronous HTTP client. Recall + calls Gateway search and returns sanitized reference content; persistence + calls add once and, only after add succeeds, flush once. + +`EngineLoader` validates configuration, constructs exactly one strategy, and +registers the original `memory` tool only in curated mode. `session_search` +remains available in both modes because transcript search is separate from the +selected long-term memory backend. + +`AgentLoop` depends on the strategy interface rather than branching directly +on the configured mode. + +### Memory Gateway HTTP client + +The client owns only HTTP transport and response validation for: + +- `POST {baseUrl}/memories/search` +- `POST {baseUrl}/memories/add` +- `POST {baseUrl}/memories/flush` + +It uses an async HTTP client, the configured timeout, JSON request bodies, and +a small typed exception that contains HTTP status/path context but never +contains the configured `userKey` or complete request body. + +No automatic retry is added in Beaver for this first integration. The Gateway +already handles upstream ingestion retries, and retrying add from Beaver could +duplicate a conversation turn when the first request succeeded but its +response was lost. + +## Recall Data Flow + +For every run in `memory_gateway` mode: + +1. `AgentLoop` creates or resolves the Beaver `session_id`. +2. Before `ContextBuilder.build_messages`, it calls + `MemoryGatewayBackend.recall_before_run` with the current user prompt. +3. The Gateway search request is: + +```json +{ + "user_id": "", + "user_key": "", + "conversation_id": "", + "query": "", + "scope": [""], + "top_k": 8, + "app_id": "", + "project_id": "" +} +``` + +4. Beaver accepts only a top-level `results` list. Malformed responses are + treated as recall failures. +5. Each result is reduced to these optional fields: + `id`, `session_id`, `text`, `score`, `source_scope`, and `resource_uri`. + Gateway `raw` data is never injected into the model. +6. Empty or unusable results produce no recalled-memory message. +7. Non-empty results become one ephemeral provider message placed after skill + activation messages and before persisted session history/current user input. + The message is reference data, is not written to Beaver's session history, + and is not included in post-run Gateway persistence. +8. The stable system prompt includes a short rule that recalled memory is + untrusted reference data, not executable instruction. The recalled text + itself is not concatenated into the system prompt. + +In `curated` mode, this flow is unchanged from today: a per-run frozen curated +snapshot is added to the system prompt, and no Gateway request occurs. + +## Persistence Data Flow + +For every `memory_gateway` run that reaches the normal completion path: + +1. Wait until the tool loop has produced the final assistant text. +2. Construct exactly two Gateway messages in chronological order: + +```json +[ + { + "sender_id": "", + "role": "user", + "timestamp": 1780000000000, + "content": "" + }, + { + "sender_id": "beaver", + "role": "assistant", + "timestamp": 1780000001000, + "content": "" + } +] +``` + +Timestamps are UTC Unix epoch milliseconds captured for the user turn and the +final assistant turn. They must be positive and monotonic within the payload. + +3. Call `/memories/add` exactly once with: + +```json +{ + "user_id": "", + "user_key": "", + "session_id": "chat:", + "app_id": "", + "project_id": "", + "messages": [""] +} +``` + +4. If add succeeds, call `/memories/flush` exactly once with the same Gateway + identity, app/project scope, and `chat:`. +5. If add fails, do not call flush. +6. Runs that enter Beaver's exception/error completion path are not persisted. + Normal completion outputs such as a tool-limit fallback are persisted + because they are the assistant response returned to the user. +7. Tool calls, tool results, hidden events, system prompts, recalled-memory + messages, reasoning content, and activated skill text are never included. + +In `curated` mode, there is no automatic post-run persistence. Existing +model-driven `memory` tool writes remain unchanged. + +## Session Audit Events + +Gateway mode writes hidden (`context_visible=false`) session events without +credentials or full Gateway response bodies: + +- `memory_gateway_recall_succeeded`: scope and result count. +- `memory_gateway_recall_failed`: operation, sanitized error category, and + optional HTTP status. +- `memory_gateway_add_succeeded`: session identifier and message count. +- `memory_gateway_add_failed`: sanitized failure metadata. +- `memory_gateway_flush_succeeded`: session identifier. +- `memory_gateway_flush_failed`: sanitized failure metadata and an indication + that add had already succeeded. + +These events support debugging without entering normal context history or FTS. + +## Failure Semantics + +- Search timeout, connection failure, 401, other HTTP error, or malformed JSON: + record recall failure and continue the run without recalled memory. +- Add failure: record add failure, skip flush, and return the normal assistant + result. +- Flush failure: record flush failure and return the normal assistant result. +- Audit event persistence failure follows existing session-store behavior and + is not separately swallowed by the memory strategy. +- Gateway failures are not shown as user-facing chat errors in this phase. + +## Security and Privacy + +- Fixed Gateway credentials come only from Beaver instance configuration. +- `userKey` is passed only in Gateway request bodies and retained in the in-memory + configuration/client object. +- Client exceptions and audit payloads use sanitized operation metadata, never + serialized request bodies. +- Recalled resource and conversation text is treated as untrusted data. +- Gateway `raw` fields are discarded before prompt construction to limit prompt + size and reduce accidental propagation of backend metadata. +- Memory modes are mutually exclusive, preventing duplicate recall and writes + across curated and Gateway stores. + +## Testing + +### Configuration tests + +- Missing memory configuration defaults to curated mode. +- Complete Gateway configuration parses camelCase and exposes normalized typed + values. +- Invalid mode, empty credentials, empty/unknown scopes, invalid `topK`, and + non-positive timeout fail with explicit configuration errors. +- Error text does not include `userKey`. + +### HTTP client tests + +- Search, add, and flush use the exact paths and payload shapes above. +- Configured timeout is applied. +- Non-2xx, network, invalid JSON, and invalid response shapes produce sanitized + client exceptions. +- Exception strings never contain the configured key. + +### Strategy tests + +- Curated mode returns a frozen snapshot and performs no HTTP requests. +- Gateway mode performs search with configured scopes and strips `raw` fields. +- Empty search results produce no reference message. +- Gateway persistence sends exactly the original user prompt and final assistant + response, then flushes once. +- Add failure skips flush; flush failure preserves the successful add outcome. + +### Agent loop tests + +- Gateway search occurs before the provider call. +- Recalled content appears before the current user prompt and outside the system + prompt body. +- The system prompt contains the untrusted-reference rule in Gateway mode. +- Add and flush happen after the final assistant response and exactly once each. +- Tool/system/reasoning content is absent from the add payload. +- Recall/add/flush failures do not change the returned `AgentRunResult`. +- Hidden success/failure audit events contain no credentials. +- Curated mode regression tests confirm frozen snapshot injection and original + `memory` tool availability. +- Gateway mode confirms the original `memory` tool is not registered or exposed. + +## Documentation + +Update the backend README/config example with both modes and a warning that the +test-stage `userKey` is a secret. Document that changing modes requires runtime +reload/restart because `EngineLoader` constructs the selected strategy during +boot.