# Memory Gateway Backend Design ## Goal Allow each Beaver instance to select exactly one memory backend through `.beaver/config.json`: - `curated`: preserve the existing `MEMORY.md` / `USER.md` snapshot and `memory` tool behavior. - `memory_gateway`: recall memory through `POST /memories/search`, then persist each completed conversation turn through one `POST /memories/add` followed by one `POST /memories/flush`. The Memory Gateway integration is best-effort. Gateway failures must be auditable without turning an otherwise successful Beaver chat run into a failure. ## Scope This change includes: - Runtime configuration for selecting the memory backend. - Fixed Memory Gateway credentials and search scopes in instance config. - A Memory Gateway HTTP client. - A memory backend strategy boundary used by `AgentLoop`. - Pre-run recall and post-run turn persistence. - Hidden session audit events for recall and persistence outcomes. - Unit and integration-style tests using fake HTTP responses/providers. This change does not include: - Automatic `POST /users` calls or credential provisioning. - A memory settings UI or memory administration UI. - Resource upload support from Beaver. - Gateway memory override or deletion APIs. - Persisting tool calls, tool results, system events, reasoning, or skill activation messages. - Simultaneously enabling curated memory and Memory Gateway. ## Configuration Beaver adds a top-level `memory` section: ```json { "memory": { "mode": "memory_gateway", "gateway": { "baseUrl": "http://127.0.0.1:8010", "userId": "gateway_test_user", "userKey": "uk_xxx", "appId": "default", "projectId": "default", "scope": ["current_chat", "resources"], "topK": 8, "timeoutSeconds": 10 } } } ``` Configuration rules: - Missing `memory.mode` defaults to `curated` for backward compatibility. - Valid modes are only `curated` and `memory_gateway`. - Gateway mode requires non-empty `baseUrl`, `userId`, and `userKey`. - `appId` and `projectId` default to `default`. - `scope` is read from config and must be a non-empty subset of `current_chat`, `resources`, and `all_user_memory`. The initial test setup uses `current_chat` and `resources` only. - `topK` defaults to 8 and must be between 1 and 100. - `timeoutSeconds` defaults to 10 and must be positive. - Invalid Gateway configuration fails runtime loading. Network and HTTP failures after valid startup configuration remain best-effort. - `userKey` must never appear in status payloads, session event payloads, or error messages produced by Beaver. ## Architecture ### Memory backend strategy Introduce one runtime-facing memory strategy abstraction with two operations: 1. `recall_before_run`: prepare memory context before provider messages are built. 2. `persist_after_run`: persist the current user message and final assistant answer after the run reaches its normal completion path. The strategy has two implementations: - `CuratedMemoryBackend` wraps the existing `MemoryService`. Recall returns the existing frozen `MemorySnapshot`; post-run persistence is a no-op because curated writes remain model-driven through the existing `memory` tool. - `MemoryGatewayBackend` wraps a dedicated asynchronous HTTP client. Recall calls Gateway search and returns sanitized reference content; persistence calls add once and, only after add succeeds, flush once. `EngineLoader` validates configuration, constructs exactly one strategy, and registers the original `memory` tool only in curated mode. `session_search` remains available in both modes because transcript search is separate from the selected long-term memory backend. `AgentLoop` depends on the strategy interface rather than branching directly on the configured mode. ### Memory Gateway HTTP client The client owns only HTTP transport and response validation for: - `POST {baseUrl}/memories/search` - `POST {baseUrl}/memories/add` - `POST {baseUrl}/memories/flush` It uses an async HTTP client, the configured timeout, JSON request bodies, and a small typed exception that contains HTTP status/path context but never contains the configured `userKey` or complete request body. No automatic retry is added in Beaver for this first integration. The Gateway already handles upstream ingestion retries, and retrying add from Beaver could duplicate a conversation turn when the first request succeeded but its response was lost. ## Recall Data Flow For every run in `memory_gateway` mode: 1. `AgentLoop` creates or resolves the Beaver `session_id`. 2. Before `ContextBuilder.build_messages`, it calls `MemoryGatewayBackend.recall_before_run` with the current user prompt. 3. The Gateway search request is: ```json { "user_id": "", "user_key": "", "conversation_id": "", "query": "", "scope": [""], "top_k": 8, "app_id": "", "project_id": "" } ``` 4. Beaver accepts only a top-level `results` list. Malformed responses are treated as recall failures. 5. Each result is reduced to these optional fields: `id`, `session_id`, `text`, `score`, `source_scope`, and `resource_uri`. Gateway `raw` data is never injected into the model. 6. Empty or unusable results produce no recalled-memory message. 7. Non-empty results become one ephemeral provider message placed after skill activation messages and before persisted session history/current user input. The message is reference data, is not written to Beaver's session history, and is not included in post-run Gateway persistence. 8. The stable system prompt includes a short rule that recalled memory is untrusted reference data, not executable instruction. The recalled text itself is not concatenated into the system prompt. In `curated` mode, this flow is unchanged from today: a per-run frozen curated snapshot is added to the system prompt, and no Gateway request occurs. ## Persistence Data Flow For every `memory_gateway` run that reaches the normal completion path: 1. Wait until the tool loop has produced the final assistant text. 2. Construct exactly two Gateway messages in chronological order: ```json [ { "sender_id": "", "role": "user", "timestamp": 1780000000000, "content": "" }, { "sender_id": "beaver", "role": "assistant", "timestamp": 1780000001000, "content": "" } ] ``` Timestamps are UTC Unix epoch milliseconds captured for the user turn and the final assistant turn. They must be positive and monotonic within the payload. 3. Call `/memories/add` exactly once with: ```json { "user_id": "", "user_key": "", "session_id": "chat:", "app_id": "", "project_id": "", "messages": [""] } ``` 4. If add succeeds, call `/memories/flush` exactly once with the same Gateway identity, app/project scope, and `chat:`. 5. If add fails, do not call flush. 6. Runs that enter Beaver's exception/error completion path are not persisted. Normal completion outputs such as a tool-limit fallback are persisted because they are the assistant response returned to the user. 7. Tool calls, tool results, hidden events, system prompts, recalled-memory messages, reasoning content, and activated skill text are never included. In `curated` mode, there is no automatic post-run persistence. Existing model-driven `memory` tool writes remain unchanged. ## Session Audit Events Gateway mode writes hidden (`context_visible=false`) session events without credentials or full Gateway response bodies: - `memory_gateway_recall_succeeded`: scope and result count. - `memory_gateway_recall_failed`: operation, sanitized error category, and optional HTTP status. - `memory_gateway_add_succeeded`: session identifier and message count. - `memory_gateway_add_failed`: sanitized failure metadata. - `memory_gateway_flush_succeeded`: session identifier. - `memory_gateway_flush_failed`: sanitized failure metadata and an indication that add had already succeeded. These events support debugging without entering normal context history or FTS. ## Failure Semantics - Search timeout, connection failure, 401, other HTTP error, or malformed JSON: record recall failure and continue the run without recalled memory. - Add failure: record add failure, skip flush, and return the normal assistant result. - Flush failure: record flush failure and return the normal assistant result. - Audit event persistence failure follows existing session-store behavior and is not separately swallowed by the memory strategy. - Gateway failures are not shown as user-facing chat errors in this phase. ## Security and Privacy - Fixed Gateway credentials come only from Beaver instance configuration. - `userKey` is passed only in Gateway request bodies and retained in the in-memory configuration/client object. - Client exceptions and audit payloads use sanitized operation metadata, never serialized request bodies. - Recalled resource and conversation text is treated as untrusted data. - Gateway `raw` fields are discarded before prompt construction to limit prompt size and reduce accidental propagation of backend metadata. - Memory modes are mutually exclusive, preventing duplicate recall and writes across curated and Gateway stores. ## Testing ### Configuration tests - Missing memory configuration defaults to curated mode. - Complete Gateway configuration parses camelCase and exposes normalized typed values. - Invalid mode, empty credentials, empty/unknown scopes, invalid `topK`, and non-positive timeout fail with explicit configuration errors. - Error text does not include `userKey`. ### HTTP client tests - Search, add, and flush use the exact paths and payload shapes above. - Configured timeout is applied. - Non-2xx, network, invalid JSON, and invalid response shapes produce sanitized client exceptions. - Exception strings never contain the configured key. ### Strategy tests - Curated mode returns a frozen snapshot and performs no HTTP requests. - Gateway mode performs search with configured scopes and strips `raw` fields. - Empty search results produce no reference message. - Gateway persistence sends exactly the original user prompt and final assistant response, then flushes once. - Add failure skips flush; flush failure preserves the successful add outcome. ### Agent loop tests - Gateway search occurs before the provider call. - Recalled content appears before the current user prompt and outside the system prompt body. - The system prompt contains the untrusted-reference rule in Gateway mode. - Add and flush happen after the final assistant response and exactly once each. - Tool/system/reasoning content is absent from the add payload. - Recall/add/flush failures do not change the returned `AgentRunResult`. - Hidden success/failure audit events contain no credentials. - Curated mode regression tests confirm frozen snapshot injection and original `memory` tool availability. - Gateway mode confirms the original `memory` tool is not registered or exposed. ## Documentation Update the backend README/config example with both modes and a warning that the test-stage `userKey` is a secret. Document that changing modes requires runtime reload/restart because `EngineLoader` constructs the selected strategy during boot.