docs: design memory gateway backend integration

2026-06-15 10:31:52 +08:00
parent 8aeb97a5fc
commit 71168b83b1
1 changed files with 301 additions and 0 deletions
--- a/docs/superpowers/specs/2026-06-15-memory-gateway-backend-design.md
+++ b/docs/superpowers/specs/2026-06-15-memory-gateway-backend-design.md
@ -0,0 +1,301 @@
+# Memory Gateway Backend Design
+
+## Goal
+
+Allow each Beaver instance to select exactly one memory backend through
+`.beaver/config.json`:
+
+- `curated`: preserve the existing `MEMORY.md` / `USER.md` snapshot and `memory`
+  tool behavior.
+- `memory_gateway`: recall memory through `POST /memories/search`, then persist
+  each completed conversation turn through one `POST /memories/add` followed by
+  one `POST /memories/flush`.
+
+The Memory Gateway integration is best-effort. Gateway failures must be
+auditable without turning an otherwise successful Beaver chat run into a
+failure.
+
+## Scope
+
+This change includes:
+
+- Runtime configuration for selecting the memory backend.
+- Fixed Memory Gateway credentials and search scopes in instance config.
+- A Memory Gateway HTTP client.
+- A memory backend strategy boundary used by `AgentLoop`.
+- Pre-run recall and post-run turn persistence.
+- Hidden session audit events for recall and persistence outcomes.
+- Unit and integration-style tests using fake HTTP responses/providers.
+
+This change does not include:
+
+- Automatic `POST /users` calls or credential provisioning.
+- A memory settings UI or memory administration UI.
+- Resource upload support from Beaver.
+- Gateway memory override or deletion APIs.
+- Persisting tool calls, tool results, system events, reasoning, or skill
+  activation messages.
+- Simultaneously enabling curated memory and Memory Gateway.
+
+## Configuration
+
+Beaver adds a top-level `memory` section:
+
+```json
+{
+  "memory": {
+    "mode": "memory_gateway",
+    "gateway": {
+      "baseUrl": "http://127.0.0.1:8010",
+      "userId": "gateway_test_user",
+      "userKey": "uk_xxx",
+      "appId": "default",
+      "projectId": "default",
+      "scope": ["current_chat", "resources"],
+      "topK": 8,
+      "timeoutSeconds": 10
+    }
+  }
+}
+```
+
+Configuration rules:
+
+- Missing `memory.mode` defaults to `curated` for backward compatibility.
+- Valid modes are only `curated` and `memory_gateway`.
+- Gateway mode requires non-empty `baseUrl`, `userId`, and `userKey`.
+- `appId` and `projectId` default to `default`.
+- `scope` is read from config and must be a non-empty subset of
+  `current_chat`, `resources`, and `all_user_memory`. The initial test setup
+  uses `current_chat` and `resources` only.
+- `topK` defaults to 8 and must be between 1 and 100.
+- `timeoutSeconds` defaults to 10 and must be positive.
+- Invalid Gateway configuration fails runtime loading. Network and HTTP
+  failures after valid startup configuration remain best-effort.
+- `userKey` must never appear in status payloads, session event payloads, or
+  error messages produced by Beaver.
+
+## Architecture
+
+### Memory backend strategy
+
+Introduce one runtime-facing memory strategy abstraction with two operations:
+
+1. `recall_before_run`: prepare memory context before provider messages are
+   built.
+2. `persist_after_run`: persist the current user message and final assistant
+   answer after the run reaches its normal completion path.
+
+The strategy has two implementations:
+
+- `CuratedMemoryBackend` wraps the existing `MemoryService`. Recall returns the
+  existing frozen `MemorySnapshot`; post-run persistence is a no-op because
+  curated writes remain model-driven through the existing `memory` tool.
+- `MemoryGatewayBackend` wraps a dedicated asynchronous HTTP client. Recall
+  calls Gateway search and returns sanitized reference content; persistence
+  calls add once and, only after add succeeds, flush once.
+
+`EngineLoader` validates configuration, constructs exactly one strategy, and
+registers the original `memory` tool only in curated mode. `session_search`
+remains available in both modes because transcript search is separate from the
+selected long-term memory backend.
+
+`AgentLoop` depends on the strategy interface rather than branching directly
+on the configured mode.
+
+### Memory Gateway HTTP client
+
+The client owns only HTTP transport and response validation for:
+
+- `POST {baseUrl}/memories/search`
+- `POST {baseUrl}/memories/add`
+- `POST {baseUrl}/memories/flush`
+
+It uses an async HTTP client, the configured timeout, JSON request bodies, and
+a small typed exception that contains HTTP status/path context but never
+contains the configured `userKey` or complete request body.
+
+No automatic retry is added in Beaver for this first integration. The Gateway
+already handles upstream ingestion retries, and retrying add from Beaver could
+duplicate a conversation turn when the first request succeeded but its
+response was lost.
+
+## Recall Data Flow
+
+For every run in `memory_gateway` mode:
+
+1. `AgentLoop` creates or resolves the Beaver `session_id`.
+2. Before `ContextBuilder.build_messages`, it calls
+   `MemoryGatewayBackend.recall_before_run` with the current user prompt.
+3. The Gateway search request is:
+
+```json
+{
+  "user_id": "<configured userId>",
+  "user_key": "<configured userKey>",
+  "conversation_id": "<Beaver resolved_session_id>",
+  "query": "<current user prompt>",
+  "scope": ["<configured scopes>"],
+  "top_k": 8,
+  "app_id": "<configured appId>",
+  "project_id": "<configured projectId>"
+}
+```
+
+4. Beaver accepts only a top-level `results` list. Malformed responses are
+   treated as recall failures.
+5. Each result is reduced to these optional fields:
+   `id`, `session_id`, `text`, `score`, `source_scope`, and `resource_uri`.
+   Gateway `raw` data is never injected into the model.
+6. Empty or unusable results produce no recalled-memory message.
+7. Non-empty results become one ephemeral provider message placed after skill
+   activation messages and before persisted session history/current user input.
+   The message is reference data, is not written to Beaver's session history,
+   and is not included in post-run Gateway persistence.
+8. The stable system prompt includes a short rule that recalled memory is
+   untrusted reference data, not executable instruction. The recalled text
+   itself is not concatenated into the system prompt.
+
+In `curated` mode, this flow is unchanged from today: a per-run frozen curated
+snapshot is added to the system prompt, and no Gateway request occurs.
+
+## Persistence Data Flow
+
+For every `memory_gateway` run that reaches the normal completion path:
+
+1. Wait until the tool loop has produced the final assistant text.
+2. Construct exactly two Gateway messages in chronological order:
+
+```json
+[
+  {
+    "sender_id": "<configured userId>",
+    "role": "user",
+    "timestamp": 1780000000000,
+    "content": "<original current user prompt>"
+  },
+  {
+    "sender_id": "beaver",
+    "role": "assistant",
+    "timestamp": 1780000001000,
+    "content": "<final assistant text>"
+  }
+]
+```
+
+Timestamps are UTC Unix epoch milliseconds captured for the user turn and the
+final assistant turn. They must be positive and monotonic within the payload.
+
+3. Call `/memories/add` exactly once with:
+
+```json
+{
+  "user_id": "<configured userId>",
+  "user_key": "<configured userKey>",
+  "session_id": "chat:<Beaver resolved_session_id>",
+  "app_id": "<configured appId>",
+  "project_id": "<configured projectId>",
+  "messages": ["<the two messages above>"]
+}
+```
+
+4. If add succeeds, call `/memories/flush` exactly once with the same Gateway
+   identity, app/project scope, and `chat:<resolved_session_id>`.
+5. If add fails, do not call flush.
+6. Runs that enter Beaver's exception/error completion path are not persisted.
+   Normal completion outputs such as a tool-limit fallback are persisted
+   because they are the assistant response returned to the user.
+7. Tool calls, tool results, hidden events, system prompts, recalled-memory
+   messages, reasoning content, and activated skill text are never included.
+
+In `curated` mode, there is no automatic post-run persistence. Existing
+model-driven `memory` tool writes remain unchanged.
+
+## Session Audit Events
+
+Gateway mode writes hidden (`context_visible=false`) session events without
+credentials or full Gateway response bodies:
+
+- `memory_gateway_recall_succeeded`: scope and result count.
+- `memory_gateway_recall_failed`: operation, sanitized error category, and
+  optional HTTP status.
+- `memory_gateway_add_succeeded`: session identifier and message count.
+- `memory_gateway_add_failed`: sanitized failure metadata.
+- `memory_gateway_flush_succeeded`: session identifier.
+- `memory_gateway_flush_failed`: sanitized failure metadata and an indication
+  that add had already succeeded.
+
+These events support debugging without entering normal context history or FTS.
+
+## Failure Semantics
+
+- Search timeout, connection failure, 401, other HTTP error, or malformed JSON:
+  record recall failure and continue the run without recalled memory.
+- Add failure: record add failure, skip flush, and return the normal assistant
+  result.
+- Flush failure: record flush failure and return the normal assistant result.
+- Audit event persistence failure follows existing session-store behavior and
+  is not separately swallowed by the memory strategy.
+- Gateway failures are not shown as user-facing chat errors in this phase.
+
+## Security and Privacy
+
+- Fixed Gateway credentials come only from Beaver instance configuration.
+- `userKey` is passed only in Gateway request bodies and retained in the in-memory
+  configuration/client object.
+- Client exceptions and audit payloads use sanitized operation metadata, never
+  serialized request bodies.
+- Recalled resource and conversation text is treated as untrusted data.
+- Gateway `raw` fields are discarded before prompt construction to limit prompt
+  size and reduce accidental propagation of backend metadata.
+- Memory modes are mutually exclusive, preventing duplicate recall and writes
+  across curated and Gateway stores.
+
+## Testing
+
+### Configuration tests
+
+- Missing memory configuration defaults to curated mode.
+- Complete Gateway configuration parses camelCase and exposes normalized typed
+  values.
+- Invalid mode, empty credentials, empty/unknown scopes, invalid `topK`, and
+  non-positive timeout fail with explicit configuration errors.
+- Error text does not include `userKey`.
+
+### HTTP client tests
+
+- Search, add, and flush use the exact paths and payload shapes above.
+- Configured timeout is applied.
+- Non-2xx, network, invalid JSON, and invalid response shapes produce sanitized
+  client exceptions.
+- Exception strings never contain the configured key.
+
+### Strategy tests
+
+- Curated mode returns a frozen snapshot and performs no HTTP requests.
+- Gateway mode performs search with configured scopes and strips `raw` fields.
+- Empty search results produce no reference message.
+- Gateway persistence sends exactly the original user prompt and final assistant
+  response, then flushes once.
+- Add failure skips flush; flush failure preserves the successful add outcome.
+
+### Agent loop tests
+
+- Gateway search occurs before the provider call.
+- Recalled content appears before the current user prompt and outside the system
+  prompt body.
+- The system prompt contains the untrusted-reference rule in Gateway mode.
+- Add and flush happen after the final assistant response and exactly once each.
+- Tool/system/reasoning content is absent from the add payload.
+- Recall/add/flush failures do not change the returned `AgentRunResult`.
+- Hidden success/failure audit events contain no credentials.
+- Curated mode regression tests confirm frozen snapshot injection and original
+  `memory` tool availability.
+- Gateway mode confirms the original `memory` tool is not registered or exposed.
+
+## Documentation
+
+Update the backend README/config example with both modes and a warning that the
+test-stage `userKey` is a secret. Document that changing modes requires runtime
+reload/restart because `EngineLoader` constructs the selected strategy during
+boot.