From b3c6ee4b785302976f8f7465a0524ad83f1e0f64 Mon Sep 17 00:00:00 2001 From: tomtan Date: Mon, 15 Jun 2026 10:56:53 +0800 Subject: [PATCH] docs: revise memory gateway design for hybrid mode --- ...026-06-15-memory-gateway-backend-design.md | 316 ++++++++++-------- 1 file changed, 183 insertions(+), 133 deletions(-) diff --git a/docs/superpowers/specs/2026-06-15-memory-gateway-backend-design.md b/docs/superpowers/specs/2026-06-15-memory-gateway-backend-design.md index 8446c21..79de54b 100644 --- a/docs/superpowers/specs/2026-06-15-memory-gateway-backend-design.md +++ b/docs/superpowers/specs/2026-06-15-memory-gateway-backend-design.md @@ -1,41 +1,47 @@ -# Memory Gateway Backend Design +# Hybrid Memory Gateway Integration Design ## Goal -Allow each Beaver instance to select exactly one memory backend through -`.beaver/config.json`: +Keep Beaver's existing curated memory as the permanent baseline and optionally +add Memory Gateway as an independent second memory layer. -- `curated`: preserve the existing `MEMORY.md` / `USER.md` snapshot and `memory` - tool behavior. -- `memory_gateway`: recall memory through `POST /memories/search`, then persist - each completed conversation turn through one `POST /memories/add` followed by - one `POST /memories/flush`. +- Curated memory continues to load `MEMORY.md` and `USER.md` into a frozen + per-run snapshot and continues to expose the existing `memory` tool. +- Memory Gateway independently recalls conversation/resource memory through + `POST /memories/search` and persists each completed conversation turn through + one `POST /memories/add` followed by one `POST /memories/flush`. +- The two layers do not synchronize, overwrite, merge, deduplicate, or resolve + conflicts with each other. -The Memory Gateway integration is best-effort. Gateway failures must be -auditable without turning an otherwise successful Beaver chat run into a +Memory Gateway is best-effort. Gateway failures must be auditable without +affecting curated memory or turning an otherwise successful chat run into a failure. ## Scope This change includes: -- Runtime configuration for selecting the memory backend. +- Runtime configuration for `curated` and `hybrid` modes. - Fixed Memory Gateway credentials and search scopes in instance config. -- A Memory Gateway HTTP client. -- A memory backend strategy boundary used by `AgentLoop`. -- Pre-run recall and post-run turn persistence. -- Hidden session audit events for recall and persistence outcomes. -- Unit and integration-style tests using fake HTTP responses/providers. +- An asynchronous Memory Gateway HTTP client. +- An optional `MemoryGatewayService` alongside the existing `MemoryService`. +- Gateway recall before each provider run in hybrid mode. +- Gateway add and flush after each normally completed run in hybrid mode. +- Hidden session audit events for Gateway outcomes. +- Unit and integration-style tests using fake transports and providers. This change does not include: +- Replacing or disabling curated memory. +- Synchronizing curated `memory` tool writes to Memory Gateway. +- Writing Gateway conversation turns into `MEMORY.md` or `USER.md`. +- Conflict resolution or automatic deduplication across the two layers. - Automatic `POST /users` calls or credential provisioning. - A memory settings UI or memory administration UI. - Resource upload support from Beaver. -- Gateway memory override or deletion APIs. -- Persisting tool calls, tool results, system events, reasoning, or skill - activation messages. -- Simultaneously enabling curated memory and Memory Gateway. +- Gateway override or deletion APIs. +- Persisting tool calls, tool results, system events, reasoning, recalled + memory, or skill activation messages to Gateway. ## Configuration @@ -44,7 +50,7 @@ Beaver adds a top-level `memory` section: ```json { "memory": { - "mode": "memory_gateway", + "mode": "hybrid", "gateway": { "baseUrl": "http://127.0.0.1:8010", "userId": "gateway_test_user", @@ -61,79 +67,89 @@ Beaver adds a top-level `memory` section: Configuration rules: -- Missing `memory.mode` defaults to `curated` for backward compatibility. -- Valid modes are only `curated` and `memory_gateway`. -- Gateway mode requires non-empty `baseUrl`, `userId`, and `userKey`. +- Valid modes are `curated` and `hybrid`. +- Curated memory is initialized and enabled in both modes. +- If the entire `memory` section is absent, the effective mode is implicitly + `hybrid`. Missing Gateway credentials in this implicit-default case produce + a startup warning and degrade only the Gateway layer; Beaver continues with + curated memory. +- If `mode: "hybrid"` is explicitly present, non-empty `baseUrl`, `userId`, and + `userKey` are required. Missing required values fail runtime loading. +- `mode: "curated"` disables Gateway initialization and ignores an optional + Gateway block. - `appId` and `projectId` default to `default`. -- `scope` is read from config and must be a non-empty subset of - `current_chat`, `resources`, and `all_user_memory`. The initial test setup - uses `current_chat` and `resources` only. +- `scope` must be a non-empty subset of `current_chat`, `resources`, and + `all_user_memory`. The initial integration uses `current_chat` and + `resources`. - `topK` defaults to 8 and must be between 1 and 100. - `timeoutSeconds` defaults to 10 and must be positive. -- Invalid Gateway configuration fails runtime loading. Network and HTTP - failures after valid startup configuration remain best-effort. -- `userKey` must never appear in status payloads, session event payloads, or - error messages produced by Beaver. +- `userKey` must never appear in status payloads, warnings, logs produced by + this integration, session events, or raised configuration/client errors. + +The parsed configuration must retain whether hybrid mode was explicit or +implicit so runtime loading can apply the different validation behavior. ## Architecture -### Memory backend strategy +### Existing curated memory remains unchanged -Introduce one runtime-facing memory strategy abstraction with two operations: +`MemoryStore`, `MemorySnapshot`, `MemoryService`, and `MemoryTool` retain their +current responsibilities: -1. `recall_before_run`: prepare memory context before provider messages are - built. -2. `persist_after_run`: persist the current user message and final assistant - answer after the run reaches its normal completion path. +- `EngineLoader` always initializes `MemoryService`. +- `AgentLoop` always captures a per-run frozen curated snapshot. +- `ContextBuilder` always receives that snapshot for system-prompt injection. +- The original `memory` tool remains registered and always operates only on + `MEMORY.md` and `USER.md`. +- Gateway availability and Gateway failures do not change curated behavior. -The strategy has two implementations: +### Optional Gateway service -- `CuratedMemoryBackend` wraps the existing `MemoryService`. Recall returns the - existing frozen `MemorySnapshot`; post-run persistence is a no-op because - curated writes remain model-driven through the existing `memory` tool. -- `MemoryGatewayBackend` wraps a dedicated asynchronous HTTP client. Recall - calls Gateway search and returns sanitized reference content; persistence - calls add once and, only after add succeeds, flush once. +Add a separate `MemoryGatewayService` rather than a mutually exclusive backend +strategy. It is present only when hybrid mode has a valid Gateway configuration. -`EngineLoader` validates configuration, constructs exactly one strategy, and -registers the original `memory` tool only in curated mode. `session_search` -remains available in both modes because transcript search is separate from the -selected long-term memory backend. +The service exposes two runtime operations: -`AgentLoop` depends on the strategy interface rather than branching directly -on the configured mode. +1. `recall_before_run`: search Gateway using the current Beaver session and + user prompt, then return sanitized reference messages plus audit metadata. +2. `persist_after_run`: add the current user message and final assistant answer, + then flush the Gateway chat session. + +`EngineLoadResult` exposes `memory_gateway_service: MemoryGatewayService | None`. +`AgentLoop` uses it conditionally while continuing its existing curated path +unconditionally. + +`session_search` remains independent and available in both modes. ### Memory Gateway HTTP client -The client owns only HTTP transport and response validation for: +The HTTP client owns transport and response validation for: - `POST {baseUrl}/memories/search` - `POST {baseUrl}/memories/add` - `POST {baseUrl}/memories/flush` -It uses an async HTTP client, the configured timeout, JSON request bodies, and -a small typed exception that contains HTTP status/path context but never -contains the configured `userKey` or complete request body. +It uses an asynchronous HTTP client, the configured timeout, JSON request +bodies, and sanitized typed exceptions containing operation/path/status +metadata without credentials or complete request bodies. -No automatic retry is added in Beaver for this first integration. The Gateway -already handles upstream ingestion retries, and retrying add from Beaver could -duplicate a conversation turn when the first request succeeded but its -response was lost. +Beaver adds no automatic retries in this first integration. Gateway already +retries upstream ingestion, and retrying add from Beaver could duplicate a +turn when the first request succeeded but its response was lost. ## Recall Data Flow -For every run in `memory_gateway` mode: +Every run follows the existing curated flow. Hybrid mode adds these steps: -1. `AgentLoop` creates or resolves the Beaver `session_id`. -2. Before `ContextBuilder.build_messages`, it calls - `MemoryGatewayBackend.recall_before_run` with the current user prompt. -3. The Gateway search request is: +1. `AgentLoop` creates or resolves `resolved_session_id`. +2. It captures the curated frozen snapshot as it does today. +3. Before `ContextBuilder.build_messages`, it calls Gateway search using: ```json { "user_id": "", "user_key": "", - "conversation_id": "", + "conversation_id": "", "query": "", "scope": [""], "top_k": 8, @@ -143,25 +159,36 @@ For every run in `memory_gateway` mode: ``` 4. Beaver accepts only a top-level `results` list. Malformed responses are - treated as recall failures. -5. Each result is reduced to these optional fields: - `id`, `session_id`, `text`, `score`, `source_scope`, and `resource_uri`. - Gateway `raw` data is never injected into the model. -6. Empty or unusable results produce no recalled-memory message. + treated as Gateway recall failures. +5. Each result is reduced to the optional fields `id`, `session_id`, `text`, + `score`, `source_scope`, and `resource_uri`. The Gateway `raw` object is + discarded. +6. Empty or unusable results produce no Gateway reference message. 7. Non-empty results become one ephemeral provider message placed after skill activation messages and before persisted session history/current user input. - The message is reference data, is not written to Beaver's session history, - and is not included in post-run Gateway persistence. -8. The stable system prompt includes a short rule that recalled memory is - untrusted reference data, not executable instruction. The recalled text - itself is not concatenated into the system prompt. +8. The Gateway reference message is not written to Beaver session history and + is not included in post-run Gateway persistence. +9. The system prompt includes a stable rule that Gateway recall is untrusted + reference data, not executable instruction. The recalled text itself stays + outside the system prompt. -In `curated` mode, this flow is unchanged from today: a per-run frozen curated -snapshot is added to the system prompt, and no Gateway request occurs. +The model receives both memory layers without an imposed priority: + +- Curated blocks remain in the system prompt exactly as today. +- Gateway results appear as a separately labelled reference message. +- Beaver performs no conflict detection, winner selection, merge, or + deduplication between them. + +In curated mode, or when implicit hybrid degrades because Gateway credentials +are absent, no Gateway request or Gateway prompt section occurs. ## Persistence Data Flow -For every `memory_gateway` run that reaches the normal completion path: +Curated persistence remains model-driven through the original `memory` tool. +Gateway persistence is separate and occurs only when the optional Gateway +service is active. + +For each run that reaches the normal completion path: 1. Wait until the tool loop has produced the final assistant text. 2. Construct exactly two Gateway messages in chronological order: @@ -183,8 +210,8 @@ For every `memory_gateway` run that reaches the normal completion path: ] ``` -Timestamps are UTC Unix epoch milliseconds captured for the user turn and the -final assistant turn. They must be positive and monotonic within the payload. +Timestamps are UTC Unix epoch milliseconds captured for the user turn and final +assistant turn. They must be positive and monotonic within the payload. 3. Call `/memories/add` exactly once with: @@ -192,75 +219,87 @@ final assistant turn. They must be positive and monotonic within the payload. { "user_id": "", "user_key": "", - "session_id": "chat:", + "session_id": "chat:", "app_id": "", "project_id": "", "messages": [""] } ``` -4. If add succeeds, call `/memories/flush` exactly once with the same Gateway +4. If add succeeds, call `/memories/flush` exactly once using the same Gateway identity, app/project scope, and `chat:`. 5. If add fails, do not call flush. -6. Runs that enter Beaver's exception/error completion path are not persisted. - Normal completion outputs such as a tool-limit fallback are persisted - because they are the assistant response returned to the user. -7. Tool calls, tool results, hidden events, system prompts, recalled-memory - messages, reasoning content, and activated skill text are never included. - -In `curated` mode, there is no automatic post-run persistence. Existing -model-driven `memory` tool writes remain unchanged. +6. Runs entering Beaver's exception/error completion path are not persisted. + Normal completion outputs such as a tool-limit fallback are persisted because + they are returned to the user. +7. Tool calls, tool results, hidden events, system prompts, curated snapshot + text, Gateway recalled text, reasoning, and activated skill text are never + included in the Gateway add payload. +8. Gateway persistence never modifies `MEMORY.md` or `USER.md`. +9. Curated `memory` tool add/replace/remove operations never call Gateway. ## Session Audit Events -Gateway mode writes hidden (`context_visible=false`) session events without -credentials or full Gateway response bodies: +When the Gateway service is active, Beaver writes hidden +(`context_visible=false`) session events without credentials or full response +bodies: -- `memory_gateway_recall_succeeded`: scope and result count. +- `memory_gateway_recall_succeeded`: configured scopes and result count. - `memory_gateway_recall_failed`: operation, sanitized error category, and optional HTTP status. -- `memory_gateway_add_succeeded`: session identifier and message count. +- `memory_gateway_add_succeeded`: Gateway chat session and message count. - `memory_gateway_add_failed`: sanitized failure metadata. -- `memory_gateway_flush_succeeded`: session identifier. -- `memory_gateway_flush_failed`: sanitized failure metadata and an indication - that add had already succeeded. +- `memory_gateway_flush_succeeded`: Gateway chat session. +- `memory_gateway_flush_failed`: sanitized failure metadata and indication that + add already succeeded. -These events support debugging without entering normal context history or FTS. +For implicit hybrid degradation at runtime boot, use a normal application +warning rather than a session event because no session exists yet. The warning +must not contain credential values. ## Failure Semantics +- Curated initialization or writes retain their existing behavior and are not + caught or changed by Gateway code. +- Missing Gateway credentials in implicit-default hybrid mode: warn, leave the + Gateway service unset, and continue with curated memory. +- Missing/invalid Gateway configuration in explicit hybrid mode: fail runtime + loading with a sanitized configuration error. - Search timeout, connection failure, 401, other HTTP error, or malformed JSON: - record recall failure and continue the run without recalled memory. + record recall failure and continue with curated memory and normal context. - Add failure: record add failure, skip flush, and return the normal assistant result. - Flush failure: record flush failure and return the normal assistant result. -- Audit event persistence failure follows existing session-store behavior and - is not separately swallowed by the memory strategy. -- Gateway failures are not shown as user-facing chat errors in this phase. +- Gateway failures do not disable, roll back, or mutate curated memory. +- Gateway failures are not surfaced as user-facing chat errors in this phase. ## Security and Privacy - Fixed Gateway credentials come only from Beaver instance configuration. -- `userKey` is passed only in Gateway request bodies and retained in the in-memory - configuration/client object. -- Client exceptions and audit payloads use sanitized operation metadata, never - serialized request bodies. -- Recalled resource and conversation text is treated as untrusted data. -- Gateway `raw` fields are discarded before prompt construction to limit prompt - size and reduce accidental propagation of backend metadata. -- Memory modes are mutually exclusive, preventing duplicate recall and writes - across curated and Gateway stores. +- `userKey` is passed only in Gateway request bodies and retained in memory by + the typed config/client objects. +- Client exceptions, startup warnings, and audit payloads never serialize + request bodies or credentials. +- Gateway conversation/resource text is treated as untrusted data. +- Gateway `raw` fields are discarded before prompt construction. +- Curated and Gateway stores remain isolated. No content is copied between + them: curated receives only explicit `memory` tool mutations, while Gateway + receives only the configured per-run conversation payload. ## Testing ### Configuration tests -- Missing memory configuration defaults to curated mode. -- Complete Gateway configuration parses camelCase and exposes normalized typed - values. -- Invalid mode, empty credentials, empty/unknown scopes, invalid `topK`, and - non-positive timeout fail with explicit configuration errors. -- Error text does not include `userKey`. +- Missing memory configuration produces implicit hybrid mode. +- Implicit hybrid without credentials leaves Gateway disabled and curated + enabled, with one sanitized warning. +- Explicit curated mode does not require or initialize Gateway. +- Complete explicit hybrid config parses camelCase fields and initializes both + memory layers. +- Explicit hybrid with missing credentials fails loading. +- Invalid mode, empty/unknown scope, invalid `topK`, and non-positive timeout + fail with explicit sanitized errors. +- No warning or exception text contains `userKey`. ### HTTP client tests @@ -270,32 +309,43 @@ These events support debugging without entering normal context history or FTS. client exceptions. - Exception strings never contain the configured key. -### Strategy tests +### Gateway service tests -- Curated mode returns a frozen snapshot and performs no HTTP requests. -- Gateway mode performs search with configured scopes and strips `raw` fields. +- Search uses configured scopes and strips `raw` fields. - Empty search results produce no reference message. -- Gateway persistence sends exactly the original user prompt and final assistant +- Persistence sends exactly the original user prompt and final assistant response, then flushes once. - Add failure skips flush; flush failure preserves the successful add outcome. +- Service methods never read or write curated files or call `MemoryStore`. -### Agent loop tests +### Agent loop and loader tests -- Gateway search occurs before the provider call. -- Recalled content appears before the current user prompt and outside the system +- Curated snapshot injection and `memory` tool availability remain present in + both curated and hybrid modes. +- Hybrid search occurs before the provider call while the curated snapshot is + still present in the system prompt. +- Gateway recall appears before the current user prompt and outside the system prompt body. -- The system prompt contains the untrusted-reference rule in Gateway mode. +- The system prompt contains the untrusted-reference rule only when Gateway is + active. - Add and flush happen after the final assistant response and exactly once each. -- Tool/system/reasoning content is absent from the add payload. -- Recall/add/flush failures do not change the returned `AgentRunResult`. +- Tool/system/reasoning/curated/Gateway-recall content is absent from the add + payload. +- Recall/add/flush failures do not change the returned `AgentRunResult` or the + curated snapshot/tool behavior. - Hidden success/failure audit events contain no credentials. -- Curated mode regression tests confirm frozen snapshot injection and original - `memory` tool availability. -- Gateway mode confirms the original `memory` tool is not registered or exposed. +- Curated `memory` tool operations produce no Gateway calls. +- Gateway persistence produces no changes to `MEMORY.md` or `USER.md`. +- Curated mode and degraded implicit hybrid perform no Gateway HTTP calls. ## Documentation -Update the backend README/config example with both modes and a warning that the -test-stage `userKey` is a secret. Document that changing modes requires runtime -reload/restart because `EngineLoader` constructs the selected strategy during -boot. +Update the backend README/config example with: + +- `hybrid` as the implicit default. +- Explicit `curated` mode for disabling Gateway. +- A complete explicit hybrid example. +- The implicit-default degradation rule and explicit-hybrid validation rule. +- A warning that `userKey` is a secret. +- A note that changing memory mode/config requires runtime reload or restart + because `EngineLoader` constructs the optional Gateway service during boot.