docs: revise memory gateway design for hybrid mode

This commit is contained in:
2026-06-15 10:56:53 +08:00
parent 71168b83b1
commit b3c6ee4b78

View File

@ -1,41 +1,47 @@
# Memory Gateway Backend Design # Hybrid Memory Gateway Integration Design
## Goal ## Goal
Allow each Beaver instance to select exactly one memory backend through Keep Beaver's existing curated memory as the permanent baseline and optionally
`.beaver/config.json`: add Memory Gateway as an independent second memory layer.
- `curated`: preserve the existing `MEMORY.md` / `USER.md` snapshot and `memory` - Curated memory continues to load `MEMORY.md` and `USER.md` into a frozen
tool behavior. per-run snapshot and continues to expose the existing `memory` tool.
- `memory_gateway`: recall memory through `POST /memories/search`, then persist - Memory Gateway independently recalls conversation/resource memory through
each completed conversation turn through one `POST /memories/add` followed by `POST /memories/search` and persists each completed conversation turn through
one `POST /memories/flush`. one `POST /memories/add` followed by one `POST /memories/flush`.
- The two layers do not synchronize, overwrite, merge, deduplicate, or resolve
conflicts with each other.
The Memory Gateway integration is best-effort. Gateway failures must be Memory Gateway is best-effort. Gateway failures must be auditable without
auditable without turning an otherwise successful Beaver chat run into a affecting curated memory or turning an otherwise successful chat run into a
failure. failure.
## Scope ## Scope
This change includes: This change includes:
- Runtime configuration for selecting the memory backend. - Runtime configuration for `curated` and `hybrid` modes.
- Fixed Memory Gateway credentials and search scopes in instance config. - Fixed Memory Gateway credentials and search scopes in instance config.
- A Memory Gateway HTTP client. - An asynchronous Memory Gateway HTTP client.
- A memory backend strategy boundary used by `AgentLoop`. - An optional `MemoryGatewayService` alongside the existing `MemoryService`.
- Pre-run recall and post-run turn persistence. - Gateway recall before each provider run in hybrid mode.
- Hidden session audit events for recall and persistence outcomes. - Gateway add and flush after each normally completed run in hybrid mode.
- Unit and integration-style tests using fake HTTP responses/providers. - Hidden session audit events for Gateway outcomes.
- Unit and integration-style tests using fake transports and providers.
This change does not include: This change does not include:
- Replacing or disabling curated memory.
- Synchronizing curated `memory` tool writes to Memory Gateway.
- Writing Gateway conversation turns into `MEMORY.md` or `USER.md`.
- Conflict resolution or automatic deduplication across the two layers.
- Automatic `POST /users` calls or credential provisioning. - Automatic `POST /users` calls or credential provisioning.
- A memory settings UI or memory administration UI. - A memory settings UI or memory administration UI.
- Resource upload support from Beaver. - Resource upload support from Beaver.
- Gateway memory override or deletion APIs. - Gateway override or deletion APIs.
- Persisting tool calls, tool results, system events, reasoning, or skill - Persisting tool calls, tool results, system events, reasoning, recalled
activation messages. memory, or skill activation messages to Gateway.
- Simultaneously enabling curated memory and Memory Gateway.
## Configuration ## Configuration
@ -44,7 +50,7 @@ Beaver adds a top-level `memory` section:
```json ```json
{ {
"memory": { "memory": {
"mode": "memory_gateway", "mode": "hybrid",
"gateway": { "gateway": {
"baseUrl": "http://127.0.0.1:8010", "baseUrl": "http://127.0.0.1:8010",
"userId": "gateway_test_user", "userId": "gateway_test_user",
@ -61,79 +67,89 @@ Beaver adds a top-level `memory` section:
Configuration rules: Configuration rules:
- Missing `memory.mode` defaults to `curated` for backward compatibility. - Valid modes are `curated` and `hybrid`.
- Valid modes are only `curated` and `memory_gateway`. - Curated memory is initialized and enabled in both modes.
- Gateway mode requires non-empty `baseUrl`, `userId`, and `userKey`. - If the entire `memory` section is absent, the effective mode is implicitly
`hybrid`. Missing Gateway credentials in this implicit-default case produce
a startup warning and degrade only the Gateway layer; Beaver continues with
curated memory.
- If `mode: "hybrid"` is explicitly present, non-empty `baseUrl`, `userId`, and
`userKey` are required. Missing required values fail runtime loading.
- `mode: "curated"` disables Gateway initialization and ignores an optional
Gateway block.
- `appId` and `projectId` default to `default`. - `appId` and `projectId` default to `default`.
- `scope` is read from config and must be a non-empty subset of - `scope` must be a non-empty subset of `current_chat`, `resources`, and
`current_chat`, `resources`, and `all_user_memory`. The initial test setup `all_user_memory`. The initial integration uses `current_chat` and
uses `current_chat` and `resources` only. `resources`.
- `topK` defaults to 8 and must be between 1 and 100. - `topK` defaults to 8 and must be between 1 and 100.
- `timeoutSeconds` defaults to 10 and must be positive. - `timeoutSeconds` defaults to 10 and must be positive.
- Invalid Gateway configuration fails runtime loading. Network and HTTP - `userKey` must never appear in status payloads, warnings, logs produced by
failures after valid startup configuration remain best-effort. this integration, session events, or raised configuration/client errors.
- `userKey` must never appear in status payloads, session event payloads, or
error messages produced by Beaver. The parsed configuration must retain whether hybrid mode was explicit or
implicit so runtime loading can apply the different validation behavior.
## Architecture ## Architecture
### Memory backend strategy ### Existing curated memory remains unchanged
Introduce one runtime-facing memory strategy abstraction with two operations: `MemoryStore`, `MemorySnapshot`, `MemoryService`, and `MemoryTool` retain their
current responsibilities:
1. `recall_before_run`: prepare memory context before provider messages are - `EngineLoader` always initializes `MemoryService`.
built. - `AgentLoop` always captures a per-run frozen curated snapshot.
2. `persist_after_run`: persist the current user message and final assistant - `ContextBuilder` always receives that snapshot for system-prompt injection.
answer after the run reaches its normal completion path. - The original `memory` tool remains registered and always operates only on
`MEMORY.md` and `USER.md`.
- Gateway availability and Gateway failures do not change curated behavior.
The strategy has two implementations: ### Optional Gateway service
- `CuratedMemoryBackend` wraps the existing `MemoryService`. Recall returns the Add a separate `MemoryGatewayService` rather than a mutually exclusive backend
existing frozen `MemorySnapshot`; post-run persistence is a no-op because strategy. It is present only when hybrid mode has a valid Gateway configuration.
curated writes remain model-driven through the existing `memory` tool.
- `MemoryGatewayBackend` wraps a dedicated asynchronous HTTP client. Recall
calls Gateway search and returns sanitized reference content; persistence
calls add once and, only after add succeeds, flush once.
`EngineLoader` validates configuration, constructs exactly one strategy, and The service exposes two runtime operations:
registers the original `memory` tool only in curated mode. `session_search`
remains available in both modes because transcript search is separate from the
selected long-term memory backend.
`AgentLoop` depends on the strategy interface rather than branching directly 1. `recall_before_run`: search Gateway using the current Beaver session and
on the configured mode. user prompt, then return sanitized reference messages plus audit metadata.
2. `persist_after_run`: add the current user message and final assistant answer,
then flush the Gateway chat session.
`EngineLoadResult` exposes `memory_gateway_service: MemoryGatewayService | None`.
`AgentLoop` uses it conditionally while continuing its existing curated path
unconditionally.
`session_search` remains independent and available in both modes.
### Memory Gateway HTTP client ### Memory Gateway HTTP client
The client owns only HTTP transport and response validation for: The HTTP client owns transport and response validation for:
- `POST {baseUrl}/memories/search` - `POST {baseUrl}/memories/search`
- `POST {baseUrl}/memories/add` - `POST {baseUrl}/memories/add`
- `POST {baseUrl}/memories/flush` - `POST {baseUrl}/memories/flush`
It uses an async HTTP client, the configured timeout, JSON request bodies, and It uses an asynchronous HTTP client, the configured timeout, JSON request
a small typed exception that contains HTTP status/path context but never bodies, and sanitized typed exceptions containing operation/path/status
contains the configured `userKey` or complete request body. metadata without credentials or complete request bodies.
No automatic retry is added in Beaver for this first integration. The Gateway Beaver adds no automatic retries in this first integration. Gateway already
already handles upstream ingestion retries, and retrying add from Beaver could retries upstream ingestion, and retrying add from Beaver could duplicate a
duplicate a conversation turn when the first request succeeded but its turn when the first request succeeded but its response was lost.
response was lost.
## Recall Data Flow ## Recall Data Flow
For every run in `memory_gateway` mode: Every run follows the existing curated flow. Hybrid mode adds these steps:
1. `AgentLoop` creates or resolves the Beaver `session_id`. 1. `AgentLoop` creates or resolves `resolved_session_id`.
2. Before `ContextBuilder.build_messages`, it calls 2. It captures the curated frozen snapshot as it does today.
`MemoryGatewayBackend.recall_before_run` with the current user prompt. 3. Before `ContextBuilder.build_messages`, it calls Gateway search using:
3. The Gateway search request is:
```json ```json
{ {
"user_id": "<configured userId>", "user_id": "<configured userId>",
"user_key": "<configured userKey>", "user_key": "<configured userKey>",
"conversation_id": "<Beaver resolved_session_id>", "conversation_id": "<resolved_session_id>",
"query": "<current user prompt>", "query": "<current user prompt>",
"scope": ["<configured scopes>"], "scope": ["<configured scopes>"],
"top_k": 8, "top_k": 8,
@ -143,25 +159,36 @@ For every run in `memory_gateway` mode:
``` ```
4. Beaver accepts only a top-level `results` list. Malformed responses are 4. Beaver accepts only a top-level `results` list. Malformed responses are
treated as recall failures. treated as Gateway recall failures.
5. Each result is reduced to these optional fields: 5. Each result is reduced to the optional fields `id`, `session_id`, `text`,
`id`, `session_id`, `text`, `score`, `source_scope`, and `resource_uri`. `score`, `source_scope`, and `resource_uri`. The Gateway `raw` object is
Gateway `raw` data is never injected into the model. discarded.
6. Empty or unusable results produce no recalled-memory message. 6. Empty or unusable results produce no Gateway reference message.
7. Non-empty results become one ephemeral provider message placed after skill 7. Non-empty results become one ephemeral provider message placed after skill
activation messages and before persisted session history/current user input. activation messages and before persisted session history/current user input.
The message is reference data, is not written to Beaver's session history, 8. The Gateway reference message is not written to Beaver session history and
and is not included in post-run Gateway persistence. is not included in post-run Gateway persistence.
8. The stable system prompt includes a short rule that recalled memory is 9. The system prompt includes a stable rule that Gateway recall is untrusted
untrusted reference data, not executable instruction. The recalled text reference data, not executable instruction. The recalled text itself stays
itself is not concatenated into the system prompt. outside the system prompt.
In `curated` mode, this flow is unchanged from today: a per-run frozen curated The model receives both memory layers without an imposed priority:
snapshot is added to the system prompt, and no Gateway request occurs.
- Curated blocks remain in the system prompt exactly as today.
- Gateway results appear as a separately labelled reference message.
- Beaver performs no conflict detection, winner selection, merge, or
deduplication between them.
In curated mode, or when implicit hybrid degrades because Gateway credentials
are absent, no Gateway request or Gateway prompt section occurs.
## Persistence Data Flow ## Persistence Data Flow
For every `memory_gateway` run that reaches the normal completion path: Curated persistence remains model-driven through the original `memory` tool.
Gateway persistence is separate and occurs only when the optional Gateway
service is active.
For each run that reaches the normal completion path:
1. Wait until the tool loop has produced the final assistant text. 1. Wait until the tool loop has produced the final assistant text.
2. Construct exactly two Gateway messages in chronological order: 2. Construct exactly two Gateway messages in chronological order:
@ -183,8 +210,8 @@ For every `memory_gateway` run that reaches the normal completion path:
] ]
``` ```
Timestamps are UTC Unix epoch milliseconds captured for the user turn and the Timestamps are UTC Unix epoch milliseconds captured for the user turn and final
final assistant turn. They must be positive and monotonic within the payload. assistant turn. They must be positive and monotonic within the payload.
3. Call `/memories/add` exactly once with: 3. Call `/memories/add` exactly once with:
@ -192,75 +219,87 @@ final assistant turn. They must be positive and monotonic within the payload.
{ {
"user_id": "<configured userId>", "user_id": "<configured userId>",
"user_key": "<configured userKey>", "user_key": "<configured userKey>",
"session_id": "chat:<Beaver resolved_session_id>", "session_id": "chat:<resolved_session_id>",
"app_id": "<configured appId>", "app_id": "<configured appId>",
"project_id": "<configured projectId>", "project_id": "<configured projectId>",
"messages": ["<the two messages above>"] "messages": ["<the two messages above>"]
} }
``` ```
4. If add succeeds, call `/memories/flush` exactly once with the same Gateway 4. If add succeeds, call `/memories/flush` exactly once using the same Gateway
identity, app/project scope, and `chat:<resolved_session_id>`. identity, app/project scope, and `chat:<resolved_session_id>`.
5. If add fails, do not call flush. 5. If add fails, do not call flush.
6. Runs that enter Beaver's exception/error completion path are not persisted. 6. Runs entering Beaver's exception/error completion path are not persisted.
Normal completion outputs such as a tool-limit fallback are persisted Normal completion outputs such as a tool-limit fallback are persisted because
because they are the assistant response returned to the user. they are returned to the user.
7. Tool calls, tool results, hidden events, system prompts, recalled-memory 7. Tool calls, tool results, hidden events, system prompts, curated snapshot
messages, reasoning content, and activated skill text are never included. text, Gateway recalled text, reasoning, and activated skill text are never
included in the Gateway add payload.
In `curated` mode, there is no automatic post-run persistence. Existing 8. Gateway persistence never modifies `MEMORY.md` or `USER.md`.
model-driven `memory` tool writes remain unchanged. 9. Curated `memory` tool add/replace/remove operations never call Gateway.
## Session Audit Events ## Session Audit Events
Gateway mode writes hidden (`context_visible=false`) session events without When the Gateway service is active, Beaver writes hidden
credentials or full Gateway response bodies: (`context_visible=false`) session events without credentials or full response
bodies:
- `memory_gateway_recall_succeeded`: scope and result count. - `memory_gateway_recall_succeeded`: configured scopes and result count.
- `memory_gateway_recall_failed`: operation, sanitized error category, and - `memory_gateway_recall_failed`: operation, sanitized error category, and
optional HTTP status. optional HTTP status.
- `memory_gateway_add_succeeded`: session identifier and message count. - `memory_gateway_add_succeeded`: Gateway chat session and message count.
- `memory_gateway_add_failed`: sanitized failure metadata. - `memory_gateway_add_failed`: sanitized failure metadata.
- `memory_gateway_flush_succeeded`: session identifier. - `memory_gateway_flush_succeeded`: Gateway chat session.
- `memory_gateway_flush_failed`: sanitized failure metadata and an indication - `memory_gateway_flush_failed`: sanitized failure metadata and indication that
that add had already succeeded. add already succeeded.
These events support debugging without entering normal context history or FTS. For implicit hybrid degradation at runtime boot, use a normal application
warning rather than a session event because no session exists yet. The warning
must not contain credential values.
## Failure Semantics ## Failure Semantics
- Curated initialization or writes retain their existing behavior and are not
caught or changed by Gateway code.
- Missing Gateway credentials in implicit-default hybrid mode: warn, leave the
Gateway service unset, and continue with curated memory.
- Missing/invalid Gateway configuration in explicit hybrid mode: fail runtime
loading with a sanitized configuration error.
- Search timeout, connection failure, 401, other HTTP error, or malformed JSON: - Search timeout, connection failure, 401, other HTTP error, or malformed JSON:
record recall failure and continue the run without recalled memory. record recall failure and continue with curated memory and normal context.
- Add failure: record add failure, skip flush, and return the normal assistant - Add failure: record add failure, skip flush, and return the normal assistant
result. result.
- Flush failure: record flush failure and return the normal assistant result. - Flush failure: record flush failure and return the normal assistant result.
- Audit event persistence failure follows existing session-store behavior and - Gateway failures do not disable, roll back, or mutate curated memory.
is not separately swallowed by the memory strategy. - Gateway failures are not surfaced as user-facing chat errors in this phase.
- Gateway failures are not shown as user-facing chat errors in this phase.
## Security and Privacy ## Security and Privacy
- Fixed Gateway credentials come only from Beaver instance configuration. - Fixed Gateway credentials come only from Beaver instance configuration.
- `userKey` is passed only in Gateway request bodies and retained in the in-memory - `userKey` is passed only in Gateway request bodies and retained in memory by
configuration/client object. the typed config/client objects.
- Client exceptions and audit payloads use sanitized operation metadata, never - Client exceptions, startup warnings, and audit payloads never serialize
serialized request bodies. request bodies or credentials.
- Recalled resource and conversation text is treated as untrusted data. - Gateway conversation/resource text is treated as untrusted data.
- Gateway `raw` fields are discarded before prompt construction to limit prompt - Gateway `raw` fields are discarded before prompt construction.
size and reduce accidental propagation of backend metadata. - Curated and Gateway stores remain isolated. No content is copied between
- Memory modes are mutually exclusive, preventing duplicate recall and writes them: curated receives only explicit `memory` tool mutations, while Gateway
across curated and Gateway stores. receives only the configured per-run conversation payload.
## Testing ## Testing
### Configuration tests ### Configuration tests
- Missing memory configuration defaults to curated mode. - Missing memory configuration produces implicit hybrid mode.
- Complete Gateway configuration parses camelCase and exposes normalized typed - Implicit hybrid without credentials leaves Gateway disabled and curated
values. enabled, with one sanitized warning.
- Invalid mode, empty credentials, empty/unknown scopes, invalid `topK`, and - Explicit curated mode does not require or initialize Gateway.
non-positive timeout fail with explicit configuration errors. - Complete explicit hybrid config parses camelCase fields and initializes both
- Error text does not include `userKey`. memory layers.
- Explicit hybrid with missing credentials fails loading.
- Invalid mode, empty/unknown scope, invalid `topK`, and non-positive timeout
fail with explicit sanitized errors.
- No warning or exception text contains `userKey`.
### HTTP client tests ### HTTP client tests
@ -270,32 +309,43 @@ These events support debugging without entering normal context history or FTS.
client exceptions. client exceptions.
- Exception strings never contain the configured key. - Exception strings never contain the configured key.
### Strategy tests ### Gateway service tests
- Curated mode returns a frozen snapshot and performs no HTTP requests. - Search uses configured scopes and strips `raw` fields.
- Gateway mode performs search with configured scopes and strips `raw` fields.
- Empty search results produce no reference message. - Empty search results produce no reference message.
- Gateway persistence sends exactly the original user prompt and final assistant - Persistence sends exactly the original user prompt and final assistant
response, then flushes once. response, then flushes once.
- Add failure skips flush; flush failure preserves the successful add outcome. - Add failure skips flush; flush failure preserves the successful add outcome.
- Service methods never read or write curated files or call `MemoryStore`.
### Agent loop tests ### Agent loop and loader tests
- Gateway search occurs before the provider call. - Curated snapshot injection and `memory` tool availability remain present in
- Recalled content appears before the current user prompt and outside the system both curated and hybrid modes.
- Hybrid search occurs before the provider call while the curated snapshot is
still present in the system prompt.
- Gateway recall appears before the current user prompt and outside the system
prompt body. prompt body.
- The system prompt contains the untrusted-reference rule in Gateway mode. - The system prompt contains the untrusted-reference rule only when Gateway is
active.
- Add and flush happen after the final assistant response and exactly once each. - Add and flush happen after the final assistant response and exactly once each.
- Tool/system/reasoning content is absent from the add payload. - Tool/system/reasoning/curated/Gateway-recall content is absent from the add
- Recall/add/flush failures do not change the returned `AgentRunResult`. payload.
- Recall/add/flush failures do not change the returned `AgentRunResult` or the
curated snapshot/tool behavior.
- Hidden success/failure audit events contain no credentials. - Hidden success/failure audit events contain no credentials.
- Curated mode regression tests confirm frozen snapshot injection and original - Curated `memory` tool operations produce no Gateway calls.
`memory` tool availability. - Gateway persistence produces no changes to `MEMORY.md` or `USER.md`.
- Gateway mode confirms the original `memory` tool is not registered or exposed. - Curated mode and degraded implicit hybrid perform no Gateway HTTP calls.
## Documentation ## Documentation
Update the backend README/config example with both modes and a warning that the Update the backend README/config example with:
test-stage `userKey` is a secret. Document that changing modes requires runtime
reload/restart because `EngineLoader` constructs the selected strategy during - `hybrid` as the implicit default.
boot. - Explicit `curated` mode for disabling Gateway.
- A complete explicit hybrid example.
- The implicit-default degradation rule and explicit-hybrid validation rule.
- A warning that `userKey` is a secret.
- A note that changing memory mode/config requires runtime reload or restart
because `EngineLoader` constructs the optional Gateway service during boot.