docs: design memory gateway backend integration
This commit is contained in:
@ -0,0 +1,301 @@
|
||||
# Memory Gateway Backend Design
|
||||
|
||||
## Goal
|
||||
|
||||
Allow each Beaver instance to select exactly one memory backend through
|
||||
`.beaver/config.json`:
|
||||
|
||||
- `curated`: preserve the existing `MEMORY.md` / `USER.md` snapshot and `memory`
|
||||
tool behavior.
|
||||
- `memory_gateway`: recall memory through `POST /memories/search`, then persist
|
||||
each completed conversation turn through one `POST /memories/add` followed by
|
||||
one `POST /memories/flush`.
|
||||
|
||||
The Memory Gateway integration is best-effort. Gateway failures must be
|
||||
auditable without turning an otherwise successful Beaver chat run into a
|
||||
failure.
|
||||
|
||||
## Scope
|
||||
|
||||
This change includes:
|
||||
|
||||
- Runtime configuration for selecting the memory backend.
|
||||
- Fixed Memory Gateway credentials and search scopes in instance config.
|
||||
- A Memory Gateway HTTP client.
|
||||
- A memory backend strategy boundary used by `AgentLoop`.
|
||||
- Pre-run recall and post-run turn persistence.
|
||||
- Hidden session audit events for recall and persistence outcomes.
|
||||
- Unit and integration-style tests using fake HTTP responses/providers.
|
||||
|
||||
This change does not include:
|
||||
|
||||
- Automatic `POST /users` calls or credential provisioning.
|
||||
- A memory settings UI or memory administration UI.
|
||||
- Resource upload support from Beaver.
|
||||
- Gateway memory override or deletion APIs.
|
||||
- Persisting tool calls, tool results, system events, reasoning, or skill
|
||||
activation messages.
|
||||
- Simultaneously enabling curated memory and Memory Gateway.
|
||||
|
||||
## Configuration
|
||||
|
||||
Beaver adds a top-level `memory` section:
|
||||
|
||||
```json
|
||||
{
|
||||
"memory": {
|
||||
"mode": "memory_gateway",
|
||||
"gateway": {
|
||||
"baseUrl": "http://127.0.0.1:8010",
|
||||
"userId": "gateway_test_user",
|
||||
"userKey": "uk_xxx",
|
||||
"appId": "default",
|
||||
"projectId": "default",
|
||||
"scope": ["current_chat", "resources"],
|
||||
"topK": 8,
|
||||
"timeoutSeconds": 10
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Configuration rules:
|
||||
|
||||
- Missing `memory.mode` defaults to `curated` for backward compatibility.
|
||||
- Valid modes are only `curated` and `memory_gateway`.
|
||||
- Gateway mode requires non-empty `baseUrl`, `userId`, and `userKey`.
|
||||
- `appId` and `projectId` default to `default`.
|
||||
- `scope` is read from config and must be a non-empty subset of
|
||||
`current_chat`, `resources`, and `all_user_memory`. The initial test setup
|
||||
uses `current_chat` and `resources` only.
|
||||
- `topK` defaults to 8 and must be between 1 and 100.
|
||||
- `timeoutSeconds` defaults to 10 and must be positive.
|
||||
- Invalid Gateway configuration fails runtime loading. Network and HTTP
|
||||
failures after valid startup configuration remain best-effort.
|
||||
- `userKey` must never appear in status payloads, session event payloads, or
|
||||
error messages produced by Beaver.
|
||||
|
||||
## Architecture
|
||||
|
||||
### Memory backend strategy
|
||||
|
||||
Introduce one runtime-facing memory strategy abstraction with two operations:
|
||||
|
||||
1. `recall_before_run`: prepare memory context before provider messages are
|
||||
built.
|
||||
2. `persist_after_run`: persist the current user message and final assistant
|
||||
answer after the run reaches its normal completion path.
|
||||
|
||||
The strategy has two implementations:
|
||||
|
||||
- `CuratedMemoryBackend` wraps the existing `MemoryService`. Recall returns the
|
||||
existing frozen `MemorySnapshot`; post-run persistence is a no-op because
|
||||
curated writes remain model-driven through the existing `memory` tool.
|
||||
- `MemoryGatewayBackend` wraps a dedicated asynchronous HTTP client. Recall
|
||||
calls Gateway search and returns sanitized reference content; persistence
|
||||
calls add once and, only after add succeeds, flush once.
|
||||
|
||||
`EngineLoader` validates configuration, constructs exactly one strategy, and
|
||||
registers the original `memory` tool only in curated mode. `session_search`
|
||||
remains available in both modes because transcript search is separate from the
|
||||
selected long-term memory backend.
|
||||
|
||||
`AgentLoop` depends on the strategy interface rather than branching directly
|
||||
on the configured mode.
|
||||
|
||||
### Memory Gateway HTTP client
|
||||
|
||||
The client owns only HTTP transport and response validation for:
|
||||
|
||||
- `POST {baseUrl}/memories/search`
|
||||
- `POST {baseUrl}/memories/add`
|
||||
- `POST {baseUrl}/memories/flush`
|
||||
|
||||
It uses an async HTTP client, the configured timeout, JSON request bodies, and
|
||||
a small typed exception that contains HTTP status/path context but never
|
||||
contains the configured `userKey` or complete request body.
|
||||
|
||||
No automatic retry is added in Beaver for this first integration. The Gateway
|
||||
already handles upstream ingestion retries, and retrying add from Beaver could
|
||||
duplicate a conversation turn when the first request succeeded but its
|
||||
response was lost.
|
||||
|
||||
## Recall Data Flow
|
||||
|
||||
For every run in `memory_gateway` mode:
|
||||
|
||||
1. `AgentLoop` creates or resolves the Beaver `session_id`.
|
||||
2. Before `ContextBuilder.build_messages`, it calls
|
||||
`MemoryGatewayBackend.recall_before_run` with the current user prompt.
|
||||
3. The Gateway search request is:
|
||||
|
||||
```json
|
||||
{
|
||||
"user_id": "<configured userId>",
|
||||
"user_key": "<configured userKey>",
|
||||
"conversation_id": "<Beaver resolved_session_id>",
|
||||
"query": "<current user prompt>",
|
||||
"scope": ["<configured scopes>"],
|
||||
"top_k": 8,
|
||||
"app_id": "<configured appId>",
|
||||
"project_id": "<configured projectId>"
|
||||
}
|
||||
```
|
||||
|
||||
4. Beaver accepts only a top-level `results` list. Malformed responses are
|
||||
treated as recall failures.
|
||||
5. Each result is reduced to these optional fields:
|
||||
`id`, `session_id`, `text`, `score`, `source_scope`, and `resource_uri`.
|
||||
Gateway `raw` data is never injected into the model.
|
||||
6. Empty or unusable results produce no recalled-memory message.
|
||||
7. Non-empty results become one ephemeral provider message placed after skill
|
||||
activation messages and before persisted session history/current user input.
|
||||
The message is reference data, is not written to Beaver's session history,
|
||||
and is not included in post-run Gateway persistence.
|
||||
8. The stable system prompt includes a short rule that recalled memory is
|
||||
untrusted reference data, not executable instruction. The recalled text
|
||||
itself is not concatenated into the system prompt.
|
||||
|
||||
In `curated` mode, this flow is unchanged from today: a per-run frozen curated
|
||||
snapshot is added to the system prompt, and no Gateway request occurs.
|
||||
|
||||
## Persistence Data Flow
|
||||
|
||||
For every `memory_gateway` run that reaches the normal completion path:
|
||||
|
||||
1. Wait until the tool loop has produced the final assistant text.
|
||||
2. Construct exactly two Gateway messages in chronological order:
|
||||
|
||||
```json
|
||||
[
|
||||
{
|
||||
"sender_id": "<configured userId>",
|
||||
"role": "user",
|
||||
"timestamp": 1780000000000,
|
||||
"content": "<original current user prompt>"
|
||||
},
|
||||
{
|
||||
"sender_id": "beaver",
|
||||
"role": "assistant",
|
||||
"timestamp": 1780000001000,
|
||||
"content": "<final assistant text>"
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
Timestamps are UTC Unix epoch milliseconds captured for the user turn and the
|
||||
final assistant turn. They must be positive and monotonic within the payload.
|
||||
|
||||
3. Call `/memories/add` exactly once with:
|
||||
|
||||
```json
|
||||
{
|
||||
"user_id": "<configured userId>",
|
||||
"user_key": "<configured userKey>",
|
||||
"session_id": "chat:<Beaver resolved_session_id>",
|
||||
"app_id": "<configured appId>",
|
||||
"project_id": "<configured projectId>",
|
||||
"messages": ["<the two messages above>"]
|
||||
}
|
||||
```
|
||||
|
||||
4. If add succeeds, call `/memories/flush` exactly once with the same Gateway
|
||||
identity, app/project scope, and `chat:<resolved_session_id>`.
|
||||
5. If add fails, do not call flush.
|
||||
6. Runs that enter Beaver's exception/error completion path are not persisted.
|
||||
Normal completion outputs such as a tool-limit fallback are persisted
|
||||
because they are the assistant response returned to the user.
|
||||
7. Tool calls, tool results, hidden events, system prompts, recalled-memory
|
||||
messages, reasoning content, and activated skill text are never included.
|
||||
|
||||
In `curated` mode, there is no automatic post-run persistence. Existing
|
||||
model-driven `memory` tool writes remain unchanged.
|
||||
|
||||
## Session Audit Events
|
||||
|
||||
Gateway mode writes hidden (`context_visible=false`) session events without
|
||||
credentials or full Gateway response bodies:
|
||||
|
||||
- `memory_gateway_recall_succeeded`: scope and result count.
|
||||
- `memory_gateway_recall_failed`: operation, sanitized error category, and
|
||||
optional HTTP status.
|
||||
- `memory_gateway_add_succeeded`: session identifier and message count.
|
||||
- `memory_gateway_add_failed`: sanitized failure metadata.
|
||||
- `memory_gateway_flush_succeeded`: session identifier.
|
||||
- `memory_gateway_flush_failed`: sanitized failure metadata and an indication
|
||||
that add had already succeeded.
|
||||
|
||||
These events support debugging without entering normal context history or FTS.
|
||||
|
||||
## Failure Semantics
|
||||
|
||||
- Search timeout, connection failure, 401, other HTTP error, or malformed JSON:
|
||||
record recall failure and continue the run without recalled memory.
|
||||
- Add failure: record add failure, skip flush, and return the normal assistant
|
||||
result.
|
||||
- Flush failure: record flush failure and return the normal assistant result.
|
||||
- Audit event persistence failure follows existing session-store behavior and
|
||||
is not separately swallowed by the memory strategy.
|
||||
- Gateway failures are not shown as user-facing chat errors in this phase.
|
||||
|
||||
## Security and Privacy
|
||||
|
||||
- Fixed Gateway credentials come only from Beaver instance configuration.
|
||||
- `userKey` is passed only in Gateway request bodies and retained in the in-memory
|
||||
configuration/client object.
|
||||
- Client exceptions and audit payloads use sanitized operation metadata, never
|
||||
serialized request bodies.
|
||||
- Recalled resource and conversation text is treated as untrusted data.
|
||||
- Gateway `raw` fields are discarded before prompt construction to limit prompt
|
||||
size and reduce accidental propagation of backend metadata.
|
||||
- Memory modes are mutually exclusive, preventing duplicate recall and writes
|
||||
across curated and Gateway stores.
|
||||
|
||||
## Testing
|
||||
|
||||
### Configuration tests
|
||||
|
||||
- Missing memory configuration defaults to curated mode.
|
||||
- Complete Gateway configuration parses camelCase and exposes normalized typed
|
||||
values.
|
||||
- Invalid mode, empty credentials, empty/unknown scopes, invalid `topK`, and
|
||||
non-positive timeout fail with explicit configuration errors.
|
||||
- Error text does not include `userKey`.
|
||||
|
||||
### HTTP client tests
|
||||
|
||||
- Search, add, and flush use the exact paths and payload shapes above.
|
||||
- Configured timeout is applied.
|
||||
- Non-2xx, network, invalid JSON, and invalid response shapes produce sanitized
|
||||
client exceptions.
|
||||
- Exception strings never contain the configured key.
|
||||
|
||||
### Strategy tests
|
||||
|
||||
- Curated mode returns a frozen snapshot and performs no HTTP requests.
|
||||
- Gateway mode performs search with configured scopes and strips `raw` fields.
|
||||
- Empty search results produce no reference message.
|
||||
- Gateway persistence sends exactly the original user prompt and final assistant
|
||||
response, then flushes once.
|
||||
- Add failure skips flush; flush failure preserves the successful add outcome.
|
||||
|
||||
### Agent loop tests
|
||||
|
||||
- Gateway search occurs before the provider call.
|
||||
- Recalled content appears before the current user prompt and outside the system
|
||||
prompt body.
|
||||
- The system prompt contains the untrusted-reference rule in Gateway mode.
|
||||
- Add and flush happen after the final assistant response and exactly once each.
|
||||
- Tool/system/reasoning content is absent from the add payload.
|
||||
- Recall/add/flush failures do not change the returned `AgentRunResult`.
|
||||
- Hidden success/failure audit events contain no credentials.
|
||||
- Curated mode regression tests confirm frozen snapshot injection and original
|
||||
`memory` tool availability.
|
||||
- Gateway mode confirms the original `memory` tool is not registered or exposed.
|
||||
|
||||
## Documentation
|
||||
|
||||
Update the backend README/config example with both modes and a warning that the
|
||||
test-stage `userKey` is a secret. Document that changing modes requires runtime
|
||||
reload/restart because `EngineLoader` constructs the selected strategy during
|
||||
boot.
|
||||
Reference in New Issue
Block a user