docs: design terminal websocket channel
This commit is contained in:
@ -0,0 +1,279 @@
|
||||
# Terminal WebSocket Channel Design
|
||||
|
||||
Date: 2026-06-01
|
||||
|
||||
## Goal
|
||||
|
||||
Add a text-only WebSocket channel adapter so a small terminal device can connect to Beaver and exchange messages through the channel runtime.
|
||||
|
||||
This is a first-stage acceptance path for proving Beaver can talk to the terminal device. The terminal must enter through `ChannelRuntime` and `MessageBus`; it must not use the existing Web UI `/ws/{session_id}` direct-chat path.
|
||||
|
||||
## Non-Goals
|
||||
|
||||
- Do not implement audio, camera, screen, image, or multimodal payloads.
|
||||
- Do not stream token deltas to the terminal in this phase.
|
||||
- Do not add AuthZ or device registration in this phase.
|
||||
- Do not implement the Hermes LiveKit LLM adapter in this phase.
|
||||
- Do not route terminal messages directly to `AgentService`.
|
||||
|
||||
## Recommended Architecture
|
||||
|
||||
Add a channel-native WebSocket adapter named `TerminalWebSocketAdapter`.
|
||||
|
||||
The Web backend exposes:
|
||||
|
||||
```text
|
||||
/api/channels/{channel_id}/ws
|
||||
```
|
||||
|
||||
The route resolves the configured channel adapter from `ChannelRuntime` and delegates the accepted WebSocket to the adapter. The adapter owns terminal connection state, normalizes incoming frames into `InboundMessage`, and receives `OutboundMessage` objects through `ChannelManager.dispatch_outbound()`.
|
||||
|
||||
The path remains bus-first:
|
||||
|
||||
```text
|
||||
terminal websocket
|
||||
-> TerminalWebSocketAdapter
|
||||
-> ChannelRuntime.accept_inbound()
|
||||
-> MessageBus.inbound
|
||||
-> ChannelRuntime bridge
|
||||
-> AgentService.handle_inbound_message()
|
||||
-> MessageBus.outbound
|
||||
-> ChannelManager.dispatch_outbound()
|
||||
-> TerminalWebSocketAdapter.send()
|
||||
-> terminal websocket
|
||||
```
|
||||
|
||||
## Channel Configuration
|
||||
|
||||
The terminal channel uses the existing `BeaverConfig.channels` map.
|
||||
|
||||
Example:
|
||||
|
||||
```json
|
||||
{
|
||||
"channels": {
|
||||
"terminal-dev": {
|
||||
"enabled": true,
|
||||
"kind": "terminal",
|
||||
"mode": "websocket",
|
||||
"accountId": "local",
|
||||
"displayName": "Terminal Dev",
|
||||
"config": {
|
||||
"heartbeatSeconds": 30,
|
||||
"maxMessageChars": 20000
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
`kind` is the platform family. `mode` is the transport mode. The adapter factory must instantiate `TerminalWebSocketAdapter` when `kind == "terminal"` and `mode == "websocket"`.
|
||||
|
||||
## Protocol
|
||||
|
||||
The protocol is JSON over WebSocket. All payloads are text-only.
|
||||
|
||||
The terminal starts with a connect frame:
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "connect",
|
||||
"peer_id": "device-001",
|
||||
"device_name": "desk-terminal",
|
||||
"capabilities": ["text"]
|
||||
}
|
||||
```
|
||||
|
||||
Beaver replies:
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "connected",
|
||||
"channel_id": "terminal-dev",
|
||||
"session_id": "terminal-dev:local:device-001"
|
||||
}
|
||||
```
|
||||
|
||||
The terminal sends user text:
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "message",
|
||||
"message_id": "m-001",
|
||||
"text": "你好"
|
||||
}
|
||||
```
|
||||
|
||||
Beaver acknowledges accepted inbound:
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "ack",
|
||||
"message_id": "m-001",
|
||||
"session_id": "terminal-dev:local:device-001",
|
||||
"accepted": true
|
||||
}
|
||||
```
|
||||
|
||||
Beaver sends the final assistant response:
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "message",
|
||||
"role": "assistant",
|
||||
"message_id": "m-001",
|
||||
"run_id": "run-id",
|
||||
"text": "你好,我在。",
|
||||
"finish_reason": "stop"
|
||||
}
|
||||
```
|
||||
|
||||
Ping/pong frames are supported:
|
||||
|
||||
```json
|
||||
{"type": "ping"}
|
||||
{"type": "pong"}
|
||||
```
|
||||
|
||||
Unsupported frame types return an error frame and keep the connection open:
|
||||
|
||||
```json
|
||||
{"type": "error", "error": "Unsupported websocket frame type: example"}
|
||||
```
|
||||
|
||||
## Identity And Session Mapping
|
||||
|
||||
The adapter builds a `ChannelIdentity` from the connect and message frames:
|
||||
|
||||
- `channel_id`: path/config channel id, such as `terminal-dev`
|
||||
- `kind`: `terminal`
|
||||
- `account_id`: channel config account id, such as `local`
|
||||
- `peer_id`: terminal `peer_id`
|
||||
- `peer_type`: `terminal`
|
||||
- `message_id`: message frame `message_id`
|
||||
- `thread_id`: optional message or connect frame field
|
||||
- `user_id`: optional message or connect frame field
|
||||
|
||||
The session id stays aligned with channel runtime v1:
|
||||
|
||||
```text
|
||||
<channel_id>:<account_id>:<peer_id>[:<thread_id>]
|
||||
```
|
||||
|
||||
For the first terminal rollout, a terminal connection is treated as one active peer. A reconnect with the same `peer_id` reuses the same session id.
|
||||
|
||||
## Delivery Semantics
|
||||
|
||||
Inbound messages are accepted through `ChannelRuntime.accept_inbound()`.
|
||||
|
||||
If dedupe sees a duplicate message id:
|
||||
|
||||
- return an ack with `duplicate: true`
|
||||
- include cached `reply` when the prior run is done
|
||||
- include `pending: true` when the prior run is still processing
|
||||
- do not publish a second inbound message
|
||||
|
||||
Outbound delivery is connection-bound. `TerminalWebSocketAdapter.send()` looks up the active connection for the outbound session or peer. If found, it sends the final assistant message. If no connection is available, it marks the outbound message as unclaimed so runtime records `outbound_unclaimed`.
|
||||
|
||||
No retry queue is required in this phase.
|
||||
|
||||
## Runtime Status And Events
|
||||
|
||||
`/api/status` and `/api/channels` include terminal channels with:
|
||||
|
||||
- `channel_id`
|
||||
- `kind`
|
||||
- `mode`
|
||||
- `display_name`
|
||||
- `enabled`
|
||||
- `state`
|
||||
- `account_id`
|
||||
- `last_event_at`
|
||||
- `websocket_url`
|
||||
- `capabilities`, including `receive_text`, `send_text`, and `persistent_connection`
|
||||
- `connected_peers`
|
||||
|
||||
Channel events should record:
|
||||
|
||||
- `adapter_started`
|
||||
- `terminal_connected`
|
||||
- `terminal_disconnected`
|
||||
- `inbound_accepted`
|
||||
- `inbound_duplicate`
|
||||
- `direct_run_started`
|
||||
- `direct_run_finished`
|
||||
- `outbound_delivered`
|
||||
- `outbound_unclaimed`
|
||||
- `adapter_stopped`
|
||||
|
||||
Do not store raw terminal payloads or full message text in the event log. Existing text preview behavior is enough.
|
||||
|
||||
## Nginx And Deployment
|
||||
|
||||
The existing `/api/channels/` nginx location must support WebSocket upgrade because terminal WebSockets live under that prefix.
|
||||
|
||||
The location should include:
|
||||
|
||||
```nginx
|
||||
proxy_http_version 1.1;
|
||||
proxy_set_header Upgrade $http_upgrade;
|
||||
proxy_set_header Connection $connection_upgrade;
|
||||
proxy_read_timeout 3600;
|
||||
proxy_send_timeout 3600;
|
||||
```
|
||||
|
||||
The 1800 second timeout used by synchronous webhooks can stay, but WebSocket upgrade headers are required for terminal devices.
|
||||
|
||||
## Error Handling
|
||||
|
||||
Before connect:
|
||||
|
||||
- only `connect` and `ping` are accepted
|
||||
- `message` returns an error requiring connect first
|
||||
|
||||
On connect:
|
||||
|
||||
- missing `peer_id` closes or rejects with an error frame
|
||||
- unsupported capabilities are ignored for now as long as text is available
|
||||
|
||||
On message:
|
||||
|
||||
- missing `message_id` returns an error
|
||||
- missing or blank `text` returns an error
|
||||
- oversized text returns an error based on `max_message_chars`
|
||||
|
||||
On disconnect:
|
||||
|
||||
- remove the active connection
|
||||
- record `terminal_disconnected`
|
||||
- do not cancel an already running Beaver direct run
|
||||
|
||||
If the run completes after disconnect, outbound is recorded as `outbound_unclaimed`.
|
||||
|
||||
## Testing
|
||||
|
||||
Add focused backend tests:
|
||||
|
||||
- WebSocket connect returns `connected` with stable session id.
|
||||
- Message frame publishes through runtime and returns ack plus assistant message.
|
||||
- Duplicate message id does not publish a second inbound and returns duplicate status.
|
||||
- Disconnect before outbound records `outbound_unclaimed`.
|
||||
- Unknown frame type returns an error and keeps the connection alive.
|
||||
- Channel status exposes `websocket_url` and connected peer count.
|
||||
- Config loader accepts `kind=terminal`, `mode=websocket` through existing channel config.
|
||||
|
||||
Run the existing backend unit suite and frontend type/test checks after implementation.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
The first-stage acceptance is complete when a small terminal can:
|
||||
|
||||
1. Connect to `/api/channels/terminal-dev/ws`.
|
||||
2. Send a `connect` frame with a stable `peer_id`.
|
||||
3. Send a text `message` frame.
|
||||
4. Receive an ack.
|
||||
5. Receive the final assistant text response from Beaver.
|
||||
6. Reconnect with the same `peer_id` and keep the same Beaver session id.
|
||||
7. Show connection and message events in Beaver channel status/events.
|
||||
|
||||
This validates the Beaver-to-terminal path through the new channel runtime without introducing AuthZ, multimodal payloads, or Hermes LiveKit LLM work.
|
||||
Reference in New Issue
Block a user