Files
beaver_project/docs/superpowers/specs/2026-06-01-terminal-websocket-channel-design.md

7.6 KiB

Terminal WebSocket Channel Design

Date: 2026-06-01

Goal

Add a text-only WebSocket channel adapter so a small terminal device can connect to Beaver and exchange messages through the channel runtime.

This is a first-stage acceptance path for proving Beaver can talk to the terminal device. The terminal must enter through ChannelRuntime and MessageBus; it must not use the existing Web UI /ws/{session_id} direct-chat path.

Non-Goals

  • Do not implement audio, camera, screen, image, or multimodal payloads.
  • Do not stream token deltas to the terminal in this phase.
  • Do not add AuthZ or device registration in this phase.
  • Do not implement the Hermes LiveKit LLM adapter in this phase.
  • Do not route terminal messages directly to AgentService.

Add a channel-native WebSocket adapter named TerminalWebSocketAdapter.

The Web backend exposes:

/api/channels/{channel_id}/ws

The route resolves the configured channel adapter from ChannelRuntime and delegates the accepted WebSocket to the adapter. The adapter owns terminal connection state, normalizes incoming frames into InboundMessage, and receives OutboundMessage objects through ChannelManager.dispatch_outbound().

The path remains bus-first:

terminal websocket
-> TerminalWebSocketAdapter
-> ChannelRuntime.accept_inbound()
-> MessageBus.inbound
-> ChannelRuntime bridge
-> AgentService.handle_inbound_message()
-> MessageBus.outbound
-> ChannelManager.dispatch_outbound()
-> TerminalWebSocketAdapter.send()
-> terminal websocket

Channel Configuration

The terminal channel uses the existing BeaverConfig.channels map.

Example:

{
  "channels": {
    "terminal-dev": {
      "enabled": true,
      "kind": "terminal",
      "mode": "websocket",
      "accountId": "local",
      "displayName": "Terminal Dev",
      "config": {
        "heartbeatSeconds": 30,
        "maxMessageChars": 20000
      }
    }
  }
}

kind is the platform family. mode is the transport mode. The adapter factory must instantiate TerminalWebSocketAdapter when kind == "terminal" and mode == "websocket".

Protocol

The protocol is JSON over WebSocket. All payloads are text-only.

The terminal starts with a connect frame:

{
  "type": "connect",
  "peer_id": "device-001",
  "device_name": "desk-terminal",
  "capabilities": ["text"]
}

Beaver replies:

{
  "type": "connected",
  "channel_id": "terminal-dev",
  "session_id": "terminal-dev:local:device-001"
}

The terminal sends user text:

{
  "type": "message",
  "message_id": "m-001",
  "text": "你好"
}

Beaver acknowledges accepted inbound:

{
  "type": "ack",
  "message_id": "m-001",
  "session_id": "terminal-dev:local:device-001",
  "accepted": true
}

Beaver sends the final assistant response:

{
  "type": "message",
  "role": "assistant",
  "message_id": "m-001",
  "run_id": "run-id",
  "text": "你好,我在。",
  "finish_reason": "stop"
}

Ping/pong frames are supported:

{"type": "ping"}
{"type": "pong"}

Unsupported frame types return an error frame and keep the connection open:

{"type": "error", "error": "Unsupported websocket frame type: example"}

Identity And Session Mapping

The adapter builds a ChannelIdentity from the connect and message frames:

  • channel_id: path/config channel id, such as terminal-dev
  • kind: terminal
  • account_id: channel config account id, such as local
  • peer_id: terminal peer_id
  • peer_type: terminal
  • message_id: message frame message_id
  • thread_id: optional message or connect frame field
  • user_id: optional message or connect frame field

The session id stays aligned with channel runtime v1:

<channel_id>:<account_id>:<peer_id>[:<thread_id>]

For the first terminal rollout, a terminal connection is treated as one active peer. A reconnect with the same peer_id reuses the same session id.

Delivery Semantics

Inbound messages are accepted through ChannelRuntime.accept_inbound().

If dedupe sees a duplicate message id:

  • return an ack with duplicate: true
  • include cached reply when the prior run is done
  • include pending: true when the prior run is still processing
  • do not publish a second inbound message

Outbound delivery is connection-bound. TerminalWebSocketAdapter.send() looks up the active connection for the outbound session or peer. If found, it sends the final assistant message. If no connection is available, it marks the outbound message as unclaimed so runtime records outbound_unclaimed.

No retry queue is required in this phase.

Runtime Status And Events

/api/status and /api/channels include terminal channels with:

  • channel_id
  • kind
  • mode
  • display_name
  • enabled
  • state
  • account_id
  • last_event_at
  • websocket_url
  • capabilities, including receive_text, send_text, and persistent_connection
  • connected_peers

Channel events should record:

  • adapter_started
  • terminal_connected
  • terminal_disconnected
  • inbound_accepted
  • inbound_duplicate
  • direct_run_started
  • direct_run_finished
  • outbound_delivered
  • outbound_unclaimed
  • adapter_stopped

Do not store raw terminal payloads or full message text in the event log. Existing text preview behavior is enough.

Nginx And Deployment

The existing /api/channels/ nginx location must support WebSocket upgrade because terminal WebSockets live under that prefix.

The location should include:

proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $connection_upgrade;
proxy_read_timeout 3600;
proxy_send_timeout 3600;

The 1800 second timeout used by synchronous webhooks can stay, but WebSocket upgrade headers are required for terminal devices.

Error Handling

Before connect:

  • only connect and ping are accepted
  • message returns an error requiring connect first

On connect:

  • missing peer_id closes or rejects with an error frame
  • unsupported capabilities are ignored for now as long as text is available

On message:

  • missing message_id returns an error
  • missing or blank text returns an error
  • oversized text returns an error based on max_message_chars

On disconnect:

  • remove the active connection
  • record terminal_disconnected
  • do not cancel an already running Beaver direct run

If the run completes after disconnect, outbound is recorded as outbound_unclaimed.

Testing

Add focused backend tests:

  • WebSocket connect returns connected with stable session id.
  • Message frame publishes through runtime and returns ack plus assistant message.
  • Duplicate message id does not publish a second inbound and returns duplicate status.
  • Disconnect before outbound records outbound_unclaimed.
  • Unknown frame type returns an error and keeps the connection alive.
  • Channel status exposes websocket_url and connected peer count.
  • Config loader accepts kind=terminal, mode=websocket through existing channel config.

Run the existing backend unit suite and frontend type/test checks after implementation.

Acceptance Criteria

The first-stage acceptance is complete when a small terminal can:

  1. Connect to /api/channels/terminal-dev/ws.
  2. Send a connect frame with a stable peer_id.
  3. Send a text message frame.
  4. Receive an ack.
  5. Receive the final assistant text response from Beaver.
  6. Reconnect with the same peer_id and keep the same Beaver session id.
  7. Show connection and message events in Beaver channel status/events.

This validates the Beaver-to-terminal path through the new channel runtime without introducing AuthZ, multimodal payloads, or Hermes LiveKit LLM work.