From 834d4e1e2f06f5f21cecb3414ed68dea34faee98 Mon Sep 17 00:00:00 2001 From: steven_li Date: Tue, 2 Jun 2026 15:17:46 +0800 Subject: [PATCH] docs: add channel connector pairing design --- ...2-channel-connectors-and-pairing-design.md | 384 ++++++++++++++++++ 1 file changed, 384 insertions(+) create mode 100644 docs/superpowers/specs/2026-06-02-channel-connectors-and-pairing-design.md diff --git a/docs/superpowers/specs/2026-06-02-channel-connectors-and-pairing-design.md b/docs/superpowers/specs/2026-06-02-channel-connectors-and-pairing-design.md new file mode 100644 index 0000000..5bcef62 --- /dev/null +++ b/docs/superpowers/specs/2026-06-02-channel-connectors-and-pairing-design.md @@ -0,0 +1,384 @@ +# Channel Connectors And Pairing Design + +Date: 2026-06-02 + +## Goal + +Add a first-class connection layer above Beaver's channel runtime so users can connect messaging platforms through plugin, QR, OAuth, token, or app-credential flows instead of editing static channel JSON by hand. + +This design reframes platform channels as two cooperating layers: + +```text +ChannelConnector +-> install / auth / QR / OAuth / credential validation / login state +-> ChannelConnectionStore +-> ChannelRuntime +-> ChannelAdapter +-> MessageBus +-> AgentService +``` + +The existing `ChannelRuntime`, `MessageBus`, `ChannelManager`, and `ChannelAdapter` contracts remain the message routing core. The new connector layer owns user-visible setup and connection lifecycle. + +## Why This Is Required + +The current channel design assumes a channel is already configured before the backend starts. That is enough for local development and simple webhook/token channels, but it does not match real platform onboarding: + +- Feishu/Lark now has a Channel SDK pattern that packages bot channel setup, WebSocket or webhook transport, event handling, and replies around an installed app identity. +- Weixin/OpenClaw-style setup uses a local plugin installer plus QR login and persistent login state. +- Terminal devices need pairing or device registration; a raw `peer_id` connect frame is not enough for a real deployment. +- Even simple token platforms such as Telegram need a UI flow for token entry, validation, status, revoke, and restart. + +So Beaver needs a connection lifecycle layer. Adapters should not be responsible for prompting the user, installing packages, storing long-lived credentials, or deciding whether an unknown device is allowed to bind. + +## Non-Goals + +- Do not replace `ChannelRuntime`, `MessageBus`, `ChannelManager`, or `AgentService`. +- Do not make every connector a Node sidecar. Node sidecars are allowed when the official or practical SDK path requires them. +- Do not implement every channel in this phase. +- Do not build a plugin marketplace in this phase. +- Do not store platform secrets in plain channel config when a credential store is available. +- Do not let external connector code call `AgentService` directly. + +## Core Terms + +`ChannelConnection` is the user-visible connection instance. Examples: "Weixin personal account", "Lark workspace bot", "Telegram main bot", "Desk terminal". + +`ChannelConnector` is the setup and lifecycle controller for one platform family. It starts pairing sessions, validates credentials, launches connector processes, handles reconnects, and emits runtime channel config. + +`ChannelAdapter` is the message transport adapter used by `ChannelRuntime`. It receives normalized inbound messages and sends outbound replies. It does not own onboarding. + +`ExternalConnectorProcess` is an optional local process for platforms whose SDK or login behavior is better isolated outside the Python backend. It talks to Beaver through a narrow control and message protocol. + +## Data Model + +Add a durable connection store under the backend workspace: + +```python +@dataclass +class ChannelConnection: + connection_id: str + owner_user_id: str | None + channel_id: str + kind: str + mode: str + display_name: str + account_id: str + status: str + auth_type: str + credentials_ref: str | None + connector_ref: str | None + pairing_session_id: str | None + runtime_config: dict[str, Any] + capabilities: list[str] + created_at: str + updated_at: str + last_seen_at: str | None + last_error: str | None +``` + +`status` values: + +- `draft`: setup has started but no credentials are usable. +- `pairing`: waiting for QR scan, OAuth callback, device approval, or token validation. +- `connected`: credentials are valid and the runtime channel can start. +- `running`: the runtime adapter or external connector is active. +- `degraded`: partially working, for example inbound works but media upload failed. +- `error`: connection cannot start or authenticate. +- `revoked`: user or platform revoked the connection. + +Credential material should live behind `credentials_ref`, not inline in `ChannelConnection`. For the first local implementation, the reference may point to an encrypted file or a restricted JSON store. The interface should still look like a credential vault so AuthZ or a real secret backend can replace it later. + +## Connector Contract + +Every connector implements a setup contract: + +```python +class ChannelConnector(Protocol): + kind: str + + async def start_pairing(request: StartPairingRequest) -> PairingSession + async def complete_pairing(event: PairingEvent) -> ChannelConnection + async def validate(connection_id: str) -> ValidationResult + async def materialize_runtime(connection_id: str) -> ChannelRuntimeSpec + async def revoke(connection_id: str) -> None +``` + +`materialize_runtime()` returns the adapter-ready config: + +```python +@dataclass +class ChannelRuntimeSpec: + channel_id: str + kind: str + mode: str + account_id: str + display_name: str + config: dict[str, Any] + secrets_ref: str | None + external_endpoint: str | None +``` + +The runtime may still internally use `ChannelConfig`, but the source of truth becomes `ChannelConnectionStore`, not only static `BeaverConfig.channels`. + +## Control APIs + +Add backend APIs for the connection UI: + +```text +GET /api/channel-connectors +GET /api/channel-connections +POST /api/channel-connections +GET /api/channel-connections/{connection_id} +POST /api/channel-connections/{connection_id}/pairing/start +POST /api/channel-connections/{connection_id}/pairing/complete +POST /api/channel-connections/{connection_id}/validate +POST /api/channel-connections/{connection_id}/start +POST /api/channel-connections/{connection_id}/stop +POST /api/channel-connections/{connection_id}/revoke +GET /api/channel-connections/{connection_id}/events +``` + +The existing `/api/channels` status endpoint can keep reporting runtime adapter status, but the UI should prefer `/api/channel-connections` for setup state. + +## UI Flow + +The status page becomes a channel connection page: + +```text +Add Channel +-> choose platform +-> connector-specific setup form +-> QR/OAuth/token/app credential validation +-> connection status +-> start runtime channel +-> test message or platform health check +``` + +The UI must distinguish: + +- setup state: pairing, credential validation, revoked. +- runtime state: adapter running, disconnected, outbound failed. +- platform state: QR expired, app not installed, permission missing, token invalid. + +This avoids the current problem where all failures collapse into adapter startup errors. + +## External Connector Process + +Some channels should run through an external process: + +```text +ExternalConnectorProcess +-> Beaver connector control API +-> local Unix/TCP/WebSocket bridge +-> ChannelRuntime external adapter +``` + +The external process must not receive permanent backend admin credentials through QR codes or copied commands. It should receive a short-lived pairing token with a narrow scope: + +```text +scope: channel:pair +kind: weixin +expires_in: 10 minutes +one_time: true +``` + +After pairing, Beaver stores the resulting connection credentials and gives the connector a renewable connection token scoped to that connection only. + +## Per-Channel Assessment + +### Feishu / Lark + +Feishu/Lark should be a first-class connector, not only a static adapter. + +Recommended first implementation: + +- connector kind: `feishu` +- setup fields: domain, app id, app secret, connection mode. +- default mode: WebSocket long connection. +- optional mode: webhook. +- runtime adapter: may be Python if coverage is sufficient, or an external Node connector when using official Channel SDK behavior. + +Required setup checks: + +- app credentials are present. +- bot/event permissions are configured. +- event subscription mode is valid. +- bot identity can be resolved. +- a test direct message or event subscription health check can run when available. + +The connector should expose both "manual app credential setup" and future "install from app template" paths. The manual path is enough for the first Beaver release. + +### Weixin + +Weixin should use an external connector process. + +Recommended first implementation: + +- connector kind: `weixin` +- setup mode: local plugin command plus QR login. +- external process: required. +- runtime adapter: external bridge adapter that receives normalized events from the connector. + +Required setup checks: + +- local connector installed. +- connector version is compatible with Beaver. +- QR session is pending, scanned, confirmed, expired, or failed. +- login state is stored behind `credentials_ref`. +- connector heartbeat is visible. + +Group delivery remains best-effort. The connector must surface group capability separately from direct message capability. + +### Telegram + +Telegram can be implemented as an internal connector plus internal adapter. + +Recommended first implementation: + +- setup mode: bot token entry. +- validation: call Telegram `getMe`. +- runtime mode: polling by default, webhook optional. +- no external process required. + +The UI still treats it as a connector so users can add, validate, revoke, and restart it without editing JSON. + +### QQBot + +QQBot should start as an internal connector with official gateway credentials. + +Recommended first implementation: + +- setup fields: app id, client secret, intents or permission hints. +- runtime mode: WebSocket gateway. +- validation: token exchange or gateway auth dry run when available. + +If SDK/runtime behavior later becomes easier outside Python, this connector can move to an external process without changing the runtime message contract. + +### Terminal + +Terminal should move from raw `peer_id` to pairing. + +Recommended first implementation: + +- UI creates a terminal pairing session. +- Beaver displays a command or QR/setup code. +- device connects with one-time pairing token. +- Beaver binds a stable device identity to a `ChannelConnection`. +- subsequent WebSocket `connect` frames authenticate as the bound device. + +The message protocol can keep `connect`, `connected`, `message`, `ack`, and assistant `message`, but production connections must include an authenticated device token. + +## Message Flow After Pairing + +Once a connection is paired, the message path stays unchanged: + +```text +platform or device +-> connector transport +-> ChannelAdapter +-> ChannelRuntime.accept_inbound() +-> MessageBus.inbound +-> AgentService.handle_inbound_message() +-> MessageBus.outbound +-> ChannelManager.dispatch_outbound() +-> ChannelAdapter.send() +-> connector transport +-> platform or device +``` + +This is intentionally conservative. Pairing changes how a channel becomes trusted and running; it does not change the agent loop. + +## Access Control + +Connection setup requires a Beaver user or backend owner identity. The connector layer decides who may create, view, revoke, or start a connection. + +Inbound platform messages still use adapter-level policy: + +- `open`: accept platform scope. +- `allowlist`: accept only known users/groups. +- `disabled`: ignore that scope. + +The important change is that allowlists belong to the connection settings, not ad hoc adapter config only. + +## Error Handling + +Pairing errors: + +- expired pairing token. +- QR not scanned before timeout. +- OAuth callback state mismatch. +- platform permission missing. +- credentials validation failed. + +Runtime errors: + +- adapter startup failed. +- connector process unavailable. +- heartbeat missed. +- inbound normalization failed. +- outbound delivery failed. + +Each event should be recorded against `connection_id` and, when available, `channel_id` and `session_id`. + +## Security Requirements + +- Pairing tokens are short-lived, one-time, and scoped to one connector kind. +- QR codes never embed permanent backend credentials. +- External connector processes do not receive broad backend admin tokens. +- Revoking a connection invalidates connector tokens and stops the runtime channel. +- Stored platform credentials are referenced by `credentials_ref`. +- Event logs must not include raw secrets, tokens, QR payloads, or full platform credential responses. + +## Relationship To Existing Channel Specs + +The terminal WebSocket spec remains valid as a development transport spec, but production terminal setup must add pairing. + +The chat platform adapter spec remains valid as a runtime adapter spec, but these statements should be revised before implementation: + +- "Do not introduce a Node sidecar as the default channel architecture" should become "Use internal adapters by default, but allow external connector processes where platform SDK or login state requires them." +- "Pairing is out of scope for this phase" should become "Pairing is owned by the connector layer; adapters assume a materialized connection." +- Static `BeaverConfig.channels` should become a development override and backward-compatible import path, not the only source of runtime channels. + +## Rollout + +Implement in this order: + +1. `ChannelConnectionStore`, connector registry, and connection status APIs. +2. Telegram connector as the simplest token-based setup path. +3. Terminal pairing to remove raw unauthenticated `peer_id` usage. +4. Feishu/Lark connector with WebSocket long-connection mode and credential validation. +5. Weixin external connector bridge with QR pairing. +6. QQBot connector after the common credential and gateway patterns are stable. + +This order proves the common connector lifecycle with a low-risk token channel before adding QR and external process complexity. + +## Testing + +Add unit tests for: + +- connection store create/update/revoke. +- pairing token expiry and one-time use. +- connector registry dispatch by kind. +- materializing runtime specs from connections. +- secret redaction in events. +- adapter runtime still receiving normalized `InboundMessage`. + +Add integration-style tests with fake connectors for: + +- successful token setup. +- QR expired and QR completed. +- external connector heartbeat loss. +- revoke stops runtime dispatch. + +Live platform tests remain manual or gated behind explicit environment variables. + +## Acceptance Criteria + +- A user can add a channel connection without editing backend JSON. +- Beaver can show setup state separately from runtime adapter state. +- Telegram can validate a bot token and materialize a runtime channel. +- Terminal can bind through a one-time pairing flow. +- Feishu/Lark design allows official SDK or Node connector use when needed. +- Weixin design requires an external connector and QR login state. +- Existing channel runtime message flow remains bus-first and adapter-mediated.