docs: add channel connector pairing design

This commit is contained in:
2026-06-02 15:17:46 +08:00
parent 6a6ddc21c0
commit 834d4e1e2f

View File

@ -0,0 +1,384 @@
# Channel Connectors And Pairing Design
Date: 2026-06-02
## Goal
Add a first-class connection layer above Beaver's channel runtime so users can connect messaging platforms through plugin, QR, OAuth, token, or app-credential flows instead of editing static channel JSON by hand.
This design reframes platform channels as two cooperating layers:
```text
ChannelConnector
-> install / auth / QR / OAuth / credential validation / login state
-> ChannelConnectionStore
-> ChannelRuntime
-> ChannelAdapter
-> MessageBus
-> AgentService
```
The existing `ChannelRuntime`, `MessageBus`, `ChannelManager`, and `ChannelAdapter` contracts remain the message routing core. The new connector layer owns user-visible setup and connection lifecycle.
## Why This Is Required
The current channel design assumes a channel is already configured before the backend starts. That is enough for local development and simple webhook/token channels, but it does not match real platform onboarding:
- Feishu/Lark now has a Channel SDK pattern that packages bot channel setup, WebSocket or webhook transport, event handling, and replies around an installed app identity.
- Weixin/OpenClaw-style setup uses a local plugin installer plus QR login and persistent login state.
- Terminal devices need pairing or device registration; a raw `peer_id` connect frame is not enough for a real deployment.
- Even simple token platforms such as Telegram need a UI flow for token entry, validation, status, revoke, and restart.
So Beaver needs a connection lifecycle layer. Adapters should not be responsible for prompting the user, installing packages, storing long-lived credentials, or deciding whether an unknown device is allowed to bind.
## Non-Goals
- Do not replace `ChannelRuntime`, `MessageBus`, `ChannelManager`, or `AgentService`.
- Do not make every connector a Node sidecar. Node sidecars are allowed when the official or practical SDK path requires them.
- Do not implement every channel in this phase.
- Do not build a plugin marketplace in this phase.
- Do not store platform secrets in plain channel config when a credential store is available.
- Do not let external connector code call `AgentService` directly.
## Core Terms
`ChannelConnection` is the user-visible connection instance. Examples: "Weixin personal account", "Lark workspace bot", "Telegram main bot", "Desk terminal".
`ChannelConnector` is the setup and lifecycle controller for one platform family. It starts pairing sessions, validates credentials, launches connector processes, handles reconnects, and emits runtime channel config.
`ChannelAdapter` is the message transport adapter used by `ChannelRuntime`. It receives normalized inbound messages and sends outbound replies. It does not own onboarding.
`ExternalConnectorProcess` is an optional local process for platforms whose SDK or login behavior is better isolated outside the Python backend. It talks to Beaver through a narrow control and message protocol.
## Data Model
Add a durable connection store under the backend workspace:
```python
@dataclass
class ChannelConnection:
connection_id: str
owner_user_id: str | None
channel_id: str
kind: str
mode: str
display_name: str
account_id: str
status: str
auth_type: str
credentials_ref: str | None
connector_ref: str | None
pairing_session_id: str | None
runtime_config: dict[str, Any]
capabilities: list[str]
created_at: str
updated_at: str
last_seen_at: str | None
last_error: str | None
```
`status` values:
- `draft`: setup has started but no credentials are usable.
- `pairing`: waiting for QR scan, OAuth callback, device approval, or token validation.
- `connected`: credentials are valid and the runtime channel can start.
- `running`: the runtime adapter or external connector is active.
- `degraded`: partially working, for example inbound works but media upload failed.
- `error`: connection cannot start or authenticate.
- `revoked`: user or platform revoked the connection.
Credential material should live behind `credentials_ref`, not inline in `ChannelConnection`. For the first local implementation, the reference may point to an encrypted file or a restricted JSON store. The interface should still look like a credential vault so AuthZ or a real secret backend can replace it later.
## Connector Contract
Every connector implements a setup contract:
```python
class ChannelConnector(Protocol):
kind: str
async def start_pairing(request: StartPairingRequest) -> PairingSession
async def complete_pairing(event: PairingEvent) -> ChannelConnection
async def validate(connection_id: str) -> ValidationResult
async def materialize_runtime(connection_id: str) -> ChannelRuntimeSpec
async def revoke(connection_id: str) -> None
```
`materialize_runtime()` returns the adapter-ready config:
```python
@dataclass
class ChannelRuntimeSpec:
channel_id: str
kind: str
mode: str
account_id: str
display_name: str
config: dict[str, Any]
secrets_ref: str | None
external_endpoint: str | None
```
The runtime may still internally use `ChannelConfig`, but the source of truth becomes `ChannelConnectionStore`, not only static `BeaverConfig.channels`.
## Control APIs
Add backend APIs for the connection UI:
```text
GET /api/channel-connectors
GET /api/channel-connections
POST /api/channel-connections
GET /api/channel-connections/{connection_id}
POST /api/channel-connections/{connection_id}/pairing/start
POST /api/channel-connections/{connection_id}/pairing/complete
POST /api/channel-connections/{connection_id}/validate
POST /api/channel-connections/{connection_id}/start
POST /api/channel-connections/{connection_id}/stop
POST /api/channel-connections/{connection_id}/revoke
GET /api/channel-connections/{connection_id}/events
```
The existing `/api/channels` status endpoint can keep reporting runtime adapter status, but the UI should prefer `/api/channel-connections` for setup state.
## UI Flow
The status page becomes a channel connection page:
```text
Add Channel
-> choose platform
-> connector-specific setup form
-> QR/OAuth/token/app credential validation
-> connection status
-> start runtime channel
-> test message or platform health check
```
The UI must distinguish:
- setup state: pairing, credential validation, revoked.
- runtime state: adapter running, disconnected, outbound failed.
- platform state: QR expired, app not installed, permission missing, token invalid.
This avoids the current problem where all failures collapse into adapter startup errors.
## External Connector Process
Some channels should run through an external process:
```text
ExternalConnectorProcess
-> Beaver connector control API
-> local Unix/TCP/WebSocket bridge
-> ChannelRuntime external adapter
```
The external process must not receive permanent backend admin credentials through QR codes or copied commands. It should receive a short-lived pairing token with a narrow scope:
```text
scope: channel:pair
kind: weixin
expires_in: 10 minutes
one_time: true
```
After pairing, Beaver stores the resulting connection credentials and gives the connector a renewable connection token scoped to that connection only.
## Per-Channel Assessment
### Feishu / Lark
Feishu/Lark should be a first-class connector, not only a static adapter.
Recommended first implementation:
- connector kind: `feishu`
- setup fields: domain, app id, app secret, connection mode.
- default mode: WebSocket long connection.
- optional mode: webhook.
- runtime adapter: may be Python if coverage is sufficient, or an external Node connector when using official Channel SDK behavior.
Required setup checks:
- app credentials are present.
- bot/event permissions are configured.
- event subscription mode is valid.
- bot identity can be resolved.
- a test direct message or event subscription health check can run when available.
The connector should expose both "manual app credential setup" and future "install from app template" paths. The manual path is enough for the first Beaver release.
### Weixin
Weixin should use an external connector process.
Recommended first implementation:
- connector kind: `weixin`
- setup mode: local plugin command plus QR login.
- external process: required.
- runtime adapter: external bridge adapter that receives normalized events from the connector.
Required setup checks:
- local connector installed.
- connector version is compatible with Beaver.
- QR session is pending, scanned, confirmed, expired, or failed.
- login state is stored behind `credentials_ref`.
- connector heartbeat is visible.
Group delivery remains best-effort. The connector must surface group capability separately from direct message capability.
### Telegram
Telegram can be implemented as an internal connector plus internal adapter.
Recommended first implementation:
- setup mode: bot token entry.
- validation: call Telegram `getMe`.
- runtime mode: polling by default, webhook optional.
- no external process required.
The UI still treats it as a connector so users can add, validate, revoke, and restart it without editing JSON.
### QQBot
QQBot should start as an internal connector with official gateway credentials.
Recommended first implementation:
- setup fields: app id, client secret, intents or permission hints.
- runtime mode: WebSocket gateway.
- validation: token exchange or gateway auth dry run when available.
If SDK/runtime behavior later becomes easier outside Python, this connector can move to an external process without changing the runtime message contract.
### Terminal
Terminal should move from raw `peer_id` to pairing.
Recommended first implementation:
- UI creates a terminal pairing session.
- Beaver displays a command or QR/setup code.
- device connects with one-time pairing token.
- Beaver binds a stable device identity to a `ChannelConnection`.
- subsequent WebSocket `connect` frames authenticate as the bound device.
The message protocol can keep `connect`, `connected`, `message`, `ack`, and assistant `message`, but production connections must include an authenticated device token.
## Message Flow After Pairing
Once a connection is paired, the message path stays unchanged:
```text
platform or device
-> connector transport
-> ChannelAdapter
-> ChannelRuntime.accept_inbound()
-> MessageBus.inbound
-> AgentService.handle_inbound_message()
-> MessageBus.outbound
-> ChannelManager.dispatch_outbound()
-> ChannelAdapter.send()
-> connector transport
-> platform or device
```
This is intentionally conservative. Pairing changes how a channel becomes trusted and running; it does not change the agent loop.
## Access Control
Connection setup requires a Beaver user or backend owner identity. The connector layer decides who may create, view, revoke, or start a connection.
Inbound platform messages still use adapter-level policy:
- `open`: accept platform scope.
- `allowlist`: accept only known users/groups.
- `disabled`: ignore that scope.
The important change is that allowlists belong to the connection settings, not ad hoc adapter config only.
## Error Handling
Pairing errors:
- expired pairing token.
- QR not scanned before timeout.
- OAuth callback state mismatch.
- platform permission missing.
- credentials validation failed.
Runtime errors:
- adapter startup failed.
- connector process unavailable.
- heartbeat missed.
- inbound normalization failed.
- outbound delivery failed.
Each event should be recorded against `connection_id` and, when available, `channel_id` and `session_id`.
## Security Requirements
- Pairing tokens are short-lived, one-time, and scoped to one connector kind.
- QR codes never embed permanent backend credentials.
- External connector processes do not receive broad backend admin tokens.
- Revoking a connection invalidates connector tokens and stops the runtime channel.
- Stored platform credentials are referenced by `credentials_ref`.
- Event logs must not include raw secrets, tokens, QR payloads, or full platform credential responses.
## Relationship To Existing Channel Specs
The terminal WebSocket spec remains valid as a development transport spec, but production terminal setup must add pairing.
The chat platform adapter spec remains valid as a runtime adapter spec, but these statements should be revised before implementation:
- "Do not introduce a Node sidecar as the default channel architecture" should become "Use internal adapters by default, but allow external connector processes where platform SDK or login state requires them."
- "Pairing is out of scope for this phase" should become "Pairing is owned by the connector layer; adapters assume a materialized connection."
- Static `BeaverConfig.channels` should become a development override and backward-compatible import path, not the only source of runtime channels.
## Rollout
Implement in this order:
1. `ChannelConnectionStore`, connector registry, and connection status APIs.
2. Telegram connector as the simplest token-based setup path.
3. Terminal pairing to remove raw unauthenticated `peer_id` usage.
4. Feishu/Lark connector with WebSocket long-connection mode and credential validation.
5. Weixin external connector bridge with QR pairing.
6. QQBot connector after the common credential and gateway patterns are stable.
This order proves the common connector lifecycle with a low-risk token channel before adding QR and external process complexity.
## Testing
Add unit tests for:
- connection store create/update/revoke.
- pairing token expiry and one-time use.
- connector registry dispatch by kind.
- materializing runtime specs from connections.
- secret redaction in events.
- adapter runtime still receiving normalized `InboundMessage`.
Add integration-style tests with fake connectors for:
- successful token setup.
- QR expired and QR completed.
- external connector heartbeat loss.
- revoke stops runtime dispatch.
Live platform tests remain manual or gated behind explicit environment variables.
## Acceptance Criteria
- A user can add a channel connection without editing backend JSON.
- Beaver can show setup state separately from runtime adapter state.
- Telegram can validate a bot token and materialize a runtime channel.
- Terminal can bind through a one-time pairing flow.
- Feishu/Lark design allows official SDK or Node connector use when needed.
- Weixin design requires an external connector and QR login state.
- Existing channel runtime message flow remains bus-first and adapter-mediated.