diff --git a/docs/superpowers/specs/2026-06-02-external-sidecar-connectors-design.md b/docs/superpowers/specs/2026-06-02-external-sidecar-connectors-design.md index 32ce091..b6c851e 100644 --- a/docs/superpowers/specs/2026-06-02-external-sidecar-connectors-design.md +++ b/docs/superpowers/specs/2026-06-02-external-sidecar-connectors-design.md @@ -15,6 +15,8 @@ This design intentionally fixes four architecture constraints before implementat - The sidecar is generic. Beaver depends on a connector HTTP contract, not on one vendor runtime. - Pairing is modeled as a broader `ConnectorSession`, because Feishu/Lark install/link flows are not only QR pairing. - Bridge events include `eventId`, `timestamp`, and `deliveryAttempt`, and Beaver dedupes bridge events before they can trigger duplicate agent replies. +- Bridge authentication is service-level in the first version. The shared connector token lives in environment variables, not per-connection credentials. +- Outbound sidecar sends include a required `requestId` so sidecar retries are idempotent. - Connected sessions dynamically register runtime channels. A successful Weixin or Feishu/Lark connection must not require a Beaver restart. ## Scope @@ -60,7 +62,7 @@ Beaver owns: - connection state in `ChannelConnectionStore` - credential references in `CredentialStore` - connector session state exposed to the web UI -- bridge endpoint authentication +- service-level connector authentication - bridge event dedupe - normalized runtime message admission - runtime channel lifecycle @@ -161,12 +163,14 @@ ChannelConfig( "connectionId": "conn_...", "sidecarBaseUrl": "http://external-connector:8787", }, - secrets={"bridgeToken": "..."}, + secrets={}, ) ``` The original `ChannelConnection.kind` remains `weixin` or `feishu`; only the runtime transport kind is generic. +`ExternalConnectorChannel` authenticates outbound calls with the service-level connector token configured in Beaver's process environment, not with a per-channel secret. The same first-version deployment may use one shared token value for both directions, exposed as `EXTERNAL_CONNECTOR_TOKEN` to Beaver and `BEAVER_BRIDGE_TOKEN` to the sidecar. + ## Dynamic Runtime Activation A connected connector session must activate without restarting Beaver. @@ -181,6 +185,15 @@ async def remove_channel(self, channel_id: str) -> None: ... ``` +`add_channel()` must run under a runtime lifecycle lock and has deterministic duplicate semantics: + +- Same `channel_id` and same effective `ChannelConfig`: no-op. +- Same `channel_id` and changed effective `ChannelConfig`: build and start the replacement adapter before swapping it into the manager; after the swap succeeds, stop the old adapter. +- Replacement start failure: keep the old adapter registered and running, and return the failure to the caller. +- First registration after runtime start: build the adapter, register it, then start only that adapter. + +`remove_channel()` must also run under the lifecycle lock. Missing channel ids are no-op; existing channels are stopped and unregistered. + When a connector session reaches `connected`: ```text @@ -216,6 +229,7 @@ services: environment: BEAVER_BRIDGE_BASE_URL: http://app-instance:8080 BEAVER_BRIDGE_TOKEN: ${BEAVER_BRIDGE_TOKEN} + CONNECTOR_API_TOKEN: ${EXTERNAL_CONNECTOR_TOKEN} CONNECTOR_HOME: /var/lib/external-connector CONNECTOR_PROVIDER: vendor_cli volumes: @@ -227,6 +241,7 @@ For the current `create-instance.sh`-style deployment, the implementation adds: - `docker-compose.external-connectors.yml` for local/development sidecar tests. - documentation for attaching `external-connector` to the same Docker network as the target app instance. - instance environment `EXTERNAL_CONNECTOR_BASE_URL=http://external-connector:8787`. +- instance environment `EXTERNAL_CONNECTOR_TOKEN=`. The implementation must not depend on Beaver mounting `/var/run/docker.sock`. @@ -274,11 +289,12 @@ POST /send "channelId": "weixin-main", "displayName": "Weixin Main", "callbackBaseUrl": "http://app-instance:8080", - "bridgeToken": "...", "options": {} } ``` +The sidecar authenticates the connector-session request with `Authorization: Bearer `. It already has `BEAVER_BRIDGE_TOKEN` from its environment, so Beaver does not send bridge tokens in connector-session bodies. + For Feishu/Lark, `kind` is `feishu` and `options` may include `domain`, `mode`, and optional app credentials when linking an existing bot. If using the official plugin installer to create a bot, the sidecar starts that installer flow and reports QR, instruction, or action status back to Beaver. `GET /connector-sessions/{session_id}` response: @@ -315,6 +331,7 @@ Allowed connector session statuses: ```json { + "requestId": "out_...", "connectionId": "conn_...", "channelId": "weixin-main", "kind": "weixin", @@ -330,6 +347,8 @@ Allowed connector session statuses: } ``` +`requestId` is required. Beaver must generate a stable request id for each outbound delivery attempt from the outbound message identity, and must reuse the same `requestId` if the same outbound delivery is retried. The sidecar dedupes `connectionId + requestId`; duplicate requests return the original send result and must not send a second platform message. + ## Beaver Bridge API Add a backend bridge endpoint for sidecar inbound messages: @@ -338,7 +357,7 @@ Add a backend bridge endpoint for sidecar inbound messages: POST /api/channel-connector-bridge/events ``` -The sidecar must authenticate every bridge request using a bearer token scoped to the connector service. Beaver rejects missing or invalid bridge tokens. +The sidecar must authenticate every bridge request using the service-level bearer token from `BEAVER_BRIDGE_TOKEN`. Beaver rejects missing or invalid bridge tokens. Bridge tokens are deployment secrets, not connection records. Bridge event body: @@ -405,7 +424,12 @@ class ConnectorMessageDedupeRecord: - `completed` - `failed` -If a duplicate bridge event arrives while the record is `processing` or `completed`, Beaver returns an idempotent success response and does not call `ChannelRuntime.accept_inbound()` again. +Duplicate handling: + +- `completed`: return idempotent success and do not call `ChannelRuntime.accept_inbound()` again. +- `processing` updated less than 60 seconds ago: return `409 Conflict` with `{"retryAfterSeconds": 5}` so the sidecar retries later. +- `processing` updated 60 seconds or more ago: treat the record as stale, increment `delivery_attempts`, update `updated_at`, and reprocess the event. +- `failed`: allow reprocessing on the next delivery attempt, increment `delivery_attempts`, and clear `last_error` before calling runtime. This store is separate from runtime session dedupe. Runtime dedupe still protects platform message identity, while bridge dedupe protects connector retries. @@ -419,7 +443,7 @@ Responsibilities: - start Weixin connector session through sidecar `/connector-sessions` - poll sidecar connector session status - create or update `ChannelConnection` -- store bridge token and sidecar connection state reference in `CredentialStore` +- store sidecar connection state reference in `CredentialStore` when the provider returns one - validate by checking sidecar connection status - materialize runtime config for `ExternalConnectorChannel` - activate runtime via `ChannelRuntime.add_channel()` when connected @@ -470,15 +494,18 @@ The old `/api/channels` static config editor may remain for advanced runtime con - QR expired: status `expired`, user can start a new connector session. - Bridge token invalid: reject with `401`, record event without platform secret values. - Unknown connection id in bridge event: reject with `404`. -- Duplicate bridge event: return idempotent success and do not call runtime again. +- Duplicate completed bridge event: return idempotent success and do not call runtime again. +- Duplicate in-flight bridge event: return `409 Conflict` until the 60-second processing TTL expires, then allow one reprocess. - Outbound send failure: mark outbound delivery failed and record connector error. +- Duplicate outbound send `requestId`: sidecar returns the original send result and does not send a second platform message. - Sidecar restart: persisted provider state should survive through sidecar volume. ## Security - Beaver never logs raw tokens, app secrets, bridge tokens, or sidecar connection tokens. -- Bridge token is generated by Beaver and stored behind `credentials_ref`. -- Sidecar can only call bridge endpoints with its bridge token. +- Bridge authentication uses a service-level token from environment variables. It is not stored per connection and is never returned by APIs. +- Sidecar can only call bridge endpoints with the service-level bridge token. +- Beaver can only call sidecar control and send endpoints with the service-level connector token. - Sidecar state volume contains login state and must be treated as sensitive. - Feishu user-identity mode has stronger privacy risk than bot-identity mode; UI must label it clearly if exposed. @@ -489,11 +516,14 @@ Backend unit tests: - sidecar client fake for Weixin connector session start/status/logout/send - sidecar client fake for Feishu connector session start/status/logout/send - `ExternalConnectorChannel.send()` target mapping +- `ExternalConnectorChannel.send()` includes stable `requestId` and connector bearer auth - `ChannelRuntime.add_channel()` dynamically starts and registers a channel +- `ChannelRuntime.add_channel()` no-ops for identical config, replaces changed config, and keeps the old channel if replacement start fails - `ChannelRuntime.remove_channel()` stops and unregisters a channel - bridge endpoint accepts valid events - bridge endpoint rejects invalid token and unknown connection id - bridge endpoint dedupes repeated `eventId` and calls runtime once +- bridge endpoint returns `409 Conflict` for non-stale `processing` duplicates and reprocesses stale records - registry lists `telegram`, `weixin`, and `feishu` - materialized sidecar connections produce `ChannelConfig(kind="external_connector", mode="http")` compatible with runtime factory @@ -502,6 +532,7 @@ Sidecar tests: - HTTP API shape for health/connectors/connector-sessions/send - fake provider status transitions - provider command runner error redaction +- send idempotency for duplicate `connectionId + requestId` Frontend tests: