```

feat(engine): 优化智能体循环中的助手消息处理逻辑 - 在没有工具调用时才添加助手消息到上下文 - 确保工具调用响应正确添加到消息上下文中 - 修复了消息构建的条件逻辑 fix(cron): 改进定时任务调度的时间解析功能 - 添加正则表达式导入用于时间显示解析 - 实现从显示文本中提取毫秒间隔的功能 - 增强整数转换的安全性，避免类型错误 - 优化定时任务配置的解析逻辑 feat(outlook): 增强Outlook集成的功能和稳定性 - 将默认超时时间从10秒增加到180秒 - 为状态检查函数添加可选的验证参数 - 串行执行邮件概览获取操作而非并行 - 改进连接状态验证逻辑 feat(channel): 添加设备名称作为会话标识的选项 - 为终端WebSocket适配器添加新的配置选项 - 实现基于设备名称生成会话对等ID的功能 - 记录原始对等ID和设备名称的元数据 - 支持从设备名称创建会话对等ID feat(skills): 完善技能学习评估系统和进度跟踪 - 在应用启动时自动调度待评估的技能草稿 - 为技能评估工作创建独立的循环工厂 - 实现异步技能评估任务的取消和清理机制 - 添加技能评估进度报告和状态跟踪功能 - 扩展会话列表API以包含更多详细信息 - 防止对不存在的会话进行操作 - 优化技能草稿提交和评估的业务逻辑 perf(skills): 提升技能评估的并发性能 - 实现并行技能案例评估以提高效率 - 添加最大并行案例数的环境变量控制 - 实现实时评估进度更新和回调机制 - 优化评估过程中的资源管理和同步 refactor(services): 创建隔离的智能体循环实例 - 添加创建独立智能体循环的工厂方法 - 确保新循环继承运行时服务配置 - 支持技能评估等需要隔离环境的场景 ```
2026-06-15 14:48:16 +08:00
parent 8aeb97a5fc
commit 4b0bf65ace
53 changed files with 4328 additions and 292 deletions
--- a/docs/superpowers/specs/2026-06-15-plugin-skill-mirroring-design.md
+++ b/docs/superpowers/specs/2026-06-15-plugin-skill-mirroring-design.md
@ -0,0 +1,409 @@
+# Beaver Plugin Skill Mirroring Design
+
+## Decision
+
+Beaver V1 plugins are declarative skill bundles. Enabling a plugin mirrors each declared
+`SKILL.md` and its supporting files into `SkillSpecStore`. From that point onward, the
+mirrored skill is a normal Beaver skill:
+
+- it has the same resolver priority as any workspace-managed skill;
+- runtime activation, receipts, performance scoring, replay evaluation, review, publish,
+  rollback, and disable all use the existing skill lifecycle;
+- self-learning only writes Beaver-managed versions and never edits the plugin package;
+- plugin origin remains metadata, not a separate runtime class.
+
+An arbitrary in-process Python entrypoint, hooks, providers, and custom runtime code are
+out of scope for this plan. Tool-providing plugins should continue to use MCP until a
+separate executable-plugin security design is approved.
+
+## Why The Proposed Flow Is Correct
+
+The proposed "mirror, learn on the mirror, merge on plugin update, then evaluate" flow is
+correct with one important refinement: plugin upgrades must be treated as a three-way
+merge, not a two-document rewrite.
+
+The three inputs are:
+
+1. `B`, the last accepted upstream plugin snapshot;
+2. `L`, the current Beaver-published skill, including local self-learning;
+3. `U`, the newly discovered upstream plugin snapshot.
+
+This distinction prevents a plugin update from silently deleting local learning and
+prevents local learning from silently discarding new upstream safety or workflow changes.
+
+## Package Contract
+
+Each plugin directory contains `beaver.plugin.json`:
+
+```json
+{
+  "schema_version": 1,
+  "id": "baoyu-comic",
+  "name": "Baoyu Comic",
+  "version": "1.2.0",
+  "skills": [
+    {
+      "name": "baoyu-comic",
+      "path": "skills/baoyu-comic"
+    }
+  ]
+}
+```
+
+Rules:
+
+- `id` and skill names use lowercase letters, digits, `_`, and `-`.
+- Skill paths are relative to the plugin root and cannot escape it.
+- Every skill directory must contain `SKILL.md`.
+- Symlinks are rejected while mirroring.
+- Two enabled plugins cannot own the same Beaver skill name.
+- A plugin cannot overwrite an existing non-plugin workspace skill.
+- Discovery does not enable a plugin. Enablement is an explicit admin action.
+
+## Storage Model
+
+Plugin packages remain outside the managed skill version tree:
+
+```text
+workspace/
+  plugins/
+    baoyu-comic/
+      beaver.plugin.json
+      skills/baoyu-comic/SKILL.md
+  .beaver/
+    plugins/state.json
+  skills/
+    baoyu-comic/
+      skill.json
+      current.json
+      upstreams/
+        baoyu-comic/
+          <tree-hash>/
+            upstream.json
+            SKILL.md
+            assets/...
+      versions/
+        v0001/
+          version.json
+          SKILL.md
+          assets/...
+```
+
+`upstreams/` stores immutable raw plugin snapshots. `versions/` stores runtime-visible
+Beaver versions. A merged Beaver version may differ from its upstream snapshot.
+
+Every upstream snapshot has two hashes:
+
+- `skill_content_hash`: canonical hash of normalized `SKILL.md`; used by the LLM merge and
+  skill-content preservation checks.
+- `skill_tree_hash`: hash of every regular file in the skill tree, including normalized
+  relative path, byte length, bytes, and executable-bit metadata. Symlinks are rejected.
+  This is the supply-chain identity used for update detection and state.
+
+The tree hash includes `SKILL.md`, templates, assets, examples, and scripts. Full Unix
+mode bits are not hashed because umask and extraction tools can change them; only whether
+any executable bit is set is normalized into the hash. Beaver metadata files such as
+`version.json` and `upstream.json` are excluded.
+
+Every Beaver `SkillVersion` also stores a backward-compatible `tree_hash`. New versions
+compute it from the complete promoted version directory. Older versions without the field
+derive it on read, so `L.tree` is available for upgrade classification.
+
+Plugin state records:
+
+```json
+{
+  "plugins": {
+    "baoyu-comic": {
+      "enabled": true,
+      "installed_version": "1.2.0",
+      "manifest_path": "plugins/baoyu-comic/beaver.plugin.json",
+      "updates_paused": false,
+      "skills": {
+        "baoyu-comic": {
+          "accepted_upstream_tree_hash": "sha256...",
+          "observed_upstream_tree_hash": "sha256...",
+          "accepted_beaver_version": "v0003",
+          "current_beaver_version": "v0003",
+          "pending_candidate_id": null,
+          "status": "synced"
+        }
+      }
+    }
+  }
+}
+```
+
+Skill versions and drafts also carry plugin provenance. State is operational metadata;
+version provenance is the durable audit record.
+
+## Initial Enable Flow
+
+When an admin enables a valid plugin:
+
+1. Discover and validate the manifest.
+2. Copy each declared skill into an immutable upstream snapshot.
+3. Reject ownership/name conflicts before changing any skill.
+4. Run the existing deterministic skill safety checker against an in-memory initial-mirror
+   draft and reject failed or critical results.
+5. Publish an exact Beaver mirror as the next skill version.
+6. Copy supporting files into that version.
+7. Mark the skill `source_kind="plugin"` and active.
+8. Record plugin ID, plugin version, source path, upstream hash, and mirror mode in
+   `SkillVersion.provenance`.
+9. Update plugin state only after all declared skills succeed.
+
+Initial enable is an explicit trust action, so it does not require LLM synthesis. Manifest
+validation, path validation, and the existing static skill safety checks still apply.
+
+All files are first written below a transaction staging directory on the same filesystem.
+Only after manifest validation, tree hashing, conflict checks, and safety checks pass are
+immutable upstream/version directories promoted with `os.replace()`. `current.json`,
+`skill.json`, and indexes are atomically replaced under the workspace write lock; plugin
+state is written last. A failed transaction may leave an unreferenced immutable directory,
+which cleanup can remove, but it cannot make a partial version runtime-visible.
+
+For a new skill, the complete staged skill directory is promoted once. For an existing
+skill, immutable directories and metadata are promoted first and `current.json` is
+replaced last as the runtime visibility switch. This provides per-skill atomic visibility;
+the workspace lock serializes writers across a multi-skill plugin operation.
+
+## Runtime Priority
+
+Mirrored plugin skills are loaded exclusively from `SkillSpecStore`. They are not supplied
+through `SkillsLoader.extra_dirs`.
+
+This makes priority deterministic:
+
+1. active published workspace versions, including plugin-origin versions;
+2. builtin skills.
+
+`source_kind="plugin"` is displayed for provenance but does not lower selection priority
+or exclude the skill from self-learning.
+
+## Upgrade Classification
+
+For each linked skill, compare upstream tree hashes:
+
+| Condition | Action |
+| --- | --- |
+| `U.tree == B.tree` | No upstream change; no action. |
+| `L.tree == U.tree` | Acknowledge the new upstream snapshot; no draft needed. |
+| `L.tree == B.tree` and `U.tree != B.tree` | Create a deterministic `fast_forward` plugin update draft containing `U`. |
+| `L.tree != B.tree` and `U.tree != B.tree` | Create a `three_way` plugin update candidate using `B`, `L`, and `U`. |
+
+Even the `fast_forward` case goes through safety, replay evaluation, review, and publish.
+It skips LLM merge synthesis because there is no local divergence.
+
+Candidate IDs are deterministic:
+
+```text
+plugin-update:<plugin-id>:<skill-name>:<new-upstream-hash-prefix>
+```
+
+This makes boot-time sync idempotent.
+
+Supporting files use a deterministic path-level three-way merge:
+
+- local unchanged from `B`: take `U`;
+- upstream unchanged from `B`: keep `L`;
+- both sides equal: keep either;
+- a file added on only one side: keep it;
+- divergent edits, delete-versus-edit, or different additions at the same path: record an
+  unresolved file conflict and block publication.
+
+The LLM merges only `SKILL.md`. It does not attempt to merge arbitrary or binary
+supporting files.
+
+## Learning Integration
+
+Add candidate kind `plugin_skill_update`. Its evidence contains only references:
+
+```json
+{
+  "plugin_id": "baoyu-comic",
+  "plugin_version": "1.2.0",
+  "skill_name": "baoyu-comic",
+  "merge_mode": "three_way",
+  "base_upstream_tree_hash": "old-hash",
+  "new_upstream_tree_hash": "new-hash",
+  "local_version": "v0003"
+}
+```
+
+The learning service resolves the actual snapshots from `SkillSpecStore`; raw skill
+content is not duplicated into `learning-candidates.jsonl`.
+
+For `three_way`, the synthesizer receives:
+
+- old upstream `B`;
+- current local skill `L`;
+- new upstream `U`;
+- relevant historical run evidence for `L`, when available.
+
+The synthesizer must return the merged skill plus explicit merge decisions:
+
+```json
+{
+  "frontmatter": {},
+  "content": "...",
+  "change_reason": "...",
+  "preserved_local_sections": [],
+  "adopted_upstream_sections": [],
+  "resolved_conflicts": [],
+  "dropped_sections": []
+}
+```
+
+The generated draft uses `proposal_kind="plugin_skill_update"` and carries the complete
+plugin merge provenance.
+
+## Evaluation And Publish Gates
+
+The existing flow remains authoritative:
+
+```text
+candidate
+  -> draft
+  -> static safety
+  -> replay eval
+  -> review
+  -> publish
+  -> rollback if needed
+```
+
+Replay eval compares:
+
+- baseline arm: current local version `L`;
+- candidate arm: merged draft `M`.
+
+The preservation report is extended for plugin updates:
+
+- local preservation: important instructions from `L` are not silently removed;
+- upstream adoption: new important instructions from `U` are represented;
+- safety/tool preservation: Safety and Required Tools changes require explicit handling;
+- unresolved conflicts cause evaluation failure.
+
+Publishing is blocked when:
+
+- static safety fails;
+- replay evaluation regresses;
+- confidence is low under the existing gate;
+- local or upstream preservation fails;
+- merge decisions contain unresolved `SKILL.md` conflicts;
+- the supporting-file merge plan contains unresolved path/content conflicts.
+
+On publish, the pipeline notifies `PluginManager`, which advances
+`accepted_upstream_tree_hash`, clears the pending candidate, and records the new Beaver
+version.
+
+Observer delivery is not the source of truth. At the start of every sync, reconciliation
+inspects the current published version provenance. If it contains a valid, newer
+`plugin_skill_update` result and its upstream snapshot exists, plugin state is repaired:
+
+- advance `accepted_upstream_tree_hash`;
+- advance `accepted_beaver_version`;
+- clear the matching pending candidate;
+- set status to `synced`.
+
+Reconciliation never moves `accepted_beaver_version` backwards after a runtime rollback.
+An observer failure is audited but does not make an already-successful publish request
+fail, which avoids client retries creating misleading duplicate operations.
+
+## Concurrent And Failure Behavior
+
+- All plugin sync, skill version allocation/publication, plugin state mutation, and
+  learning-candidate mutation share a reentrant cross-process workspace write lock at
+  `.beaver/locks/plugin-skill-write.lock`.
+- The lock uses the repository's existing `fcntl`/`msvcrt` pattern plus an in-process
+  reentrant guard. Nested store calls reuse the held lock instead of deadlocking.
+- Candidate existence checks and JSONL writes happen inside the lock.
+- Version-number allocation and version promotion happen inside the lock.
+- Explicit enable/sync waits for the lock with a bounded timeout and returns a busy error
+  on timeout.
+- Engine boot never calls an LLM. Its auto-sync uses a non-blocking lock attempt; when the
+  lock is busy, boot proceeds with the current published skills and reports sync deferred.
+- Repeated and concurrent boot/sync is idempotent across processes, not only within one
+  Python object.
+- If another active draft targets the same skill, the plugin update remains pending and
+  is not synthesized until the skill is free.
+- If a newer plugin version appears while an older update is pending, the old candidate is
+  marked superseded and a new candidate is created against the last accepted upstream.
+- Rejecting a draft preserves the plugin package, upstream snapshots, current skill, and
+  candidate audit history. Regeneration remains possible.
+- Partial multi-skill plugin enable never promotes metadata/current pointers until every
+  staged skill passes validation.
+- Plugin files are never modified by learning or publication.
+
+## Pause, Disable, Missing, And Adopt
+
+- Pausing updates suspends discovery-to-candidate sync while linked skills remain active.
+- Resuming updates reconciles state and performs sync.
+- Disabling a plugin is an explicit destructive runtime action: it pauses updates and
+  disables linked skills, but never deletes versions or upstream snapshots. The API
+  requires an explicit `disable_linked_skills=true` confirmation.
+- Re-enabling restores linked skills and performs sync.
+- A missing plugin package is a supply-chain status only. It marks the plugin `missing`,
+  suspends sync/update, and leaves the current Beaver skills active.
+- An explicit `adopt` action detaches a skill from its plugin, changes
+  `source_kind` to `managed`, keeps the current version active, and prevents future plugin
+  updates from targeting it.
+
+## Management API And UI
+
+Backend endpoints:
+
+```text
+GET  /api/plugins
+POST /api/plugins/sync
+POST /api/plugins/{plugin_id}/enable
+POST /api/plugins/{plugin_id}/pause
+POST /api/plugins/{plugin_id}/resume
+POST /api/plugins/{plugin_id}/disable
+POST /api/plugins/{plugin_id}/skills/{skill_name}/adopt
+```
+
+API payloads never expose absolute server paths. Workspace manifests use workspace-relative
+paths. External manifests use a redacted display path such as
+`<external>/baoyu-comic/beaver.plugin.json`.
+
+The existing Skills page gains a Plugins tab showing:
+
+- discovered/enabled/missing/error state;
+- installed and discovered plugin versions;
+- declared skills and their current Beaver versions;
+- sync state and pending learning candidate;
+- enable, pause, resume, disable, sync, and adopt actions.
+
+Plugin-origin skills continue to appear in the normal Published, Candidates, and Drafts
+tabs with provenance and merge-mode labels.
+
+## Non-Goals
+
+- Importing arbitrary plugin Python modules into the Beaver process.
+- Plugin-defined hooks, providers, channels, or frontend bundles.
+- Automatic downloading from a plugin marketplace.
+- Automatically publishing plugin upgrades without review.
+- Editing or rebasing plugin source files in place.
+
+## Acceptance Criteria
+
+1. Enabling a plugin mirrors all declared skills and supporting files into managed
+   versions.
+2. Mirrored skills have the same resolver priority and learning eligibility as ordinary
+   workspace skills.
+3. Self-learning never modifies the plugin package.
+4. Plugin updates create idempotent `plugin_skill_update` candidates.
+5. Local divergence triggers a three-way merge; no divergence triggers a deterministic
+   fast-forward draft.
+6. Every plugin update passes the existing safety, replay, review, and publish gates.
+7. Publishing advances plugin state and preserves complete provenance.
+8. Pause, disable, missing package, rejection, restart, and newer-update races do not lose
+   the current skill or its learned versions; missing packages leave current skills active.
+9. Existing non-plugin skills and legacy candidate payloads remain backward compatible.
+10. Supporting-file-only updates change the upstream tree hash and create an update
+    candidate.
+11. Concurrent boot, sync, and enable operations do not allocate duplicate versions or
+    append duplicate candidates.
+12. Sync reconciliation repairs plugin state after a published version succeeds but its
+    observer/state update fails.