feat(engine): 优化智能体循环中的助手消息处理逻辑 - 在没有工具调用时才添加助手消息到上下文 - 确保工具调用响应正确添加到消息上下文中 - 修复了消息构建的条件逻辑 fix(cron): 改进定时任务调度的时间解析功能 - 添加正则表达式导入用于时间显示解析 - 实现从显示文本中提取毫秒间隔的功能 - 增强整数转换的安全性,避免类型错误 - 优化定时任务配置的解析逻辑 feat(outlook): 增强Outlook集成的功能和稳定性 - 将默认超时时间从10秒增加到180秒 - 为状态检查函数添加可选的验证参数 - 串行执行邮件概览获取操作而非并行 - 改进连接状态验证逻辑 feat(channel): 添加设备名称作为会话标识的选项 - 为终端WebSocket适配器添加新的配置选项 - 实现基于设备名称生成会话对等ID的功能 - 记录原始对等ID和设备名称的元数据 - 支持从设备名称创建会话对等ID feat(skills): 完善技能学习评估系统和进度跟踪 - 在应用启动时自动调度待评估的技能草稿 - 为技能评估工作创建独立的循环工厂 - 实现异步技能评估任务的取消和清理机制 - 添加技能评估进度报告和状态跟踪功能 - 扩展会话列表API以包含更多详细信息 - 防止对不存在的会话进行操作 - 优化技能草稿提交和评估的业务逻辑 perf(skills): 提升技能评估的并发性能 - 实现并行技能案例评估以提高效率 - 添加最大并行案例数的环境变量控制 - 实现实时评估进度更新和回调机制 - 优化评估过程中的资源管理和同步 refactor(services): 创建隔离的智能体循环实例 - 添加创建独立智能体循环的工厂方法 - 确保新循环继承运行时服务配置 - 支持技能评估等需要隔离环境的场景 ```
15 KiB
Beaver Plugin Skill Mirroring Design
Decision
Beaver V1 plugins are declarative skill bundles. Enabling a plugin mirrors each declared
SKILL.md and its supporting files into SkillSpecStore. From that point onward, the
mirrored skill is a normal Beaver skill:
- it has the same resolver priority as any workspace-managed skill;
- runtime activation, receipts, performance scoring, replay evaluation, review, publish, rollback, and disable all use the existing skill lifecycle;
- self-learning only writes Beaver-managed versions and never edits the plugin package;
- plugin origin remains metadata, not a separate runtime class.
An arbitrary in-process Python entrypoint, hooks, providers, and custom runtime code are out of scope for this plan. Tool-providing plugins should continue to use MCP until a separate executable-plugin security design is approved.
Why The Proposed Flow Is Correct
The proposed "mirror, learn on the mirror, merge on plugin update, then evaluate" flow is correct with one important refinement: plugin upgrades must be treated as a three-way merge, not a two-document rewrite.
The three inputs are:
B, the last accepted upstream plugin snapshot;L, the current Beaver-published skill, including local self-learning;U, the newly discovered upstream plugin snapshot.
This distinction prevents a plugin update from silently deleting local learning and prevents local learning from silently discarding new upstream safety or workflow changes.
Package Contract
Each plugin directory contains beaver.plugin.json:
{
"schema_version": 1,
"id": "baoyu-comic",
"name": "Baoyu Comic",
"version": "1.2.0",
"skills": [
{
"name": "baoyu-comic",
"path": "skills/baoyu-comic"
}
]
}
Rules:
idand skill names use lowercase letters, digits,_, and-.- Skill paths are relative to the plugin root and cannot escape it.
- Every skill directory must contain
SKILL.md. - Symlinks are rejected while mirroring.
- Two enabled plugins cannot own the same Beaver skill name.
- A plugin cannot overwrite an existing non-plugin workspace skill.
- Discovery does not enable a plugin. Enablement is an explicit admin action.
Storage Model
Plugin packages remain outside the managed skill version tree:
workspace/
plugins/
baoyu-comic/
beaver.plugin.json
skills/baoyu-comic/SKILL.md
.beaver/
plugins/state.json
skills/
baoyu-comic/
skill.json
current.json
upstreams/
baoyu-comic/
<tree-hash>/
upstream.json
SKILL.md
assets/...
versions/
v0001/
version.json
SKILL.md
assets/...
upstreams/ stores immutable raw plugin snapshots. versions/ stores runtime-visible
Beaver versions. A merged Beaver version may differ from its upstream snapshot.
Every upstream snapshot has two hashes:
skill_content_hash: canonical hash of normalizedSKILL.md; used by the LLM merge and skill-content preservation checks.skill_tree_hash: hash of every regular file in the skill tree, including normalized relative path, byte length, bytes, and executable-bit metadata. Symlinks are rejected. This is the supply-chain identity used for update detection and state.
The tree hash includes SKILL.md, templates, assets, examples, and scripts. Full Unix
mode bits are not hashed because umask and extraction tools can change them; only whether
any executable bit is set is normalized into the hash. Beaver metadata files such as
version.json and upstream.json are excluded.
Every Beaver SkillVersion also stores a backward-compatible tree_hash. New versions
compute it from the complete promoted version directory. Older versions without the field
derive it on read, so L.tree is available for upgrade classification.
Plugin state records:
{
"plugins": {
"baoyu-comic": {
"enabled": true,
"installed_version": "1.2.0",
"manifest_path": "plugins/baoyu-comic/beaver.plugin.json",
"updates_paused": false,
"skills": {
"baoyu-comic": {
"accepted_upstream_tree_hash": "sha256...",
"observed_upstream_tree_hash": "sha256...",
"accepted_beaver_version": "v0003",
"current_beaver_version": "v0003",
"pending_candidate_id": null,
"status": "synced"
}
}
}
}
}
Skill versions and drafts also carry plugin provenance. State is operational metadata; version provenance is the durable audit record.
Initial Enable Flow
When an admin enables a valid plugin:
- Discover and validate the manifest.
- Copy each declared skill into an immutable upstream snapshot.
- Reject ownership/name conflicts before changing any skill.
- Run the existing deterministic skill safety checker against an in-memory initial-mirror draft and reject failed or critical results.
- Publish an exact Beaver mirror as the next skill version.
- Copy supporting files into that version.
- Mark the skill
source_kind="plugin"and active. - Record plugin ID, plugin version, source path, upstream hash, and mirror mode in
SkillVersion.provenance. - Update plugin state only after all declared skills succeed.
Initial enable is an explicit trust action, so it does not require LLM synthesis. Manifest validation, path validation, and the existing static skill safety checks still apply.
All files are first written below a transaction staging directory on the same filesystem.
Only after manifest validation, tree hashing, conflict checks, and safety checks pass are
immutable upstream/version directories promoted with os.replace(). current.json,
skill.json, and indexes are atomically replaced under the workspace write lock; plugin
state is written last. A failed transaction may leave an unreferenced immutable directory,
which cleanup can remove, but it cannot make a partial version runtime-visible.
For a new skill, the complete staged skill directory is promoted once. For an existing
skill, immutable directories and metadata are promoted first and current.json is
replaced last as the runtime visibility switch. This provides per-skill atomic visibility;
the workspace lock serializes writers across a multi-skill plugin operation.
Runtime Priority
Mirrored plugin skills are loaded exclusively from SkillSpecStore. They are not supplied
through SkillsLoader.extra_dirs.
This makes priority deterministic:
- active published workspace versions, including plugin-origin versions;
- builtin skills.
source_kind="plugin" is displayed for provenance but does not lower selection priority
or exclude the skill from self-learning.
Upgrade Classification
For each linked skill, compare upstream tree hashes:
| Condition | Action |
|---|---|
U.tree == B.tree |
No upstream change; no action. |
L.tree == U.tree |
Acknowledge the new upstream snapshot; no draft needed. |
L.tree == B.tree and U.tree != B.tree |
Create a deterministic fast_forward plugin update draft containing U. |
L.tree != B.tree and U.tree != B.tree |
Create a three_way plugin update candidate using B, L, and U. |
Even the fast_forward case goes through safety, replay evaluation, review, and publish.
It skips LLM merge synthesis because there is no local divergence.
Candidate IDs are deterministic:
plugin-update:<plugin-id>:<skill-name>:<new-upstream-hash-prefix>
This makes boot-time sync idempotent.
Supporting files use a deterministic path-level three-way merge:
- local unchanged from
B: takeU; - upstream unchanged from
B: keepL; - both sides equal: keep either;
- a file added on only one side: keep it;
- divergent edits, delete-versus-edit, or different additions at the same path: record an unresolved file conflict and block publication.
The LLM merges only SKILL.md. It does not attempt to merge arbitrary or binary
supporting files.
Learning Integration
Add candidate kind plugin_skill_update. Its evidence contains only references:
{
"plugin_id": "baoyu-comic",
"plugin_version": "1.2.0",
"skill_name": "baoyu-comic",
"merge_mode": "three_way",
"base_upstream_tree_hash": "old-hash",
"new_upstream_tree_hash": "new-hash",
"local_version": "v0003"
}
The learning service resolves the actual snapshots from SkillSpecStore; raw skill
content is not duplicated into learning-candidates.jsonl.
For three_way, the synthesizer receives:
- old upstream
B; - current local skill
L; - new upstream
U; - relevant historical run evidence for
L, when available.
The synthesizer must return the merged skill plus explicit merge decisions:
{
"frontmatter": {},
"content": "...",
"change_reason": "...",
"preserved_local_sections": [],
"adopted_upstream_sections": [],
"resolved_conflicts": [],
"dropped_sections": []
}
The generated draft uses proposal_kind="plugin_skill_update" and carries the complete
plugin merge provenance.
Evaluation And Publish Gates
The existing flow remains authoritative:
candidate
-> draft
-> static safety
-> replay eval
-> review
-> publish
-> rollback if needed
Replay eval compares:
- baseline arm: current local version
L; - candidate arm: merged draft
M.
The preservation report is extended for plugin updates:
- local preservation: important instructions from
Lare not silently removed; - upstream adoption: new important instructions from
Uare represented; - safety/tool preservation: Safety and Required Tools changes require explicit handling;
- unresolved conflicts cause evaluation failure.
Publishing is blocked when:
- static safety fails;
- replay evaluation regresses;
- confidence is low under the existing gate;
- local or upstream preservation fails;
- merge decisions contain unresolved
SKILL.mdconflicts; - the supporting-file merge plan contains unresolved path/content conflicts.
On publish, the pipeline notifies PluginManager, which advances
accepted_upstream_tree_hash, clears the pending candidate, and records the new Beaver
version.
Observer delivery is not the source of truth. At the start of every sync, reconciliation
inspects the current published version provenance. If it contains a valid, newer
plugin_skill_update result and its upstream snapshot exists, plugin state is repaired:
- advance
accepted_upstream_tree_hash; - advance
accepted_beaver_version; - clear the matching pending candidate;
- set status to
synced.
Reconciliation never moves accepted_beaver_version backwards after a runtime rollback.
An observer failure is audited but does not make an already-successful publish request
fail, which avoids client retries creating misleading duplicate operations.
Concurrent And Failure Behavior
- All plugin sync, skill version allocation/publication, plugin state mutation, and
learning-candidate mutation share a reentrant cross-process workspace write lock at
.beaver/locks/plugin-skill-write.lock. - The lock uses the repository's existing
fcntl/msvcrtpattern plus an in-process reentrant guard. Nested store calls reuse the held lock instead of deadlocking. - Candidate existence checks and JSONL writes happen inside the lock.
- Version-number allocation and version promotion happen inside the lock.
- Explicit enable/sync waits for the lock with a bounded timeout and returns a busy error on timeout.
- Engine boot never calls an LLM. Its auto-sync uses a non-blocking lock attempt; when the lock is busy, boot proceeds with the current published skills and reports sync deferred.
- Repeated and concurrent boot/sync is idempotent across processes, not only within one Python object.
- If another active draft targets the same skill, the plugin update remains pending and is not synthesized until the skill is free.
- If a newer plugin version appears while an older update is pending, the old candidate is marked superseded and a new candidate is created against the last accepted upstream.
- Rejecting a draft preserves the plugin package, upstream snapshots, current skill, and candidate audit history. Regeneration remains possible.
- Partial multi-skill plugin enable never promotes metadata/current pointers until every staged skill passes validation.
- Plugin files are never modified by learning or publication.
Pause, Disable, Missing, And Adopt
- Pausing updates suspends discovery-to-candidate sync while linked skills remain active.
- Resuming updates reconciles state and performs sync.
- Disabling a plugin is an explicit destructive runtime action: it pauses updates and
disables linked skills, but never deletes versions or upstream snapshots. The API
requires an explicit
disable_linked_skills=trueconfirmation. - Re-enabling restores linked skills and performs sync.
- A missing plugin package is a supply-chain status only. It marks the plugin
missing, suspends sync/update, and leaves the current Beaver skills active. - An explicit
adoptaction detaches a skill from its plugin, changessource_kindtomanaged, keeps the current version active, and prevents future plugin updates from targeting it.
Management API And UI
Backend endpoints:
GET /api/plugins
POST /api/plugins/sync
POST /api/plugins/{plugin_id}/enable
POST /api/plugins/{plugin_id}/pause
POST /api/plugins/{plugin_id}/resume
POST /api/plugins/{plugin_id}/disable
POST /api/plugins/{plugin_id}/skills/{skill_name}/adopt
API payloads never expose absolute server paths. Workspace manifests use workspace-relative
paths. External manifests use a redacted display path such as
<external>/baoyu-comic/beaver.plugin.json.
The existing Skills page gains a Plugins tab showing:
- discovered/enabled/missing/error state;
- installed and discovered plugin versions;
- declared skills and their current Beaver versions;
- sync state and pending learning candidate;
- enable, pause, resume, disable, sync, and adopt actions.
Plugin-origin skills continue to appear in the normal Published, Candidates, and Drafts tabs with provenance and merge-mode labels.
Non-Goals
- Importing arbitrary plugin Python modules into the Beaver process.
- Plugin-defined hooks, providers, channels, or frontend bundles.
- Automatic downloading from a plugin marketplace.
- Automatically publishing plugin upgrades without review.
- Editing or rebasing plugin source files in place.
Acceptance Criteria
- Enabling a plugin mirrors all declared skills and supporting files into managed versions.
- Mirrored skills have the same resolver priority and learning eligibility as ordinary workspace skills.
- Self-learning never modifies the plugin package.
- Plugin updates create idempotent
plugin_skill_updatecandidates. - Local divergence triggers a three-way merge; no divergence triggers a deterministic fast-forward draft.
- Every plugin update passes the existing safety, replay, review, and publish gates.
- Publishing advances plugin state and preserves complete provenance.
- Pause, disable, missing package, rejection, restart, and newer-update races do not lose the current skill or its learned versions; missing packages leave current skills active.
- Existing non-plugin skills and legacy candidate payloads remain backward compatible.
- Supporting-file-only updates change the upstream tree hash and create an update candidate.
- Concurrent boot, sync, and enable operations do not allocate duplicate versions or append duplicate candidates.
- Sync reconciliation repairs plugin state after a published version succeeds but its observer/state update fails.