feat(engine): 优化智能体循环中的助手消息处理逻辑 - 在没有工具调用时才添加助手消息到上下文 - 确保工具调用响应正确添加到消息上下文中 - 修复了消息构建的条件逻辑 fix(cron): 改进定时任务调度的时间解析功能 - 添加正则表达式导入用于时间显示解析 - 实现从显示文本中提取毫秒间隔的功能 - 增强整数转换的安全性,避免类型错误 - 优化定时任务配置的解析逻辑 feat(outlook): 增强Outlook集成的功能和稳定性 - 将默认超时时间从10秒增加到180秒 - 为状态检查函数添加可选的验证参数 - 串行执行邮件概览获取操作而非并行 - 改进连接状态验证逻辑 feat(channel): 添加设备名称作为会话标识的选项 - 为终端WebSocket适配器添加新的配置选项 - 实现基于设备名称生成会话对等ID的功能 - 记录原始对等ID和设备名称的元数据 - 支持从设备名称创建会话对等ID feat(skills): 完善技能学习评估系统和进度跟踪 - 在应用启动时自动调度待评估的技能草稿 - 为技能评估工作创建独立的循环工厂 - 实现异步技能评估任务的取消和清理机制 - 添加技能评估进度报告和状态跟踪功能 - 扩展会话列表API以包含更多详细信息 - 防止对不存在的会话进行操作 - 优化技能草稿提交和评估的业务逻辑 perf(skills): 提升技能评估的并发性能 - 实现并行技能案例评估以提高效率 - 添加最大并行案例数的环境变量控制 - 实现实时评估进度更新和回调机制 - 优化评估过程中的资源管理和同步 refactor(services): 创建隔离的智能体循环实例 - 添加创建独立智能体循环的工厂方法 - 确保新循环继承运行时服务配置 - 支持技能评估等需要隔离环境的场景 ```
410 lines
15 KiB
Markdown
410 lines
15 KiB
Markdown
# Beaver Plugin Skill Mirroring Design
|
|
|
|
## Decision
|
|
|
|
Beaver V1 plugins are declarative skill bundles. Enabling a plugin mirrors each declared
|
|
`SKILL.md` and its supporting files into `SkillSpecStore`. From that point onward, the
|
|
mirrored skill is a normal Beaver skill:
|
|
|
|
- it has the same resolver priority as any workspace-managed skill;
|
|
- runtime activation, receipts, performance scoring, replay evaluation, review, publish,
|
|
rollback, and disable all use the existing skill lifecycle;
|
|
- self-learning only writes Beaver-managed versions and never edits the plugin package;
|
|
- plugin origin remains metadata, not a separate runtime class.
|
|
|
|
An arbitrary in-process Python entrypoint, hooks, providers, and custom runtime code are
|
|
out of scope for this plan. Tool-providing plugins should continue to use MCP until a
|
|
separate executable-plugin security design is approved.
|
|
|
|
## Why The Proposed Flow Is Correct
|
|
|
|
The proposed "mirror, learn on the mirror, merge on plugin update, then evaluate" flow is
|
|
correct with one important refinement: plugin upgrades must be treated as a three-way
|
|
merge, not a two-document rewrite.
|
|
|
|
The three inputs are:
|
|
|
|
1. `B`, the last accepted upstream plugin snapshot;
|
|
2. `L`, the current Beaver-published skill, including local self-learning;
|
|
3. `U`, the newly discovered upstream plugin snapshot.
|
|
|
|
This distinction prevents a plugin update from silently deleting local learning and
|
|
prevents local learning from silently discarding new upstream safety or workflow changes.
|
|
|
|
## Package Contract
|
|
|
|
Each plugin directory contains `beaver.plugin.json`:
|
|
|
|
```json
|
|
{
|
|
"schema_version": 1,
|
|
"id": "baoyu-comic",
|
|
"name": "Baoyu Comic",
|
|
"version": "1.2.0",
|
|
"skills": [
|
|
{
|
|
"name": "baoyu-comic",
|
|
"path": "skills/baoyu-comic"
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
Rules:
|
|
|
|
- `id` and skill names use lowercase letters, digits, `_`, and `-`.
|
|
- Skill paths are relative to the plugin root and cannot escape it.
|
|
- Every skill directory must contain `SKILL.md`.
|
|
- Symlinks are rejected while mirroring.
|
|
- Two enabled plugins cannot own the same Beaver skill name.
|
|
- A plugin cannot overwrite an existing non-plugin workspace skill.
|
|
- Discovery does not enable a plugin. Enablement is an explicit admin action.
|
|
|
|
## Storage Model
|
|
|
|
Plugin packages remain outside the managed skill version tree:
|
|
|
|
```text
|
|
workspace/
|
|
plugins/
|
|
baoyu-comic/
|
|
beaver.plugin.json
|
|
skills/baoyu-comic/SKILL.md
|
|
.beaver/
|
|
plugins/state.json
|
|
skills/
|
|
baoyu-comic/
|
|
skill.json
|
|
current.json
|
|
upstreams/
|
|
baoyu-comic/
|
|
<tree-hash>/
|
|
upstream.json
|
|
SKILL.md
|
|
assets/...
|
|
versions/
|
|
v0001/
|
|
version.json
|
|
SKILL.md
|
|
assets/...
|
|
```
|
|
|
|
`upstreams/` stores immutable raw plugin snapshots. `versions/` stores runtime-visible
|
|
Beaver versions. A merged Beaver version may differ from its upstream snapshot.
|
|
|
|
Every upstream snapshot has two hashes:
|
|
|
|
- `skill_content_hash`: canonical hash of normalized `SKILL.md`; used by the LLM merge and
|
|
skill-content preservation checks.
|
|
- `skill_tree_hash`: hash of every regular file in the skill tree, including normalized
|
|
relative path, byte length, bytes, and executable-bit metadata. Symlinks are rejected.
|
|
This is the supply-chain identity used for update detection and state.
|
|
|
|
The tree hash includes `SKILL.md`, templates, assets, examples, and scripts. Full Unix
|
|
mode bits are not hashed because umask and extraction tools can change them; only whether
|
|
any executable bit is set is normalized into the hash. Beaver metadata files such as
|
|
`version.json` and `upstream.json` are excluded.
|
|
|
|
Every Beaver `SkillVersion` also stores a backward-compatible `tree_hash`. New versions
|
|
compute it from the complete promoted version directory. Older versions without the field
|
|
derive it on read, so `L.tree` is available for upgrade classification.
|
|
|
|
Plugin state records:
|
|
|
|
```json
|
|
{
|
|
"plugins": {
|
|
"baoyu-comic": {
|
|
"enabled": true,
|
|
"installed_version": "1.2.0",
|
|
"manifest_path": "plugins/baoyu-comic/beaver.plugin.json",
|
|
"updates_paused": false,
|
|
"skills": {
|
|
"baoyu-comic": {
|
|
"accepted_upstream_tree_hash": "sha256...",
|
|
"observed_upstream_tree_hash": "sha256...",
|
|
"accepted_beaver_version": "v0003",
|
|
"current_beaver_version": "v0003",
|
|
"pending_candidate_id": null,
|
|
"status": "synced"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
Skill versions and drafts also carry plugin provenance. State is operational metadata;
|
|
version provenance is the durable audit record.
|
|
|
|
## Initial Enable Flow
|
|
|
|
When an admin enables a valid plugin:
|
|
|
|
1. Discover and validate the manifest.
|
|
2. Copy each declared skill into an immutable upstream snapshot.
|
|
3. Reject ownership/name conflicts before changing any skill.
|
|
4. Run the existing deterministic skill safety checker against an in-memory initial-mirror
|
|
draft and reject failed or critical results.
|
|
5. Publish an exact Beaver mirror as the next skill version.
|
|
6. Copy supporting files into that version.
|
|
7. Mark the skill `source_kind="plugin"` and active.
|
|
8. Record plugin ID, plugin version, source path, upstream hash, and mirror mode in
|
|
`SkillVersion.provenance`.
|
|
9. Update plugin state only after all declared skills succeed.
|
|
|
|
Initial enable is an explicit trust action, so it does not require LLM synthesis. Manifest
|
|
validation, path validation, and the existing static skill safety checks still apply.
|
|
|
|
All files are first written below a transaction staging directory on the same filesystem.
|
|
Only after manifest validation, tree hashing, conflict checks, and safety checks pass are
|
|
immutable upstream/version directories promoted with `os.replace()`. `current.json`,
|
|
`skill.json`, and indexes are atomically replaced under the workspace write lock; plugin
|
|
state is written last. A failed transaction may leave an unreferenced immutable directory,
|
|
which cleanup can remove, but it cannot make a partial version runtime-visible.
|
|
|
|
For a new skill, the complete staged skill directory is promoted once. For an existing
|
|
skill, immutable directories and metadata are promoted first and `current.json` is
|
|
replaced last as the runtime visibility switch. This provides per-skill atomic visibility;
|
|
the workspace lock serializes writers across a multi-skill plugin operation.
|
|
|
|
## Runtime Priority
|
|
|
|
Mirrored plugin skills are loaded exclusively from `SkillSpecStore`. They are not supplied
|
|
through `SkillsLoader.extra_dirs`.
|
|
|
|
This makes priority deterministic:
|
|
|
|
1. active published workspace versions, including plugin-origin versions;
|
|
2. builtin skills.
|
|
|
|
`source_kind="plugin"` is displayed for provenance but does not lower selection priority
|
|
or exclude the skill from self-learning.
|
|
|
|
## Upgrade Classification
|
|
|
|
For each linked skill, compare upstream tree hashes:
|
|
|
|
| Condition | Action |
|
|
| --- | --- |
|
|
| `U.tree == B.tree` | No upstream change; no action. |
|
|
| `L.tree == U.tree` | Acknowledge the new upstream snapshot; no draft needed. |
|
|
| `L.tree == B.tree` and `U.tree != B.tree` | Create a deterministic `fast_forward` plugin update draft containing `U`. |
|
|
| `L.tree != B.tree` and `U.tree != B.tree` | Create a `three_way` plugin update candidate using `B`, `L`, and `U`. |
|
|
|
|
Even the `fast_forward` case goes through safety, replay evaluation, review, and publish.
|
|
It skips LLM merge synthesis because there is no local divergence.
|
|
|
|
Candidate IDs are deterministic:
|
|
|
|
```text
|
|
plugin-update:<plugin-id>:<skill-name>:<new-upstream-hash-prefix>
|
|
```
|
|
|
|
This makes boot-time sync idempotent.
|
|
|
|
Supporting files use a deterministic path-level three-way merge:
|
|
|
|
- local unchanged from `B`: take `U`;
|
|
- upstream unchanged from `B`: keep `L`;
|
|
- both sides equal: keep either;
|
|
- a file added on only one side: keep it;
|
|
- divergent edits, delete-versus-edit, or different additions at the same path: record an
|
|
unresolved file conflict and block publication.
|
|
|
|
The LLM merges only `SKILL.md`. It does not attempt to merge arbitrary or binary
|
|
supporting files.
|
|
|
|
## Learning Integration
|
|
|
|
Add candidate kind `plugin_skill_update`. Its evidence contains only references:
|
|
|
|
```json
|
|
{
|
|
"plugin_id": "baoyu-comic",
|
|
"plugin_version": "1.2.0",
|
|
"skill_name": "baoyu-comic",
|
|
"merge_mode": "three_way",
|
|
"base_upstream_tree_hash": "old-hash",
|
|
"new_upstream_tree_hash": "new-hash",
|
|
"local_version": "v0003"
|
|
}
|
|
```
|
|
|
|
The learning service resolves the actual snapshots from `SkillSpecStore`; raw skill
|
|
content is not duplicated into `learning-candidates.jsonl`.
|
|
|
|
For `three_way`, the synthesizer receives:
|
|
|
|
- old upstream `B`;
|
|
- current local skill `L`;
|
|
- new upstream `U`;
|
|
- relevant historical run evidence for `L`, when available.
|
|
|
|
The synthesizer must return the merged skill plus explicit merge decisions:
|
|
|
|
```json
|
|
{
|
|
"frontmatter": {},
|
|
"content": "...",
|
|
"change_reason": "...",
|
|
"preserved_local_sections": [],
|
|
"adopted_upstream_sections": [],
|
|
"resolved_conflicts": [],
|
|
"dropped_sections": []
|
|
}
|
|
```
|
|
|
|
The generated draft uses `proposal_kind="plugin_skill_update"` and carries the complete
|
|
plugin merge provenance.
|
|
|
|
## Evaluation And Publish Gates
|
|
|
|
The existing flow remains authoritative:
|
|
|
|
```text
|
|
candidate
|
|
-> draft
|
|
-> static safety
|
|
-> replay eval
|
|
-> review
|
|
-> publish
|
|
-> rollback if needed
|
|
```
|
|
|
|
Replay eval compares:
|
|
|
|
- baseline arm: current local version `L`;
|
|
- candidate arm: merged draft `M`.
|
|
|
|
The preservation report is extended for plugin updates:
|
|
|
|
- local preservation: important instructions from `L` are not silently removed;
|
|
- upstream adoption: new important instructions from `U` are represented;
|
|
- safety/tool preservation: Safety and Required Tools changes require explicit handling;
|
|
- unresolved conflicts cause evaluation failure.
|
|
|
|
Publishing is blocked when:
|
|
|
|
- static safety fails;
|
|
- replay evaluation regresses;
|
|
- confidence is low under the existing gate;
|
|
- local or upstream preservation fails;
|
|
- merge decisions contain unresolved `SKILL.md` conflicts;
|
|
- the supporting-file merge plan contains unresolved path/content conflicts.
|
|
|
|
On publish, the pipeline notifies `PluginManager`, which advances
|
|
`accepted_upstream_tree_hash`, clears the pending candidate, and records the new Beaver
|
|
version.
|
|
|
|
Observer delivery is not the source of truth. At the start of every sync, reconciliation
|
|
inspects the current published version provenance. If it contains a valid, newer
|
|
`plugin_skill_update` result and its upstream snapshot exists, plugin state is repaired:
|
|
|
|
- advance `accepted_upstream_tree_hash`;
|
|
- advance `accepted_beaver_version`;
|
|
- clear the matching pending candidate;
|
|
- set status to `synced`.
|
|
|
|
Reconciliation never moves `accepted_beaver_version` backwards after a runtime rollback.
|
|
An observer failure is audited but does not make an already-successful publish request
|
|
fail, which avoids client retries creating misleading duplicate operations.
|
|
|
|
## Concurrent And Failure Behavior
|
|
|
|
- All plugin sync, skill version allocation/publication, plugin state mutation, and
|
|
learning-candidate mutation share a reentrant cross-process workspace write lock at
|
|
`.beaver/locks/plugin-skill-write.lock`.
|
|
- The lock uses the repository's existing `fcntl`/`msvcrt` pattern plus an in-process
|
|
reentrant guard. Nested store calls reuse the held lock instead of deadlocking.
|
|
- Candidate existence checks and JSONL writes happen inside the lock.
|
|
- Version-number allocation and version promotion happen inside the lock.
|
|
- Explicit enable/sync waits for the lock with a bounded timeout and returns a busy error
|
|
on timeout.
|
|
- Engine boot never calls an LLM. Its auto-sync uses a non-blocking lock attempt; when the
|
|
lock is busy, boot proceeds with the current published skills and reports sync deferred.
|
|
- Repeated and concurrent boot/sync is idempotent across processes, not only within one
|
|
Python object.
|
|
- If another active draft targets the same skill, the plugin update remains pending and
|
|
is not synthesized until the skill is free.
|
|
- If a newer plugin version appears while an older update is pending, the old candidate is
|
|
marked superseded and a new candidate is created against the last accepted upstream.
|
|
- Rejecting a draft preserves the plugin package, upstream snapshots, current skill, and
|
|
candidate audit history. Regeneration remains possible.
|
|
- Partial multi-skill plugin enable never promotes metadata/current pointers until every
|
|
staged skill passes validation.
|
|
- Plugin files are never modified by learning or publication.
|
|
|
|
## Pause, Disable, Missing, And Adopt
|
|
|
|
- Pausing updates suspends discovery-to-candidate sync while linked skills remain active.
|
|
- Resuming updates reconciles state and performs sync.
|
|
- Disabling a plugin is an explicit destructive runtime action: it pauses updates and
|
|
disables linked skills, but never deletes versions or upstream snapshots. The API
|
|
requires an explicit `disable_linked_skills=true` confirmation.
|
|
- Re-enabling restores linked skills and performs sync.
|
|
- A missing plugin package is a supply-chain status only. It marks the plugin `missing`,
|
|
suspends sync/update, and leaves the current Beaver skills active.
|
|
- An explicit `adopt` action detaches a skill from its plugin, changes
|
|
`source_kind` to `managed`, keeps the current version active, and prevents future plugin
|
|
updates from targeting it.
|
|
|
|
## Management API And UI
|
|
|
|
Backend endpoints:
|
|
|
|
```text
|
|
GET /api/plugins
|
|
POST /api/plugins/sync
|
|
POST /api/plugins/{plugin_id}/enable
|
|
POST /api/plugins/{plugin_id}/pause
|
|
POST /api/plugins/{plugin_id}/resume
|
|
POST /api/plugins/{plugin_id}/disable
|
|
POST /api/plugins/{plugin_id}/skills/{skill_name}/adopt
|
|
```
|
|
|
|
API payloads never expose absolute server paths. Workspace manifests use workspace-relative
|
|
paths. External manifests use a redacted display path such as
|
|
`<external>/baoyu-comic/beaver.plugin.json`.
|
|
|
|
The existing Skills page gains a Plugins tab showing:
|
|
|
|
- discovered/enabled/missing/error state;
|
|
- installed and discovered plugin versions;
|
|
- declared skills and their current Beaver versions;
|
|
- sync state and pending learning candidate;
|
|
- enable, pause, resume, disable, sync, and adopt actions.
|
|
|
|
Plugin-origin skills continue to appear in the normal Published, Candidates, and Drafts
|
|
tabs with provenance and merge-mode labels.
|
|
|
|
## Non-Goals
|
|
|
|
- Importing arbitrary plugin Python modules into the Beaver process.
|
|
- Plugin-defined hooks, providers, channels, or frontend bundles.
|
|
- Automatic downloading from a plugin marketplace.
|
|
- Automatically publishing plugin upgrades without review.
|
|
- Editing or rebasing plugin source files in place.
|
|
|
|
## Acceptance Criteria
|
|
|
|
1. Enabling a plugin mirrors all declared skills and supporting files into managed
|
|
versions.
|
|
2. Mirrored skills have the same resolver priority and learning eligibility as ordinary
|
|
workspace skills.
|
|
3. Self-learning never modifies the plugin package.
|
|
4. Plugin updates create idempotent `plugin_skill_update` candidates.
|
|
5. Local divergence triggers a three-way merge; no divergence triggers a deterministic
|
|
fast-forward draft.
|
|
6. Every plugin update passes the existing safety, replay, review, and publish gates.
|
|
7. Publishing advances plugin state and preserves complete provenance.
|
|
8. Pause, disable, missing package, rejection, restart, and newer-update races do not lose
|
|
the current skill or its learned versions; missing packages leave current skills active.
|
|
9. Existing non-plugin skills and legacy candidate payloads remain backward compatible.
|
|
10. Supporting-file-only updates change the upstream tree hash and create an update
|
|
candidate.
|
|
11. Concurrent boot, sync, and enable operations do not allocate duplicate versions or
|
|
append duplicate candidates.
|
|
12. Sync reconciliation repairs plugin state after a published version succeeds but its
|
|
observer/state update fails.
|