feat(engine): 优化智能体循环中的助手消息处理逻辑

- 在没有工具调用时才添加助手消息到上下文
- 确保工具调用响应正确添加到消息上下文中
- 修复了消息构建的条件逻辑

fix(cron): 改进定时任务调度的时间解析功能

- 添加正则表达式导入用于时间显示解析
- 实现从显示文本中提取毫秒间隔的功能
- 增强整数转换的安全性,避免类型错误
- 优化定时任务配置的解析逻辑

feat(outlook): 增强Outlook集成的功能和稳定性

- 将默认超时时间从10秒增加到180秒
- 为状态检查函数添加可选的验证参数
- 串行执行邮件概览获取操作而非并行
- 改进连接状态验证逻辑

feat(channel): 添加设备名称作为会话标识的选项

- 为终端WebSocket适配器添加新的配置选项
- 实现基于设备名称生成会话对等ID的功能
- 记录原始对等ID和设备名称的元数据
- 支持从设备名称创建会话对等ID

feat(skills): 完善技能学习评估系统和进度跟踪

- 在应用启动时自动调度待评估的技能草稿
- 为技能评估工作创建独立的循环工厂
- 实现异步技能评估任务的取消和清理机制
- 添加技能评估进度报告和状态跟踪功能
- 扩展会话列表API以包含更多详细信息
- 防止对不存在的会话进行操作
- 优化技能草稿提交和评估的业务逻辑

perf(skills): 提升技能评估的并发性能

- 实现并行技能案例评估以提高效率
- 添加最大并行案例数的环境变量控制
- 实现实时评估进度更新和回调机制
- 优化评估过程中的资源管理和同步

refactor(services): 创建隔离的智能体循环实例

- 添加创建独立智能体循环的工厂方法
- 确保新循环继承运行时服务配置
- 支持技能评估等需要隔离环境的场景
```
This commit is contained in:
2026-06-15 14:48:16 +08:00
parent 8aeb97a5fc
commit 4b0bf65ace
53 changed files with 4328 additions and 292 deletions

View File

@ -0,0 +1,409 @@
# Beaver Plugin Skill Mirroring Design
## Decision
Beaver V1 plugins are declarative skill bundles. Enabling a plugin mirrors each declared
`SKILL.md` and its supporting files into `SkillSpecStore`. From that point onward, the
mirrored skill is a normal Beaver skill:
- it has the same resolver priority as any workspace-managed skill;
- runtime activation, receipts, performance scoring, replay evaluation, review, publish,
rollback, and disable all use the existing skill lifecycle;
- self-learning only writes Beaver-managed versions and never edits the plugin package;
- plugin origin remains metadata, not a separate runtime class.
An arbitrary in-process Python entrypoint, hooks, providers, and custom runtime code are
out of scope for this plan. Tool-providing plugins should continue to use MCP until a
separate executable-plugin security design is approved.
## Why The Proposed Flow Is Correct
The proposed "mirror, learn on the mirror, merge on plugin update, then evaluate" flow is
correct with one important refinement: plugin upgrades must be treated as a three-way
merge, not a two-document rewrite.
The three inputs are:
1. `B`, the last accepted upstream plugin snapshot;
2. `L`, the current Beaver-published skill, including local self-learning;
3. `U`, the newly discovered upstream plugin snapshot.
This distinction prevents a plugin update from silently deleting local learning and
prevents local learning from silently discarding new upstream safety or workflow changes.
## Package Contract
Each plugin directory contains `beaver.plugin.json`:
```json
{
"schema_version": 1,
"id": "baoyu-comic",
"name": "Baoyu Comic",
"version": "1.2.0",
"skills": [
{
"name": "baoyu-comic",
"path": "skills/baoyu-comic"
}
]
}
```
Rules:
- `id` and skill names use lowercase letters, digits, `_`, and `-`.
- Skill paths are relative to the plugin root and cannot escape it.
- Every skill directory must contain `SKILL.md`.
- Symlinks are rejected while mirroring.
- Two enabled plugins cannot own the same Beaver skill name.
- A plugin cannot overwrite an existing non-plugin workspace skill.
- Discovery does not enable a plugin. Enablement is an explicit admin action.
## Storage Model
Plugin packages remain outside the managed skill version tree:
```text
workspace/
plugins/
baoyu-comic/
beaver.plugin.json
skills/baoyu-comic/SKILL.md
.beaver/
plugins/state.json
skills/
baoyu-comic/
skill.json
current.json
upstreams/
baoyu-comic/
<tree-hash>/
upstream.json
SKILL.md
assets/...
versions/
v0001/
version.json
SKILL.md
assets/...
```
`upstreams/` stores immutable raw plugin snapshots. `versions/` stores runtime-visible
Beaver versions. A merged Beaver version may differ from its upstream snapshot.
Every upstream snapshot has two hashes:
- `skill_content_hash`: canonical hash of normalized `SKILL.md`; used by the LLM merge and
skill-content preservation checks.
- `skill_tree_hash`: hash of every regular file in the skill tree, including normalized
relative path, byte length, bytes, and executable-bit metadata. Symlinks are rejected.
This is the supply-chain identity used for update detection and state.
The tree hash includes `SKILL.md`, templates, assets, examples, and scripts. Full Unix
mode bits are not hashed because umask and extraction tools can change them; only whether
any executable bit is set is normalized into the hash. Beaver metadata files such as
`version.json` and `upstream.json` are excluded.
Every Beaver `SkillVersion` also stores a backward-compatible `tree_hash`. New versions
compute it from the complete promoted version directory. Older versions without the field
derive it on read, so `L.tree` is available for upgrade classification.
Plugin state records:
```json
{
"plugins": {
"baoyu-comic": {
"enabled": true,
"installed_version": "1.2.0",
"manifest_path": "plugins/baoyu-comic/beaver.plugin.json",
"updates_paused": false,
"skills": {
"baoyu-comic": {
"accepted_upstream_tree_hash": "sha256...",
"observed_upstream_tree_hash": "sha256...",
"accepted_beaver_version": "v0003",
"current_beaver_version": "v0003",
"pending_candidate_id": null,
"status": "synced"
}
}
}
}
}
```
Skill versions and drafts also carry plugin provenance. State is operational metadata;
version provenance is the durable audit record.
## Initial Enable Flow
When an admin enables a valid plugin:
1. Discover and validate the manifest.
2. Copy each declared skill into an immutable upstream snapshot.
3. Reject ownership/name conflicts before changing any skill.
4. Run the existing deterministic skill safety checker against an in-memory initial-mirror
draft and reject failed or critical results.
5. Publish an exact Beaver mirror as the next skill version.
6. Copy supporting files into that version.
7. Mark the skill `source_kind="plugin"` and active.
8. Record plugin ID, plugin version, source path, upstream hash, and mirror mode in
`SkillVersion.provenance`.
9. Update plugin state only after all declared skills succeed.
Initial enable is an explicit trust action, so it does not require LLM synthesis. Manifest
validation, path validation, and the existing static skill safety checks still apply.
All files are first written below a transaction staging directory on the same filesystem.
Only after manifest validation, tree hashing, conflict checks, and safety checks pass are
immutable upstream/version directories promoted with `os.replace()`. `current.json`,
`skill.json`, and indexes are atomically replaced under the workspace write lock; plugin
state is written last. A failed transaction may leave an unreferenced immutable directory,
which cleanup can remove, but it cannot make a partial version runtime-visible.
For a new skill, the complete staged skill directory is promoted once. For an existing
skill, immutable directories and metadata are promoted first and `current.json` is
replaced last as the runtime visibility switch. This provides per-skill atomic visibility;
the workspace lock serializes writers across a multi-skill plugin operation.
## Runtime Priority
Mirrored plugin skills are loaded exclusively from `SkillSpecStore`. They are not supplied
through `SkillsLoader.extra_dirs`.
This makes priority deterministic:
1. active published workspace versions, including plugin-origin versions;
2. builtin skills.
`source_kind="plugin"` is displayed for provenance but does not lower selection priority
or exclude the skill from self-learning.
## Upgrade Classification
For each linked skill, compare upstream tree hashes:
| Condition | Action |
| --- | --- |
| `U.tree == B.tree` | No upstream change; no action. |
| `L.tree == U.tree` | Acknowledge the new upstream snapshot; no draft needed. |
| `L.tree == B.tree` and `U.tree != B.tree` | Create a deterministic `fast_forward` plugin update draft containing `U`. |
| `L.tree != B.tree` and `U.tree != B.tree` | Create a `three_way` plugin update candidate using `B`, `L`, and `U`. |
Even the `fast_forward` case goes through safety, replay evaluation, review, and publish.
It skips LLM merge synthesis because there is no local divergence.
Candidate IDs are deterministic:
```text
plugin-update:<plugin-id>:<skill-name>:<new-upstream-hash-prefix>
```
This makes boot-time sync idempotent.
Supporting files use a deterministic path-level three-way merge:
- local unchanged from `B`: take `U`;
- upstream unchanged from `B`: keep `L`;
- both sides equal: keep either;
- a file added on only one side: keep it;
- divergent edits, delete-versus-edit, or different additions at the same path: record an
unresolved file conflict and block publication.
The LLM merges only `SKILL.md`. It does not attempt to merge arbitrary or binary
supporting files.
## Learning Integration
Add candidate kind `plugin_skill_update`. Its evidence contains only references:
```json
{
"plugin_id": "baoyu-comic",
"plugin_version": "1.2.0",
"skill_name": "baoyu-comic",
"merge_mode": "three_way",
"base_upstream_tree_hash": "old-hash",
"new_upstream_tree_hash": "new-hash",
"local_version": "v0003"
}
```
The learning service resolves the actual snapshots from `SkillSpecStore`; raw skill
content is not duplicated into `learning-candidates.jsonl`.
For `three_way`, the synthesizer receives:
- old upstream `B`;
- current local skill `L`;
- new upstream `U`;
- relevant historical run evidence for `L`, when available.
The synthesizer must return the merged skill plus explicit merge decisions:
```json
{
"frontmatter": {},
"content": "...",
"change_reason": "...",
"preserved_local_sections": [],
"adopted_upstream_sections": [],
"resolved_conflicts": [],
"dropped_sections": []
}
```
The generated draft uses `proposal_kind="plugin_skill_update"` and carries the complete
plugin merge provenance.
## Evaluation And Publish Gates
The existing flow remains authoritative:
```text
candidate
-> draft
-> static safety
-> replay eval
-> review
-> publish
-> rollback if needed
```
Replay eval compares:
- baseline arm: current local version `L`;
- candidate arm: merged draft `M`.
The preservation report is extended for plugin updates:
- local preservation: important instructions from `L` are not silently removed;
- upstream adoption: new important instructions from `U` are represented;
- safety/tool preservation: Safety and Required Tools changes require explicit handling;
- unresolved conflicts cause evaluation failure.
Publishing is blocked when:
- static safety fails;
- replay evaluation regresses;
- confidence is low under the existing gate;
- local or upstream preservation fails;
- merge decisions contain unresolved `SKILL.md` conflicts;
- the supporting-file merge plan contains unresolved path/content conflicts.
On publish, the pipeline notifies `PluginManager`, which advances
`accepted_upstream_tree_hash`, clears the pending candidate, and records the new Beaver
version.
Observer delivery is not the source of truth. At the start of every sync, reconciliation
inspects the current published version provenance. If it contains a valid, newer
`plugin_skill_update` result and its upstream snapshot exists, plugin state is repaired:
- advance `accepted_upstream_tree_hash`;
- advance `accepted_beaver_version`;
- clear the matching pending candidate;
- set status to `synced`.
Reconciliation never moves `accepted_beaver_version` backwards after a runtime rollback.
An observer failure is audited but does not make an already-successful publish request
fail, which avoids client retries creating misleading duplicate operations.
## Concurrent And Failure Behavior
- All plugin sync, skill version allocation/publication, plugin state mutation, and
learning-candidate mutation share a reentrant cross-process workspace write lock at
`.beaver/locks/plugin-skill-write.lock`.
- The lock uses the repository's existing `fcntl`/`msvcrt` pattern plus an in-process
reentrant guard. Nested store calls reuse the held lock instead of deadlocking.
- Candidate existence checks and JSONL writes happen inside the lock.
- Version-number allocation and version promotion happen inside the lock.
- Explicit enable/sync waits for the lock with a bounded timeout and returns a busy error
on timeout.
- Engine boot never calls an LLM. Its auto-sync uses a non-blocking lock attempt; when the
lock is busy, boot proceeds with the current published skills and reports sync deferred.
- Repeated and concurrent boot/sync is idempotent across processes, not only within one
Python object.
- If another active draft targets the same skill, the plugin update remains pending and
is not synthesized until the skill is free.
- If a newer plugin version appears while an older update is pending, the old candidate is
marked superseded and a new candidate is created against the last accepted upstream.
- Rejecting a draft preserves the plugin package, upstream snapshots, current skill, and
candidate audit history. Regeneration remains possible.
- Partial multi-skill plugin enable never promotes metadata/current pointers until every
staged skill passes validation.
- Plugin files are never modified by learning or publication.
## Pause, Disable, Missing, And Adopt
- Pausing updates suspends discovery-to-candidate sync while linked skills remain active.
- Resuming updates reconciles state and performs sync.
- Disabling a plugin is an explicit destructive runtime action: it pauses updates and
disables linked skills, but never deletes versions or upstream snapshots. The API
requires an explicit `disable_linked_skills=true` confirmation.
- Re-enabling restores linked skills and performs sync.
- A missing plugin package is a supply-chain status only. It marks the plugin `missing`,
suspends sync/update, and leaves the current Beaver skills active.
- An explicit `adopt` action detaches a skill from its plugin, changes
`source_kind` to `managed`, keeps the current version active, and prevents future plugin
updates from targeting it.
## Management API And UI
Backend endpoints:
```text
GET /api/plugins
POST /api/plugins/sync
POST /api/plugins/{plugin_id}/enable
POST /api/plugins/{plugin_id}/pause
POST /api/plugins/{plugin_id}/resume
POST /api/plugins/{plugin_id}/disable
POST /api/plugins/{plugin_id}/skills/{skill_name}/adopt
```
API payloads never expose absolute server paths. Workspace manifests use workspace-relative
paths. External manifests use a redacted display path such as
`<external>/baoyu-comic/beaver.plugin.json`.
The existing Skills page gains a Plugins tab showing:
- discovered/enabled/missing/error state;
- installed and discovered plugin versions;
- declared skills and their current Beaver versions;
- sync state and pending learning candidate;
- enable, pause, resume, disable, sync, and adopt actions.
Plugin-origin skills continue to appear in the normal Published, Candidates, and Drafts
tabs with provenance and merge-mode labels.
## Non-Goals
- Importing arbitrary plugin Python modules into the Beaver process.
- Plugin-defined hooks, providers, channels, or frontend bundles.
- Automatic downloading from a plugin marketplace.
- Automatically publishing plugin upgrades without review.
- Editing or rebasing plugin source files in place.
## Acceptance Criteria
1. Enabling a plugin mirrors all declared skills and supporting files into managed
versions.
2. Mirrored skills have the same resolver priority and learning eligibility as ordinary
workspace skills.
3. Self-learning never modifies the plugin package.
4. Plugin updates create idempotent `plugin_skill_update` candidates.
5. Local divergence triggers a three-way merge; no divergence triggers a deterministic
fast-forward draft.
6. Every plugin update passes the existing safety, replay, review, and publish gates.
7. Publishing advances plugin state and preserves complete provenance.
8. Pause, disable, missing package, rejection, restart, and newer-update races do not lose
the current skill or its learned versions; missing packages leave current skills active.
9. Existing non-plugin skills and legacy candidate payloads remain backward compatible.
10. Supporting-file-only updates change the upstream tree hash and create an update
candidate.
11. Concurrent boot, sync, and enable operations do not allocate duplicate versions or
append duplicate candidates.
12. Sync reconciliation repairs plugin state after a published version succeeds but its
observer/state update fails.