feat(engine): 优化智能体循环中的助手消息处理逻辑 - 在没有工具调用时才添加助手消息到上下文 - 确保工具调用响应正确添加到消息上下文中 - 修复了消息构建的条件逻辑 fix(cron): 改进定时任务调度的时间解析功能 - 添加正则表达式导入用于时间显示解析 - 实现从显示文本中提取毫秒间隔的功能 - 增强整数转换的安全性,避免类型错误 - 优化定时任务配置的解析逻辑 feat(outlook): 增强Outlook集成的功能和稳定性 - 将默认超时时间从10秒增加到180秒 - 为状态检查函数添加可选的验证参数 - 串行执行邮件概览获取操作而非并行 - 改进连接状态验证逻辑 feat(channel): 添加设备名称作为会话标识的选项 - 为终端WebSocket适配器添加新的配置选项 - 实现基于设备名称生成会话对等ID的功能 - 记录原始对等ID和设备名称的元数据 - 支持从设备名称创建会话对等ID feat(skills): 完善技能学习评估系统和进度跟踪 - 在应用启动时自动调度待评估的技能草稿 - 为技能评估工作创建独立的循环工厂 - 实现异步技能评估任务的取消和清理机制 - 添加技能评估进度报告和状态跟踪功能 - 扩展会话列表API以包含更多详细信息 - 防止对不存在的会话进行操作 - 优化技能草稿提交和评估的业务逻辑 perf(skills): 提升技能评估的并发性能 - 实现并行技能案例评估以提高效率 - 添加最大并行案例数的环境变量控制 - 实现实时评估进度更新和回调机制 - 优化评估过程中的资源管理和同步 refactor(services): 创建隔离的智能体循环实例 - 添加创建独立智能体循环的工厂方法 - 确保新循环继承运行时服务配置 - 支持技能评估等需要隔离环境的场景 ```
1759 lines
59 KiB
Markdown
1759 lines
59 KiB
Markdown
# Plugin Skill Mirroring And Upgrade Learning Implementation Plan
|
|
|
|
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
|
|
|
**Goal:** Add declarative Beaver plugins whose skills are mirrored as normal managed skills, learn normally, and merge plugin upgrades through the existing safety, replay evaluation, review, publish, and rollback lifecycle.
|
|
|
|
**Architecture:** A new `beaver.plugins` package discovers and validates `beaver.plugin.json`, computes content and full-tree hashes, persists enable/sync state, and stages immutable upstream/version trees before atomic promotion under a workspace write lock. Plugin upgrades become deterministic `plugin_skill_update` learning candidates using old upstream, current local, and new upstream inputs; the existing learning pipeline remains the only path for update publication, with sync-time reconciliation repairing failed state acknowledgements.
|
|
|
|
**Tech Stack:** Python dataclasses and file-backed JSON stores, existing `SkillSpecStore` and skill-learning pipeline, FastAPI, pytest, Next.js/TypeScript, existing shadcn UI components.
|
|
|
|
---
|
|
|
|
## Scope
|
|
|
|
This plan implements declarative skill plugins only. Do not add Python plugin entrypoints,
|
|
hooks, providers, channels, dependency installation, or marketplace download support.
|
|
Plugin-provided tools continue to use MCP.
|
|
|
|
## File Structure
|
|
|
|
Create focused plugin modules:
|
|
|
|
- `app-instance/backend/beaver/plugins/models.py`: manifest, discovery, state, and sync result dataclasses.
|
|
- `app-instance/backend/beaver/plugins/manifest.py`: JSON parsing, identifier validation, and contained-path validation.
|
|
- `app-instance/backend/beaver/plugins/hashing.py`: canonical skill-content and full-tree hashing.
|
|
- `app-instance/backend/beaver/plugins/tree_merge.py`: deterministic three-way supporting-file merge plans.
|
|
- `app-instance/backend/beaver/plugins/state.py`: atomic `.beaver/plugins/state.json` persistence.
|
|
- `app-instance/backend/beaver/plugins/discovery.py`: scan workspace and configured plugin roots.
|
|
- `app-instance/backend/beaver/plugins/transaction.py`: same-filesystem staging and immutable directory promotion.
|
|
- `app-instance/backend/beaver/plugins/skills.py`: initial mirror, update classification, candidate creation, reconciliation, pause/resume, disable, and adopt.
|
|
- `app-instance/backend/beaver/plugins/__init__.py`: public exports.
|
|
- `app-instance/backend/beaver/foundation/utils/file_lock.py`: reentrant cross-process workspace write lock.
|
|
|
|
Modify skill lifecycle modules:
|
|
|
|
- `app-instance/backend/beaver/skills/specs/models.py`: add upstream snapshot and draft provenance models.
|
|
- `app-instance/backend/beaver/skills/specs/storage.py`: persist immutable upstream snapshots and safely copy supporting files.
|
|
- `app-instance/backend/beaver/memory/skills/store.py`: lock candidate existence checks and JSONL mutations.
|
|
- `app-instance/backend/beaver/skills/drafts/service.py`: create plugin update drafts.
|
|
- `app-instance/backend/beaver/skills/learning/service.py`: synthesize `plugin_skill_update`.
|
|
- `app-instance/backend/beaver/skills/learning/synthesizer.py`: three-way plugin merge prompt and result.
|
|
- `app-instance/backend/beaver/skills/learning/eval.py`: plugin merge preservation report.
|
|
- `app-instance/backend/beaver/skills/learning/pipeline.py`: acknowledge successful plugin update publication.
|
|
- `app-instance/backend/beaver/skills/publisher/service.py`: carry draft provenance into published versions.
|
|
|
|
Modify runtime and management surfaces:
|
|
|
|
- `app-instance/backend/beaver/foundation/config/schema.py`
|
|
- `app-instance/backend/beaver/foundation/config/loader.py`
|
|
- `app-instance/backend/beaver/engine/loader.py`
|
|
- `app-instance/backend/beaver/interfaces/web/app.py`
|
|
- `app-instance/frontend/types/index.ts`
|
|
- `app-instance/frontend/lib/api.ts`
|
|
- `app-instance/frontend/app/(app)/skills/page.tsx`
|
|
|
|
Add tests:
|
|
|
|
- `app-instance/backend/tests/unit/test_plugin_manifest.py`
|
|
- `app-instance/backend/tests/unit/test_plugin_hashing.py`
|
|
- `app-instance/backend/tests/unit/test_plugin_state.py`
|
|
- `app-instance/backend/tests/unit/test_workspace_write_lock.py`
|
|
- `app-instance/backend/tests/unit/test_plugin_skill_storage.py`
|
|
- `app-instance/backend/tests/unit/test_plugin_skill_sync.py`
|
|
- `app-instance/backend/tests/unit/test_plugin_skill_learning.py`
|
|
- `app-instance/backend/tests/unit/test_plugin_runtime.py`
|
|
- `app-instance/backend/tests/unit/test_plugin_web_api.py`
|
|
- `app-instance/frontend/lib/plugin-api.test.ts`
|
|
|
|
---
|
|
|
|
### Task 1: Add Plugin Configuration And Manifest Models
|
|
|
|
**Files:**
|
|
- Create: `app-instance/backend/beaver/plugins/models.py`
|
|
- Create: `app-instance/backend/beaver/plugins/manifest.py`
|
|
- Create: `app-instance/backend/beaver/plugins/hashing.py`
|
|
- Create: `app-instance/backend/beaver/plugins/__init__.py`
|
|
- Modify: `app-instance/backend/beaver/foundation/config/schema.py`
|
|
- Modify: `app-instance/backend/beaver/foundation/config/loader.py`
|
|
- Modify: `app-instance/backend/beaver/foundation/config/__init__.py`
|
|
- Test: `app-instance/backend/tests/unit/test_plugin_manifest.py`
|
|
- Test: `app-instance/backend/tests/unit/test_plugin_hashing.py`
|
|
- Test: `app-instance/backend/tests/unit/test_config_loader.py`
|
|
|
|
- [ ] **Step 1: Write failing manifest validation tests**
|
|
|
|
Create tests covering:
|
|
|
|
```python
|
|
def test_load_plugin_manifest_accepts_declared_skill(tmp_path: Path) -> None:
|
|
root = tmp_path / "comic"
|
|
(root / "skills" / "comic").mkdir(parents=True)
|
|
(root / "skills" / "comic" / "SKILL.md").write_text("# Comic\n", encoding="utf-8")
|
|
(root / "beaver.plugin.json").write_text(
|
|
json.dumps(
|
|
{
|
|
"schema_version": 1,
|
|
"id": "baoyu-comic",
|
|
"name": "Baoyu Comic",
|
|
"version": "1.2.0",
|
|
"skills": [{"name": "baoyu-comic", "path": "skills/comic"}],
|
|
}
|
|
),
|
|
encoding="utf-8",
|
|
)
|
|
|
|
manifest = load_plugin_manifest(root / "beaver.plugin.json")
|
|
|
|
assert manifest.plugin_id == "baoyu-comic"
|
|
assert manifest.skills[0].name == "baoyu-comic"
|
|
assert manifest.skills[0].root == root / "skills" / "comic"
|
|
|
|
|
|
@pytest.mark.parametrize("value", ["../outside", "/absolute", "skills/../../outside"])
|
|
def test_load_plugin_manifest_rejects_escaping_skill_path(tmp_path: Path, value: str) -> None:
|
|
path = tmp_path / "beaver.plugin.json"
|
|
path.write_text(
|
|
json.dumps(
|
|
{
|
|
"schema_version": 1,
|
|
"id": "unsafe",
|
|
"name": "Unsafe",
|
|
"version": "1.0.0",
|
|
"skills": [{"name": "unsafe", "path": value}],
|
|
}
|
|
),
|
|
encoding="utf-8",
|
|
)
|
|
|
|
with pytest.raises(ValueError, match="contained"):
|
|
load_plugin_manifest(path)
|
|
```
|
|
|
|
Also test invalid IDs, duplicate skill names, unsupported schema versions, missing
|
|
`SKILL.md`, and symlinked skill roots.
|
|
|
|
Add tree-hash tests:
|
|
|
|
```python
|
|
def test_skill_tree_hash_changes_when_supporting_file_changes(tmp_path: Path) -> None:
|
|
root = tmp_path / "skill"
|
|
root.mkdir()
|
|
(root / "SKILL.md").write_text("# Skill\n", encoding="utf-8")
|
|
(root / "templates").mkdir()
|
|
template = root / "templates" / "report.md"
|
|
template.write_text("v1", encoding="utf-8")
|
|
|
|
first = hash_plugin_skill_tree(root)
|
|
template.write_text("v2", encoding="utf-8")
|
|
second = hash_plugin_skill_tree(root)
|
|
|
|
assert first.skill_content_hash == second.skill_content_hash
|
|
assert first.skill_tree_hash != second.skill_tree_hash
|
|
```
|
|
|
|
Also verify path changes and executable-bit changes affect `skill_tree_hash`, while mtime
|
|
and non-executable permission changes do not.
|
|
|
|
- [ ] **Step 2: Run tests and verify failure**
|
|
|
|
Run:
|
|
|
|
```bash
|
|
cd app-instance/backend
|
|
pytest tests/unit/test_plugin_manifest.py tests/unit/test_plugin_hashing.py tests/unit/test_config_loader.py -q
|
|
```
|
|
|
|
Expected: FAIL because `beaver.plugins` and `PluginsConfig` do not exist.
|
|
|
|
- [ ] **Step 3: Implement immutable plugin models and config**
|
|
|
|
Put plugin package models in `beaver/plugins/models.py`:
|
|
|
|
```python
|
|
@dataclass(frozen=True, slots=True)
|
|
class PluginSkillDeclaration:
|
|
name: str
|
|
relative_path: str
|
|
root: Path
|
|
|
|
|
|
@dataclass(frozen=True, slots=True)
|
|
class PluginManifest:
|
|
schema_version: int
|
|
plugin_id: str
|
|
name: str
|
|
version: str
|
|
root: Path
|
|
manifest_path: Path
|
|
display_path: str
|
|
skills: tuple[PluginSkillDeclaration, ...]
|
|
|
|
|
|
@dataclass(frozen=True, slots=True)
|
|
class PluginSkillFileDigest:
|
|
path: str
|
|
size: int
|
|
executable: bool
|
|
content_hash: str
|
|
|
|
|
|
@dataclass(frozen=True, slots=True)
|
|
class PluginSkillTreeDigest:
|
|
skill_content_hash: str
|
|
skill_tree_hash: str
|
|
files: tuple[PluginSkillFileDigest, ...]
|
|
|
|
|
|
```
|
|
|
|
Put configuration in `beaver/foundation/config/schema.py` to preserve the foundation layer
|
|
and avoid importing plugin runtime modules from config:
|
|
|
|
```python
|
|
@dataclass(slots=True)
|
|
class PluginsConfig:
|
|
search_paths: list[str] = field(default_factory=list)
|
|
auto_sync: bool = True
|
|
```
|
|
|
|
Add `plugins: PluginsConfig` to `BeaverConfig`. Parse both camelCase and snake_case:
|
|
|
|
```python
|
|
def _parse_plugins(raw: Any) -> PluginsConfig:
|
|
data = _as_dict(raw)
|
|
return PluginsConfig(
|
|
search_paths=_string_list(data.get("searchPaths") or data.get("search_paths")),
|
|
auto_sync=_bool(data.get("autoSync") if "autoSync" in data else data.get("auto_sync"), default=True),
|
|
)
|
|
```
|
|
|
|
- [ ] **Step 4: Implement strict JSON manifest loading**
|
|
|
|
`load_plugin_manifest()` must:
|
|
|
|
1. parse a JSON object;
|
|
2. require schema version `1`;
|
|
3. validate identifiers with `^[a-z0-9][a-z0-9_-]*$`;
|
|
4. resolve every skill root and check `resolved.is_relative_to(plugin_root)`;
|
|
5. reject symlinks in the path from plugin root to skill root;
|
|
6. require a regular `SKILL.md`;
|
|
7. initialize `display_path` without exposing an absolute path;
|
|
8. return frozen dataclasses.
|
|
|
|
- [ ] **Step 5: Implement deterministic dual hashing**
|
|
|
|
`hash_plugin_skill_tree(root)` must:
|
|
|
|
1. reject symlinks and non-regular files;
|
|
2. enumerate regular files by normalized POSIX relative path;
|
|
3. compute `skill_content_hash` from normalized `SKILL.md`;
|
|
4. compute `skill_tree_hash` from each path, byte length, file bytes, and one normalized
|
|
executable-bit flag;
|
|
5. include `SKILL.md` and every supporting file;
|
|
6. exclude Beaver metadata such as `version.json` and `upstream.json`;
|
|
7. ignore mtime, uid/gid, and non-executable mode bits.
|
|
|
|
Use length-prefixed binary fields in the digest input instead of ambiguous string
|
|
concatenation.
|
|
|
|
- [ ] **Step 6: Run focused tests**
|
|
|
|
```bash
|
|
cd app-instance/backend
|
|
pytest tests/unit/test_plugin_manifest.py tests/unit/test_plugin_hashing.py tests/unit/test_config_loader.py -q
|
|
```
|
|
|
|
Expected: PASS.
|
|
|
|
- [ ] **Step 7: Commit**
|
|
|
|
```bash
|
|
git add app-instance/backend/beaver/plugins app-instance/backend/beaver/foundation/config app-instance/backend/tests/unit/test_plugin_manifest.py app-instance/backend/tests/unit/test_plugin_hashing.py app-instance/backend/tests/unit/test_config_loader.py
|
|
git commit -m "feat(plugins): add declarative skill manifest"
|
|
```
|
|
|
|
---
|
|
|
|
### Task 2: Add Discovery And Atomic Plugin State
|
|
|
|
**Files:**
|
|
- Create: `app-instance/backend/beaver/plugins/discovery.py`
|
|
- Create: `app-instance/backend/beaver/plugins/state.py`
|
|
- Create: `app-instance/backend/beaver/foundation/utils/file_lock.py`
|
|
- Modify: `app-instance/backend/beaver/plugins/models.py`
|
|
- Modify: `app-instance/backend/beaver/plugins/__init__.py`
|
|
- Test: `app-instance/backend/tests/unit/test_plugin_state.py`
|
|
- Test: `app-instance/backend/tests/unit/test_workspace_write_lock.py`
|
|
|
|
- [ ] **Step 1: Write failing discovery and state tests**
|
|
|
|
Cover workspace discovery, configured search paths, duplicate plugin IDs, malformed
|
|
manifests reported as errors instead of crashing the full scan, and state round trips:
|
|
|
|
```python
|
|
def test_plugin_state_round_trip_is_atomic(tmp_path: Path) -> None:
|
|
store = PluginStateStore(tmp_path)
|
|
store.set_enabled("baoyu-comic", True)
|
|
store.update_skill_binding(
|
|
"baoyu-comic",
|
|
"baoyu-comic",
|
|
PluginSkillBinding(
|
|
accepted_upstream_tree_hash="old",
|
|
observed_upstream_tree_hash="new",
|
|
accepted_beaver_version="v0001",
|
|
current_beaver_version="v0002",
|
|
pending_candidate_id="plugin-update:baoyu-comic:baoyu-comic:new",
|
|
status="update_pending",
|
|
),
|
|
)
|
|
|
|
reloaded = PluginStateStore(tmp_path).get_plugin("baoyu-comic")
|
|
|
|
assert reloaded is not None
|
|
assert reloaded.enabled is True
|
|
assert reloaded.skills["baoyu-comic"].accepted_upstream_tree_hash == "old"
|
|
assert not (tmp_path / ".beaver" / "plugins" / "state.json.tmp").exists()
|
|
```
|
|
|
|
Add a multiprocess lock test in which two processes enter the same workspace lock and
|
|
assert their critical sections never overlap. Add a reentrancy test in which nested
|
|
acquisitions in one process complete without deadlock.
|
|
|
|
- [ ] **Step 2: Run tests and verify failure**
|
|
|
|
```bash
|
|
cd app-instance/backend
|
|
pytest tests/unit/test_plugin_state.py tests/unit/test_workspace_write_lock.py -q
|
|
```
|
|
|
|
Expected: FAIL because discovery and state stores are missing.
|
|
|
|
- [ ] **Step 3: Implement state dataclasses**
|
|
|
|
Add backward-compatible `to_dict()` and `from_dict()` methods for:
|
|
|
|
```python
|
|
@dataclass(slots=True)
|
|
class PluginSkillBinding:
|
|
accepted_upstream_tree_hash: str | None = None
|
|
observed_upstream_tree_hash: str | None = None
|
|
accepted_beaver_version: str | None = None
|
|
current_beaver_version: str | None = None
|
|
pending_candidate_id: str | None = None
|
|
status: str = "discovered"
|
|
last_error: str | None = None
|
|
|
|
|
|
@dataclass(slots=True)
|
|
class PluginState:
|
|
plugin_id: str
|
|
enabled: bool = False
|
|
updates_paused: bool = False
|
|
installed_version: str | None = None
|
|
manifest_path: str | None = None
|
|
status: str = "discovered"
|
|
last_error: str | None = None
|
|
skills: dict[str, PluginSkillBinding] = field(default_factory=dict)
|
|
```
|
|
|
|
- [ ] **Step 4: Implement atomic state persistence**
|
|
|
|
Store data at `<workspace>/.beaver/plugins/state.json`. Write a complete JSON document to
|
|
`state.json.tmp`, flush it, then replace `state.json`. Public methods:
|
|
|
|
```python
|
|
list_plugins()
|
|
get_plugin(plugin_id)
|
|
set_enabled(plugin_id, enabled)
|
|
upsert_plugin(plugin_state)
|
|
update_skill_binding(plugin_id, skill_name, binding)
|
|
```
|
|
|
|
- [ ] **Step 5: Implement the shared workspace write lock**
|
|
|
|
Add:
|
|
|
|
```python
|
|
class WorkspaceWriteLock:
|
|
def __init__(self, workspace: str | Path) -> None:
|
|
self.path = Path(workspace) / ".beaver" / "locks" / "plugin-skill-write.lock"
|
|
|
|
@contextmanager
|
|
def acquire(self, *, timeout_seconds: float | None = None, blocking: bool = True):
|
|
...
|
|
```
|
|
|
|
Requirements:
|
|
|
|
- use `fcntl.flock()` on POSIX and `msvcrt.locking()` on Windows, matching
|
|
`memory/curated/store.py`;
|
|
- guard with a process-local `threading.RLock`;
|
|
- track per-thread recursion depth so nested store calls reuse the OS lock;
|
|
- support non-blocking acquisition for Engine boot;
|
|
- raise `WorkspaceWriteLockBusy` on timeout/contention;
|
|
- keep the lock file separate from atomically replaced data files.
|
|
|
|
- [ ] **Step 6: Implement discovery**
|
|
|
|
Scan:
|
|
|
|
1. `<workspace>/plugins`;
|
|
2. each configured `plugins.search_paths`.
|
|
|
|
Only direct child directories containing `beaver.plugin.json` are plugins. Return a
|
|
`PluginDiscoveryResult` containing valid manifests and per-path errors. Duplicate IDs are
|
|
errors and neither duplicate is activated. Discovery records a workspace-relative
|
|
manifest display path when possible and a redacted
|
|
`<external>/<plugin-dir>/beaver.plugin.json` path otherwise; absolute paths remain
|
|
internal.
|
|
|
|
- [ ] **Step 7: Run focused tests**
|
|
|
|
```bash
|
|
cd app-instance/backend
|
|
pytest tests/unit/test_plugin_state.py tests/unit/test_workspace_write_lock.py tests/unit/test_plugin_manifest.py -q
|
|
```
|
|
|
|
Expected: PASS.
|
|
|
|
- [ ] **Step 8: Commit**
|
|
|
|
```bash
|
|
git add app-instance/backend/beaver/plugins app-instance/backend/beaver/foundation/utils/file_lock.py app-instance/backend/tests/unit/test_plugin_state.py app-instance/backend/tests/unit/test_workspace_write_lock.py
|
|
git commit -m "feat(plugins): discover packages and persist state"
|
|
```
|
|
|
|
---
|
|
|
|
### Task 3: Persist Immutable Upstream Skill Snapshots
|
|
|
|
**Files:**
|
|
- Create: `app-instance/backend/beaver/plugins/transaction.py`
|
|
- Modify: `app-instance/backend/beaver/skills/specs/models.py`
|
|
- Modify: `app-instance/backend/beaver/skills/specs/storage.py`
|
|
- Modify: `app-instance/backend/beaver/skills/specs/__init__.py`
|
|
- Test: `app-instance/backend/tests/unit/test_plugin_skill_storage.py`
|
|
|
|
- [ ] **Step 1: Write failing snapshot storage tests**
|
|
|
|
Test exact content, supporting files, idempotence, symlink rejection, and source
|
|
immutability:
|
|
|
|
```python
|
|
def test_write_upstream_snapshot_copies_skill_without_mutating_source(tmp_path: Path) -> None:
|
|
source = tmp_path / "plugin" / "skills" / "comic"
|
|
source.mkdir(parents=True)
|
|
(source / "SKILL.md").write_text("# Comic\n\nOriginal.\n", encoding="utf-8")
|
|
(source / "templates").mkdir()
|
|
(source / "templates" / "panel.txt").write_text("panel", encoding="utf-8")
|
|
store = SkillSpecStore(tmp_path / "workspace")
|
|
transaction = PluginSkillTransaction(tmp_path / "workspace")
|
|
|
|
snapshot = store.stage_upstream_snapshot(
|
|
transaction,
|
|
skill_name="baoyu-comic",
|
|
source_kind="plugin",
|
|
source_id="baoyu-comic",
|
|
source_version="1.0.0",
|
|
source_path="skills/comic",
|
|
source_root=source,
|
|
)
|
|
store.promote_upstream_snapshot(transaction, snapshot)
|
|
|
|
loaded = store.read_upstream_snapshot("baoyu-comic", "baoyu-comic", snapshot.skill_tree_hash)
|
|
assert loaded is not None
|
|
assert loaded.content == "# Comic\n\nOriginal.\n"
|
|
assert (loaded.root / "templates" / "panel.txt").read_text(encoding="utf-8") == "panel"
|
|
assert (source / "SKILL.md").read_text(encoding="utf-8") == "# Comic\n\nOriginal.\n"
|
|
```
|
|
|
|
Also test:
|
|
|
|
- changing only `templates/panel.txt` creates a different snapshot directory;
|
|
- `SkillVersion.from_dict()` remains compatible without `tree_hash`;
|
|
- reading a legacy version derives its complete tree hash;
|
|
- staging does not make a snapshot visible to `read_upstream_snapshot()`;
|
|
- promoting a staged snapshot uses `os.replace()` and is idempotent;
|
|
- a failed metadata write leaves no current pointer to the staged version.
|
|
|
|
- [ ] **Step 2: Run test and verify failure**
|
|
|
|
```bash
|
|
cd app-instance/backend
|
|
pytest tests/unit/test_plugin_skill_storage.py -q
|
|
```
|
|
|
|
Expected: FAIL because upstream snapshot APIs do not exist.
|
|
|
|
- [ ] **Step 3: Add upstream snapshot models**
|
|
|
|
Add:
|
|
|
|
```python
|
|
@dataclass(slots=True)
|
|
class SkillUpstreamSnapshot:
|
|
skill_name: str
|
|
source_kind: str
|
|
source_id: str
|
|
source_version: str
|
|
source_path: str
|
|
skill_content_hash: str
|
|
skill_tree_hash: str
|
|
created_at: str
|
|
frontmatter: dict[str, Any] = field(default_factory=dict)
|
|
```
|
|
|
|
Add `LoadedSkillUpstreamSnapshot(snapshot, content, root)` for storage reads. Extend
|
|
`SkillVersion` with a backward-compatible `tree_hash: str = ""`; new versions persist the
|
|
complete version-tree hash, while `read_published_skill()` derives it for legacy metadata
|
|
that lacks the field.
|
|
|
|
- [ ] **Step 4: Add safe tree-copy helper**
|
|
|
|
Refactor a private `SkillSpecStore._copy_regular_tree(source_root, target_root)` that:
|
|
|
|
- rejects any symlink;
|
|
- rejects paths containing empty, `.`, or `..` segments;
|
|
- copies regular files only;
|
|
- creates parents;
|
|
- never writes outside `target_root`.
|
|
|
|
Use it for transaction staging now; Task 4 will reuse it for mirrored versions.
|
|
|
|
- [ ] **Step 5: Implement same-filesystem staging and promotion**
|
|
|
|
`PluginSkillTransaction` creates:
|
|
|
|
```text
|
|
<workspace>/.beaver/staging/plugin-skills/<transaction-id>/
|
|
```
|
|
|
|
The staging root must be on the same filesystem as `<workspace>/skills`. It exposes:
|
|
|
|
```python
|
|
stage_upstream_snapshot(...)
|
|
stage_skill_version(...)
|
|
promote_directory(staged, final)
|
|
cleanup()
|
|
```
|
|
|
|
`promote_directory()` uses `os.replace()` and never replaces an existing non-identical
|
|
immutable directory. Cleanup removes only the transaction's staging root.
|
|
|
|
- [ ] **Step 6: Implement snapshot APIs**
|
|
|
|
Write snapshots to:
|
|
|
|
```text
|
|
skills/<skill>/upstreams/<source-id>/<skill-tree-hash>/
|
|
```
|
|
|
|
The snapshot metadata stores both hashes. If the directory already exists, verify all
|
|
stored metadata and return it without rewriting.
|
|
|
|
Public methods:
|
|
|
|
```python
|
|
stage_upstream_snapshot(transaction, ...)
|
|
promote_upstream_snapshot(transaction, snapshot)
|
|
read_upstream_snapshot(skill_name, source_id, skill_tree_hash)
|
|
```
|
|
|
|
- [ ] **Step 7: Make JSON/current/index writes atomic**
|
|
|
|
Change `SkillSpecStore._write_json()` and current/index pointer writes to create a temporary
|
|
file in the target directory, flush and `fsync`, then `os.replace()`. Immutable version
|
|
directories are promoted first; runtime visibility changes only when `current.json`,
|
|
`skill.json`, and the published index are atomically replaced under the workspace lock.
|
|
|
|
- [ ] **Step 8: Run focused and existing storage tests**
|
|
|
|
```bash
|
|
cd app-instance/backend
|
|
pytest tests/unit/test_plugin_skill_storage.py tests/unit/test_phase5_skills_runtime.py -q
|
|
```
|
|
|
|
Expected: PASS.
|
|
|
|
- [ ] **Step 9: Commit**
|
|
|
|
```bash
|
|
git add app-instance/backend/beaver/plugins/transaction.py app-instance/backend/beaver/skills/specs app-instance/backend/tests/unit/test_plugin_skill_storage.py
|
|
git commit -m "feat(skills): store immutable plugin upstream snapshots"
|
|
```
|
|
|
|
---
|
|
|
|
### Task 4: Mirror Initial Plugin Skills As First-Class Skills
|
|
|
|
**Files:**
|
|
- Create: `app-instance/backend/beaver/plugins/skills.py`
|
|
- Modify: `app-instance/backend/beaver/plugins/models.py`
|
|
- Modify: `app-instance/backend/beaver/plugins/__init__.py`
|
|
- Modify: `app-instance/backend/beaver/skills/specs/storage.py`
|
|
- Test: `app-instance/backend/tests/unit/test_plugin_skill_sync.py`
|
|
|
|
- [ ] **Step 1: Write failing initial mirror tests**
|
|
|
|
Cover:
|
|
|
|
- enabling mirrors `SKILL.md` and supporting files;
|
|
- mirrored skill is returned by `SkillsLoader.list_published_skills()`;
|
|
- `source_kind` is `plugin`, but runtime source is still workspace;
|
|
- existing non-plugin name collision fails without modification;
|
|
- any validation/safety failure in a multi-skill plugin occurs before promotion and leaves
|
|
every linked skill unchanged;
|
|
- repeated sync is idempotent.
|
|
- supporting files are present in the promoted version;
|
|
- concurrent enable calls allocate only one version.
|
|
|
|
Core assertion:
|
|
|
|
```python
|
|
result = manager.enable("baoyu-comic")
|
|
record = SkillsLoader(workspace).get_skill_record("baoyu-comic")
|
|
loaded = SkillSpecStore(workspace).read_published_skill("baoyu-comic")
|
|
|
|
assert result.status == "synced"
|
|
assert record is not None and record.source == "workspace"
|
|
assert record.source_kind == "plugin"
|
|
assert loaded is not None
|
|
assert loaded.version.version == "v0001"
|
|
assert loaded.version.provenance["plugin_id"] == "baoyu-comic"
|
|
assert loaded.version.provenance["upstream_skill_content_hash"]
|
|
assert loaded.version.provenance["upstream_skill_tree_hash"]
|
|
```
|
|
|
|
- [ ] **Step 2: Run tests and verify failure**
|
|
|
|
```bash
|
|
cd app-instance/backend
|
|
pytest tests/unit/test_plugin_skill_sync.py -q
|
|
```
|
|
|
|
Expected: FAIL because `PluginManager` does not exist.
|
|
|
|
- [ ] **Step 3: Implement `PluginManager` constructor and discovery view**
|
|
|
|
Constructor dependencies:
|
|
|
|
```python
|
|
class PluginManager:
|
|
def __init__(
|
|
self,
|
|
*,
|
|
workspace: Path,
|
|
manifests: dict[str, PluginManifest],
|
|
discovery_errors: list[PluginDiscoveryError],
|
|
state_store: PluginStateStore,
|
|
skill_store: SkillSpecStore,
|
|
learning_store: SkillLearningStore,
|
|
publisher: SkillPublisher,
|
|
safety_checker: SkillDraftSafetyChecker,
|
|
write_lock: WorkspaceWriteLock,
|
|
) -> None:
|
|
...
|
|
```
|
|
|
|
Keep all filesystem and lifecycle dependencies injectable for tests.
|
|
|
|
- [ ] **Step 4: Implement exact initial mirror publication**
|
|
|
|
Acquire the workspace write lock before reading state, allocating versions, or writing
|
|
candidates. For each declared skill:
|
|
|
|
1. persist the upstream snapshot;
|
|
2. validate ownership conflict;
|
|
3. parse frontmatter/body and create an in-memory `SkillDraft` with
|
|
`proposal_kind="plugin_initial_mirror"`;
|
|
4. run `SkillDraftSafetyChecker.check()` and reject failed or critical reports;
|
|
5. allocate the next `vNNNN` while holding the lock;
|
|
6. stage a `SkillVersion` whose content exactly equals upstream `SKILL.md`;
|
|
7. stage snapshot supporting files into the version directory;
|
|
8. generate the complete next `SkillSpec`, current pointer, index, and plugin-state JSON
|
|
payloads in memory.
|
|
|
|
Use provenance:
|
|
|
|
```python
|
|
{
|
|
"source_kind": "plugin",
|
|
"plugin_id": manifest.plugin_id,
|
|
"plugin_version": manifest.version,
|
|
"plugin_skill_path": declaration.relative_path,
|
|
"upstream_skill_content_hash": snapshot.skill_content_hash,
|
|
"upstream_skill_tree_hash": snapshot.skill_tree_hash,
|
|
"merge_mode": "initial_mirror",
|
|
}
|
|
```
|
|
|
|
- [ ] **Step 5: Promote the complete staged transaction**
|
|
|
|
After every declared skill passes validation:
|
|
|
|
1. for a new skill, promote its complete staged skill directory with one `os.replace()`;
|
|
2. for an existing skill, promote immutable upstream/version directories, atomically
|
|
replace spec/index metadata, and replace `current.json` last as the visibility switch;
|
|
3. atomically write plugin state last;
|
|
4. clean the staging directory.
|
|
|
|
Do not implement reverse rollback across already-promoted immutable directories. If a
|
|
metadata write fails, those directories remain unreferenced and harmless; the previous
|
|
current pointers remain authoritative. Add startup cleanup for staging directories older
|
|
than 24 hours.
|
|
|
|
- [ ] **Step 6: Run focused and loader tests**
|
|
|
|
```bash
|
|
cd app-instance/backend
|
|
pytest tests/unit/test_plugin_skill_sync.py tests/unit/test_phase5_skills_runtime.py -q
|
|
```
|
|
|
|
Expected: PASS.
|
|
|
|
- [ ] **Step 7: Commit**
|
|
|
|
```bash
|
|
git add app-instance/backend/beaver/plugins app-instance/backend/beaver/skills/specs/storage.py app-instance/backend/tests/unit/test_plugin_skill_sync.py
|
|
git commit -m "feat(plugins): mirror enabled plugin skills"
|
|
```
|
|
|
|
---
|
|
|
|
### Task 5: Detect Upgrades And Create Idempotent Learning Candidates
|
|
|
|
**Files:**
|
|
- Modify: `app-instance/backend/beaver/plugins/skills.py`
|
|
- Modify: `app-instance/backend/beaver/memory/skills/models.py`
|
|
- Modify: `app-instance/backend/beaver/memory/skills/store.py`
|
|
- Test: `app-instance/backend/tests/unit/test_plugin_skill_sync.py`
|
|
- Test: `app-instance/backend/tests/unit/test_skill_learning_candidate_state.py`
|
|
|
|
- [ ] **Step 1: Write failing upgrade classification tests**
|
|
|
|
Create four tree-hash fixtures representing `B`, `L`, and `U`:
|
|
|
|
```python
|
|
@pytest.mark.parametrize(
|
|
("base", "local", "upstream", "expected"),
|
|
[
|
|
("A", "A", "A", "unchanged"),
|
|
("A", "B", "B", "already_applied"),
|
|
("A", "A", "B", "fast_forward"),
|
|
("A", "LOCAL", "UPSTREAM", "three_way"),
|
|
],
|
|
)
|
|
def test_classify_plugin_skill_update(base: str, local: str, upstream: str, expected: str) -> None:
|
|
assert classify_plugin_skill_update(base, local, upstream) == expected
|
|
```
|
|
|
|
Also test:
|
|
|
|
- a supporting-file-only change returns `fast_forward` or `three_way`, never `unchanged`;
|
|
- candidate ID stability across repeated sync;
|
|
- new upstream supersedes an older pending candidate;
|
|
- candidate evidence contains hashes/version references but no raw skill body;
|
|
- legacy candidate payloads still parse.
|
|
- two processes syncing the same update append only one candidate record.
|
|
|
|
- [ ] **Step 2: Run tests and verify failure**
|
|
|
|
```bash
|
|
cd app-instance/backend
|
|
pytest tests/unit/test_plugin_skill_sync.py tests/unit/test_skill_learning_candidate_state.py -q
|
|
```
|
|
|
|
Expected: FAIL because update classification and candidate kind are missing.
|
|
|
|
- [ ] **Step 3: Add `plugin_skill_update` candidate support**
|
|
|
|
Do not add a special status. Existing candidate statuses remain sufficient. Ensure
|
|
`SkillLearningCandidate.from_dict()` accepts the new `kind` without changing legacy
|
|
defaults.
|
|
|
|
Use evidence:
|
|
|
|
```python
|
|
{
|
|
"plugin_id": plugin_id,
|
|
"plugin_version": manifest.version,
|
|
"skill_name": skill_name,
|
|
"merge_mode": merge_mode,
|
|
"base_upstream_tree_hash": accepted_tree_hash,
|
|
"new_upstream_tree_hash": snapshot.skill_tree_hash,
|
|
"local_version": current.version.version,
|
|
}
|
|
```
|
|
|
|
Set `priority=10`, `confidence=1.0`, `trigger_reason="plugin_update"`.
|
|
|
|
- [ ] **Step 4: Implement update classification and candidate creation**
|
|
|
|
Use canonical hashes and deterministic IDs:
|
|
|
|
```python
|
|
candidate_id = (
|
|
f"plugin-update:{plugin_id}:{skill_name}:"
|
|
f"{new_upstream_tree_hash[:12]}"
|
|
)
|
|
```
|
|
|
|
For `already_applied`, advance state without a candidate. For `fast_forward` and
|
|
`three_way`, record an open candidate. If the same ID exists in any status, do not append
|
|
another JSONL record.
|
|
|
|
- [ ] **Step 5: Make candidate mutation atomic under the shared lock**
|
|
|
|
Add an optional `WorkspaceWriteLock` to `SkillLearningStore`; EngineLoader supplies the
|
|
shared workspace instance, while isolated unit-test construction falls back to a
|
|
store-local lock. Add:
|
|
|
|
```python
|
|
record_learning_candidate_if_absent(candidate) -> tuple[SkillLearningCandidate, bool]
|
|
```
|
|
|
|
Inside one lock acquisition, read current candidates, check the deterministic ID, and
|
|
atomically rewrite or append the JSONL record. Apply the same lock to candidate update and
|
|
transition methods. Nested calls from `PluginManager` reuse the reentrant lock.
|
|
|
|
- [ ] **Step 6: Supersede stale pending updates**
|
|
|
|
When a different pending candidate exists for the same plugin skill:
|
|
|
|
```python
|
|
learning_store.transition_learning_candidate(
|
|
old_candidate_id,
|
|
"superseded",
|
|
event_type="plugin_update_superseded",
|
|
payload={"replacement_candidate_id": new_candidate_id},
|
|
)
|
|
```
|
|
|
|
- [ ] **Step 7: Run focused tests**
|
|
|
|
```bash
|
|
cd app-instance/backend
|
|
pytest tests/unit/test_plugin_skill_sync.py tests/unit/test_skill_learning_candidate_state.py -q
|
|
```
|
|
|
|
Expected: PASS.
|
|
|
|
- [ ] **Step 8: Commit**
|
|
|
|
```bash
|
|
git add app-instance/backend/beaver/plugins/skills.py app-instance/backend/beaver/memory/skills/models.py app-instance/backend/beaver/memory/skills/store.py app-instance/backend/tests/unit/test_plugin_skill_sync.py app-instance/backend/tests/unit/test_skill_learning_candidate_state.py
|
|
git commit -m "feat(plugins): enqueue skill upgrade candidates"
|
|
```
|
|
|
|
---
|
|
|
|
### Task 6: Add Plugin Update Draft Provenance And Fast-Forward Synthesis
|
|
|
|
**Files:**
|
|
- Modify: `app-instance/backend/beaver/skills/specs/models.py`
|
|
- Modify: `app-instance/backend/beaver/skills/drafts/service.py`
|
|
- Modify: `app-instance/backend/beaver/skills/publisher/service.py`
|
|
- Modify: `app-instance/backend/beaver/skills/learning/service.py`
|
|
- Test: `app-instance/backend/tests/unit/test_plugin_skill_learning.py`
|
|
- Test: `app-instance/backend/tests/unit/test_skill_learning_pipeline.py`
|
|
|
|
- [ ] **Step 1: Write failing model and fast-forward tests**
|
|
|
|
Test backward-compatible draft parsing and exact upstream fast-forward:
|
|
|
|
```python
|
|
draft = asyncio.run(service.synthesize_draft(candidate.candidate_id, provider_bundle))
|
|
|
|
assert draft.proposal_kind == "plugin_skill_update"
|
|
assert draft.proposed_content == new_upstream.content
|
|
assert draft.base_version == "v0001"
|
|
assert draft.provenance["merge_mode"] == "fast_forward"
|
|
assert draft.provenance["new_upstream_tree_hash"] == new_upstream.snapshot.skill_tree_hash
|
|
assert provider.calls == []
|
|
```
|
|
|
|
After publish, assert the new version contains the new upstream supporting files even when
|
|
`SKILL.md` did not change.
|
|
|
|
- [ ] **Step 2: Run tests and verify failure**
|
|
|
|
```bash
|
|
cd app-instance/backend
|
|
pytest tests/unit/test_plugin_skill_learning.py tests/unit/test_skill_learning_pipeline.py -q
|
|
```
|
|
|
|
Expected: FAIL because drafts have no provenance and the learning service has no plugin
|
|
update branch.
|
|
|
|
- [ ] **Step 3: Add backward-compatible draft provenance**
|
|
|
|
Extend `SkillDraft`:
|
|
|
|
```python
|
|
provenance: dict[str, Any] = field(default_factory=dict)
|
|
```
|
|
|
|
Include it in `to_dict()` and parse missing values as `{}` in `from_dict()`.
|
|
|
|
- [ ] **Step 4: Add a focused draft constructor**
|
|
|
|
Add:
|
|
|
|
```python
|
|
def create_plugin_update_draft(
|
|
self,
|
|
*,
|
|
skill_name: str,
|
|
base_version: str,
|
|
proposed_content: str,
|
|
proposed_frontmatter: dict,
|
|
created_by: str,
|
|
reason: str,
|
|
provenance: dict,
|
|
evidence_refs: list[dict] | None = None,
|
|
) -> SkillDraft:
|
|
```
|
|
|
|
It writes `proposal_kind="plugin_skill_update"`.
|
|
|
|
- [ ] **Step 5: Implement fast-forward synthesis**
|
|
|
|
In `SkillLearningService.synthesize_draft()`, branch before ordinary revision:
|
|
|
|
```python
|
|
if candidate.kind == "plugin_skill_update":
|
|
return await self._synthesize_plugin_update(candidate, provider_bundle)
|
|
```
|
|
|
|
For `merge_mode == "fast_forward"`, load `U` from `SkillSpecStore`, parse its
|
|
frontmatter/body, and create a draft exactly equal to `U`. Do not call the provider.
|
|
|
|
- [ ] **Step 6: Serialize all skill publication**
|
|
|
|
Add an optional `WorkspaceWriteLock` to `SkillPublisher`; EngineLoader supplies the shared
|
|
workspace instance and isolated tests use a publisher-local fallback. Hold it across
|
|
`_next_version()`, version staging/promotion, spec/current/index replacement, rollback,
|
|
and disable. This protects ordinary learned skills as well as plugin-origin skills from
|
|
racing with boot or explicit plugin sync.
|
|
|
|
- [ ] **Step 7: Materialize referenced supporting files during publish**
|
|
|
|
For `proposal_kind="plugin_skill_update"`, resolve the snapshot and supporting-file plan
|
|
from draft provenance. Stage the complete next version directory, including `SKILL.md`
|
|
and supporting files, before promoting it. Reject missing snapshots, path conflicts, or
|
|
tree-hash mismatches. Ordinary skill publication keeps its current behavior.
|
|
|
|
- [ ] **Step 8: Preserve draft provenance on publish**
|
|
|
|
Change `SkillPublisher.publish()` provenance construction to:
|
|
|
|
```python
|
|
provenance={
|
|
**dict(draft.provenance),
|
|
"draft_id": draft_id,
|
|
"proposal_kind": draft.proposal_kind,
|
|
"trigger_run_id": draft.trigger_run_id,
|
|
"trigger_session_id": draft.trigger_session_id,
|
|
}
|
|
```
|
|
|
|
- [ ] **Step 9: Run focused tests**
|
|
|
|
```bash
|
|
cd app-instance/backend
|
|
pytest tests/unit/test_plugin_skill_learning.py tests/unit/test_skill_learning_pipeline.py -q
|
|
```
|
|
|
|
Expected: PASS.
|
|
|
|
- [ ] **Step 10: Commit**
|
|
|
|
```bash
|
|
git add app-instance/backend/beaver/skills app-instance/backend/tests/unit/test_plugin_skill_learning.py app-instance/backend/tests/unit/test_skill_learning_pipeline.py
|
|
git commit -m "feat(skill-learning): create plugin update drafts"
|
|
```
|
|
|
|
---
|
|
|
|
### Task 7: Implement Three-Way Plugin Skill Synthesis
|
|
|
|
**Files:**
|
|
- Create: `app-instance/backend/beaver/plugins/tree_merge.py`
|
|
- Modify: `app-instance/backend/beaver/skills/learning/synthesizer.py`
|
|
- Modify: `app-instance/backend/beaver/skills/learning/service.py`
|
|
- Test: `app-instance/backend/tests/unit/test_plugin_skill_learning.py`
|
|
- Test: `app-instance/backend/tests/unit/test_skill_learning_synthesizer_preservation.py`
|
|
|
|
- [ ] **Step 1: Write failing three-way prompt and parse tests**
|
|
|
|
Assert the prompt contains labeled `OLD UPSTREAM`, `CURRENT LOCAL`, and `NEW UPSTREAM`
|
|
sections and does not confuse the current local version with the merge base.
|
|
|
|
Test response parsing for:
|
|
|
|
```json
|
|
{
|
|
"frontmatter": {"name": "baoyu-comic", "description": "Comic workflow", "tools": []},
|
|
"content": "# Baoyu Comic\n...",
|
|
"change_reason": "Adopt upstream layout while preserving learned review step.",
|
|
"preserved_local_sections": ["Review"],
|
|
"adopted_upstream_sections": ["Panel Layout"],
|
|
"resolved_conflicts": ["Output ordering"],
|
|
"dropped_sections": []
|
|
}
|
|
```
|
|
|
|
Add supporting-file merge tests:
|
|
|
|
```python
|
|
def test_supporting_file_merge_adopts_upstream_when_local_is_unchanged() -> None:
|
|
plan = merge_supporting_file_trees(base={"a.txt": "A"}, local={"a.txt": "A"}, upstream={"a.txt": "U"})
|
|
assert plan.files["a.txt"].source == "upstream"
|
|
assert plan.conflicts == []
|
|
|
|
|
|
def test_supporting_file_merge_blocks_divergent_edits() -> None:
|
|
plan = merge_supporting_file_trees(base={"a.txt": "A"}, local={"a.txt": "L"}, upstream={"a.txt": "U"})
|
|
assert plan.conflicts[0].path == "a.txt"
|
|
```
|
|
|
|
- [ ] **Step 2: Run tests and verify failure**
|
|
|
|
```bash
|
|
cd app-instance/backend
|
|
pytest tests/unit/test_plugin_skill_learning.py tests/unit/test_skill_learning_synthesizer_preservation.py -q
|
|
```
|
|
|
|
Expected: FAIL because three-way synthesis does not exist.
|
|
|
|
- [ ] **Step 3: Add `synthesize_plugin_update()`**
|
|
|
|
Signature:
|
|
|
|
```python
|
|
async def synthesize_plugin_update(
|
|
self,
|
|
candidate: SkillLearningCandidate,
|
|
evidence_packet: EvidencePacket,
|
|
provider: LLMProvider,
|
|
model: str,
|
|
*,
|
|
old_upstream: dict[str, Any],
|
|
current_local: dict[str, Any],
|
|
new_upstream: dict[str, Any],
|
|
) -> dict[str, Any]:
|
|
```
|
|
|
|
The system message must require JSON only and state:
|
|
|
|
- preserve valid local learning;
|
|
- adopt upstream fixes and safety changes;
|
|
- do not concatenate duplicate sections;
|
|
- list every intentional drop;
|
|
- leave `resolved_conflicts` empty only when no semantic conflict exists.
|
|
|
|
- [ ] **Step 4: Load all three snapshots in the learning service**
|
|
|
|
Resolve:
|
|
|
|
- `B` using `base_upstream_tree_hash`;
|
|
- `L` using `local_version`;
|
|
- `U` using `new_upstream_tree_hash`.
|
|
|
|
Raise a specific `ValueError` when any referenced snapshot/version is missing. Do not
|
|
fallback to a two-way merge.
|
|
|
|
- [ ] **Step 5: Build the deterministic supporting-file merge plan**
|
|
|
|
Compare files by path and content/executable digest:
|
|
|
|
- `L == B`: use `U`;
|
|
- `U == B`: use `L`;
|
|
- `L == U`: use either;
|
|
- one-sided addition: use the added file;
|
|
- divergent edit, different same-path additions, and delete-versus-edit: conflict.
|
|
|
|
Exclude `SKILL.md` because the synthesizer handles it. Store selected source references
|
|
and conflict records in draft provenance; do not duplicate file bytes in JSON.
|
|
|
|
- [ ] **Step 6: Create the plugin update draft**
|
|
|
|
Store merge decisions in draft provenance:
|
|
|
|
```python
|
|
{
|
|
**plugin_reference_fields,
|
|
"merge_mode": "three_way",
|
|
"preserved_local_sections": payload["preserved_local_sections"],
|
|
"adopted_upstream_sections": payload["adopted_upstream_sections"],
|
|
"resolved_conflicts": payload["resolved_conflicts"],
|
|
"dropped_sections": payload["dropped_sections"],
|
|
"supporting_file_plan": supporting_file_plan.to_dict(),
|
|
}
|
|
```
|
|
|
|
If the supporting-file plan contains conflicts, the draft may be inspected but cannot be
|
|
published. V1 does not ask the LLM to merge arbitrary or binary files.
|
|
|
|
- [ ] **Step 7: Run focused tests**
|
|
|
|
```bash
|
|
cd app-instance/backend
|
|
pytest tests/unit/test_plugin_skill_learning.py tests/unit/test_skill_learning_synthesizer_preservation.py -q
|
|
```
|
|
|
|
Expected: PASS.
|
|
|
|
- [ ] **Step 8: Commit**
|
|
|
|
```bash
|
|
git add app-instance/backend/beaver/plugins/tree_merge.py app-instance/backend/beaver/skills/learning app-instance/backend/tests/unit/test_plugin_skill_learning.py app-instance/backend/tests/unit/test_skill_learning_synthesizer_preservation.py
|
|
git commit -m "feat(skill-learning): synthesize three-way plugin updates"
|
|
```
|
|
|
|
---
|
|
|
|
### Task 8: Extend Replay Preservation For Plugin Merges
|
|
|
|
**Files:**
|
|
- Modify: `app-instance/backend/beaver/skills/learning/preservation.py`
|
|
- Modify: `app-instance/backend/beaver/skills/learning/eval.py`
|
|
- Modify: `app-instance/backend/beaver/skills/learning/pipeline.py`
|
|
- Test: `app-instance/backend/tests/unit/test_skill_learning_preservation.py`
|
|
- Test: `app-instance/backend/tests/unit/test_skill_learning_eval.py`
|
|
- Test: `app-instance/backend/tests/unit/test_skill_learning_pipeline.py`
|
|
|
|
- [ ] **Step 1: Write failing plugin merge preservation tests**
|
|
|
|
Cover:
|
|
|
|
- merged draft preserves local Safety and adopts new upstream Safety;
|
|
- silently dropping either Safety section fails;
|
|
- explicitly resolved non-safety conflicts pass;
|
|
- unresolved conflicts block publish;
|
|
- unresolved supporting-file conflicts block publish;
|
|
- baseline replay remains current local `L`.
|
|
|
|
Expected report shape:
|
|
|
|
```python
|
|
assert report.preservation_report == {
|
|
"mode": "plugin_three_way",
|
|
"passed": True,
|
|
"local": {...},
|
|
"upstream": {...},
|
|
"unresolved_conflicts": [],
|
|
}
|
|
```
|
|
|
|
- [ ] **Step 2: Run tests and verify failure**
|
|
|
|
```bash
|
|
cd app-instance/backend
|
|
pytest tests/unit/test_skill_learning_preservation.py tests/unit/test_skill_learning_eval.py tests/unit/test_skill_learning_pipeline.py -q
|
|
```
|
|
|
|
Expected: FAIL because preservation only checks one base skill.
|
|
|
|
- [ ] **Step 3: Add plugin merge preservation helper**
|
|
|
|
Add:
|
|
|
|
```python
|
|
def check_plugin_merge_preservation(
|
|
*,
|
|
local_content: str,
|
|
upstream_content: str,
|
|
draft_content: str,
|
|
merge_decisions: dict[str, Any],
|
|
) -> dict[str, Any]:
|
|
```
|
|
|
|
It calls existing `check_preservation()` for local and upstream content, gives Safety and
|
|
Required Tools sections blocking weight, and reports unresolved conflicts separately.
|
|
|
|
- [ ] **Step 4: Use current local as replay baseline**
|
|
|
|
When `draft.proposal_kind == "plugin_skill_update"`, load `draft.base_version` as the
|
|
baseline skill. Continue to run the candidate arm with the draft context. Do not use raw
|
|
upstream `B` or `U` as the replay baseline.
|
|
|
|
- [ ] **Step 5: Tighten publish gate**
|
|
|
|
Add:
|
|
|
|
```python
|
|
if draft.proposal_kind == "plugin_skill_update":
|
|
preservation = eval_report.preservation_report or {}
|
|
if preservation.get("mode") != "plugin_three_way" and draft.provenance.get("merge_mode") == "three_way":
|
|
raise ValueError("Plugin update requires a three-way preservation report")
|
|
if preservation.get("unresolved_conflicts"):
|
|
raise ValueError("Plugin update has unresolved merge conflicts")
|
|
if draft.provenance.get("supporting_file_plan", {}).get("conflicts"):
|
|
raise ValueError("Plugin update has unresolved supporting-file conflicts")
|
|
```
|
|
|
|
The existing `passed is False` gate remains active.
|
|
|
|
- [ ] **Step 6: Run focused tests**
|
|
|
|
```bash
|
|
cd app-instance/backend
|
|
pytest tests/unit/test_skill_learning_preservation.py tests/unit/test_skill_learning_eval.py tests/unit/test_skill_learning_pipeline.py -q
|
|
```
|
|
|
|
Expected: PASS.
|
|
|
|
- [ ] **Step 7: Commit**
|
|
|
|
```bash
|
|
git add app-instance/backend/beaver/skills/learning app-instance/backend/tests/unit/test_skill_learning_preservation.py app-instance/backend/tests/unit/test_skill_learning_eval.py app-instance/backend/tests/unit/test_skill_learning_pipeline.py
|
|
git commit -m "feat(skill-learning): gate plugin merge preservation"
|
|
```
|
|
|
|
---
|
|
|
|
### Task 9: Reconcile Publication And Implement Pause/Disable/Adopt
|
|
|
|
**Files:**
|
|
- Modify: `app-instance/backend/beaver/plugins/skills.py`
|
|
- Modify: `app-instance/backend/beaver/skills/learning/pipeline.py`
|
|
- Modify: `app-instance/backend/beaver/skills/publisher/service.py`
|
|
- Test: `app-instance/backend/tests/unit/test_plugin_skill_sync.py`
|
|
- Test: `app-instance/backend/tests/unit/test_skill_learning_pipeline.py`
|
|
|
|
- [ ] **Step 1: Write failing lifecycle tests**
|
|
|
|
Test:
|
|
|
|
- publishing a plugin update advances accepted upstream tree hash;
|
|
- pending candidate clears;
|
|
- simulated observer failure leaves the published version intact;
|
|
- the next sync reconciles state from current version provenance and does not recreate the
|
|
candidate;
|
|
- reconciliation never moves `accepted_beaver_version` backwards after rollback;
|
|
- pause leaves linked skills active and creates no update candidates;
|
|
- resume reconciles and syncs;
|
|
- disabling plugin disables linked skills without deletion;
|
|
- re-enable restores and syncs;
|
|
- missing package sets plugin status `missing`, suspends sync, and leaves linked skills
|
|
active;
|
|
- adopt changes `source_kind` to `managed`, removes binding, and keeps the skill active.
|
|
|
|
- [ ] **Step 2: Run tests and verify failure**
|
|
|
|
```bash
|
|
cd app-instance/backend
|
|
pytest tests/unit/test_plugin_skill_sync.py tests/unit/test_skill_learning_pipeline.py -q
|
|
```
|
|
|
|
Expected: FAIL because publication has no plugin acknowledgement callback.
|
|
|
|
- [ ] **Step 3: Add a narrow publication observer**
|
|
|
|
Extend pipeline construction with:
|
|
|
|
```python
|
|
publish_observer: Callable[[SkillDraft, SkillVersion | SkillSpec], None] | None = None
|
|
```
|
|
|
|
After successful publish, call it before returning. Observer failure must be recorded and
|
|
audited as `plugin_publish_ack_failed`; it must not delete the already-published version
|
|
or turn the publish API response into a failure. Mark the learning candidate published
|
|
before invoking the best-effort observer so clients do not retry a successful publish.
|
|
The next sync is responsible for reconciliation.
|
|
|
|
- [ ] **Step 4: Implement `PluginManager.on_skill_published()`**
|
|
|
|
For `proposal_kind="plugin_skill_update"`:
|
|
|
|
1. validate plugin ID, skill name, and new upstream tree hash from draft provenance;
|
|
2. set `accepted_upstream_tree_hash = new_upstream_tree_hash`;
|
|
3. set `observed_upstream_tree_hash = new_upstream_tree_hash`;
|
|
4. set `accepted_beaver_version = published.version`;
|
|
5. set `current_beaver_version = published.version`;
|
|
6. clear `pending_candidate_id`;
|
|
7. set status `synced`.
|
|
|
|
- [ ] **Step 5: Implement sync-time reconciliation**
|
|
|
|
At the beginning of `sync_enabled()`, inspect each linked skill's current published
|
|
version. When provenance contains:
|
|
|
|
```python
|
|
{
|
|
"proposal_kind": "plugin_skill_update",
|
|
"plugin_id": plugin_id,
|
|
"new_upstream_tree_hash": tree_hash,
|
|
}
|
|
```
|
|
|
|
and the referenced upstream snapshot exists, advance state only if the current version
|
|
number is newer than `accepted_beaver_version`. Clear only the matching pending candidate.
|
|
Never regress state when the runtime current pointer was rolled back to an older version.
|
|
|
|
- [ ] **Step 6: Implement pause, resume, disable, missing, and adopt**
|
|
|
|
`pause(plugin_id)` sets `updates_paused=True` and leaves linked skills unchanged.
|
|
`resume(plugin_id)` clears the flag and performs reconciliation/sync.
|
|
|
|
`disable(plugin_id, disable_linked_skills=True)` rejects calls without the explicit
|
|
confirmation and calls `SkillPublisher.disable()` for every still-linked skill.
|
|
`adopt(plugin_id, skill_name)`:
|
|
|
|
- requires an existing binding;
|
|
- changes `SkillSpec.source_kind` to `managed`;
|
|
- appends `adopted_from_plugin:<plugin-id>` to lineage;
|
|
- removes the binding;
|
|
- leaves the current version active.
|
|
|
|
When discovery cannot find a previously known plugin, set status `missing`, preserve
|
|
`enabled` and `updates_paused`, skip update generation, and do not disable any linked
|
|
skill.
|
|
|
|
- [ ] **Step 7: Run focused tests**
|
|
|
|
```bash
|
|
cd app-instance/backend
|
|
pytest tests/unit/test_plugin_skill_sync.py tests/unit/test_skill_learning_pipeline.py -q
|
|
```
|
|
|
|
Expected: PASS.
|
|
|
|
- [ ] **Step 8: Commit**
|
|
|
|
```bash
|
|
git add app-instance/backend/beaver/plugins/skills.py app-instance/backend/beaver/skills/learning/pipeline.py app-instance/backend/beaver/skills/publisher/service.py app-instance/backend/tests/unit/test_plugin_skill_sync.py app-instance/backend/tests/unit/test_skill_learning_pipeline.py
|
|
git commit -m "feat(plugins): track published updates and ownership"
|
|
```
|
|
|
|
---
|
|
|
|
### Task 10: Wire Plugin Sync Into Engine Loading
|
|
|
|
**Files:**
|
|
- Modify: `app-instance/backend/beaver/engine/loader.py`
|
|
- Modify: `app-instance/backend/beaver/plugins/__init__.py`
|
|
- Test: `app-instance/backend/tests/unit/test_plugin_runtime.py`
|
|
- Test: `app-instance/backend/tests/unit/test_phase5_skills_runtime.py`
|
|
|
|
- [ ] **Step 1: Write failing runtime assembly tests**
|
|
|
|
Test:
|
|
|
|
- discovered disabled plugins do not mirror;
|
|
- enabled plugin mirrors before `EngineLoadResult.skills` is calculated;
|
|
- changed plugin creates a candidate but never calls an LLM during boot;
|
|
- repeated boot creates no duplicate versions/candidates;
|
|
- concurrent multi-process boot creates no duplicate versions/candidates;
|
|
- boot skips auto-sync and reports `deferred_lock_busy` when an explicit sync holds the
|
|
workspace lock;
|
|
- `EngineLoadResult.plugin_manager` and plugin summaries are available.
|
|
|
|
- [ ] **Step 2: Run tests and verify failure**
|
|
|
|
```bash
|
|
cd app-instance/backend
|
|
pytest tests/unit/test_plugin_runtime.py tests/unit/test_phase5_skills_runtime.py -q
|
|
```
|
|
|
|
Expected: FAIL because `EngineLoader` does not assemble plugin services.
|
|
|
|
- [ ] **Step 3: Extend `EngineLoadResult` and loader injection**
|
|
|
|
Add:
|
|
|
|
```python
|
|
plugin_manager: PluginManager | None = None
|
|
plugins: list[dict] = field(default_factory=list)
|
|
```
|
|
|
|
Allow `plugin_manager` injection in `EngineLoader.__init__()` for tests.
|
|
|
|
- [ ] **Step 4: Assemble in dependency order**
|
|
|
|
Required order:
|
|
|
|
1. config/workspace;
|
|
2. `SkillSpecStore`, learning store, and `SkillsLoader`;
|
|
3. tool registry and builtins, including skill-view tools using that loader;
|
|
4. draft/review/publisher and a safety checker using the completed tool registry;
|
|
5. discovery and `PluginStateStore`;
|
|
6. `PluginManager`;
|
|
7. `plugin_manager.sync_enabled(blocking=False)` when `config.plugins.auto_sync`;
|
|
8. learning service/pipeline with publication observer;
|
|
9. result summaries.
|
|
|
|
Do not use `SkillsLoader.extra_dirs` for plugin skills. Explicit API enable/sync uses a
|
|
bounded blocking lock timeout; Engine boot uses a non-blocking attempt and proceeds with
|
|
the current published skill set if another writer owns the lock.
|
|
|
|
- [ ] **Step 5: Run runtime tests**
|
|
|
|
```bash
|
|
cd app-instance/backend
|
|
pytest tests/unit/test_plugin_runtime.py tests/unit/test_phase5_skills_runtime.py -q
|
|
```
|
|
|
|
Expected: PASS.
|
|
|
|
- [ ] **Step 6: Commit**
|
|
|
|
```bash
|
|
git add app-instance/backend/beaver/engine/loader.py app-instance/backend/beaver/plugins app-instance/backend/tests/unit/test_plugin_runtime.py app-instance/backend/tests/unit/test_phase5_skills_runtime.py
|
|
git commit -m "feat(runtime): sync declarative plugins at boot"
|
|
```
|
|
|
|
---
|
|
|
|
### Task 11: Add Plugin Management API
|
|
|
|
**Files:**
|
|
- Modify: `app-instance/backend/beaver/interfaces/web/app.py`
|
|
- Test: `app-instance/backend/tests/unit/test_plugin_web_api.py`
|
|
|
|
- [ ] **Step 1: Write failing API tests**
|
|
|
|
Cover:
|
|
|
|
```text
|
|
GET /api/plugins
|
|
POST /api/plugins/sync
|
|
POST /api/plugins/{plugin_id}/enable
|
|
POST /api/plugins/{plugin_id}/pause
|
|
POST /api/plugins/{plugin_id}/resume
|
|
POST /api/plugins/{plugin_id}/disable
|
|
POST /api/plugins/{plugin_id}/skills/{skill_name}/adopt
|
|
```
|
|
|
|
Assert `404` for unknown plugin, `409` for skill ownership conflict, and `400` for invalid
|
|
manifest/sync errors. Assert lock timeout maps to `409 plugin_write_busy`. Assert no
|
|
payload contains the real absolute workspace or external search-root path. Assert disable
|
|
without `{"disable_linked_skills": true}` is rejected.
|
|
|
|
- [ ] **Step 2: Run tests and verify failure**
|
|
|
|
```bash
|
|
cd app-instance/backend
|
|
pytest tests/unit/test_plugin_web_api.py -q
|
|
```
|
|
|
|
Expected: FAIL with missing routes.
|
|
|
|
- [ ] **Step 3: Add normalized plugin payload helper**
|
|
|
|
Return:
|
|
|
|
```python
|
|
{
|
|
"id": manifest.plugin_id,
|
|
"name": manifest.name,
|
|
"discovered_version": manifest.version,
|
|
"installed_version": state.installed_version,
|
|
"enabled": state.enabled,
|
|
"status": state.status,
|
|
"last_error": state.last_error,
|
|
"manifest_path": manifest.display_path,
|
|
"updates_paused": state.updates_paused,
|
|
"skills": [
|
|
{
|
|
"name": declaration.name,
|
|
"status": binding.status,
|
|
"current_beaver_version": binding.current_beaver_version,
|
|
"accepted_upstream_tree_hash": binding.accepted_upstream_tree_hash,
|
|
"observed_upstream_tree_hash": binding.observed_upstream_tree_hash,
|
|
"accepted_beaver_version": binding.accepted_beaver_version,
|
|
"pending_candidate_id": binding.pending_candidate_id,
|
|
}
|
|
],
|
|
}
|
|
```
|
|
|
|
Never return arbitrary plugin file content, secrets, or absolute server paths.
|
|
|
|
- [ ] **Step 4: Implement routes**
|
|
|
|
Each mutating endpoint boots one runtime, invokes its `plugin_manager`, and returns the
|
|
updated plugin payload. Map `ValueError` messages to stable HTTP status codes.
|
|
|
|
- [ ] **Step 5: Run focused and existing web tests**
|
|
|
|
```bash
|
|
cd app-instance/backend
|
|
pytest tests/unit/test_plugin_web_api.py tests/unit/test_skill_learning_web_api.py -q
|
|
```
|
|
|
|
Expected: PASS.
|
|
|
|
- [ ] **Step 6: Commit**
|
|
|
|
```bash
|
|
git add app-instance/backend/beaver/interfaces/web/app.py app-instance/backend/tests/unit/test_plugin_web_api.py
|
|
git commit -m "feat(api): manage declarative plugins"
|
|
```
|
|
|
|
---
|
|
|
|
### Task 12: Add Plugin Management To The Skills UI
|
|
|
|
**Files:**
|
|
- Modify: `app-instance/frontend/types/index.ts`
|
|
- Modify: `app-instance/frontend/lib/api.ts`
|
|
- Modify: `app-instance/frontend/app/(app)/skills/page.tsx`
|
|
- Test: `app-instance/frontend/lib/plugin-api.test.ts`
|
|
|
|
- [ ] **Step 1: Write failing API client tests**
|
|
|
|
Test URL, method, and response typing for list, sync, enable, pause, resume, disable, and
|
|
adopt.
|
|
|
|
- [ ] **Step 2: Run frontend test and verify failure**
|
|
|
|
Run the repository's existing frontend test command targeting:
|
|
|
|
```bash
|
|
cd app-instance/frontend
|
|
npx vitest run lib/plugin-api.test.ts
|
|
```
|
|
|
|
Expected: FAIL because plugin API functions do not exist.
|
|
|
|
- [ ] **Step 3: Add frontend types**
|
|
|
|
Add:
|
|
|
|
```typescript
|
|
export interface PluginSkillBinding {
|
|
name: string;
|
|
status: string;
|
|
current_beaver_version?: string | null;
|
|
accepted_upstream_tree_hash?: string | null;
|
|
observed_upstream_tree_hash?: string | null;
|
|
accepted_beaver_version?: string | null;
|
|
pending_candidate_id?: string | null;
|
|
}
|
|
|
|
export interface BeaverPlugin {
|
|
id: string;
|
|
name: string;
|
|
discovered_version?: string | null;
|
|
installed_version?: string | null;
|
|
enabled: boolean;
|
|
updates_paused: boolean;
|
|
status: string;
|
|
last_error?: string | null;
|
|
manifest_path?: string | null;
|
|
skills: PluginSkillBinding[];
|
|
}
|
|
```
|
|
|
|
- [ ] **Step 4: Add API functions**
|
|
|
|
Implement:
|
|
|
|
```typescript
|
|
listPlugins()
|
|
syncPlugins()
|
|
enablePlugin(pluginId)
|
|
pausePlugin(pluginId)
|
|
resumePlugin(pluginId)
|
|
disablePlugin(pluginId, { disable_linked_skills: true })
|
|
adoptPluginSkill(pluginId, skillName)
|
|
```
|
|
|
|
- [ ] **Step 5: Add a `plugins` Skills tab**
|
|
|
|
Extend `SkillsTab` and render a compact table with:
|
|
|
|
- plugin name and versions;
|
|
- enabled/status badges;
|
|
- linked skills and pending candidate link;
|
|
- icon buttons with tooltips for sync, enable, pause, resume, disable, and adopt;
|
|
- confirmation before disable/adopt;
|
|
- missing-source warning stating that current skills remain active but updates are
|
|
suspended;
|
|
- existing `runAction()` and error handling.
|
|
|
|
Do not add a separate marketing-style page or nested cards.
|
|
|
|
- [ ] **Step 6: Label plugin-origin skills and update candidates**
|
|
|
|
In existing Published/Candidates/Drafts views:
|
|
|
|
- show `Plugin` source badge when `source_kind === "plugin"`;
|
|
- render `plugin_skill_update` as `插件升级合并 / Plugin update merge`;
|
|
- show `fast_forward` or `three_way` from candidate evidence/provenance.
|
|
|
|
- [ ] **Step 7: Run frontend tests and type checks**
|
|
|
|
```bash
|
|
cd app-instance/frontend
|
|
npx vitest run lib/plugin-api.test.ts
|
|
npm run lint
|
|
npx tsc --noEmit
|
|
```
|
|
|
|
Expected: PASS.
|
|
|
|
- [ ] **Step 8: Commit**
|
|
|
|
```bash
|
|
git add app-instance/frontend/types/index.ts app-instance/frontend/lib/api.ts app-instance/frontend/lib/plugin-api.test.ts 'app-instance/frontend/app/(app)/skills/page.tsx'
|
|
git commit -m "feat(skills-ui): manage plugin skill mirrors"
|
|
```
|
|
|
|
---
|
|
|
|
### Task 13: Add End-To-End Lifecycle Coverage And Documentation
|
|
|
|
**Files:**
|
|
- Create: `app-instance/backend/tests/integration/test_plugin_skill_lifecycle.py`
|
|
- Create: `docs/plugins/skill-plugins.md`
|
|
- Modify: `docs/product-discovery/beaver/README.md`
|
|
|
|
- [ ] **Step 1: Write the end-to-end lifecycle test**
|
|
|
|
The test must:
|
|
|
|
1. create plugin `1.0.0`;
|
|
2. enable it and assert mirror `v0001`;
|
|
3. publish a normal learned local revision `v0002`;
|
|
4. replace the package with plugin `1.1.0`;
|
|
5. sync and assert one `three_way` candidate;
|
|
6. synthesize with a stub provider;
|
|
7. run safety and replay evaluation with a stub runner;
|
|
8. submit, approve, and publish `v0003`;
|
|
9. assert accepted upstream tree hash and provenance advanced;
|
|
10. rollback to `v0002`;
|
|
11. assert plugin source files were never modified;
|
|
12. update only a supporting file and assert a new update candidate is created;
|
|
13. simulate publish-observer failure and assert the next sync reconciles state;
|
|
14. remove the plugin package and assert the plugin is `missing` while the current skill
|
|
remains active;
|
|
15. run two sync processes and assert no duplicate version or candidate is created.
|
|
|
|
- [ ] **Step 2: Run the integration test and fix only lifecycle defects**
|
|
|
|
```bash
|
|
cd app-instance/backend
|
|
pytest tests/integration/test_plugin_skill_lifecycle.py -v
|
|
```
|
|
|
|
Expected: PASS.
|
|
|
|
- [ ] **Step 3: Write operator documentation**
|
|
|
|
Document:
|
|
|
|
- package layout and manifest;
|
|
- discovery roots;
|
|
- explicit enable requirement;
|
|
- mirror and three-way merge behavior;
|
|
- dual content/tree hashing and supporting-file merge conflicts;
|
|
- update candidate review flow;
|
|
- pause/resume versus disable/adopt;
|
|
- recovery from missing/invalid plugins;
|
|
- workspace locking, deferred boot sync, and publication reconciliation;
|
|
- why plugin Python code is not executed in V1.
|
|
|
|
- [ ] **Step 4: Run the complete relevant backend suite**
|
|
|
|
```bash
|
|
cd app-instance/backend
|
|
pytest \
|
|
tests/unit/test_plugin_manifest.py \
|
|
tests/unit/test_plugin_hashing.py \
|
|
tests/unit/test_plugin_state.py \
|
|
tests/unit/test_workspace_write_lock.py \
|
|
tests/unit/test_plugin_skill_storage.py \
|
|
tests/unit/test_plugin_skill_sync.py \
|
|
tests/unit/test_plugin_skill_learning.py \
|
|
tests/unit/test_plugin_runtime.py \
|
|
tests/unit/test_plugin_web_api.py \
|
|
tests/unit/test_skill_learning_candidate_state.py \
|
|
tests/unit/test_skill_learning_pipeline.py \
|
|
tests/unit/test_skill_learning_eval.py \
|
|
tests/unit/test_skill_learning_worker.py \
|
|
tests/unit/test_phase5_skills_runtime.py \
|
|
tests/integration/test_plugin_skill_lifecycle.py \
|
|
-q
|
|
```
|
|
|
|
Expected: PASS.
|
|
|
|
- [ ] **Step 5: Run frontend verification**
|
|
|
|
```bash
|
|
cd app-instance/frontend
|
|
npx vitest run lib/plugin-api.test.ts
|
|
npm run lint
|
|
npx tsc --noEmit
|
|
```
|
|
|
|
Expected: PASS.
|
|
|
|
- [ ] **Step 6: Run a dirty-worktree-safe diff review**
|
|
|
|
```bash
|
|
git status --short
|
|
git diff --check
|
|
git diff --stat
|
|
```
|
|
|
|
Expected:
|
|
|
|
- no whitespace errors;
|
|
- only plugin/skill lifecycle files and planned docs/tests are included in this feature;
|
|
- unrelated pre-existing user changes remain untouched.
|
|
|
|
- [ ] **Step 7: Commit**
|
|
|
|
```bash
|
|
git add app-instance/backend/tests/integration/test_plugin_skill_lifecycle.py docs/plugins/skill-plugins.md docs/product-discovery/beaver/README.md
|
|
git commit -m "docs(plugins): document skill mirror lifecycle"
|
|
```
|
|
|
|
---
|
|
|
|
## Release Sequence
|
|
|
|
1. Ship backend manifest, state, snapshots, and initial mirror behind the Plugins API.
|
|
2. Enable update candidate generation after initial mirror tests pass in a real workspace.
|
|
3. Enable three-way synthesis and replay publish gates.
|
|
4. Ship the Plugins UI.
|
|
5. Keep executable plugin code disabled; design it separately with process isolation and
|
|
permission boundaries.
|
|
|
|
## Rollout Metrics
|
|
|
|
Track:
|
|
|
|
- plugin discovery and manifest error count;
|
|
- initial mirror success/failure count;
|
|
- plugin update candidates created, superseded, rejected, and published;
|
|
- plugin update candidates caused by supporting-file-only changes;
|
|
- fast-forward versus three-way update ratio;
|
|
- write-lock contention and deferred boot sync count;
|
|
- publication reconciliation repair count;
|
|
- replay regression and preservation failure rate;
|
|
- time from upstream discovery to accepted publication;
|
|
- rollback count for plugin-origin versions.
|
|
|
|
## Final Acceptance Test
|
|
|
|
The feature is complete only when a plugin-origin skill can:
|
|
|
|
1. be enabled and used with normal skill priority;
|
|
2. accumulate a normal Beaver-learned revision;
|
|
3. receive a newer upstream plugin version;
|
|
4. produce a three-way update draft without editing the plugin package;
|
|
5. pass the same safety, replay, review, and publish gates as ordinary skills;
|
|
6. retain full upstream and local provenance;
|
|
7. detect and publish supporting-file-only updates;
|
|
8. survive concurrent boot/sync without duplicate versions or candidates;
|
|
9. recover plugin state after observer failure;
|
|
10. remain active when its plugin package is temporarily missing;
|
|
11. be paused, resumed, rolled back, disabled, re-enabled, or adopted without data loss.
|