59 KiB
Plugin Skill Mirroring And Upgrade Learning Implementation Plan
For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (
- [ ]) syntax for tracking.
Goal: Add declarative Beaver plugins whose skills are mirrored as normal managed skills, learn normally, and merge plugin upgrades through the existing safety, replay evaluation, review, publish, and rollback lifecycle.
Architecture: A new beaver.plugins package discovers and validates beaver.plugin.json, computes content and full-tree hashes, persists enable/sync state, and stages immutable upstream/version trees before atomic promotion under a workspace write lock. Plugin upgrades become deterministic plugin_skill_update learning candidates using old upstream, current local, and new upstream inputs; the existing learning pipeline remains the only path for update publication, with sync-time reconciliation repairing failed state acknowledgements.
Tech Stack: Python dataclasses and file-backed JSON stores, existing SkillSpecStore and skill-learning pipeline, FastAPI, pytest, Next.js/TypeScript, existing shadcn UI components.
Scope
This plan implements declarative skill plugins only. Do not add Python plugin entrypoints, hooks, providers, channels, dependency installation, or marketplace download support. Plugin-provided tools continue to use MCP.
File Structure
Create focused plugin modules:
app-instance/backend/beaver/plugins/models.py: manifest, discovery, state, and sync result dataclasses.app-instance/backend/beaver/plugins/manifest.py: JSON parsing, identifier validation, and contained-path validation.app-instance/backend/beaver/plugins/hashing.py: canonical skill-content and full-tree hashing.app-instance/backend/beaver/plugins/tree_merge.py: deterministic three-way supporting-file merge plans.app-instance/backend/beaver/plugins/state.py: atomic.beaver/plugins/state.jsonpersistence.app-instance/backend/beaver/plugins/discovery.py: scan workspace and configured plugin roots.app-instance/backend/beaver/plugins/transaction.py: same-filesystem staging and immutable directory promotion.app-instance/backend/beaver/plugins/skills.py: initial mirror, update classification, candidate creation, reconciliation, pause/resume, disable, and adopt.app-instance/backend/beaver/plugins/__init__.py: public exports.app-instance/backend/beaver/foundation/utils/file_lock.py: reentrant cross-process workspace write lock.
Modify skill lifecycle modules:
app-instance/backend/beaver/skills/specs/models.py: add upstream snapshot and draft provenance models.app-instance/backend/beaver/skills/specs/storage.py: persist immutable upstream snapshots and safely copy supporting files.app-instance/backend/beaver/memory/skills/store.py: lock candidate existence checks and JSONL mutations.app-instance/backend/beaver/skills/drafts/service.py: create plugin update drafts.app-instance/backend/beaver/skills/learning/service.py: synthesizeplugin_skill_update.app-instance/backend/beaver/skills/learning/synthesizer.py: three-way plugin merge prompt and result.app-instance/backend/beaver/skills/learning/eval.py: plugin merge preservation report.app-instance/backend/beaver/skills/learning/pipeline.py: acknowledge successful plugin update publication.app-instance/backend/beaver/skills/publisher/service.py: carry draft provenance into published versions.
Modify runtime and management surfaces:
app-instance/backend/beaver/foundation/config/schema.pyapp-instance/backend/beaver/foundation/config/loader.pyapp-instance/backend/beaver/engine/loader.pyapp-instance/backend/beaver/interfaces/web/app.pyapp-instance/frontend/types/index.tsapp-instance/frontend/lib/api.tsapp-instance/frontend/app/(app)/skills/page.tsx
Add tests:
app-instance/backend/tests/unit/test_plugin_manifest.pyapp-instance/backend/tests/unit/test_plugin_hashing.pyapp-instance/backend/tests/unit/test_plugin_state.pyapp-instance/backend/tests/unit/test_workspace_write_lock.pyapp-instance/backend/tests/unit/test_plugin_skill_storage.pyapp-instance/backend/tests/unit/test_plugin_skill_sync.pyapp-instance/backend/tests/unit/test_plugin_skill_learning.pyapp-instance/backend/tests/unit/test_plugin_runtime.pyapp-instance/backend/tests/unit/test_plugin_web_api.pyapp-instance/frontend/lib/plugin-api.test.ts
Task 1: Add Plugin Configuration And Manifest Models
Files:
-
Create:
app-instance/backend/beaver/plugins/models.py -
Create:
app-instance/backend/beaver/plugins/manifest.py -
Create:
app-instance/backend/beaver/plugins/hashing.py -
Create:
app-instance/backend/beaver/plugins/__init__.py -
Modify:
app-instance/backend/beaver/foundation/config/schema.py -
Modify:
app-instance/backend/beaver/foundation/config/loader.py -
Modify:
app-instance/backend/beaver/foundation/config/__init__.py -
Test:
app-instance/backend/tests/unit/test_plugin_manifest.py -
Test:
app-instance/backend/tests/unit/test_plugin_hashing.py -
Test:
app-instance/backend/tests/unit/test_config_loader.py -
Step 1: Write failing manifest validation tests
Create tests covering:
def test_load_plugin_manifest_accepts_declared_skill(tmp_path: Path) -> None:
root = tmp_path / "comic"
(root / "skills" / "comic").mkdir(parents=True)
(root / "skills" / "comic" / "SKILL.md").write_text("# Comic\n", encoding="utf-8")
(root / "beaver.plugin.json").write_text(
json.dumps(
{
"schema_version": 1,
"id": "baoyu-comic",
"name": "Baoyu Comic",
"version": "1.2.0",
"skills": [{"name": "baoyu-comic", "path": "skills/comic"}],
}
),
encoding="utf-8",
)
manifest = load_plugin_manifest(root / "beaver.plugin.json")
assert manifest.plugin_id == "baoyu-comic"
assert manifest.skills[0].name == "baoyu-comic"
assert manifest.skills[0].root == root / "skills" / "comic"
@pytest.mark.parametrize("value", ["../outside", "/absolute", "skills/../../outside"])
def test_load_plugin_manifest_rejects_escaping_skill_path(tmp_path: Path, value: str) -> None:
path = tmp_path / "beaver.plugin.json"
path.write_text(
json.dumps(
{
"schema_version": 1,
"id": "unsafe",
"name": "Unsafe",
"version": "1.0.0",
"skills": [{"name": "unsafe", "path": value}],
}
),
encoding="utf-8",
)
with pytest.raises(ValueError, match="contained"):
load_plugin_manifest(path)
Also test invalid IDs, duplicate skill names, unsupported schema versions, missing
SKILL.md, and symlinked skill roots.
Add tree-hash tests:
def test_skill_tree_hash_changes_when_supporting_file_changes(tmp_path: Path) -> None:
root = tmp_path / "skill"
root.mkdir()
(root / "SKILL.md").write_text("# Skill\n", encoding="utf-8")
(root / "templates").mkdir()
template = root / "templates" / "report.md"
template.write_text("v1", encoding="utf-8")
first = hash_plugin_skill_tree(root)
template.write_text("v2", encoding="utf-8")
second = hash_plugin_skill_tree(root)
assert first.skill_content_hash == second.skill_content_hash
assert first.skill_tree_hash != second.skill_tree_hash
Also verify path changes and executable-bit changes affect skill_tree_hash, while mtime
and non-executable permission changes do not.
- Step 2: Run tests and verify failure
Run:
cd app-instance/backend
pytest tests/unit/test_plugin_manifest.py tests/unit/test_plugin_hashing.py tests/unit/test_config_loader.py -q
Expected: FAIL because beaver.plugins and PluginsConfig do not exist.
- Step 3: Implement immutable plugin models and config
Put plugin package models in beaver/plugins/models.py:
@dataclass(frozen=True, slots=True)
class PluginSkillDeclaration:
name: str
relative_path: str
root: Path
@dataclass(frozen=True, slots=True)
class PluginManifest:
schema_version: int
plugin_id: str
name: str
version: str
root: Path
manifest_path: Path
display_path: str
skills: tuple[PluginSkillDeclaration, ...]
@dataclass(frozen=True, slots=True)
class PluginSkillFileDigest:
path: str
size: int
executable: bool
content_hash: str
@dataclass(frozen=True, slots=True)
class PluginSkillTreeDigest:
skill_content_hash: str
skill_tree_hash: str
files: tuple[PluginSkillFileDigest, ...]
Put configuration in beaver/foundation/config/schema.py to preserve the foundation layer
and avoid importing plugin runtime modules from config:
@dataclass(slots=True)
class PluginsConfig:
search_paths: list[str] = field(default_factory=list)
auto_sync: bool = True
Add plugins: PluginsConfig to BeaverConfig. Parse both camelCase and snake_case:
def _parse_plugins(raw: Any) -> PluginsConfig:
data = _as_dict(raw)
return PluginsConfig(
search_paths=_string_list(data.get("searchPaths") or data.get("search_paths")),
auto_sync=_bool(data.get("autoSync") if "autoSync" in data else data.get("auto_sync"), default=True),
)
- Step 4: Implement strict JSON manifest loading
load_plugin_manifest() must:
- parse a JSON object;
- require schema version
1; - validate identifiers with
^[a-z0-9][a-z0-9_-]*$; - resolve every skill root and check
resolved.is_relative_to(plugin_root); - reject symlinks in the path from plugin root to skill root;
- require a regular
SKILL.md; - initialize
display_pathwithout exposing an absolute path; - return frozen dataclasses.
- Step 5: Implement deterministic dual hashing
hash_plugin_skill_tree(root) must:
- reject symlinks and non-regular files;
- enumerate regular files by normalized POSIX relative path;
- compute
skill_content_hashfrom normalizedSKILL.md; - compute
skill_tree_hashfrom each path, byte length, file bytes, and one normalized executable-bit flag; - include
SKILL.mdand every supporting file; - exclude Beaver metadata such as
version.jsonandupstream.json; - ignore mtime, uid/gid, and non-executable mode bits.
Use length-prefixed binary fields in the digest input instead of ambiguous string concatenation.
- Step 6: Run focused tests
cd app-instance/backend
pytest tests/unit/test_plugin_manifest.py tests/unit/test_plugin_hashing.py tests/unit/test_config_loader.py -q
Expected: PASS.
- Step 7: Commit
git add app-instance/backend/beaver/plugins app-instance/backend/beaver/foundation/config app-instance/backend/tests/unit/test_plugin_manifest.py app-instance/backend/tests/unit/test_plugin_hashing.py app-instance/backend/tests/unit/test_config_loader.py
git commit -m "feat(plugins): add declarative skill manifest"
Task 2: Add Discovery And Atomic Plugin State
Files:
-
Create:
app-instance/backend/beaver/plugins/discovery.py -
Create:
app-instance/backend/beaver/plugins/state.py -
Create:
app-instance/backend/beaver/foundation/utils/file_lock.py -
Modify:
app-instance/backend/beaver/plugins/models.py -
Modify:
app-instance/backend/beaver/plugins/__init__.py -
Test:
app-instance/backend/tests/unit/test_plugin_state.py -
Test:
app-instance/backend/tests/unit/test_workspace_write_lock.py -
Step 1: Write failing discovery and state tests
Cover workspace discovery, configured search paths, duplicate plugin IDs, malformed manifests reported as errors instead of crashing the full scan, and state round trips:
def test_plugin_state_round_trip_is_atomic(tmp_path: Path) -> None:
store = PluginStateStore(tmp_path)
store.set_enabled("baoyu-comic", True)
store.update_skill_binding(
"baoyu-comic",
"baoyu-comic",
PluginSkillBinding(
accepted_upstream_tree_hash="old",
observed_upstream_tree_hash="new",
accepted_beaver_version="v0001",
current_beaver_version="v0002",
pending_candidate_id="plugin-update:baoyu-comic:baoyu-comic:new",
status="update_pending",
),
)
reloaded = PluginStateStore(tmp_path).get_plugin("baoyu-comic")
assert reloaded is not None
assert reloaded.enabled is True
assert reloaded.skills["baoyu-comic"].accepted_upstream_tree_hash == "old"
assert not (tmp_path / ".beaver" / "plugins" / "state.json.tmp").exists()
Add a multiprocess lock test in which two processes enter the same workspace lock and assert their critical sections never overlap. Add a reentrancy test in which nested acquisitions in one process complete without deadlock.
- Step 2: Run tests and verify failure
cd app-instance/backend
pytest tests/unit/test_plugin_state.py tests/unit/test_workspace_write_lock.py -q
Expected: FAIL because discovery and state stores are missing.
- Step 3: Implement state dataclasses
Add backward-compatible to_dict() and from_dict() methods for:
@dataclass(slots=True)
class PluginSkillBinding:
accepted_upstream_tree_hash: str | None = None
observed_upstream_tree_hash: str | None = None
accepted_beaver_version: str | None = None
current_beaver_version: str | None = None
pending_candidate_id: str | None = None
status: str = "discovered"
last_error: str | None = None
@dataclass(slots=True)
class PluginState:
plugin_id: str
enabled: bool = False
updates_paused: bool = False
installed_version: str | None = None
manifest_path: str | None = None
status: str = "discovered"
last_error: str | None = None
skills: dict[str, PluginSkillBinding] = field(default_factory=dict)
- Step 4: Implement atomic state persistence
Store data at <workspace>/.beaver/plugins/state.json. Write a complete JSON document to
state.json.tmp, flush it, then replace state.json. Public methods:
list_plugins()
get_plugin(plugin_id)
set_enabled(plugin_id, enabled)
upsert_plugin(plugin_state)
update_skill_binding(plugin_id, skill_name, binding)
- Step 5: Implement the shared workspace write lock
Add:
class WorkspaceWriteLock:
def __init__(self, workspace: str | Path) -> None:
self.path = Path(workspace) / ".beaver" / "locks" / "plugin-skill-write.lock"
@contextmanager
def acquire(self, *, timeout_seconds: float | None = None, blocking: bool = True):
...
Requirements:
-
use
fcntl.flock()on POSIX andmsvcrt.locking()on Windows, matchingmemory/curated/store.py; -
guard with a process-local
threading.RLock; -
track per-thread recursion depth so nested store calls reuse the OS lock;
-
support non-blocking acquisition for Engine boot;
-
raise
WorkspaceWriteLockBusyon timeout/contention; -
keep the lock file separate from atomically replaced data files.
-
Step 6: Implement discovery
Scan:
<workspace>/plugins;- each configured
plugins.search_paths.
Only direct child directories containing beaver.plugin.json are plugins. Return a
PluginDiscoveryResult containing valid manifests and per-path errors. Duplicate IDs are
errors and neither duplicate is activated. Discovery records a workspace-relative
manifest display path when possible and a redacted
<external>/<plugin-dir>/beaver.plugin.json path otherwise; absolute paths remain
internal.
- Step 7: Run focused tests
cd app-instance/backend
pytest tests/unit/test_plugin_state.py tests/unit/test_workspace_write_lock.py tests/unit/test_plugin_manifest.py -q
Expected: PASS.
- Step 8: Commit
git add app-instance/backend/beaver/plugins app-instance/backend/beaver/foundation/utils/file_lock.py app-instance/backend/tests/unit/test_plugin_state.py app-instance/backend/tests/unit/test_workspace_write_lock.py
git commit -m "feat(plugins): discover packages and persist state"
Task 3: Persist Immutable Upstream Skill Snapshots
Files:
-
Create:
app-instance/backend/beaver/plugins/transaction.py -
Modify:
app-instance/backend/beaver/skills/specs/models.py -
Modify:
app-instance/backend/beaver/skills/specs/storage.py -
Modify:
app-instance/backend/beaver/skills/specs/__init__.py -
Test:
app-instance/backend/tests/unit/test_plugin_skill_storage.py -
Step 1: Write failing snapshot storage tests
Test exact content, supporting files, idempotence, symlink rejection, and source immutability:
def test_write_upstream_snapshot_copies_skill_without_mutating_source(tmp_path: Path) -> None:
source = tmp_path / "plugin" / "skills" / "comic"
source.mkdir(parents=True)
(source / "SKILL.md").write_text("# Comic\n\nOriginal.\n", encoding="utf-8")
(source / "templates").mkdir()
(source / "templates" / "panel.txt").write_text("panel", encoding="utf-8")
store = SkillSpecStore(tmp_path / "workspace")
transaction = PluginSkillTransaction(tmp_path / "workspace")
snapshot = store.stage_upstream_snapshot(
transaction,
skill_name="baoyu-comic",
source_kind="plugin",
source_id="baoyu-comic",
source_version="1.0.0",
source_path="skills/comic",
source_root=source,
)
store.promote_upstream_snapshot(transaction, snapshot)
loaded = store.read_upstream_snapshot("baoyu-comic", "baoyu-comic", snapshot.skill_tree_hash)
assert loaded is not None
assert loaded.content == "# Comic\n\nOriginal.\n"
assert (loaded.root / "templates" / "panel.txt").read_text(encoding="utf-8") == "panel"
assert (source / "SKILL.md").read_text(encoding="utf-8") == "# Comic\n\nOriginal.\n"
Also test:
-
changing only
templates/panel.txtcreates a different snapshot directory; -
SkillVersion.from_dict()remains compatible withouttree_hash; -
reading a legacy version derives its complete tree hash;
-
staging does not make a snapshot visible to
read_upstream_snapshot(); -
promoting a staged snapshot uses
os.replace()and is idempotent; -
a failed metadata write leaves no current pointer to the staged version.
-
Step 2: Run test and verify failure
cd app-instance/backend
pytest tests/unit/test_plugin_skill_storage.py -q
Expected: FAIL because upstream snapshot APIs do not exist.
- Step 3: Add upstream snapshot models
Add:
@dataclass(slots=True)
class SkillUpstreamSnapshot:
skill_name: str
source_kind: str
source_id: str
source_version: str
source_path: str
skill_content_hash: str
skill_tree_hash: str
created_at: str
frontmatter: dict[str, Any] = field(default_factory=dict)
Add LoadedSkillUpstreamSnapshot(snapshot, content, root) for storage reads. Extend
SkillVersion with a backward-compatible tree_hash: str = ""; new versions persist the
complete version-tree hash, while read_published_skill() derives it for legacy metadata
that lacks the field.
- Step 4: Add safe tree-copy helper
Refactor a private SkillSpecStore._copy_regular_tree(source_root, target_root) that:
- rejects any symlink;
- rejects paths containing empty,
., or..segments; - copies regular files only;
- creates parents;
- never writes outside
target_root.
Use it for transaction staging now; Task 4 will reuse it for mirrored versions.
- Step 5: Implement same-filesystem staging and promotion
PluginSkillTransaction creates:
<workspace>/.beaver/staging/plugin-skills/<transaction-id>/
The staging root must be on the same filesystem as <workspace>/skills. It exposes:
stage_upstream_snapshot(...)
stage_skill_version(...)
promote_directory(staged, final)
cleanup()
promote_directory() uses os.replace() and never replaces an existing non-identical
immutable directory. Cleanup removes only the transaction's staging root.
- Step 6: Implement snapshot APIs
Write snapshots to:
skills/<skill>/upstreams/<source-id>/<skill-tree-hash>/
The snapshot metadata stores both hashes. If the directory already exists, verify all stored metadata and return it without rewriting.
Public methods:
stage_upstream_snapshot(transaction, ...)
promote_upstream_snapshot(transaction, snapshot)
read_upstream_snapshot(skill_name, source_id, skill_tree_hash)
- Step 7: Make JSON/current/index writes atomic
Change SkillSpecStore._write_json() and current/index pointer writes to create a temporary
file in the target directory, flush and fsync, then os.replace(). Immutable version
directories are promoted first; runtime visibility changes only when current.json,
skill.json, and the published index are atomically replaced under the workspace lock.
- Step 8: Run focused and existing storage tests
cd app-instance/backend
pytest tests/unit/test_plugin_skill_storage.py tests/unit/test_phase5_skills_runtime.py -q
Expected: PASS.
- Step 9: Commit
git add app-instance/backend/beaver/plugins/transaction.py app-instance/backend/beaver/skills/specs app-instance/backend/tests/unit/test_plugin_skill_storage.py
git commit -m "feat(skills): store immutable plugin upstream snapshots"
Task 4: Mirror Initial Plugin Skills As First-Class Skills
Files:
-
Create:
app-instance/backend/beaver/plugins/skills.py -
Modify:
app-instance/backend/beaver/plugins/models.py -
Modify:
app-instance/backend/beaver/plugins/__init__.py -
Modify:
app-instance/backend/beaver/skills/specs/storage.py -
Test:
app-instance/backend/tests/unit/test_plugin_skill_sync.py -
Step 1: Write failing initial mirror tests
Cover:
- enabling mirrors
SKILL.mdand supporting files; - mirrored skill is returned by
SkillsLoader.list_published_skills(); source_kindisplugin, but runtime source is still workspace;- existing non-plugin name collision fails without modification;
- any validation/safety failure in a multi-skill plugin occurs before promotion and leaves every linked skill unchanged;
- repeated sync is idempotent.
- supporting files are present in the promoted version;
- concurrent enable calls allocate only one version.
Core assertion:
result = manager.enable("baoyu-comic")
record = SkillsLoader(workspace).get_skill_record("baoyu-comic")
loaded = SkillSpecStore(workspace).read_published_skill("baoyu-comic")
assert result.status == "synced"
assert record is not None and record.source == "workspace"
assert record.source_kind == "plugin"
assert loaded is not None
assert loaded.version.version == "v0001"
assert loaded.version.provenance["plugin_id"] == "baoyu-comic"
assert loaded.version.provenance["upstream_skill_content_hash"]
assert loaded.version.provenance["upstream_skill_tree_hash"]
- Step 2: Run tests and verify failure
cd app-instance/backend
pytest tests/unit/test_plugin_skill_sync.py -q
Expected: FAIL because PluginManager does not exist.
- Step 3: Implement
PluginManagerconstructor and discovery view
Constructor dependencies:
class PluginManager:
def __init__(
self,
*,
workspace: Path,
manifests: dict[str, PluginManifest],
discovery_errors: list[PluginDiscoveryError],
state_store: PluginStateStore,
skill_store: SkillSpecStore,
learning_store: SkillLearningStore,
publisher: SkillPublisher,
safety_checker: SkillDraftSafetyChecker,
write_lock: WorkspaceWriteLock,
) -> None:
...
Keep all filesystem and lifecycle dependencies injectable for tests.
- Step 4: Implement exact initial mirror publication
Acquire the workspace write lock before reading state, allocating versions, or writing candidates. For each declared skill:
- persist the upstream snapshot;
- validate ownership conflict;
- parse frontmatter/body and create an in-memory
SkillDraftwithproposal_kind="plugin_initial_mirror"; - run
SkillDraftSafetyChecker.check()and reject failed or critical reports; - allocate the next
vNNNNwhile holding the lock; - stage a
SkillVersionwhose content exactly equals upstreamSKILL.md; - stage snapshot supporting files into the version directory;
- generate the complete next
SkillSpec, current pointer, index, and plugin-state JSON payloads in memory.
Use provenance:
{
"source_kind": "plugin",
"plugin_id": manifest.plugin_id,
"plugin_version": manifest.version,
"plugin_skill_path": declaration.relative_path,
"upstream_skill_content_hash": snapshot.skill_content_hash,
"upstream_skill_tree_hash": snapshot.skill_tree_hash,
"merge_mode": "initial_mirror",
}
- Step 5: Promote the complete staged transaction
After every declared skill passes validation:
- for a new skill, promote its complete staged skill directory with one
os.replace(); - for an existing skill, promote immutable upstream/version directories, atomically
replace spec/index metadata, and replace
current.jsonlast as the visibility switch; - atomically write plugin state last;
- clean the staging directory.
Do not implement reverse rollback across already-promoted immutable directories. If a metadata write fails, those directories remain unreferenced and harmless; the previous current pointers remain authoritative. Add startup cleanup for staging directories older than 24 hours.
- Step 6: Run focused and loader tests
cd app-instance/backend
pytest tests/unit/test_plugin_skill_sync.py tests/unit/test_phase5_skills_runtime.py -q
Expected: PASS.
- Step 7: Commit
git add app-instance/backend/beaver/plugins app-instance/backend/beaver/skills/specs/storage.py app-instance/backend/tests/unit/test_plugin_skill_sync.py
git commit -m "feat(plugins): mirror enabled plugin skills"
Task 5: Detect Upgrades And Create Idempotent Learning Candidates
Files:
-
Modify:
app-instance/backend/beaver/plugins/skills.py -
Modify:
app-instance/backend/beaver/memory/skills/models.py -
Modify:
app-instance/backend/beaver/memory/skills/store.py -
Test:
app-instance/backend/tests/unit/test_plugin_skill_sync.py -
Test:
app-instance/backend/tests/unit/test_skill_learning_candidate_state.py -
Step 1: Write failing upgrade classification tests
Create four tree-hash fixtures representing B, L, and U:
@pytest.mark.parametrize(
("base", "local", "upstream", "expected"),
[
("A", "A", "A", "unchanged"),
("A", "B", "B", "already_applied"),
("A", "A", "B", "fast_forward"),
("A", "LOCAL", "UPSTREAM", "three_way"),
],
)
def test_classify_plugin_skill_update(base: str, local: str, upstream: str, expected: str) -> None:
assert classify_plugin_skill_update(base, local, upstream) == expected
Also test:
-
a supporting-file-only change returns
fast_forwardorthree_way, neverunchanged; -
candidate ID stability across repeated sync;
-
new upstream supersedes an older pending candidate;
-
candidate evidence contains hashes/version references but no raw skill body;
-
legacy candidate payloads still parse.
-
two processes syncing the same update append only one candidate record.
-
Step 2: Run tests and verify failure
cd app-instance/backend
pytest tests/unit/test_plugin_skill_sync.py tests/unit/test_skill_learning_candidate_state.py -q
Expected: FAIL because update classification and candidate kind are missing.
- Step 3: Add
plugin_skill_updatecandidate support
Do not add a special status. Existing candidate statuses remain sufficient. Ensure
SkillLearningCandidate.from_dict() accepts the new kind without changing legacy
defaults.
Use evidence:
{
"plugin_id": plugin_id,
"plugin_version": manifest.version,
"skill_name": skill_name,
"merge_mode": merge_mode,
"base_upstream_tree_hash": accepted_tree_hash,
"new_upstream_tree_hash": snapshot.skill_tree_hash,
"local_version": current.version.version,
}
Set priority=10, confidence=1.0, trigger_reason="plugin_update".
- Step 4: Implement update classification and candidate creation
Use canonical hashes and deterministic IDs:
candidate_id = (
f"plugin-update:{plugin_id}:{skill_name}:"
f"{new_upstream_tree_hash[:12]}"
)
For already_applied, advance state without a candidate. For fast_forward and
three_way, record an open candidate. If the same ID exists in any status, do not append
another JSONL record.
- Step 5: Make candidate mutation atomic under the shared lock
Add an optional WorkspaceWriteLock to SkillLearningStore; EngineLoader supplies the
shared workspace instance, while isolated unit-test construction falls back to a
store-local lock. Add:
record_learning_candidate_if_absent(candidate) -> tuple[SkillLearningCandidate, bool]
Inside one lock acquisition, read current candidates, check the deterministic ID, and
atomically rewrite or append the JSONL record. Apply the same lock to candidate update and
transition methods. Nested calls from PluginManager reuse the reentrant lock.
- Step 6: Supersede stale pending updates
When a different pending candidate exists for the same plugin skill:
learning_store.transition_learning_candidate(
old_candidate_id,
"superseded",
event_type="plugin_update_superseded",
payload={"replacement_candidate_id": new_candidate_id},
)
- Step 7: Run focused tests
cd app-instance/backend
pytest tests/unit/test_plugin_skill_sync.py tests/unit/test_skill_learning_candidate_state.py -q
Expected: PASS.
- Step 8: Commit
git add app-instance/backend/beaver/plugins/skills.py app-instance/backend/beaver/memory/skills/models.py app-instance/backend/beaver/memory/skills/store.py app-instance/backend/tests/unit/test_plugin_skill_sync.py app-instance/backend/tests/unit/test_skill_learning_candidate_state.py
git commit -m "feat(plugins): enqueue skill upgrade candidates"
Task 6: Add Plugin Update Draft Provenance And Fast-Forward Synthesis
Files:
-
Modify:
app-instance/backend/beaver/skills/specs/models.py -
Modify:
app-instance/backend/beaver/skills/drafts/service.py -
Modify:
app-instance/backend/beaver/skills/publisher/service.py -
Modify:
app-instance/backend/beaver/skills/learning/service.py -
Test:
app-instance/backend/tests/unit/test_plugin_skill_learning.py -
Test:
app-instance/backend/tests/unit/test_skill_learning_pipeline.py -
Step 1: Write failing model and fast-forward tests
Test backward-compatible draft parsing and exact upstream fast-forward:
draft = asyncio.run(service.synthesize_draft(candidate.candidate_id, provider_bundle))
assert draft.proposal_kind == "plugin_skill_update"
assert draft.proposed_content == new_upstream.content
assert draft.base_version == "v0001"
assert draft.provenance["merge_mode"] == "fast_forward"
assert draft.provenance["new_upstream_tree_hash"] == new_upstream.snapshot.skill_tree_hash
assert provider.calls == []
After publish, assert the new version contains the new upstream supporting files even when
SKILL.md did not change.
- Step 2: Run tests and verify failure
cd app-instance/backend
pytest tests/unit/test_plugin_skill_learning.py tests/unit/test_skill_learning_pipeline.py -q
Expected: FAIL because drafts have no provenance and the learning service has no plugin update branch.
- Step 3: Add backward-compatible draft provenance
Extend SkillDraft:
provenance: dict[str, Any] = field(default_factory=dict)
Include it in to_dict() and parse missing values as {} in from_dict().
- Step 4: Add a focused draft constructor
Add:
def create_plugin_update_draft(
self,
*,
skill_name: str,
base_version: str,
proposed_content: str,
proposed_frontmatter: dict,
created_by: str,
reason: str,
provenance: dict,
evidence_refs: list[dict] | None = None,
) -> SkillDraft:
It writes proposal_kind="plugin_skill_update".
- Step 5: Implement fast-forward synthesis
In SkillLearningService.synthesize_draft(), branch before ordinary revision:
if candidate.kind == "plugin_skill_update":
return await self._synthesize_plugin_update(candidate, provider_bundle)
For merge_mode == "fast_forward", load U from SkillSpecStore, parse its
frontmatter/body, and create a draft exactly equal to U. Do not call the provider.
- Step 6: Serialize all skill publication
Add an optional WorkspaceWriteLock to SkillPublisher; EngineLoader supplies the shared
workspace instance and isolated tests use a publisher-local fallback. Hold it across
_next_version(), version staging/promotion, spec/current/index replacement, rollback,
and disable. This protects ordinary learned skills as well as plugin-origin skills from
racing with boot or explicit plugin sync.
- Step 7: Materialize referenced supporting files during publish
For proposal_kind="plugin_skill_update", resolve the snapshot and supporting-file plan
from draft provenance. Stage the complete next version directory, including SKILL.md
and supporting files, before promoting it. Reject missing snapshots, path conflicts, or
tree-hash mismatches. Ordinary skill publication keeps its current behavior.
- Step 8: Preserve draft provenance on publish
Change SkillPublisher.publish() provenance construction to:
provenance={
**dict(draft.provenance),
"draft_id": draft_id,
"proposal_kind": draft.proposal_kind,
"trigger_run_id": draft.trigger_run_id,
"trigger_session_id": draft.trigger_session_id,
}
- Step 9: Run focused tests
cd app-instance/backend
pytest tests/unit/test_plugin_skill_learning.py tests/unit/test_skill_learning_pipeline.py -q
Expected: PASS.
- Step 10: Commit
git add app-instance/backend/beaver/skills app-instance/backend/tests/unit/test_plugin_skill_learning.py app-instance/backend/tests/unit/test_skill_learning_pipeline.py
git commit -m "feat(skill-learning): create plugin update drafts"
Task 7: Implement Three-Way Plugin Skill Synthesis
Files:
-
Create:
app-instance/backend/beaver/plugins/tree_merge.py -
Modify:
app-instance/backend/beaver/skills/learning/synthesizer.py -
Modify:
app-instance/backend/beaver/skills/learning/service.py -
Test:
app-instance/backend/tests/unit/test_plugin_skill_learning.py -
Test:
app-instance/backend/tests/unit/test_skill_learning_synthesizer_preservation.py -
Step 1: Write failing three-way prompt and parse tests
Assert the prompt contains labeled OLD UPSTREAM, CURRENT LOCAL, and NEW UPSTREAM
sections and does not confuse the current local version with the merge base.
Test response parsing for:
{
"frontmatter": {"name": "baoyu-comic", "description": "Comic workflow", "tools": []},
"content": "# Baoyu Comic\n...",
"change_reason": "Adopt upstream layout while preserving learned review step.",
"preserved_local_sections": ["Review"],
"adopted_upstream_sections": ["Panel Layout"],
"resolved_conflicts": ["Output ordering"],
"dropped_sections": []
}
Add supporting-file merge tests:
def test_supporting_file_merge_adopts_upstream_when_local_is_unchanged() -> None:
plan = merge_supporting_file_trees(base={"a.txt": "A"}, local={"a.txt": "A"}, upstream={"a.txt": "U"})
assert plan.files["a.txt"].source == "upstream"
assert plan.conflicts == []
def test_supporting_file_merge_blocks_divergent_edits() -> None:
plan = merge_supporting_file_trees(base={"a.txt": "A"}, local={"a.txt": "L"}, upstream={"a.txt": "U"})
assert plan.conflicts[0].path == "a.txt"
- Step 2: Run tests and verify failure
cd app-instance/backend
pytest tests/unit/test_plugin_skill_learning.py tests/unit/test_skill_learning_synthesizer_preservation.py -q
Expected: FAIL because three-way synthesis does not exist.
- Step 3: Add
synthesize_plugin_update()
Signature:
async def synthesize_plugin_update(
self,
candidate: SkillLearningCandidate,
evidence_packet: EvidencePacket,
provider: LLMProvider,
model: str,
*,
old_upstream: dict[str, Any],
current_local: dict[str, Any],
new_upstream: dict[str, Any],
) -> dict[str, Any]:
The system message must require JSON only and state:
-
preserve valid local learning;
-
adopt upstream fixes and safety changes;
-
do not concatenate duplicate sections;
-
list every intentional drop;
-
leave
resolved_conflictsempty only when no semantic conflict exists. -
Step 4: Load all three snapshots in the learning service
Resolve:
Busingbase_upstream_tree_hash;Lusinglocal_version;Uusingnew_upstream_tree_hash.
Raise a specific ValueError when any referenced snapshot/version is missing. Do not
fallback to a two-way merge.
- Step 5: Build the deterministic supporting-file merge plan
Compare files by path and content/executable digest:
L == B: useU;U == B: useL;L == U: use either;- one-sided addition: use the added file;
- divergent edit, different same-path additions, and delete-versus-edit: conflict.
Exclude SKILL.md because the synthesizer handles it. Store selected source references
and conflict records in draft provenance; do not duplicate file bytes in JSON.
- Step 6: Create the plugin update draft
Store merge decisions in draft provenance:
{
**plugin_reference_fields,
"merge_mode": "three_way",
"preserved_local_sections": payload["preserved_local_sections"],
"adopted_upstream_sections": payload["adopted_upstream_sections"],
"resolved_conflicts": payload["resolved_conflicts"],
"dropped_sections": payload["dropped_sections"],
"supporting_file_plan": supporting_file_plan.to_dict(),
}
If the supporting-file plan contains conflicts, the draft may be inspected but cannot be published. V1 does not ask the LLM to merge arbitrary or binary files.
- Step 7: Run focused tests
cd app-instance/backend
pytest tests/unit/test_plugin_skill_learning.py tests/unit/test_skill_learning_synthesizer_preservation.py -q
Expected: PASS.
- Step 8: Commit
git add app-instance/backend/beaver/plugins/tree_merge.py app-instance/backend/beaver/skills/learning app-instance/backend/tests/unit/test_plugin_skill_learning.py app-instance/backend/tests/unit/test_skill_learning_synthesizer_preservation.py
git commit -m "feat(skill-learning): synthesize three-way plugin updates"
Task 8: Extend Replay Preservation For Plugin Merges
Files:
-
Modify:
app-instance/backend/beaver/skills/learning/preservation.py -
Modify:
app-instance/backend/beaver/skills/learning/eval.py -
Modify:
app-instance/backend/beaver/skills/learning/pipeline.py -
Test:
app-instance/backend/tests/unit/test_skill_learning_preservation.py -
Test:
app-instance/backend/tests/unit/test_skill_learning_eval.py -
Test:
app-instance/backend/tests/unit/test_skill_learning_pipeline.py -
Step 1: Write failing plugin merge preservation tests
Cover:
- merged draft preserves local Safety and adopts new upstream Safety;
- silently dropping either Safety section fails;
- explicitly resolved non-safety conflicts pass;
- unresolved conflicts block publish;
- unresolved supporting-file conflicts block publish;
- baseline replay remains current local
L.
Expected report shape:
assert report.preservation_report == {
"mode": "plugin_three_way",
"passed": True,
"local": {...},
"upstream": {...},
"unresolved_conflicts": [],
}
- Step 2: Run tests and verify failure
cd app-instance/backend
pytest tests/unit/test_skill_learning_preservation.py tests/unit/test_skill_learning_eval.py tests/unit/test_skill_learning_pipeline.py -q
Expected: FAIL because preservation only checks one base skill.
- Step 3: Add plugin merge preservation helper
Add:
def check_plugin_merge_preservation(
*,
local_content: str,
upstream_content: str,
draft_content: str,
merge_decisions: dict[str, Any],
) -> dict[str, Any]:
It calls existing check_preservation() for local and upstream content, gives Safety and
Required Tools sections blocking weight, and reports unresolved conflicts separately.
- Step 4: Use current local as replay baseline
When draft.proposal_kind == "plugin_skill_update", load draft.base_version as the
baseline skill. Continue to run the candidate arm with the draft context. Do not use raw
upstream B or U as the replay baseline.
- Step 5: Tighten publish gate
Add:
if draft.proposal_kind == "plugin_skill_update":
preservation = eval_report.preservation_report or {}
if preservation.get("mode") != "plugin_three_way" and draft.provenance.get("merge_mode") == "three_way":
raise ValueError("Plugin update requires a three-way preservation report")
if preservation.get("unresolved_conflicts"):
raise ValueError("Plugin update has unresolved merge conflicts")
if draft.provenance.get("supporting_file_plan", {}).get("conflicts"):
raise ValueError("Plugin update has unresolved supporting-file conflicts")
The existing passed is False gate remains active.
- Step 6: Run focused tests
cd app-instance/backend
pytest tests/unit/test_skill_learning_preservation.py tests/unit/test_skill_learning_eval.py tests/unit/test_skill_learning_pipeline.py -q
Expected: PASS.
- Step 7: Commit
git add app-instance/backend/beaver/skills/learning app-instance/backend/tests/unit/test_skill_learning_preservation.py app-instance/backend/tests/unit/test_skill_learning_eval.py app-instance/backend/tests/unit/test_skill_learning_pipeline.py
git commit -m "feat(skill-learning): gate plugin merge preservation"
Task 9: Reconcile Publication And Implement Pause/Disable/Adopt
Files:
-
Modify:
app-instance/backend/beaver/plugins/skills.py -
Modify:
app-instance/backend/beaver/skills/learning/pipeline.py -
Modify:
app-instance/backend/beaver/skills/publisher/service.py -
Test:
app-instance/backend/tests/unit/test_plugin_skill_sync.py -
Test:
app-instance/backend/tests/unit/test_skill_learning_pipeline.py -
Step 1: Write failing lifecycle tests
Test:
-
publishing a plugin update advances accepted upstream tree hash;
-
pending candidate clears;
-
simulated observer failure leaves the published version intact;
-
the next sync reconciles state from current version provenance and does not recreate the candidate;
-
reconciliation never moves
accepted_beaver_versionbackwards after rollback; -
pause leaves linked skills active and creates no update candidates;
-
resume reconciles and syncs;
-
disabling plugin disables linked skills without deletion;
-
re-enable restores and syncs;
-
missing package sets plugin status
missing, suspends sync, and leaves linked skills active; -
adopt changes
source_kindtomanaged, removes binding, and keeps the skill active. -
Step 2: Run tests and verify failure
cd app-instance/backend
pytest tests/unit/test_plugin_skill_sync.py tests/unit/test_skill_learning_pipeline.py -q
Expected: FAIL because publication has no plugin acknowledgement callback.
- Step 3: Add a narrow publication observer
Extend pipeline construction with:
publish_observer: Callable[[SkillDraft, SkillVersion | SkillSpec], None] | None = None
After successful publish, call it before returning. Observer failure must be recorded and
audited as plugin_publish_ack_failed; it must not delete the already-published version
or turn the publish API response into a failure. Mark the learning candidate published
before invoking the best-effort observer so clients do not retry a successful publish.
The next sync is responsible for reconciliation.
- Step 4: Implement
PluginManager.on_skill_published()
For proposal_kind="plugin_skill_update":
- validate plugin ID, skill name, and new upstream tree hash from draft provenance;
- set
accepted_upstream_tree_hash = new_upstream_tree_hash; - set
observed_upstream_tree_hash = new_upstream_tree_hash; - set
accepted_beaver_version = published.version; - set
current_beaver_version = published.version; - clear
pending_candidate_id; - set status
synced.
- Step 5: Implement sync-time reconciliation
At the beginning of sync_enabled(), inspect each linked skill's current published
version. When provenance contains:
{
"proposal_kind": "plugin_skill_update",
"plugin_id": plugin_id,
"new_upstream_tree_hash": tree_hash,
}
and the referenced upstream snapshot exists, advance state only if the current version
number is newer than accepted_beaver_version. Clear only the matching pending candidate.
Never regress state when the runtime current pointer was rolled back to an older version.
- Step 6: Implement pause, resume, disable, missing, and adopt
pause(plugin_id) sets updates_paused=True and leaves linked skills unchanged.
resume(plugin_id) clears the flag and performs reconciliation/sync.
disable(plugin_id, disable_linked_skills=True) rejects calls without the explicit
confirmation and calls SkillPublisher.disable() for every still-linked skill.
adopt(plugin_id, skill_name):
- requires an existing binding;
- changes
SkillSpec.source_kindtomanaged; - appends
adopted_from_plugin:<plugin-id>to lineage; - removes the binding;
- leaves the current version active.
When discovery cannot find a previously known plugin, set status missing, preserve
enabled and updates_paused, skip update generation, and do not disable any linked
skill.
- Step 7: Run focused tests
cd app-instance/backend
pytest tests/unit/test_plugin_skill_sync.py tests/unit/test_skill_learning_pipeline.py -q
Expected: PASS.
- Step 8: Commit
git add app-instance/backend/beaver/plugins/skills.py app-instance/backend/beaver/skills/learning/pipeline.py app-instance/backend/beaver/skills/publisher/service.py app-instance/backend/tests/unit/test_plugin_skill_sync.py app-instance/backend/tests/unit/test_skill_learning_pipeline.py
git commit -m "feat(plugins): track published updates and ownership"
Task 10: Wire Plugin Sync Into Engine Loading
Files:
-
Modify:
app-instance/backend/beaver/engine/loader.py -
Modify:
app-instance/backend/beaver/plugins/__init__.py -
Test:
app-instance/backend/tests/unit/test_plugin_runtime.py -
Test:
app-instance/backend/tests/unit/test_phase5_skills_runtime.py -
Step 1: Write failing runtime assembly tests
Test:
-
discovered disabled plugins do not mirror;
-
enabled plugin mirrors before
EngineLoadResult.skillsis calculated; -
changed plugin creates a candidate but never calls an LLM during boot;
-
repeated boot creates no duplicate versions/candidates;
-
concurrent multi-process boot creates no duplicate versions/candidates;
-
boot skips auto-sync and reports
deferred_lock_busywhen an explicit sync holds the workspace lock; -
EngineLoadResult.plugin_managerand plugin summaries are available. -
Step 2: Run tests and verify failure
cd app-instance/backend
pytest tests/unit/test_plugin_runtime.py tests/unit/test_phase5_skills_runtime.py -q
Expected: FAIL because EngineLoader does not assemble plugin services.
- Step 3: Extend
EngineLoadResultand loader injection
Add:
plugin_manager: PluginManager | None = None
plugins: list[dict] = field(default_factory=list)
Allow plugin_manager injection in EngineLoader.__init__() for tests.
- Step 4: Assemble in dependency order
Required order:
- config/workspace;
SkillSpecStore, learning store, andSkillsLoader;- tool registry and builtins, including skill-view tools using that loader;
- draft/review/publisher and a safety checker using the completed tool registry;
- discovery and
PluginStateStore; PluginManager;plugin_manager.sync_enabled(blocking=False)whenconfig.plugins.auto_sync;- learning service/pipeline with publication observer;
- result summaries.
Do not use SkillsLoader.extra_dirs for plugin skills. Explicit API enable/sync uses a
bounded blocking lock timeout; Engine boot uses a non-blocking attempt and proceeds with
the current published skill set if another writer owns the lock.
- Step 5: Run runtime tests
cd app-instance/backend
pytest tests/unit/test_plugin_runtime.py tests/unit/test_phase5_skills_runtime.py -q
Expected: PASS.
- Step 6: Commit
git add app-instance/backend/beaver/engine/loader.py app-instance/backend/beaver/plugins app-instance/backend/tests/unit/test_plugin_runtime.py app-instance/backend/tests/unit/test_phase5_skills_runtime.py
git commit -m "feat(runtime): sync declarative plugins at boot"
Task 11: Add Plugin Management API
Files:
-
Modify:
app-instance/backend/beaver/interfaces/web/app.py -
Test:
app-instance/backend/tests/unit/test_plugin_web_api.py -
Step 1: Write failing API tests
Cover:
GET /api/plugins
POST /api/plugins/sync
POST /api/plugins/{plugin_id}/enable
POST /api/plugins/{plugin_id}/pause
POST /api/plugins/{plugin_id}/resume
POST /api/plugins/{plugin_id}/disable
POST /api/plugins/{plugin_id}/skills/{skill_name}/adopt
Assert 404 for unknown plugin, 409 for skill ownership conflict, and 400 for invalid
manifest/sync errors. Assert lock timeout maps to 409 plugin_write_busy. Assert no
payload contains the real absolute workspace or external search-root path. Assert disable
without {"disable_linked_skills": true} is rejected.
- Step 2: Run tests and verify failure
cd app-instance/backend
pytest tests/unit/test_plugin_web_api.py -q
Expected: FAIL with missing routes.
- Step 3: Add normalized plugin payload helper
Return:
{
"id": manifest.plugin_id,
"name": manifest.name,
"discovered_version": manifest.version,
"installed_version": state.installed_version,
"enabled": state.enabled,
"status": state.status,
"last_error": state.last_error,
"manifest_path": manifest.display_path,
"updates_paused": state.updates_paused,
"skills": [
{
"name": declaration.name,
"status": binding.status,
"current_beaver_version": binding.current_beaver_version,
"accepted_upstream_tree_hash": binding.accepted_upstream_tree_hash,
"observed_upstream_tree_hash": binding.observed_upstream_tree_hash,
"accepted_beaver_version": binding.accepted_beaver_version,
"pending_candidate_id": binding.pending_candidate_id,
}
],
}
Never return arbitrary plugin file content, secrets, or absolute server paths.
- Step 4: Implement routes
Each mutating endpoint boots one runtime, invokes its plugin_manager, and returns the
updated plugin payload. Map ValueError messages to stable HTTP status codes.
- Step 5: Run focused and existing web tests
cd app-instance/backend
pytest tests/unit/test_plugin_web_api.py tests/unit/test_skill_learning_web_api.py -q
Expected: PASS.
- Step 6: Commit
git add app-instance/backend/beaver/interfaces/web/app.py app-instance/backend/tests/unit/test_plugin_web_api.py
git commit -m "feat(api): manage declarative plugins"
Task 12: Add Plugin Management To The Skills UI
Files:
-
Modify:
app-instance/frontend/types/index.ts -
Modify:
app-instance/frontend/lib/api.ts -
Modify:
app-instance/frontend/app/(app)/skills/page.tsx -
Test:
app-instance/frontend/lib/plugin-api.test.ts -
Step 1: Write failing API client tests
Test URL, method, and response typing for list, sync, enable, pause, resume, disable, and adopt.
- Step 2: Run frontend test and verify failure
Run the repository's existing frontend test command targeting:
cd app-instance/frontend
npx vitest run lib/plugin-api.test.ts
Expected: FAIL because plugin API functions do not exist.
- Step 3: Add frontend types
Add:
export interface PluginSkillBinding {
name: string;
status: string;
current_beaver_version?: string | null;
accepted_upstream_tree_hash?: string | null;
observed_upstream_tree_hash?: string | null;
accepted_beaver_version?: string | null;
pending_candidate_id?: string | null;
}
export interface BeaverPlugin {
id: string;
name: string;
discovered_version?: string | null;
installed_version?: string | null;
enabled: boolean;
updates_paused: boolean;
status: string;
last_error?: string | null;
manifest_path?: string | null;
skills: PluginSkillBinding[];
}
- Step 4: Add API functions
Implement:
listPlugins()
syncPlugins()
enablePlugin(pluginId)
pausePlugin(pluginId)
resumePlugin(pluginId)
disablePlugin(pluginId, { disable_linked_skills: true })
adoptPluginSkill(pluginId, skillName)
- Step 5: Add a
pluginsSkills tab
Extend SkillsTab and render a compact table with:
- plugin name and versions;
- enabled/status badges;
- linked skills and pending candidate link;
- icon buttons with tooltips for sync, enable, pause, resume, disable, and adopt;
- confirmation before disable/adopt;
- missing-source warning stating that current skills remain active but updates are suspended;
- existing
runAction()and error handling.
Do not add a separate marketing-style page or nested cards.
- Step 6: Label plugin-origin skills and update candidates
In existing Published/Candidates/Drafts views:
-
show
Pluginsource badge whensource_kind === "plugin"; -
render
plugin_skill_updateas插件升级合并 / Plugin update merge; -
show
fast_forwardorthree_wayfrom candidate evidence/provenance. -
Step 7: Run frontend tests and type checks
cd app-instance/frontend
npx vitest run lib/plugin-api.test.ts
npm run lint
npx tsc --noEmit
Expected: PASS.
- Step 8: Commit
git add app-instance/frontend/types/index.ts app-instance/frontend/lib/api.ts app-instance/frontend/lib/plugin-api.test.ts 'app-instance/frontend/app/(app)/skills/page.tsx'
git commit -m "feat(skills-ui): manage plugin skill mirrors"
Task 13: Add End-To-End Lifecycle Coverage And Documentation
Files:
-
Create:
app-instance/backend/tests/integration/test_plugin_skill_lifecycle.py -
Create:
docs/plugins/skill-plugins.md -
Modify:
docs/product-discovery/beaver/README.md -
Step 1: Write the end-to-end lifecycle test
The test must:
- create plugin
1.0.0; - enable it and assert mirror
v0001; - publish a normal learned local revision
v0002; - replace the package with plugin
1.1.0; - sync and assert one
three_waycandidate; - synthesize with a stub provider;
- run safety and replay evaluation with a stub runner;
- submit, approve, and publish
v0003; - assert accepted upstream tree hash and provenance advanced;
- rollback to
v0002; - assert plugin source files were never modified;
- update only a supporting file and assert a new update candidate is created;
- simulate publish-observer failure and assert the next sync reconciles state;
- remove the plugin package and assert the plugin is
missingwhile the current skill remains active; - run two sync processes and assert no duplicate version or candidate is created.
- Step 2: Run the integration test and fix only lifecycle defects
cd app-instance/backend
pytest tests/integration/test_plugin_skill_lifecycle.py -v
Expected: PASS.
- Step 3: Write operator documentation
Document:
-
package layout and manifest;
-
discovery roots;
-
explicit enable requirement;
-
mirror and three-way merge behavior;
-
dual content/tree hashing and supporting-file merge conflicts;
-
update candidate review flow;
-
pause/resume versus disable/adopt;
-
recovery from missing/invalid plugins;
-
workspace locking, deferred boot sync, and publication reconciliation;
-
why plugin Python code is not executed in V1.
-
Step 4: Run the complete relevant backend suite
cd app-instance/backend
pytest \
tests/unit/test_plugin_manifest.py \
tests/unit/test_plugin_hashing.py \
tests/unit/test_plugin_state.py \
tests/unit/test_workspace_write_lock.py \
tests/unit/test_plugin_skill_storage.py \
tests/unit/test_plugin_skill_sync.py \
tests/unit/test_plugin_skill_learning.py \
tests/unit/test_plugin_runtime.py \
tests/unit/test_plugin_web_api.py \
tests/unit/test_skill_learning_candidate_state.py \
tests/unit/test_skill_learning_pipeline.py \
tests/unit/test_skill_learning_eval.py \
tests/unit/test_skill_learning_worker.py \
tests/unit/test_phase5_skills_runtime.py \
tests/integration/test_plugin_skill_lifecycle.py \
-q
Expected: PASS.
- Step 5: Run frontend verification
cd app-instance/frontend
npx vitest run lib/plugin-api.test.ts
npm run lint
npx tsc --noEmit
Expected: PASS.
- Step 6: Run a dirty-worktree-safe diff review
git status --short
git diff --check
git diff --stat
Expected:
-
no whitespace errors;
-
only plugin/skill lifecycle files and planned docs/tests are included in this feature;
-
unrelated pre-existing user changes remain untouched.
-
Step 7: Commit
git add app-instance/backend/tests/integration/test_plugin_skill_lifecycle.py docs/plugins/skill-plugins.md docs/product-discovery/beaver/README.md
git commit -m "docs(plugins): document skill mirror lifecycle"
Release Sequence
- Ship backend manifest, state, snapshots, and initial mirror behind the Plugins API.
- Enable update candidate generation after initial mirror tests pass in a real workspace.
- Enable three-way synthesis and replay publish gates.
- Ship the Plugins UI.
- Keep executable plugin code disabled; design it separately with process isolation and permission boundaries.
Rollout Metrics
Track:
- plugin discovery and manifest error count;
- initial mirror success/failure count;
- plugin update candidates created, superseded, rejected, and published;
- plugin update candidates caused by supporting-file-only changes;
- fast-forward versus three-way update ratio;
- write-lock contention and deferred boot sync count;
- publication reconciliation repair count;
- replay regression and preservation failure rate;
- time from upstream discovery to accepted publication;
- rollback count for plugin-origin versions.
Final Acceptance Test
The feature is complete only when a plugin-origin skill can:
- be enabled and used with normal skill priority;
- accumulate a normal Beaver-learned revision;
- receive a newer upstream plugin version;
- produce a three-way update draft without editing the plugin package;
- pass the same safety, replay, review, and publish gates as ordinary skills;
- retain full upstream and local provenance;
- detect and publish supporting-file-only updates;
- survive concurrent boot/sync without duplicate versions or candidates;
- recover plugin state after observer failure;
- remain active when its plugin package is temporarily missing;
- be paused, resumed, rolled back, disabled, re-enabled, or adopted without data loss.