feat(app): 移除内置agents并添加CORS支持和技能上传优化

移除了agents/registry.json中的所有内置agents配置，将agents数组清空。为web应用添加了CORS中间件支持，允许指定的前端地址跨域访问。重构了技能上传功能，增加了LLM重写机制，自动规范化上传的技能格式。新增了工具名称提取逻辑，从技能正文中自动识别Required Tools段落。更新了技能学习候选者和草稿的载荷结构，添加评估报告统计信息。修改了意图路由技能的说明，改进任务状态管理逻辑。
2026-06-12 13:25:20 +08:00
312 changed files with 56029 additions and 1829 deletions
--- a/.agents/skills/speckit-agent-context-update/SKILL.md
+++ b/.agents/skills/speckit-agent-context-update/SKILL.md
@ -0,0 +1,31 @@
+---
+name: speckit-agent-context-update
+description: Refresh the managed Spec Kit section in the coding agent context file
+compatibility: Requires spec-kit project structure with .specify/ directory
+metadata:
+  author: github-spec-kit
+  source: agent-context:commands/speckit.agent-context.update.md
+---
+
+# Update Coding Agent Context
+
+Refresh the managed Spec Kit section inside the active coding agent's context/instruction file (e.g. `CLAUDE.md`, `.github/copilot-instructions.md`, `AGENTS.md`).
+
+## Behavior
+
+The script reads the agent-context extension config at
+`.specify/extensions/agent-context/agent-context-config.yml` to discover:
+
+- `context_file` — the path of the coding agent context file to manage.
+- `context_markers.start` / `.end` — the delimiters surrounding the managed section. Defaults to `<!-- SPECKIT START -->` and `<!-- SPECKIT END -->` when the field is missing.
+
+It then creates, replaces, or appends the managed block so that the section points at the most recent plan path when one can be discovered (`specs/<feature>/plan.md`).
+
+If `context_file` is empty or the file cannot be located, the command reports nothing to do and exits successfully.
+
+## Execution
+
+- **Bash**: `.specify/extensions/agent-context/scripts/bash/update-agent-context.sh [plan_path]`
+- **PowerShell**: `.specify/extensions/agent-context/scripts/powershell/update-agent-context.ps1 [plan_path]`
+
+When `plan_path` is omitted, the script auto-detects the most recently modified `specs/*/plan.md`.
--- a/.agents/skills/speckit-analyze/SKILL.md
+++ b/.agents/skills/speckit-analyze/SKILL.md
@ -0,0 +1,257 @@
+---
+name: "speckit-analyze"
+description: "Perform a non-destructive cross-artifact consistency and quality analysis across spec.md, plan.md, and tasks.md after task generation."
+compatibility: "Requires spec-kit project structure with .specify/ directory"
+metadata:
+  author: "github-spec-kit"
+  source: "templates/commands/analyze.md"
+---
+
+
+## User Input
+
+```text
+$ARGUMENTS
+```
+
+You **MUST** consider the user input before proceeding (if not empty).
+
+## Pre-Execution Checks
+
+**Check for extension hooks (before analysis)**:
+- Check if `.specify/extensions.yml` exists in the project root.
+- If it exists, read it and look for entries under the `hooks.before_analyze` key
+- If the YAML cannot be parsed or is invalid, skip hook checking silently and continue normally
+- Filter out hooks where `enabled` is explicitly `false`. Treat hooks without an `enabled` field as enabled by default.
+- For each remaining hook, do **not** attempt to interpret or evaluate hook `condition` expressions:
+  - If the hook has no `condition` field, or it is null/empty, treat the hook as executable
+  - If the hook defines a non-empty `condition`, skip the hook and leave condition evaluation to the HookExecutor implementation
+- When constructing slash commands from hook command names, replace dots (`.`) with hyphens (`-`). For example, `speckit.git.commit` → `/speckit-git-commit`.
+- For each executable hook, output the following based on its `optional` flag:
+  - **Optional hook** (`optional: true`):
+    ```
+    ## Extension Hooks
+
+    **Optional Pre-Hook**: {extension}
+    Command: `/{command}`
+    Description: {description}
+
+    Prompt: {prompt}
+    To execute: `/{command}`
+    ```
+  - **Mandatory hook** (`optional: false`):
+    ```
+    ## Extension Hooks
+
+    **Automatic Pre-Hook**: {extension}
+    Executing: `/{command}`
+    EXECUTE_COMMAND: {command}
+
+    Wait for the result of the hook command before proceeding to the Goal.
+    ```
+- If no hooks are registered or `.specify/extensions.yml` does not exist, skip silently
+
+## Goal
+
+Identify inconsistencies, duplications, ambiguities, and underspecified items across the three core artifacts (`spec.md`, `plan.md`, `tasks.md`) before implementation. This command MUST run only after `/speckit-tasks` has successfully produced a complete `tasks.md`.
+
+## Operating Constraints
+
+**STRICTLY READ-ONLY**: Do **not** modify any files. Output a structured analysis report. Offer an optional remediation plan (user must explicitly approve before any follow-up editing commands would be invoked manually).
+
+**Constitution Authority**: The project constitution (`.specify/memory/constitution.md`) is **non-negotiable** within this analysis scope. Constitution conflicts are automatically CRITICAL and require adjustment of the spec, plan, or tasks—not dilution, reinterpretation, or silent ignoring of the principle. If a principle itself needs to change, that must occur in a separate, explicit constitution update outside `/speckit-analyze`.
+
+## Execution Steps
+
+### 1. Initialize Analysis Context
+
+Run `.specify/scripts/bash/check-prerequisites.sh --json --require-tasks --include-tasks` once from repo root and parse JSON for FEATURE_DIR and AVAILABLE_DOCS. Derive absolute paths:
+
+- SPEC = FEATURE_DIR/spec.md
+- PLAN = FEATURE_DIR/plan.md
+- TASKS = FEATURE_DIR/tasks.md
+
+Abort with an error message if any required file is missing (instruct the user to run missing prerequisite command).
+For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
+
+### 2. Load Artifacts (Progressive Disclosure)
+
+Load only the minimal necessary context from each artifact:
+
+**From spec.md:**
+
+- Overview/Context
+- Functional Requirements
+- Success Criteria (measurable outcomes — e.g., performance, security, availability, user success, business impact)
+- User Stories
+- Edge Cases (if present)
+
+**From plan.md:**
+
+- Architecture/stack choices
+- Data Model references
+- Phases
+- Technical constraints
+
+**From tasks.md:**
+
+- Task IDs
+- Descriptions
+- Phase grouping
+- Parallel markers [P]
+- Referenced file paths
+
+**From constitution:**
+
+- Load `.specify/memory/constitution.md` for principle validation
+
+### 3. Build Semantic Models
+
+Create internal representations (do not include raw artifacts in output):
+
+- **Requirements inventory**: For each Functional Requirement (FR-###) and Success Criterion (SC-###), record a stable key. Use the explicit FR-/SC- identifier as the primary key when present, and optionally also derive an imperative-phrase slug for readability (e.g., "User can upload file" → `user-can-upload-file`). Include only Success Criteria items that require buildable work (e.g., load-testing infrastructure, security audit tooling), and exclude post-launch outcome metrics and business KPIs (e.g., "Reduce support tickets by 50%").
+- **User story/action inventory**: Discrete user actions with acceptance criteria
+- **Task coverage mapping**: Map each task to one or more requirements or stories (inference by keyword / explicit reference patterns like IDs or key phrases)
+- **Constitution rule set**: Extract principle names and MUST/SHOULD normative statements
+
+### 4. Detection Passes (Token-Efficient Analysis)
+
+Focus on high-signal findings. Limit to 50 findings total; aggregate remainder in overflow summary.
+
+#### A. Duplication Detection
+
+- Identify near-duplicate requirements
+- Mark lower-quality phrasing for consolidation
+
+#### B. Ambiguity Detection
+
+- Flag vague adjectives (fast, scalable, secure, intuitive, robust) lacking measurable criteria
+- Flag unresolved placeholders (TODO, TKTK, ???, `<placeholder>`, etc.)
+
+#### C. Underspecification
+
+- Requirements with verbs but missing object or measurable outcome
+- User stories missing acceptance criteria alignment
+- Tasks referencing files or components not defined in spec/plan
+
+#### D. Constitution Alignment
+
+- Any requirement or plan element conflicting with a MUST principle
+- Missing mandated sections or quality gates from constitution
+
+#### E. Coverage Gaps
+
+- Requirements with zero associated tasks
+- Tasks with no mapped requirement/story
+- Success Criteria requiring buildable work (performance, security, availability) not reflected in tasks
+
+#### F. Inconsistency
+
+- Terminology drift (same concept named differently across files)
+- Data entities referenced in plan but absent in spec (or vice versa)
+- Task ordering contradictions (e.g., integration tasks before foundational setup tasks without dependency note)
+- Conflicting requirements (e.g., one requires Next.js while other specifies Vue)
+
+### 5. Severity Assignment
+
+Use this heuristic to prioritize findings:
+
+- **CRITICAL**: Violates constitution MUST, missing core spec artifact, or requirement with zero coverage that blocks baseline functionality
+- **HIGH**: Duplicate or conflicting requirement, ambiguous security/performance attribute, untestable acceptance criterion
+- **MEDIUM**: Terminology drift, missing non-functional task coverage, underspecified edge case
+- **LOW**: Style/wording improvements, minor redundancy not affecting execution order
+
+### 6. Produce Compact Analysis Report
+
+Output a Markdown report (no file writes) with the following structure:
+
+## Specification Analysis Report
+
+| ID | Category | Severity | Location(s) | Summary | Recommendation |
+|----|----------|----------|-------------|---------|----------------|
+| A1 | Duplication | HIGH | spec.md:L120-134 | Two similar requirements ... | Merge phrasing; keep clearer version |
+
+(Add one row per finding; generate stable IDs prefixed by category initial.)
+
+**Coverage Summary Table:**
+
+| Requirement Key | Has Task? | Task IDs | Notes |
+|-----------------|-----------|----------|-------|
+
+**Constitution Alignment Issues:** (if any)
+
+**Unmapped Tasks:** (if any)
+
+**Metrics:**
+
+- Total Requirements
+- Total Tasks
+- Coverage % (requirements with >=1 task)
+- Ambiguity Count
+- Duplication Count
+- Critical Issues Count
+
+### 7. Provide Next Actions
+
+At end of report, output a concise Next Actions block:
+
+- If CRITICAL issues exist: Recommend resolving before `/speckit-implement`
+- If only LOW/MEDIUM: User may proceed, but provide improvement suggestions
+- Provide explicit command suggestions: e.g., "Run /speckit-specify with refinement", "Run /speckit-plan to adjust architecture", "Manually edit tasks.md to add coverage for 'performance-metrics'"
+
+### 8. Offer Remediation
+
+Ask the user: "Would you like me to suggest concrete remediation edits for the top N issues?" (Do NOT apply them automatically.)
+
+### 9. Check for extension hooks
+
+After reporting, check if `.specify/extensions.yml` exists in the project root.
+- If it exists, read it and look for entries under the `hooks.after_analyze` key
+- If the YAML cannot be parsed or is invalid, skip hook checking silently and continue normally
+- Filter out hooks where `enabled` is explicitly `false`. Treat hooks without an `enabled` field as enabled by default.
+- For each remaining hook, do **not** attempt to interpret or evaluate hook `condition` expressions:
+  - If the hook has no `condition` field, or it is null/empty, treat the hook as executable
+  - If the hook defines a non-empty `condition`, skip the hook and leave condition evaluation to the HookExecutor implementation
+- When constructing slash commands from hook command names, replace dots (`.`) with hyphens (`-`). For example, `speckit.git.commit` → `/speckit-git-commit`.
+- For each executable hook, output the following based on its `optional` flag:
+  - **Optional hook** (`optional: true`):
+    ```
+    ## Extension Hooks
+
+    **Optional Hook**: {extension}
+    Command: `/{command}`
+    Description: {description}
+
+    Prompt: {prompt}
+    To execute: `/{command}`
+    ```
+  - **Mandatory hook** (`optional: false`):
+    ```
+    ## Extension Hooks
+
+    **Automatic Hook**: {extension}
+    Executing: `/{command}`
+    EXECUTE_COMMAND: {command}
+    ```
+- If no hooks are registered or `.specify/extensions.yml` does not exist, skip silently
+
+## Operating Principles
+
+### Context Efficiency
+
+- **Minimal high-signal tokens**: Focus on actionable findings, not exhaustive documentation
+- **Progressive disclosure**: Load artifacts incrementally; don't dump all content into analysis
+- **Token-efficient output**: Limit findings table to 50 rows; summarize overflow
+- **Deterministic results**: Rerunning without changes should produce consistent IDs and counts
+
+### Analysis Guidelines
+
+- **NEVER modify files** (this is read-only analysis)
+- **NEVER hallucinate missing sections** (if absent, report them accurately)
+- **Prioritize constitution violations** (these are always CRITICAL)
+- **Use examples over exhaustive rules** (cite specific instances, not generic patterns)
+- **Report zero issues gracefully** (emit success report with coverage statistics)
+
+## Context
+
+$ARGUMENTS
--- a/.agents/skills/speckit-checklist/SKILL.md
+++ b/.agents/skills/speckit-checklist/SKILL.md
@ -0,0 +1,371 @@
+---
+name: "speckit-checklist"
+description: "Generate a custom checklist for the current feature based on user requirements."
+compatibility: "Requires spec-kit project structure with .specify/ directory"
+metadata:
+  author: "github-spec-kit"
+  source: "templates/commands/checklist.md"
+---
+
+
+## Checklist Purpose: "Unit Tests for English"
+
+**CRITICAL CONCEPT**: Checklists are **UNIT TESTS FOR REQUIREMENTS WRITING** - they validate the quality, clarity, and completeness of requirements in a given domain.
+
+**NOT for verification/testing**:
+
+- ❌ NOT "Verify the button clicks correctly"
+- ❌ NOT "Test error handling works"
+- ❌ NOT "Confirm the API returns 200"
+- ❌ NOT checking if code/implementation matches the spec
+
+**FOR requirements quality validation**:
+
+- ✅ "Are visual hierarchy requirements defined for all card types?" (completeness)
+- ✅ "Is 'prominent display' quantified with specific sizing/positioning?" (clarity)
+- ✅ "Are hover state requirements consistent across all interactive elements?" (consistency)
+- ✅ "Are accessibility requirements defined for keyboard navigation?" (coverage)
+- ✅ "Does the spec define what happens when logo image fails to load?" (edge cases)
+
+**Metaphor**: If your spec is code written in English, the checklist is its unit test suite. You're testing whether the requirements are well-written, complete, unambiguous, and ready for implementation - NOT whether the implementation works.
+
+## User Input
+
+```text
+$ARGUMENTS
+```
+
+You **MUST** consider the user input before proceeding (if not empty).
+
+## Pre-Execution Checks
+
+**Check for extension hooks (before checklist generation)**:
+- Check if `.specify/extensions.yml` exists in the project root.
+- If it exists, read it and look for entries under the `hooks.before_checklist` key
+- If the YAML cannot be parsed or is invalid, skip hook checking silently and continue normally
+- Filter out hooks where `enabled` is explicitly `false`. Treat hooks without an `enabled` field as enabled by default.
+- For each remaining hook, do **not** attempt to interpret or evaluate hook `condition` expressions:
+  - If the hook has no `condition` field, or it is null/empty, treat the hook as executable
+  - If the hook defines a non-empty `condition`, skip the hook and leave condition evaluation to the HookExecutor implementation
+- When constructing slash commands from hook command names, replace dots (`.`) with hyphens (`-`). For example, `speckit.git.commit` → `/speckit-git-commit`.
+- For each executable hook, output the following based on its `optional` flag:
+  - **Optional hook** (`optional: true`):
+    ```
+    ## Extension Hooks
+
+    **Optional Pre-Hook**: {extension}
+    Command: `/{command}`
+    Description: {description}
+
+    Prompt: {prompt}
+    To execute: `/{command}`
+    ```
+  - **Mandatory hook** (`optional: false`):
+    ```
+    ## Extension Hooks
+
+    **Automatic Pre-Hook**: {extension}
+    Executing: `/{command}`
+    EXECUTE_COMMAND: {command}
+
+    Wait for the result of the hook command before proceeding to the Execution Steps.
+    ```
+- If no hooks are registered or `.specify/extensions.yml` does not exist, skip silently
+
+## Execution Steps
+
+1. **Setup**: Run `.specify/scripts/bash/check-prerequisites.sh --json` from repo root and parse JSON for FEATURE_DIR and AVAILABLE_DOCS list.
+   - All file paths must be absolute.
+   - For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
+
+2. **IF EXISTS**: Load `.specify/memory/constitution.md` for project principles and governance constraints.
+
+3. **Clarify intent (dynamic)**: Derive up to THREE initial contextual clarifying questions (no pre-baked catalog). They MUST:
+   - Be generated from the user's phrasing + extracted signals from spec/plan/tasks
+   - Only ask about information that materially changes checklist content
+   - Be skipped individually if already unambiguous in `$ARGUMENTS`
+   - Prefer precision over breadth
+
+   Generation algorithm:
+   1. Extract signals: feature domain keywords (e.g., auth, latency, UX, API), risk indicators ("critical", "must", "compliance"), stakeholder hints ("QA", "review", "security team"), and explicit deliverables ("a11y", "rollback", "contracts").
+   2. Cluster signals into candidate focus areas (max 4) ranked by relevance.
+   3. Identify probable audience & timing (author, reviewer, QA, release) if not explicit.
+   4. Detect missing dimensions: scope breadth, depth/rigor, risk emphasis, exclusion boundaries, measurable acceptance criteria.
+   5. Formulate questions chosen from these archetypes:
+      - Scope refinement (e.g., "Should this include integration touchpoints with X and Y or stay limited to local module correctness?")
+      - Risk prioritization (e.g., "Which of these potential risk areas should receive mandatory gating checks?")
+      - Depth calibration (e.g., "Is this a lightweight pre-commit sanity list or a formal release gate?")
+      - Audience framing (e.g., "Will this be used by the author only or peers during PR review?")
+      - Boundary exclusion (e.g., "Should we explicitly exclude performance tuning items this round?")
+      - Scenario class gap (e.g., "No recovery flows detected—are rollback / partial failure paths in scope?")
+
+   Question formatting rules:
+   - If presenting options, generate a compact table with columns: Option | Candidate | Why It Matters
+   - Limit to A–E options maximum; omit table if a free-form answer is clearer
+   - Never ask the user to restate what they already said
+   - Avoid speculative categories (no hallucination). If uncertain, ask explicitly: "Confirm whether X belongs in scope."
+
+   Defaults when interaction impossible:
+   - Depth: Standard
+   - Audience: Reviewer (PR) if code-related; Author otherwise
+   - Focus: Top 2 relevance clusters
+
+   Output the questions (label Q1/Q2/Q3). After answers: if ≥2 scenario classes (Alternate / Exception / Recovery / Non-Functional domain) remain unclear, you MAY ask up to TWO more targeted follow‑ups (Q4/Q5) with a one-line justification each (e.g., "Unresolved recovery path risk"). Do not exceed five total questions. Skip escalation if user explicitly declines more.
+
+4. **Understand user request**: Combine `$ARGUMENTS` + clarifying answers:
+   - Derive checklist theme (e.g., security, review, deploy, ux)
+   - Consolidate explicit must-have items mentioned by user
+   - Map focus selections to category scaffolding
+   - Infer any missing context from spec/plan/tasks (do NOT hallucinate)
+
+5. **Load feature context**: Read from FEATURE_DIR:
+   - spec.md: Feature requirements and scope
+   - plan.md (if exists): Technical details, dependencies
+   - tasks.md (if exists): Implementation tasks
+
+   **Context Loading Strategy**:
+   - Load only necessary portions relevant to active focus areas (avoid full-file dumping)
+   - Prefer summarizing long sections into concise scenario/requirement bullets
+   - Use progressive disclosure: add follow-on retrieval only if gaps detected
+   - If source docs are large, generate interim summary items instead of embedding raw text
+
+6. **Generate checklist** - Create "Unit Tests for Requirements":
+   - Create `FEATURE_DIR/checklists/` directory if it doesn't exist
+   - Generate unique checklist filename:
+     - Use short, descriptive name based on domain (e.g., `ux.md`, `api.md`, `security.md`)
+     - Format: `[domain].md`
+   - File handling behavior:
+     - If file does NOT exist: Create new file and number items starting from CHK001
+     - If file exists: Append new items to existing file, continuing from the last CHK ID (e.g., if last item is CHK015, start new items at CHK016)
+   - Never delete or replace existing checklist content - always preserve and append
+
+   **CORE PRINCIPLE - Test the Requirements, Not the Implementation**:
+   Every checklist item MUST evaluate the REQUIREMENTS THEMSELVES for:
+   - **Completeness**: Are all necessary requirements present?
+   - **Clarity**: Are requirements unambiguous and specific?
+   - **Consistency**: Do requirements align with each other?
+   - **Measurability**: Can requirements be objectively verified?
+   - **Coverage**: Are all scenarios/edge cases addressed?
+
+   **Category Structure** - Group items by requirement quality dimensions:
+   - **Requirement Completeness** (Are all necessary requirements documented?)
+   - **Requirement Clarity** (Are requirements specific and unambiguous?)
+   - **Requirement Consistency** (Do requirements align without conflicts?)
+   - **Acceptance Criteria Quality** (Are success criteria measurable?)
+   - **Scenario Coverage** (Are all flows/cases addressed?)
+   - **Edge Case Coverage** (Are boundary conditions defined?)
+   - **Non-Functional Requirements** (Performance, Security, Accessibility, etc. - are they specified?)
+   - **Dependencies & Assumptions** (Are they documented and validated?)
+   - **Ambiguities & Conflicts** (What needs clarification?)
+
+   **HOW TO WRITE CHECKLIST ITEMS - "Unit Tests for English"**:
+
+   ❌ **WRONG** (Testing implementation):
+   - "Verify landing page displays 3 episode cards"
+   - "Test hover states work on desktop"
+   - "Confirm logo click navigates home"
+
+   ✅ **CORRECT** (Testing requirements quality):
+   - "Are the exact number and layout of featured episodes specified?" [Completeness]
+   - "Is 'prominent display' quantified with specific sizing/positioning?" [Clarity]
+   - "Are hover state requirements consistent across all interactive elements?" [Consistency]
+   - "Are keyboard navigation requirements defined for all interactive UI?" [Coverage]
+   - "Is the fallback behavior specified when logo image fails to load?" [Edge Cases]
+   - "Are loading states defined for asynchronous episode data?" [Completeness]
+   - "Does the spec define visual hierarchy for competing UI elements?" [Clarity]
+
+   **ITEM STRUCTURE**:
+   Each item should follow this pattern:
+   - Question format asking about requirement quality
+   - Focus on what's WRITTEN (or not written) in the spec/plan
+   - Include quality dimension in brackets [Completeness/Clarity/Consistency/etc.]
+   - Reference spec section `[Spec §X.Y]` when checking existing requirements
+   - Use `[Gap]` marker when checking for missing requirements
+
+   **EXAMPLES BY QUALITY DIMENSION**:
+
+   Completeness:
+   - "Are error handling requirements defined for all API failure modes? [Gap]"
+   - "Are accessibility requirements specified for all interactive elements? [Completeness]"
+   - "Are mobile breakpoint requirements defined for responsive layouts? [Gap]"
+
+   Clarity:
+   - "Is 'fast loading' quantified with specific timing thresholds? [Clarity, Spec §NFR-2]"
+   - "Are 'related episodes' selection criteria explicitly defined? [Clarity, Spec §FR-5]"
+   - "Is 'prominent' defined with measurable visual properties? [Ambiguity, Spec §FR-4]"
+
+   Consistency:
+   - "Do navigation requirements align across all pages? [Consistency, Spec §FR-10]"
+   - "Are card component requirements consistent between landing and detail pages? [Consistency]"
+
+   Coverage:
+   - "Are requirements defined for zero-state scenarios (no episodes)? [Coverage, Edge Case]"
+   - "Are concurrent user interaction scenarios addressed? [Coverage, Gap]"
+   - "Are requirements specified for partial data loading failures? [Coverage, Exception Flow]"
+
+   Measurability:
+   - "Are visual hierarchy requirements measurable/testable? [Acceptance Criteria, Spec §FR-1]"
+   - "Can 'balanced visual weight' be objectively verified? [Measurability, Spec §FR-2]"
+
+   **Scenario Classification & Coverage** (Requirements Quality Focus):
+   - Check if requirements exist for: Primary, Alternate, Exception/Error, Recovery, Non-Functional scenarios
+   - For each scenario class, ask: "Are [scenario type] requirements complete, clear, and consistent?"
+   - If scenario class missing: "Are [scenario type] requirements intentionally excluded or missing? [Gap]"
+   - Include resilience/rollback when state mutation occurs: "Are rollback requirements defined for migration failures? [Gap]"
+
+   **Traceability Requirements**:
+   - MINIMUM: ≥80% of items MUST include at least one traceability reference
+   - Each item should reference: spec section `[Spec §X.Y]`, or use markers: `[Gap]`, `[Ambiguity]`, `[Conflict]`, `[Assumption]`
+   - If no ID system exists: "Is a requirement & acceptance criteria ID scheme established? [Traceability]"
+
+   **Surface & Resolve Issues** (Requirements Quality Problems):
+   Ask questions about the requirements themselves:
+   - Ambiguities: "Is the term 'fast' quantified with specific metrics? [Ambiguity, Spec §NFR-1]"
+   - Conflicts: "Do navigation requirements conflict between §FR-10 and §FR-10a? [Conflict]"
+   - Assumptions: "Is the assumption of 'always available podcast API' validated? [Assumption]"
+   - Dependencies: "Are external podcast API requirements documented? [Dependency, Gap]"
+   - Missing definitions: "Is 'visual hierarchy' defined with measurable criteria? [Gap]"
+
+   **Content Consolidation**:
+   - Soft cap: If raw candidate items > 40, prioritize by risk/impact
+   - Merge near-duplicates checking the same requirement aspect
+   - If >5 low-impact edge cases, create one item: "Are edge cases X, Y, Z addressed in requirements? [Coverage]"
+
+   **🚫 ABSOLUTELY PROHIBITED** - These make it an implementation test, not a requirements test:
+   - ❌ Any item starting with "Verify", "Test", "Confirm", "Check" + implementation behavior
+   - ❌ References to code execution, user actions, system behavior
+   - ❌ "Displays correctly", "works properly", "functions as expected"
+   - ❌ "Click", "navigate", "render", "load", "execute"
+   - ❌ Test cases, test plans, QA procedures
+   - ❌ Implementation details (frameworks, APIs, algorithms)
+
+   **✅ REQUIRED PATTERNS** - These test requirements quality:
+   - ✅ "Are [requirement type] defined/specified/documented for [scenario]?"
+   - ✅ "Is [vague term] quantified/clarified with specific criteria?"
+   - ✅ "Are requirements consistent between [section A] and [section B]?"
+   - ✅ "Can [requirement] be objectively measured/verified?"
+   - ✅ "Are [edge cases/scenarios] addressed in requirements?"
+   - ✅ "Does the spec define [missing aspect]?"
+
+7. **Structure Reference**: Generate the checklist following the canonical template in `.specify/templates/checklist-template.md` for title, meta section, category headings, and ID formatting. If template is unavailable, use: H1 title, purpose/created meta lines, `##` category sections containing `- [ ] CHK### <requirement item>` lines with globally incrementing IDs starting at CHK001.
+
+8. **Report**: Output full path to checklist file, item count, and summarize whether the run created a new file or appended to an existing one. Summarize:
+   - Focus areas selected
+   - Depth level
+   - Actor/timing
+   - Any explicit user-specified must-have items incorporated
+
+**Important**: Each `/speckit-checklist` command invocation uses a short, descriptive checklist filename and either creates a new file or appends to an existing one. This allows:
+
+- Multiple checklists of different types (e.g., `ux.md`, `test.md`, `security.md`)
+- Simple, memorable filenames that indicate checklist purpose
+- Easy identification and navigation in the `checklists/` folder
+
+To avoid clutter, use descriptive types and clean up obsolete checklists when done.
+
+## Example Checklist Types & Sample Items
+
+**UX Requirements Quality:** `ux.md`
+
+Sample items (testing the requirements, NOT the implementation):
+
+- "Are visual hierarchy requirements defined with measurable criteria? [Clarity, Spec §FR-1]"
+- "Is the number and positioning of UI elements explicitly specified? [Completeness, Spec §FR-1]"
+- "Are interaction state requirements (hover, focus, active) consistently defined? [Consistency]"
+- "Are accessibility requirements specified for all interactive elements? [Coverage, Gap]"
+- "Is fallback behavior defined when images fail to load? [Edge Case, Gap]"
+- "Can 'prominent display' be objectively measured? [Measurability, Spec §FR-4]"
+
+**API Requirements Quality:** `api.md`
+
+Sample items:
+
+- "Are error response formats specified for all failure scenarios? [Completeness]"
+- "Are rate limiting requirements quantified with specific thresholds? [Clarity]"
+- "Are authentication requirements consistent across all endpoints? [Consistency]"
+- "Are retry/timeout requirements defined for external dependencies? [Coverage, Gap]"
+- "Is versioning strategy documented in requirements? [Gap]"
+
+**Performance Requirements Quality:** `performance.md`
+
+Sample items:
+
+- "Are performance requirements quantified with specific metrics? [Clarity]"
+- "Are performance targets defined for all critical user journeys? [Coverage]"
+- "Are performance requirements under different load conditions specified? [Completeness]"
+- "Can performance requirements be objectively measured? [Measurability]"
+- "Are degradation requirements defined for high-load scenarios? [Edge Case, Gap]"
+
+**Security Requirements Quality:** `security.md`
+
+Sample items:
+
+- "Are authentication requirements specified for all protected resources? [Coverage]"
+- "Are data protection requirements defined for sensitive information? [Completeness]"
+- "Is the threat model documented and requirements aligned to it? [Traceability]"
+- "Are security requirements consistent with compliance obligations? [Consistency]"
+- "Are security failure/breach response requirements defined? [Gap, Exception Flow]"
+
+## Anti-Examples: What NOT To Do
+
+**❌ WRONG - These test implementation, not requirements:**
+
+```markdown
+- [ ] CHK001 - Verify landing page displays 3 episode cards [Spec §FR-001]
+- [ ] CHK002 - Test hover states work correctly on desktop [Spec §FR-003]
+- [ ] CHK003 - Confirm logo click navigates to home page [Spec §FR-010]
+- [ ] CHK004 - Check that related episodes section shows 3-5 items [Spec §FR-005]
+```
+
+**✅ CORRECT - These test requirements quality:**
+
+```markdown
+- [ ] CHK001 - Are the number and layout of featured episodes explicitly specified? [Completeness, Spec §FR-001]
+- [ ] CHK002 - Are hover state requirements consistently defined for all interactive elements? [Consistency, Spec §FR-003]
+- [ ] CHK003 - Are navigation requirements clear for all clickable brand elements? [Clarity, Spec §FR-010]
+- [ ] CHK004 - Is the selection criteria for related episodes documented? [Gap, Spec §FR-005]
+- [ ] CHK005 - Are loading state requirements defined for asynchronous episode data? [Gap]
+- [ ] CHK006 - Can "visual hierarchy" requirements be objectively measured? [Measurability, Spec §FR-001]
+```
+
+**Key Differences:**
+
+- Wrong: Tests if the system works correctly
+- Correct: Tests if the requirements are written correctly
+- Wrong: Verification of behavior
+- Correct: Validation of requirement quality
+- Wrong: "Does it do X?"
+- Correct: "Is X clearly specified?"
+
+## Post-Execution Checks
+
+**Check for extension hooks (after checklist generation)**:
+Check if `.specify/extensions.yml` exists in the project root.
+- If it exists, read it and look for entries under the `hooks.after_checklist` key
+- If the YAML cannot be parsed or is invalid, skip hook checking silently and continue normally
+- Filter out hooks where `enabled` is explicitly `false`. Treat hooks without an `enabled` field as enabled by default.
+- For each remaining hook, do **not** attempt to interpret or evaluate hook `condition` expressions:
+  - If the hook has no `condition` field, or it is null/empty, treat the hook as executable
+  - If the hook defines a non-empty `condition`, skip the hook and leave condition evaluation to the HookExecutor implementation
+- When constructing slash commands from hook command names, replace dots (`.`) with hyphens (`-`). For example, `speckit.git.commit` → `/speckit-git-commit`.
+- For each executable hook, output the following based on its `optional` flag:
+  - **Optional hook** (`optional: true`):
+    ```
+    ## Extension Hooks
+
+    **Optional Hook**: {extension}
+    Command: `/{command}`
+    Description: {description}
+
+    Prompt: {prompt}
+    To execute: `/{command}`
+    ```
+  - **Mandatory hook** (`optional: false`):
+    ```
+    ## Extension Hooks
+
+    **Automatic Hook**: {extension}
+    Executing: `/{command}`
+    EXECUTE_COMMAND: {command}
+    ```
+- If no hooks are registered or `.specify/extensions.yml` does not exist, skip silently
--- a/.agents/skills/speckit-clarify/SKILL.md
+++ b/.agents/skills/speckit-clarify/SKILL.md
@ -0,0 +1,283 @@
+---
+name: "speckit-clarify"
+description: "Identify underspecified areas in the current feature spec by asking up to 5 highly targeted clarification questions and encoding answers back into the spec."
+compatibility: "Requires spec-kit project structure with .specify/ directory"
+metadata:
+  author: "github-spec-kit"
+  source: "templates/commands/clarify.md"
+---
+
+
+## User Input
+
+```text
+$ARGUMENTS
+```
+
+You **MUST** consider the user input before proceeding (if not empty).
+
+## Pre-Execution Checks
+
+**Check for extension hooks (before clarification)**:
+- Check if `.specify/extensions.yml` exists in the project root.
+- If it exists, read it and look for entries under the `hooks.before_clarify` key
+- If the YAML cannot be parsed or is invalid, skip hook checking silently and continue normally
+- Filter out hooks where `enabled` is explicitly `false`. Treat hooks without an `enabled` field as enabled by default.
+- For each remaining hook, do **not** attempt to interpret or evaluate hook `condition` expressions:
+  - If the hook has no `condition` field, or it is null/empty, treat the hook as executable
+  - If the hook defines a non-empty `condition`, skip the hook and leave condition evaluation to the HookExecutor implementation
+- When constructing slash commands from hook command names, replace dots (`.`) with hyphens (`-`). For example, `speckit.git.commit` → `/speckit-git-commit`.
+- For each executable hook, output the following based on its `optional` flag:
+  - **Optional hook** (`optional: true`):
+    ```
+    ## Extension Hooks
+
+    **Optional Pre-Hook**: {extension}
+    Command: `/{command}`
+    Description: {description}
+
+    Prompt: {prompt}
+    To execute: `/{command}`
+    ```
+  - **Mandatory hook** (`optional: false`):
+    ```
+    ## Extension Hooks
+
+    **Automatic Pre-Hook**: {extension}
+    Executing: `/{command}`
+    EXECUTE_COMMAND: {command}
+
+    Wait for the result of the hook command before proceeding to the Outline.
+    ```
+- If no hooks are registered or `.specify/extensions.yml` does not exist, skip silently
+
+## Outline
+
+Goal: Detect and reduce ambiguity or missing decision points in the active feature specification and record the clarifications directly in the spec file.
+
+Note: This clarification workflow is expected to run (and be completed) BEFORE invoking `/speckit-plan`. If the user explicitly states they are skipping clarification (e.g., exploratory spike), you may proceed, but must warn that downstream rework risk increases.
+
+Execution steps:
+
+1. Run `.specify/scripts/bash/check-prerequisites.sh --json --paths-only` from repo root **once** (combined `--json --paths-only` mode / `-Json -PathsOnly`). Parse minimal JSON payload fields:
+   - `FEATURE_DIR`
+   - `FEATURE_SPEC`
+   - (Optionally capture `IMPL_PLAN`, `TASKS` for future chained flows.)
+   - If JSON parsing fails, abort and instruct user to re-run `/speckit-specify` or verify feature branch environment.
+   - For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
+
+2. **IF EXISTS**: Load `.specify/memory/constitution.md` for project principles and governance constraints.
+
+3. Load the current spec file. Perform a structured ambiguity & coverage scan using this taxonomy. For each category, mark status: Clear / Partial / Missing. Produce an internal coverage map used for prioritization (do not output raw map unless no questions will be asked).
+
+   Functional Scope & Behavior:
+   - Core user goals & success criteria
+   - Explicit out-of-scope declarations
+   - User roles / personas differentiation
+
+   Domain & Data Model:
+   - Entities, attributes, relationships
+   - Identity & uniqueness rules
+   - Lifecycle/state transitions
+   - Data volume / scale assumptions
+
+   Interaction & UX Flow:
+   - Critical user journeys / sequences
+   - Error/empty/loading states
+   - Accessibility or localization notes
+
+   Non-Functional Quality Attributes:
+   - Performance (latency, throughput targets)
+   - Scalability (horizontal/vertical, limits)
+   - Reliability & availability (uptime, recovery expectations)
+   - Observability (logging, metrics, tracing signals)
+   - Security & privacy (authN/Z, data protection, threat assumptions)
+   - Compliance / regulatory constraints (if any)
+
+   Integration & External Dependencies:
+   - External services/APIs and failure modes
+   - Data import/export formats
+   - Protocol/versioning assumptions
+
+   Edge Cases & Failure Handling:
+   - Negative scenarios
+   - Rate limiting / throttling
+   - Conflict resolution (e.g., concurrent edits)
+
+   Constraints & Tradeoffs:
+   - Technical constraints (language, storage, hosting)
+   - Explicit tradeoffs or rejected alternatives
+
+   Terminology & Consistency:
+   - Canonical glossary terms
+   - Avoided synonyms / deprecated terms
+
+   Completion Signals:
+   - Acceptance criteria testability
+   - Measurable Definition of Done style indicators
+
+   Misc / Placeholders:
+   - TODO markers / unresolved decisions
+   - Ambiguous adjectives ("robust", "intuitive") lacking quantification
+
+   For each category with Partial or Missing status, add a candidate question opportunity unless:
+   - Clarification would not materially change implementation or validation strategy
+   - Information is better deferred to planning phase (note internally)
+
+4. Generate (internally) a prioritized queue of candidate clarification questions (maximum 5). Do NOT output them all at once. Apply these constraints:
+    - Maximum of 5 total questions across the whole session.
+    - Each question must be answerable with EITHER:
+       - A short multiple‑choice selection (2–5 distinct, mutually exclusive options), OR
+       - A one-word / short‑phrase answer (explicitly constrain: "Answer in <=5 words").
+    - Only include questions whose answers materially impact architecture, data modeling, task decomposition, test design, UX behavior, operational readiness, or compliance validation.
+    - Ensure category coverage balance: attempt to cover the highest impact unresolved categories first; avoid asking two low-impact questions when a single high-impact area (e.g., security posture) is unresolved.
+    - Exclude questions already answered, trivial stylistic preferences, or plan-level execution details (unless blocking correctness).
+    - Favor clarifications that reduce downstream rework risk or prevent misaligned acceptance tests.
+    - If more than 5 categories remain unresolved, select the top 5 by (Impact * Uncertainty) heuristic.
+
+5. Sequential questioning loop (interactive):
+    - Present EXACTLY ONE question at a time.
+    - For multiple‑choice questions:
+       - **Analyze all options** and determine the **most suitable option** based on:
+          - Best practices for the project type
+          - Common patterns in similar implementations
+          - Risk reduction (security, performance, maintainability)
+          - Alignment with any explicit project goals or constraints visible in the spec
+       - Present your **recommended option prominently** at the top with clear reasoning (1-2 sentences explaining why this is the best choice).
+       - Format as: `**Recommended:** Option [X] - <reasoning>`
+       - Then render all options as a Markdown table:
+
+       | Option | Description |
+       |--------|-------------|
+       | A | <Option A description> |
+       | B | <Option B description> |
+       | C | <Option C description> (add D/E as needed up to 5) |
+       | Short | Provide a different short answer (<=5 words) (Include only if free-form alternative is appropriate) |
+
+       - After the table, add: `You can reply with the option letter (e.g., "A"), accept the recommendation by saying "yes" or "recommended", or provide your own short answer.`
+    - For short‑answer style (no meaningful discrete options):
+       - Provide your **suggested answer** based on best practices and context.
+       - Format as: `**Suggested:** <your proposed answer> - <brief reasoning>`
+       - Then output: `Format: Short answer (<=5 words). You can accept the suggestion by saying "yes" or "suggested", or provide your own answer.`
+    - After the user answers:
+       - If the user replies with "yes", "recommended", or "suggested", use your previously stated recommendation/suggestion as the answer.
+       - Otherwise, validate the answer maps to one option or fits the <=5 word constraint.
+       - If ambiguous, ask for a quick disambiguation (count still belongs to same question; do not advance).
+       - Once satisfactory, record it in working memory (do not yet write to disk) and move to the next queued question.
+    - Stop asking further questions when:
+       - All critical ambiguities resolved early (remaining queued items become unnecessary), OR
+       - User signals completion ("done", "good", "no more"), OR
+       - You reach 5 asked questions.
+    - Never reveal future queued questions in advance.
+    - If no valid questions exist at start, immediately report no critical ambiguities.
+
+6. Integration after EACH accepted answer (incremental update approach):
+    - Maintain in-memory representation of the spec (loaded once at start) plus the raw file contents.
+    - For the first integrated answer in this session:
+       - Ensure a `## Clarifications` section exists (create it just after the highest-level contextual/overview section per the spec template if missing).
+       - Under it, create (if not present) a `### Session YYYY-MM-DD` subheading for today.
+    - Append a bullet line immediately after acceptance: `- Q: <question> → A: <final answer>`.
+    - Then immediately apply the clarification to the most appropriate section(s):
+       - Functional ambiguity → Update or add a bullet in Functional Requirements.
+       - User interaction / actor distinction → Update User Stories or Actors subsection (if present) with clarified role, constraint, or scenario.
+       - Data shape / entities → Update Data Model (add fields, types, relationships) preserving ordering; note added constraints succinctly.
+       - Non-functional constraint → Add/modify measurable criteria in Success Criteria > Measurable Outcomes (convert vague adjective to metric or explicit target).
+       - Edge case / negative flow → Add a new bullet under Edge Cases / Error Handling (or create such subsection if template provides placeholder for it).
+       - Terminology conflict → Normalize term across spec; retain original only if necessary by adding `(formerly referred to as "X")` once.
+    - If the clarification invalidates an earlier ambiguous statement, replace that statement instead of duplicating; leave no obsolete contradictory text.
+    - Save the spec file AFTER each integration to minimize risk of context loss (atomic overwrite).
+    - Preserve formatting: do not reorder unrelated sections; keep heading hierarchy intact.
+    - Keep each inserted clarification minimal and testable (avoid narrative drift).
+
+7. Validation (performed after EACH write plus final pass):
+   - Clarifications session contains exactly one bullet per accepted answer (no duplicates).
+   - Total asked (accepted) questions ≤ 5.
+   - Updated sections contain no lingering vague placeholders the new answer was meant to resolve.
+   - No contradictory earlier statement remains (scan for now-invalid alternative choices removed).
+   - Markdown structure valid; only allowed new headings: `## Clarifications`, `### Session YYYY-MM-DD`.
+   - Terminology consistency: same canonical term used across all updated sections.
+
+8. Write the updated spec back to `FEATURE_SPEC`.
+
+9. **Re-validate Spec Quality Checklist** (if it exists):
+   - Check if `FEATURE_DIR/checklists/requirements.md` exists.
+   - If it does NOT exist, skip this step silently.
+   - If it exists:
+     1. Read the checklist file.
+     2. Identify all GitHub task-list checkbox lines — lines matching `- [ ]`, `- [x]`, or `- [X]` (case-insensitive, tolerant of leading whitespace for nested items) outside of code fences. Ignore all other content (headings, notes, non-checkbox bullets, metadata).
+     3. For each checkbox line, record its current marker state (checked or unchecked) and item text into a before-snapshot list.
+     4. Re-evaluate each checkbox item against the **updated** spec (the version just saved in step 7).
+     5. For each checkbox item, update only if the checked/unchecked state actually changes:
+        - If the item now passes and was unchecked: change `[ ]` to `[x]`.
+        - If the item now fails and was checked: change `[x]`/`[X]` to `[ ]`.
+        - If the state is unchanged: leave the marker as-is (preserve existing case to avoid cosmetic diffs).
+     6. Save the updated checklist file. **Only toggle the `[ ]`/`[x]` marker portion of checkbox lines whose state changed.** All other file content — headings, metadata, notes, line ordering, whitespace — must remain unchanged to avoid noisy diffs.
+     7. Compare the before-snapshot with the current state to compute three lists for the Completion Report:
+        - **Newly passing**: items that changed from unchecked to checked.
+        - **Regressions**: items that changed from checked to unchecked.
+        - **Still unchecked**: items that remain unchecked.
+     8. Record the before/after pass counts as checked/total checkbox items (e.g., "12/16 → 15/16 items passing").
+
+Behavior rules:
+
+- If no meaningful ambiguities found (or all potential questions would be low-impact), respond: "No critical ambiguities detected worth formal clarification." and suggest proceeding.
+- If spec file missing, instruct user to run `/speckit-specify` first (do not create a new spec here).
+- Never exceed 5 total asked questions (clarification retries for a single question do not count as new questions).
+- Avoid speculative tech stack questions unless the absence blocks functional clarity.
+- Respect user early termination signals ("stop", "done", "proceed").
+- If no questions asked due to full coverage, output a compact coverage summary (all categories Clear) then suggest advancing.
+- If quota reached with unresolved high-impact categories remaining, explicitly flag them under Deferred with rationale.
+
+Context for prioritization: $ARGUMENTS
+
+## Mandatory Post-Execution Hooks
+
+**You MUST complete this section before reporting completion to the user.**
+
+Check if `.specify/extensions.yml` exists in the project root.
+- If it does not exist, or no hooks are registered under `hooks.after_clarify`, skip to the Completion Report.
+- If it exists, read it and look for entries under the `hooks.after_clarify` key.
+- If the YAML cannot be parsed or is invalid, skip hook checking silently and continue to the Completion Report.
+- Filter out hooks where `enabled` is explicitly `false`. Treat hooks without an `enabled` field as enabled by default.
+- For each remaining hook, do **not** attempt to interpret or evaluate hook `condition` expressions:
+  - If the hook has no `condition` field, or it is null/empty, treat the hook as executable
+  - If the hook defines a non-empty `condition`, skip the hook and leave condition evaluation to the HookExecutor implementation
+- When constructing slash commands from hook command names, replace dots (`.`) with hyphens (`-`). For example, `speckit.git.commit` → `/speckit-git-commit`.
+- For each executable hook, output the following based on its `optional` flag:
+  - **Mandatory hook** (`optional: false`) — **You MUST emit `EXECUTE_COMMAND:` for each mandatory hook**:
+    ```
+    ## Extension Hooks
+
+    **Automatic Hook**: {extension}
+    Executing: `/{command}`
+    EXECUTE_COMMAND: {command}
+    ```
+  - **Optional hook** (`optional: true`):
+    ```
+    ## Extension Hooks
+
+    **Optional Hook**: {extension}
+    Command: `/{command}`
+    Description: {description}
+
+    Prompt: {prompt}
+    To execute: `/{command}`
+    ```
+
+## Completion Report
+
+Report completion (after questioning loop ends or early termination):
+- Number of questions asked & answered.
+- Path to updated spec.
+- Sections touched (list names).
+- Spec quality checklist status (if `FEATURE_DIR/checklists/requirements.md` was re-validated): show before/after pass counts (e.g., "Spec Quality Checklist: 12/16 → 15/16 items passing") and list any items that changed state — both newly checked (unchecked → checked) and any regressions (checked → unchecked). If any items remain unchecked, list them as areas needing attention.
+- Coverage summary table listing each taxonomy category with Status: Resolved (was Partial/Missing and addressed), Deferred (exceeds question quota or better suited for planning), Clear (already sufficient), Outstanding (still Partial/Missing but low impact).
+- If any Outstanding or Deferred remain, recommend whether to proceed to `/speckit-plan` or run `/speckit-clarify` again later post-plan.
+- Suggested next command.
+
+## Done When
+
+- [ ] Spec ambiguities identified and clarifications integrated into spec file
+- [ ] Spec quality checklist re-validated against updated spec (if `FEATURE_DIR/checklists/requirements.md` exists)
+- [ ] Extension hooks dispatched or skipped according to the rules in Mandatory Post-Execution Hooks above
+- [ ] Completion reported to user with questions answered, sections touched, checklist status, and coverage summary
--- a/.agents/skills/speckit-constitution/SKILL.md
+++ b/.agents/skills/speckit-constitution/SKILL.md
@ -0,0 +1,154 @@
+---
+name: "speckit-constitution"
+description: "Create or update the project constitution from interactive or provided principle inputs, ensuring all dependent templates stay in sync."
+compatibility: "Requires spec-kit project structure with .specify/ directory"
+metadata:
+  author: "github-spec-kit"
+  source: "templates/commands/constitution.md"
+---
+
+
+## User Input
+
+```text
+$ARGUMENTS
+```
+
+You **MUST** consider the user input before proceeding (if not empty).
+
+## Pre-Execution Checks
+
+**Check for extension hooks (before constitution update)**:
+- Check if `.specify/extensions.yml` exists in the project root.
+- If it exists, read it and look for entries under the `hooks.before_constitution` key
+- If the YAML cannot be parsed or is invalid, skip hook checking silently and continue normally
+- Filter out hooks where `enabled` is explicitly `false`. Treat hooks without an `enabled` field as enabled by default.
+- For each remaining hook, do **not** attempt to interpret or evaluate hook `condition` expressions:
+  - If the hook has no `condition` field, or it is null/empty, treat the hook as executable
+  - If the hook defines a non-empty `condition`, skip the hook and leave condition evaluation to the HookExecutor implementation
+- When constructing slash commands from hook command names, replace dots (`.`) with hyphens (`-`). For example, `speckit.git.commit` → `/speckit-git-commit`.
+- For each executable hook, output the following based on its `optional` flag:
+  - **Optional hook** (`optional: true`):
+    ```
+    ## Extension Hooks
+
+    **Optional Pre-Hook**: {extension}
+    Command: `/{command}`
+    Description: {description}
+
+    Prompt: {prompt}
+    To execute: `/{command}`
+    ```
+  - **Mandatory hook** (`optional: false`):
+    ```
+    ## Extension Hooks
+
+    **Automatic Pre-Hook**: {extension}
+    Executing: `/{command}`
+    EXECUTE_COMMAND: {command}
+
+    Wait for the result of the hook command before proceeding to the Outline.
+    ```
+- If no hooks are registered or `.specify/extensions.yml` does not exist, skip silently
+
+## Outline
+
+You are updating the project constitution at `.specify/memory/constitution.md`. This file is a TEMPLATE containing placeholder tokens in square brackets (e.g. `[PROJECT_NAME]`, `[PRINCIPLE_1_NAME]`). Your job is to (a) collect/derive concrete values, (b) fill the template precisely, and (c) propagate any amendments across dependent artifacts.
+
+**Note**: If `.specify/memory/constitution.md` does not exist yet, it should have been initialized from `.specify/templates/constitution-template.md` during project setup. If it's missing, copy the template first.
+
+Follow this execution flow:
+
+1. Load the existing constitution at `.specify/memory/constitution.md`.
+   - Identify every placeholder token of the form `[ALL_CAPS_IDENTIFIER]`.
+   **IMPORTANT**: The user might require less or more principles than the ones used in the template. If a number is specified, respect that - follow the general template. You will update the doc accordingly.
+
+2. Collect/derive values for placeholders:
+   - If user input (conversation) supplies a value, use it.
+   - Otherwise infer from existing repo context (README, docs, prior constitution versions if embedded).
+   - For governance dates: `RATIFICATION_DATE` is the original adoption date (if unknown ask or mark TODO), `LAST_AMENDED_DATE` is today if changes are made, otherwise keep previous.
+   - `CONSTITUTION_VERSION` must increment according to semantic versioning rules:
+     - MAJOR: Backward incompatible governance/principle removals or redefinitions.
+     - MINOR: New principle/section added or materially expanded guidance.
+     - PATCH: Clarifications, wording, typo fixes, non-semantic refinements.
+   - If version bump type ambiguous, propose reasoning before finalizing.
+
+3. Draft the updated constitution content:
+   - Replace every placeholder with concrete text (no bracketed tokens left except intentionally retained template slots that the project has chosen not to define yet—explicitly justify any left).
+   - Preserve heading hierarchy and comments can be removed once replaced unless they still add clarifying guidance.
+   - Ensure each Principle section: succinct name line, paragraph (or bullet list) capturing non‑negotiable rules, explicit rationale if not obvious.
+   - Ensure Governance section lists amendment procedure, versioning policy, and compliance review expectations.
+
+4. Consistency propagation checklist (convert prior checklist into active validations):
+   - Read `.specify/templates/plan-template.md` and ensure any "Constitution Check" or rules align with updated principles.
+   - Read `.specify/templates/spec-template.md` for scope/requirements alignment—update if constitution adds/removes mandatory sections or constraints.
+   - Read `.specify/templates/tasks-template.md` and ensure task categorization reflects new or removed principle-driven task types (e.g., observability, versioning, testing discipline).
+   - Read each command file in `.specify/templates/commands/*.md` (including this one) to verify no outdated references (agent-specific names like CLAUDE only) remain when generic guidance is required.
+   - Read any runtime guidance docs (e.g., `README.md`, `docs/quickstart.md`, or agent-specific guidance files if present). Update references to principles changed.
+
+5. Produce a Sync Impact Report (prepend as an HTML comment at top of the constitution file after update):
+   - Version change: old → new
+   - List of modified principles (old title → new title if renamed)
+   - Added sections
+   - Removed sections
+   - Templates requiring updates (✅ updated / ⚠ pending) with file paths
+   - Follow-up TODOs if any placeholders intentionally deferred.
+
+6. Validation before final output:
+   - No remaining unexplained bracket tokens.
+   - Version line matches report.
+   - Dates ISO format YYYY-MM-DD.
+   - Principles are declarative, testable, and free of vague language ("should" → replace with MUST/SHOULD rationale where appropriate).
+
+7. Write the completed constitution back to `.specify/memory/constitution.md` (overwrite).
+
+8. Output a final summary to the user with:
+   - New version and bump rationale.
+   - Any files flagged for manual follow-up.
+   - Suggested commit message (e.g., `docs: amend constitution to vX.Y.Z (principle additions + governance update)`).
+
+Formatting & Style Requirements:
+
+- Use Markdown headings exactly as in the template (do not demote/promote levels).
+- Wrap long rationale lines to keep readability (<100 chars ideally) but do not hard enforce with awkward breaks.
+- Keep a single blank line between sections.
+- Avoid trailing whitespace.
+
+If the user supplies partial updates (e.g., only one principle revision), still perform validation and version decision steps.
+
+If critical info missing (e.g., ratification date truly unknown), insert `TODO(<FIELD_NAME>): explanation` and include in the Sync Impact Report under deferred items.
+
+Do not create a new template; always operate on the existing `.specify/memory/constitution.md` file.
+
+## Post-Execution Checks
+
+**Check for extension hooks (after constitution update)**:
+Check if `.specify/extensions.yml` exists in the project root.
+- If it exists, read it and look for entries under the `hooks.after_constitution` key
+- If the YAML cannot be parsed or is invalid, skip hook checking silently and continue normally
+- Filter out hooks where `enabled` is explicitly `false`. Treat hooks without an `enabled` field as enabled by default.
+- For each remaining hook, do **not** attempt to interpret or evaluate hook `condition` expressions:
+  - If the hook has no `condition` field, or it is null/empty, treat the hook as executable
+  - If the hook defines a non-empty `condition`, skip the hook and leave condition evaluation to the HookExecutor implementation
+- When constructing slash commands from hook command names, replace dots (`.`) with hyphens (`-`). For example, `speckit.git.commit` → `/speckit-git-commit`.
+- For each executable hook, output the following based on its `optional` flag:
+  - **Optional hook** (`optional: true`):
+    ```
+    ## Extension Hooks
+
+    **Optional Hook**: {extension}
+    Command: `/{command}`
+    Description: {description}
+
+    Prompt: {prompt}
+    To execute: `/{command}`
+    ```
+  - **Mandatory hook** (`optional: false`):
+    ```
+    ## Extension Hooks
+
+    **Automatic Hook**: {extension}
+    Executing: `/{command}`
+    EXECUTE_COMMAND: {command}
+    ```
+- If no hooks are registered or `.specify/extensions.yml` does not exist, skip silently
--- a/.agents/skills/speckit-git-commit/SKILL.md
+++ b/.agents/skills/speckit-git-commit/SKILL.md
@ -0,0 +1,53 @@
+---
+name: speckit-git-commit
+description: Auto-commit changes after a Spec Kit command completes
+compatibility: Requires spec-kit project structure with .specify/ directory
+metadata:
+  author: github-spec-kit
+  source: git:commands/speckit.git.commit.md
+---
+
+# Auto-Commit Changes
+
+Automatically stage and commit all changes after a Spec Kit command completes.
+
+## Behavior
+
+This command is invoked as a hook after (or before) core commands. It:
+
+1. Determines the event name from the hook context (e.g., if invoked as an `after_specify` hook, the event is `after_specify`; if `before_plan`, the event is `before_plan`)
+2. Checks `.specify/extensions/git/git-config.yml` for the `auto_commit` section
+3. Looks up the specific event key to see if auto-commit is enabled
+4. Falls back to `auto_commit.default` if no event-specific key exists
+5. Uses the per-command `message` if configured, otherwise a default message
+6. If enabled and there are uncommitted changes, runs `git add .` + `git commit`
+
+## Execution
+
+Determine the event name from the hook that triggered this command, then run the script:
+
+- **Bash**: `.specify/extensions/git/scripts/bash/auto-commit.sh <event_name>`
+- **PowerShell**: `.specify/extensions/git/scripts/powershell/auto-commit.ps1 <event_name>`
+
+Replace `<event_name>` with the actual hook event (e.g., `after_specify`, `before_plan`, `after_implement`).
+
+## Configuration
+
+In `.specify/extensions/git/git-config.yml`:
+
+```yaml
+auto_commit:
+  default: false          # Global toggle — set true to enable for all commands
+  after_specify:
+    enabled: true          # Override per-command
+    message: "[Spec Kit] Add specification"
+  after_plan:
+    enabled: false
+    message: "[Spec Kit] Add implementation plan"
+```
+
+## Graceful Degradation
+
+- If Git is not available or the current directory is not a repository: skips with a warning
+- If no config file exists: skips (disabled by default)
+- If no changes to commit: skips with a message
--- a/.agents/skills/speckit-git-feature/SKILL.md
+++ b/.agents/skills/speckit-git-feature/SKILL.md
@ -0,0 +1,72 @@
+---
+name: speckit-git-feature
+description: Create a feature branch with sequential or timestamp numbering
+compatibility: Requires spec-kit project structure with .specify/ directory
+metadata:
+  author: github-spec-kit
+  source: git:commands/speckit.git.feature.md
+---
+
+# Create Feature Branch
+
+Create and switch to a new git feature branch for the given specification. This command handles **branch creation only** — the spec directory and files are created by the core `/speckit-specify` workflow.
+
+## User Input
+
+```text
+$ARGUMENTS
+```
+
+You **MUST** consider the user input before proceeding (if not empty).
+
+## Environment Variable Override
+
+If the user explicitly provided `GIT_BRANCH_NAME` (e.g., via environment variable, argument, or in their request), pass it through to the script by setting the `GIT_BRANCH_NAME` environment variable before invoking the script. When `GIT_BRANCH_NAME` is set:
+- The script uses the exact value as the branch name, bypassing all prefix/suffix generation
+- `--short-name`, `--number`, and `--timestamp` flags are ignored
+- `FEATURE_NUM` is extracted from the name if it starts with a numeric prefix, otherwise set to the full branch name
+
+## Prerequisites
+
+- Verify Git is available by running `git rev-parse --is-inside-work-tree 2>/dev/null`
+- If Git is not available, warn the user and skip branch creation
+
+## Branch Numbering Mode
+
+Determine the branch numbering strategy by checking configuration in this order:
+
+1. Check `.specify/extensions/git/git-config.yml` for `branch_numbering` value
+2. Check `.specify/init-options.json` for `branch_numbering` value (backward compatibility)
+3. Default to `sequential` if neither exists
+
+## Execution
+
+Generate a concise short name (2-4 words) for the branch:
+- Analyze the feature description and extract the most meaningful keywords
+- Use action-noun format when possible (e.g., "add-user-auth", "fix-payment-bug")
+- Preserve technical terms and acronyms (OAuth2, API, JWT, etc.)
+
+Run the appropriate script based on your platform:
+
+- **Bash**: `.specify/extensions/git/scripts/bash/create-new-feature.sh --json --short-name "<short-name>" "<feature description>"`
+- **Bash (timestamp)**: `.specify/extensions/git/scripts/bash/create-new-feature.sh --json --timestamp --short-name "<short-name>" "<feature description>"`
+- **PowerShell**: `.specify/extensions/git/scripts/powershell/create-new-feature.ps1 -Json -ShortName "<short-name>" "<feature description>"`
+- **PowerShell (timestamp)**: `.specify/extensions/git/scripts/powershell/create-new-feature.ps1 -Json -Timestamp -ShortName "<short-name>" "<feature description>"`
+
+**IMPORTANT**:
+- Do NOT pass `--number` — the script determines the correct next number automatically
+- Always include the JSON flag (`--json` for Bash, `-Json` for PowerShell) so the output can be parsed reliably
+- You must only ever run this script once per feature
+- The JSON output will contain `BRANCH_NAME` and `FEATURE_NUM`
+
+## Graceful Degradation
+
+If Git is not installed or the current directory is not a Git repository:
+- Branch creation is skipped with a warning: `[specify] Warning: Git repository not detected; skipped branch creation`
+- The script still outputs `BRANCH_NAME` and `FEATURE_NUM` so the caller can reference them
+
+## Output
+
+The script outputs JSON with:
+- `BRANCH_NAME`: The branch name (e.g., `003-user-auth` or `20260319-143022-user-auth`)
+- `FEATURE_NUM`: The numeric or timestamp prefix used
--- a/.agents/skills/speckit-git-initialize/SKILL.md
+++ b/.agents/skills/speckit-git-initialize/SKILL.md
@ -0,0 +1,54 @@
+---
+name: speckit-git-initialize
+description: Initialize a Git repository with an initial commit
+compatibility: Requires spec-kit project structure with .specify/ directory
+metadata:
+  author: github-spec-kit
+  source: git:commands/speckit.git.initialize.md
+---
+
+# Initialize Git Repository
+
+Initialize a Git repository in the current project directory if one does not already exist.
+
+## Execution
+
+Run the appropriate script from the project root:
+
+- **Bash**: `.specify/extensions/git/scripts/bash/initialize-repo.sh`
+- **PowerShell**: `.specify/extensions/git/scripts/powershell/initialize-repo.ps1`
+
+If the extension scripts are not found, fall back to:
+- **Bash**: `git init && git add . && git commit -m "Initial commit from Specify template"`
+- **PowerShell**: `git init; git add .; git commit -m "Initial commit from Specify template"`
+
+The script handles all checks internally:
+- Skips if Git is not available
+- Skips if already inside a Git repository
+- Runs `git init`, `git add .`, and `git commit` with an initial commit message
+
+## Customization
+
+Replace the script to add project-specific Git initialization steps:
+- Custom `.gitignore` templates
+- Default branch naming (`git config init.defaultBranch`)
+- Git LFS setup
+- Git hooks installation
+- Commit signing configuration
+- Git Flow initialization
+
+## Output
+
+On success:
+- `[OK] Git repository initialized`
+
+## Graceful Degradation
+
+If Git is not installed:
+- Warn the user
+- Skip repository initialization
+- The project continues to function without Git (specs can still be created under `specs/`)
+
+If Git is installed but `git init`, `git add .`, or `git commit` fails:
+- Surface the error to the user
+- Stop this command rather than continuing with a partially initialized repository
--- a/.agents/skills/speckit-git-remote/SKILL.md
+++ b/.agents/skills/speckit-git-remote/SKILL.md
@ -0,0 +1,50 @@
+---
+name: speckit-git-remote
+description: Detect Git remote URL for GitHub integration
+compatibility: Requires spec-kit project structure with .specify/ directory
+metadata:
+  author: github-spec-kit
+  source: git:commands/speckit.git.remote.md
+---
+
+# Detect Git Remote URL
+
+Detect the Git remote URL for integration with GitHub services (e.g., issue creation).
+
+## Prerequisites
+
+- Check if Git is available by running `git rev-parse --is-inside-work-tree 2>/dev/null`
+- If Git is not available, output a warning and return empty:
+  ```
+  [specify] Warning: Git repository not detected; cannot determine remote URL
+  ```
+
+## Execution
+
+Run the following command to get the remote URL:
+
+```bash
+git config --get remote.origin.url
+```
+
+## Output
+
+Parse the remote URL and determine:
+
+1. **Repository owner**: Extract from the URL (e.g., `github` from `https://github.com/github/spec-kit.git`)
+2. **Repository name**: Extract from the URL (e.g., `spec-kit` from `https://github.com/github/spec-kit.git`)
+3. **Is GitHub**: Whether the remote points to a GitHub repository
+
+Supported URL formats:
+- HTTPS: `https://github.com/<owner>/<repo>.git`
+- SSH: `git@github.com:<owner>/<repo>.git`
+
+> [!CAUTION]
+> ONLY report a GitHub repository if the remote URL actually points to github.com.
+> Do NOT assume the remote is GitHub if the URL format doesn't match.
+
+## Graceful Degradation
+
+If Git is not installed, the directory is not a Git repository, or no remote is configured:
+- Return an empty result
+- Do NOT error — other workflows should continue without Git remote information
--- a/.agents/skills/speckit-git-validate/SKILL.md
+++ b/.agents/skills/speckit-git-validate/SKILL.md
@ -0,0 +1,54 @@
+---
+name: speckit-git-validate
+description: Validate current branch follows feature branch naming conventions
+compatibility: Requires spec-kit project structure with .specify/ directory
+metadata:
+  author: github-spec-kit
+  source: git:commands/speckit.git.validate.md
+---
+
+# Validate Feature Branch
+
+Validate that the current Git branch follows the expected feature branch naming conventions.
+
+## Prerequisites
+
+- Check if Git is available by running `git rev-parse --is-inside-work-tree 2>/dev/null`
+- If Git is not available, output a warning and skip validation:
+  ```
+  [specify] Warning: Git repository not detected; skipped branch validation
+  ```
+
+## Validation Rules
+
+Get the current branch name:
+
+```bash
+git rev-parse --abbrev-ref HEAD
+```
+
+The branch name must match one of these patterns:
+
+1. **Sequential**: `^[0-9]{3,}-` (e.g., `001-feature-name`, `042-fix-bug`, `1000-big-feature`)
+2. **Timestamp**: `^[0-9]{8}-[0-9]{6}-` (e.g., `20260319-143022-feature-name`)
+
+## Execution
+
+If on a feature branch (matches either pattern):
+- Output: `✓ On feature branch: <branch-name>`
+- Check if the corresponding spec directory exists under `specs/`:
+  - For sequential branches, look for `specs/<prefix>-*` where prefix matches the numeric portion
+  - For timestamp branches, look for `specs/<prefix>-*` where prefix matches the `YYYYMMDD-HHMMSS` portion
+- If spec directory exists: `✓ Spec directory found: <path>`
+- If spec directory missing: `⚠ No spec directory found for prefix <prefix>`
+
+If NOT on a feature branch:
+- Output: `✗ Not on a feature branch. Current branch: <branch-name>`
+- Output: `Feature branches should be named like: 001-feature-name or 20260319-143022-feature-name`
+
+## Graceful Degradation
+
+If Git is not installed or the directory is not a Git repository:
+- Check the `SPECIFY_FEATURE` environment variable as a fallback
+- If set, validate that value against the naming patterns
+- If not set, skip validation with a warning
--- a/.agents/skills/speckit-implement/SKILL.md
+++ b/.agents/skills/speckit-implement/SKILL.md
@ -0,0 +1,221 @@
+---
+name: "speckit-implement"
+description: "Execute the implementation plan by processing and executing all tasks defined in tasks.md"
+compatibility: "Requires spec-kit project structure with .specify/ directory"
+metadata:
+  author: "github-spec-kit"
+  source: "templates/commands/implement.md"
+---
+
+
+## User Input
+
+```text
+$ARGUMENTS
+```
+
+You **MUST** consider the user input before proceeding (if not empty).
+
+## Pre-Execution Checks
+
+**Check for extension hooks (before implementation)**:
+- Check if `.specify/extensions.yml` exists in the project root.
+- If it exists, read it and look for entries under the `hooks.before_implement` key
+- If the YAML cannot be parsed or is invalid, skip hook checking silently and continue normally
+- Filter out hooks where `enabled` is explicitly `false`. Treat hooks without an `enabled` field as enabled by default.
+- For each remaining hook, do **not** attempt to interpret or evaluate hook `condition` expressions:
+  - If the hook has no `condition` field, or it is null/empty, treat the hook as executable
+  - If the hook defines a non-empty `condition`, skip the hook and leave condition evaluation to the HookExecutor implementation
+- When constructing slash commands from hook command names, replace dots (`.`) with hyphens (`-`). For example, `speckit.git.commit` → `/speckit-git-commit`.
+- For each executable hook, output the following based on its `optional` flag:
+  - **Optional hook** (`optional: true`):
+    ```
+    ## Extension Hooks
+
+    **Optional Pre-Hook**: {extension}
+    Command: `/{command}`
+    Description: {description}
+
+    Prompt: {prompt}
+    To execute: `/{command}`
+    ```
+  - **Mandatory hook** (`optional: false`):
+    ```
+    ## Extension Hooks
+
+    **Automatic Pre-Hook**: {extension}
+    Executing: `/{command}`
+    EXECUTE_COMMAND: {command}
+    
+    Wait for the result of the hook command before proceeding to the Outline.
+    ```
+- If no hooks are registered or `.specify/extensions.yml` does not exist, skip silently
+
+## Outline
+
+1. Run `.specify/scripts/bash/check-prerequisites.sh --json --require-tasks --include-tasks` from repo root and parse FEATURE_DIR and AVAILABLE_DOCS list. All paths must be absolute. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
+
+2. **Check checklists status** (if FEATURE_DIR/checklists/ exists):
+   - Scan all checklist files in the checklists/ directory
+   - For each checklist, count:
+     - Total items: All lines matching `- [ ]` or `- [X]` or `- [x]`
+     - Completed items: Lines matching `- [X]` or `- [x]`
+     - Incomplete items: Lines matching `- [ ]`
+   - Create a status table:
+
+     ```text
+     | Checklist | Total | Completed | Incomplete | Status |
+     |-----------|-------|-----------|------------|--------|
+     | ux.md     | 12    | 12        | 0          | ✓ PASS |
+     | test.md   | 8     | 5         | 3          | ✗ FAIL |
+     | security.md | 6   | 6         | 0          | ✓ PASS |
+     ```
+
+   - Calculate overall status:
+     - **PASS**: All checklists have 0 incomplete items
+     - **FAIL**: One or more checklists have incomplete items
+
+   - **If any checklist is incomplete**:
+     - Display the table with incomplete item counts
+     - **STOP** and ask: "Some checklists are incomplete. Do you want to proceed with implementation anyway? (yes/no)"
+     - Wait for user response before continuing
+     - If user says "no" or "wait" or "stop", halt execution
+     - If user says "yes" or "proceed" or "continue", proceed to step 3
+
+   - **If all checklists are complete**:
+     - Display the table showing all checklists passed
+     - Automatically proceed to step 3
+
+3. Load and analyze the implementation context:
+   - **REQUIRED**: Read tasks.md for the complete task list and execution plan
+   - **REQUIRED**: Read plan.md for tech stack, architecture, and file structure
+   - **IF EXISTS**: Read data-model.md for entities and relationships
+   - **IF EXISTS**: Read contracts/ for API specifications and test requirements
+   - **IF EXISTS**: Read research.md for technical decisions and constraints
+   - **IF EXISTS**: Read .specify/memory/constitution.md for governance constraints
+   - **IF EXISTS**: Read quickstart.md for integration scenarios
+
+4. **Project Setup Verification**:
+   - **REQUIRED**: Create/verify ignore files based on actual project setup:
+
+   **Detection & Creation Logic**:
+   - Check if the following command succeeds to determine if the repository is a git repo (create/verify .gitignore if so):
+
+     ```sh
+     git rev-parse --git-dir 2>/dev/null
+     ```
+
+   - Check if Dockerfile* exists or Docker in plan.md → create/verify .dockerignore
+   - Check if .eslintrc* exists → create/verify .eslintignore
+   - Check if eslint.config.* exists → ensure the config's `ignores` entries cover required patterns
+   - Check if .prettierrc* exists → create/verify .prettierignore
+   - Check if .npmrc or package.json exists → create/verify .npmignore (if publishing)
+   - Check if terraform files (*.tf) exist → create/verify .terraformignore
+   - Check if .helmignore needed (helm charts present) → create/verify .helmignore
+
+   **If ignore file already exists**: Verify it contains essential patterns, append missing critical patterns only
+   **If ignore file missing**: Create with full pattern set for detected technology
+
+   **Common Patterns by Technology** (from plan.md tech stack):
+   - **Node.js/JavaScript/TypeScript**: `node_modules/`, `dist/`, `build/`, `*.log`, `.env*`
+   - **Python**: `__pycache__/`, `*.pyc`, `.venv/`, `venv/`, `dist/`, `*.egg-info/`
+   - **Java**: `target/`, `*.class`, `*.jar`, `.gradle/`, `build/`
+   - **C#/.NET**: `bin/`, `obj/`, `*.user`, `*.suo`, `packages/`
+   - **Go**: `*.exe`, `*.test`, `vendor/`, `*.out`
+   - **Ruby**: `.bundle/`, `log/`, `tmp/`, `*.gem`, `vendor/bundle/`
+   - **PHP**: `vendor/`, `*.log`, `*.cache`, `*.env`
+   - **Rust**: `target/`, `debug/`, `release/`, `*.rs.bk`, `*.rlib`, `*.prof*`, `.idea/`, `*.log`, `.env*`
+   - **Kotlin**: `build/`, `out/`, `.gradle/`, `.idea/`, `*.class`, `*.jar`, `*.iml`, `*.log`, `.env*`
+   - **C++**: `build/`, `bin/`, `obj/`, `out/`, `*.o`, `*.so`, `*.a`, `*.exe`, `*.dll`, `.idea/`, `*.log`, `.env*`
+   - **C**: `build/`, `bin/`, `obj/`, `out/`, `*.o`, `*.a`, `*.so`, `*.exe`, `*.dll`, `autom4te.cache/`, `config.status`, `config.log`, `.idea/`, `*.log`, `.env*`
+   - **Swift**: `.build/`, `DerivedData/`, `*.swiftpm/`, `Packages/`
+   - **R**: `.Rproj.user/`, `.Rhistory`, `.RData`, `.Ruserdata`, `*.Rproj`, `packrat/`, `renv/`
+   - **Universal**: `.DS_Store`, `Thumbs.db`, `*.tmp`, `*.swp`, `.vscode/`, `.idea/`
+
+   **Tool-Specific Patterns**:
+   - **Docker**: `node_modules/`, `.git/`, `Dockerfile*`, `.dockerignore`, `*.log*`, `.env*`, `coverage/`
+   - **ESLint**: `node_modules/`, `dist/`, `build/`, `coverage/`, `*.min.js`
+   - **Prettier**: `node_modules/`, `dist/`, `build/`, `coverage/`, `package-lock.json`, `yarn.lock`, `pnpm-lock.yaml`
+   - **Terraform**: `.terraform/`, `*.tfstate*`, `*.tfvars`, `.terraform.lock.hcl`
+   - **Kubernetes/k8s**: `*.secret.yaml`, `secrets/`, `.kube/`, `kubeconfig*`, `*.key`, `*.crt`
+
+5. Parse tasks.md structure and extract:
+   - **Task phases**: Setup, Tests, Core, Integration, Polish
+   - **Task dependencies**: Sequential vs parallel execution rules
+   - **Task details**: ID, description, file paths, parallel markers [P]
+   - **Execution flow**: Order and dependency requirements
+
+6. Execute implementation following the task plan:
+   - **Phase-by-phase execution**: Complete each phase before moving to the next
+   - **Respect dependencies**: Run sequential tasks in order, parallel tasks [P] can run together  
+   - **Follow TDD approach**: Execute test tasks before their corresponding implementation tasks
+   - **File-based coordination**: Tasks affecting the same files must run sequentially
+   - **Validation checkpoints**: Verify each phase completion before proceeding
+
+7. Implementation execution rules:
+   - **Setup first**: Initialize project structure, dependencies, configuration
+   - **Tests before code**: If you need to write tests for contracts, entities, and integration scenarios
+   - **Core development**: Implement models, services, CLI commands, endpoints
+   - **Integration work**: Database connections, middleware, logging, external services
+   - **Polish and validation**: Unit tests, performance optimization, documentation
+
+8. Progress tracking and error handling:
+   - Report progress after each completed task
+   - Halt execution if any non-parallel task fails
+   - For parallel tasks [P], continue with successful tasks, report failed ones
+   - Provide clear error messages with context for debugging
+   - Suggest next steps if implementation cannot proceed
+   - **IMPORTANT** For completed tasks, make sure to mark the task off as [X] in the tasks file.
+
+9. Completion validation:
+   - Verify all required tasks are completed
+   - Check that implemented features match the original specification
+   - Validate that tests pass and coverage meets requirements
+   - Confirm the implementation follows the technical plan
+
+Note: This command assumes a complete task breakdown exists in tasks.md. If tasks are incomplete or missing, suggest running `/speckit-tasks` first to regenerate the task list.
+
+## Mandatory Post-Execution Hooks
+
+**You MUST complete this section before reporting completion to the user.**
+
+Check if `.specify/extensions.yml` exists in the project root.
+- If it does not exist, or no hooks are registered under `hooks.after_implement`, skip to the Completion Report.
+- If it exists, read it and look for entries under the `hooks.after_implement` key.
+- If the YAML cannot be parsed or is invalid, skip hook checking silently and continue to the Completion Report.
+- Filter out hooks where `enabled` is explicitly `false`. Treat hooks without an `enabled` field as enabled by default.
+- For each remaining hook, do **not** attempt to interpret or evaluate hook `condition` expressions:
+  - If the hook has no `condition` field, or it is null/empty, treat the hook as executable
+  - If the hook defines a non-empty `condition`, skip the hook and leave condition evaluation to the HookExecutor implementation
+- When constructing slash commands from hook command names, replace dots (`.`) with hyphens (`-`). For example, `speckit.git.commit` → `/speckit-git-commit`.
+- For each executable hook, output the following based on its `optional` flag:
+  - **Mandatory hook** (`optional: false`) — **You MUST emit `EXECUTE_COMMAND:` for each mandatory hook**:
+    ```
+    ## Extension Hooks
+
+    **Automatic Hook**: {extension}
+    Executing: `/{command}`
+    EXECUTE_COMMAND: {command}
+    ```
+  - **Optional hook** (`optional: true`):
+    ```
+    ## Extension Hooks
+
+    **Optional Hook**: {extension}
+    Command: `/{command}`
+    Description: {description}
+
+    Prompt: {prompt}
+    To execute: `/{command}`
+    ```
+
+## Completion Report
+
+Report final status with summary of completed work.
+
+## Done When
+
+- [ ] All tasks in tasks.md completed and marked `[X]`
+- [ ] Implementation validated against specification, plan, and test coverage
+- [ ] Extension hooks dispatched or skipped according to the rules in Mandatory Post-Execution Hooks above
+- [ ] Completion reported to user with summary of completed work
--- a/.agents/skills/speckit-plan/SKILL.md
+++ b/.agents/skills/speckit-plan/SKILL.md
@ -0,0 +1,168 @@
+---
+name: "speckit-plan"
+description: "Execute the implementation planning workflow using the plan template to generate design artifacts."
+compatibility: "Requires spec-kit project structure with .specify/ directory"
+metadata:
+  author: "github-spec-kit"
+  source: "templates/commands/plan.md"
+---
+
+
+## User Input
+
+```text
+$ARGUMENTS
+```
+
+You **MUST** consider the user input before proceeding (if not empty).
+
+## Pre-Execution Checks
+
+**Check for extension hooks (before planning)**:
+- Check if `.specify/extensions.yml` exists in the project root.
+- If it exists, read it and look for entries under the `hooks.before_plan` key
+- If the YAML cannot be parsed or is invalid, skip hook checking silently and continue normally
+- Filter out hooks where `enabled` is explicitly `false`. Treat hooks without an `enabled` field as enabled by default.
+- For each remaining hook, do **not** attempt to interpret or evaluate hook `condition` expressions:
+  - If the hook has no `condition` field, or it is null/empty, treat the hook as executable
+  - If the hook defines a non-empty `condition`, skip the hook and leave condition evaluation to the HookExecutor implementation
+- When constructing slash commands from hook command names, replace dots (`.`) with hyphens (`-`). For example, `speckit.git.commit` → `/speckit-git-commit`.
+- For each executable hook, output the following based on its `optional` flag:
+  - **Optional hook** (`optional: true`):
+    ```
+    ## Extension Hooks
+
+    **Optional Pre-Hook**: {extension}
+    Command: `/{command}`
+    Description: {description}
+
+    Prompt: {prompt}
+    To execute: `/{command}`
+    ```
+  - **Mandatory hook** (`optional: false`):
+    ```
+    ## Extension Hooks
+
+    **Automatic Pre-Hook**: {extension}
+    Executing: `/{command}`
+    EXECUTE_COMMAND: {command}
+
+    Wait for the result of the hook command before proceeding to the Outline.
+    ```
+- If no hooks are registered or `.specify/extensions.yml` does not exist, skip silently
+
+## Outline
+
+1. **Setup**: Run `.specify/scripts/bash/setup-plan.sh --json` from repo root and parse JSON for FEATURE_SPEC, IMPL_PLAN, SPECS_DIR, BRANCH. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
+
+2. **Load context**: Read FEATURE_SPEC and `.specify/memory/constitution.md`. Load IMPL_PLAN template (already copied).
+
+3. **Execute plan workflow**: Follow the structure in IMPL_PLAN template to:
+   - Fill Technical Context (mark unknowns as "NEEDS CLARIFICATION")
+   - Fill Constitution Check section from constitution
+   - Evaluate gates (ERROR if violations unjustified)
+   - Phase 0: Generate research.md (resolve all NEEDS CLARIFICATION)
+   - Phase 1: Generate data-model.md, contracts/, quickstart.md
+   - Phase 1: Update agent context by running the agent script
+   - Re-evaluate Constitution Check post-design
+
+## Mandatory Post-Execution Hooks
+
+**You MUST complete this section before reporting completion to the user.**
+
+Check if `.specify/extensions.yml` exists in the project root.
+- If it does not exist, or no hooks are registered under `hooks.after_plan`, skip to the Completion Report.
+- If it exists, read it and look for entries under the `hooks.after_plan` key.
+- If the YAML cannot be parsed or is invalid, skip hook checking silently and continue to the Completion Report.
+- Filter out hooks where `enabled` is explicitly `false`. Treat hooks without an `enabled` field as enabled by default.
+- For each remaining hook, do **not** attempt to interpret or evaluate hook `condition` expressions:
+  - If the hook has no `condition` field, or it is null/empty, treat the hook as executable
+  - If the hook defines a non-empty `condition`, skip the hook and leave condition evaluation to the HookExecutor implementation
+- When constructing slash commands from hook command names, replace dots (`.`) with hyphens (`-`). For example, `speckit.git.commit` → `/speckit-git-commit`.
+- For each executable hook, output the following based on its `optional` flag:
+  - **Mandatory hook** (`optional: false`) — **You MUST emit `EXECUTE_COMMAND:` for each mandatory hook**:
+    ```
+    ## Extension Hooks
+
+    **Automatic Hook**: {extension}
+    Executing: `/{command}`
+    EXECUTE_COMMAND: {command}
+    ```
+  - **Optional hook** (`optional: true`):
+    ```
+    ## Extension Hooks
+
+    **Optional Hook**: {extension}
+    Command: `/{command}`
+    Description: {description}
+
+    Prompt: {prompt}
+    To execute: `/{command}`
+    ```
+
+## Completion Report
+
+Command ends after Phase 2 planning. Report branch, IMPL_PLAN path, and generated artifacts.
+
+## Phases
+
+### Phase 0: Outline & Research
+
+1. **Extract unknowns from Technical Context** above:
+   - For each NEEDS CLARIFICATION → research task
+   - For each dependency → best practices task
+   - For each integration → patterns task
+
+2. **Generate and dispatch research agents**:
+
+   ```text
+   For each unknown in Technical Context:
+     Task: "Research {unknown} for {feature context}"
+   For each technology choice:
+     Task: "Find best practices for {tech} in {domain}"
+   ```
+
+3. **Consolidate findings** in `research.md` using format:
+   - Decision: [what was chosen]
+   - Rationale: [why chosen]
+   - Alternatives considered: [what else evaluated]
+
+**Output**: research.md with all NEEDS CLARIFICATION resolved
+
+### Phase 1: Design & Contracts
+
+**Prerequisites:** `research.md` complete
+
+1. **Extract entities from feature spec** → `data-model.md`:
+   - Entity name, fields, relationships
+   - Validation rules from requirements
+   - State transitions if applicable
+
+2. **Define interface contracts** (if project has external interfaces) → `/contracts/`:
+   - Identify what interfaces the project exposes to users or other systems
+   - Document the contract format appropriate for the project type
+   - Examples: public APIs for libraries, command schemas for CLI tools, endpoints for web services, grammars for parsers, UI contracts for applications
+   - Skip if project is purely internal (build scripts, one-off tools, etc.)
+
+3. **Create quickstart validation guide** → `quickstart.md`:
+   - Document runnable validation scenarios that prove the feature works end-to-end
+   - Include prerequisites, setup commands, test/run commands, and expected outcomes
+   - Use links or references to contracts and data model details instead of duplicating them
+   - Do not include full implementation code, model/service/controller bodies, migrations, or complete test suites
+   - Keep this artifact as a validation/run guide; implementation details belong in `tasks.md` and the implementation phase
+
+4. **Agent context update**:
+   - Update the plan reference between the `<!-- SPECKIT START -->` and `<!-- SPECKIT END -->` markers in `AGENTS.md` to point to the plan file created in step 1 (the IMPL_PLAN path)
+
+**Output**: data-model.md, /contracts/*, quickstart.md, updated agent context file
+
+## Key rules
+
+- Use absolute paths for filesystem operations; use project-relative paths for references in documentation and agent context files
+- ERROR on gate failures or unresolved clarifications
+
+## Done When
+
+- [ ] Plan workflow executed and design artifacts generated
+- [ ] Extension hooks dispatched or skipped according to the rules in Mandatory Post-Execution Hooks above
+- [ ] Completion reported to user with branch, plan path, and generated artifacts
--- a/.agents/skills/speckit-specify/SKILL.md
+++ b/.agents/skills/speckit-specify/SKILL.md
@ -0,0 +1,342 @@
+---
+name: "speckit-specify"
+description: "Create or update the feature specification from a natural language feature description."
+compatibility: "Requires spec-kit project structure with .specify/ directory"
+metadata:
+  author: "github-spec-kit"
+  source: "templates/commands/specify.md"
+---
+
+
+## User Input
+
+```text
+$ARGUMENTS
+```
+
+You **MUST** consider the user input before proceeding (if not empty).
+
+## Pre-Execution Checks
+
+**Check for extension hooks (before specification)**:
+- Check if `.specify/extensions.yml` exists in the project root.
+- If it exists, read it and look for entries under the `hooks.before_specify` key
+- If the YAML cannot be parsed or is invalid, skip hook checking silently and continue normally
+- Filter out hooks where `enabled` is explicitly `false`. Treat hooks without an `enabled` field as enabled by default.
+- For each remaining hook, do **not** attempt to interpret or evaluate hook `condition` expressions:
+  - If the hook has no `condition` field, or it is null/empty, treat the hook as executable
+  - If the hook defines a non-empty `condition`, skip the hook and leave condition evaluation to the HookExecutor implementation
+- When constructing slash commands from hook command names, replace dots (`.`) with hyphens (`-`). For example, `speckit.git.commit` → `/speckit-git-commit`.
+- For each executable hook, output the following based on its `optional` flag:
+  - **Optional hook** (`optional: true`):
+    ```
+    ## Extension Hooks
+
+    **Optional Pre-Hook**: {extension}
+    Command: `/{command}`
+    Description: {description}
+
+    Prompt: {prompt}
+    To execute: `/{command}`
+    ```
+  - **Mandatory hook** (`optional: false`):
+    ```
+    ## Extension Hooks
+
+    **Automatic Pre-Hook**: {extension}
+    Executing: `/{command}`
+    EXECUTE_COMMAND: {command}
+
+    Wait for the result of the hook command before proceeding to the Outline.
+    ```
+- If no hooks are registered or `.specify/extensions.yml` does not exist, skip silently
+
+## Outline
+
+The text the user typed after `/speckit-specify` in the triggering message **is** the feature description. Assume you always have it available in this conversation even if `$ARGUMENTS` appears literally below. Do not ask the user to repeat it unless they provided an empty command.
+
+Given that feature description, do this:
+
+1. **Generate a concise short name** (2-4 words) for the feature:
+   - Analyze the feature description and extract the most meaningful keywords
+   - Create a 2-4 word short name that captures the essence of the feature
+   - Use action-noun format when possible (e.g., "add-user-auth", "fix-payment-bug")
+   - Preserve technical terms and acronyms (OAuth2, API, JWT, etc.)
+   - Keep it concise but descriptive enough to understand the feature at a glance
+   - Examples:
+     - "I want to add user authentication" → "user-auth"
+     - "Implement OAuth2 integration for the API" → "oauth2-api-integration"
+     - "Create a dashboard for analytics" → "analytics-dashboard"
+     - "Fix payment processing timeout bug" → "fix-payment-timeout"
+
+2. **Branch creation** (optional, via hook):
+
+   If a `before_specify` hook ran successfully in the Pre-Execution Checks above, it will have created/switched to a git branch and output JSON containing `BRANCH_NAME` and `FEATURE_NUM`. Note these values for reference, but the branch name does **not** dictate the spec directory name.
+
+   If the user explicitly provided `GIT_BRANCH_NAME`, pass it through to the hook so the branch script uses the exact value as the branch name (bypassing all prefix/suffix generation).
+
+3. **Create the spec feature directory**:
+
+   Specs live under the default `specs/` directory unless the user explicitly provides `SPECIFY_FEATURE_DIRECTORY`.
+
+   **Resolution order for `SPECIFY_FEATURE_DIRECTORY`**:
+   1. If the user explicitly provided `SPECIFY_FEATURE_DIRECTORY` (e.g., via environment variable, argument, or configuration), use it as-is
+   2. Otherwise, auto-generate it under `specs/`:
+      - Check `.specify/init-options.json` for `branch_numbering`
+      - If `"timestamp"`: prefix is `YYYYMMDD-HHMMSS` (current timestamp)
+      - If `"sequential"` or absent: prefix is `NNN` (next available 3-digit number after scanning existing directories in `specs/`)
+      - Construct the directory name: `<prefix>-<short-name>` (e.g., `003-user-auth` or `20260319-143022-user-auth`)
+      - Set `SPECIFY_FEATURE_DIRECTORY` to `specs/<directory-name>`
+
+   **Create the directory and spec file**:
+   - `mkdir -p SPECIFY_FEATURE_DIRECTORY`
+   - Resolve the active `spec-template` through the Spec Kit preset/template resolution stack (equivalent to `specify preset resolve spec-template`)
+   - Copy the resolved `spec-template` file to `SPECIFY_FEATURE_DIRECTORY/spec.md` as the starting point
+   - Set `SPEC_FILE` to `SPECIFY_FEATURE_DIRECTORY/spec.md`
+   - Persist the resolved path to `.specify/feature.json`:
+     ```json
+     {
+       "feature_directory": "<resolved feature dir>"
+     }
+     ```
+     Write the actual resolved directory path value (for example, `specs/003-user-auth`), not the literal string `SPECIFY_FEATURE_DIRECTORY`.
+     This allows downstream commands (`/speckit-plan`, `/speckit-tasks`, etc.) to locate the feature directory without relying on git branch name conventions.
+
+   **IMPORTANT**:
+   - You must only create one feature per `/speckit-specify` invocation
+   - The spec directory name and the git branch name are independent — they may be the same but that is the user's choice
+   - The spec directory and file are always created by this command, never by the hook
+
+4. Load the resolved active `spec-template` file to understand required sections.
+
+5. **IF EXISTS**: Load `.specify/memory/constitution.md` for project principles and governance constraints.
+
+6. Follow this execution flow:
+    1. Parse user description from arguments
+       If empty: ERROR "No feature description provided"
+    2. Extract key concepts from description
+       Identify: actors, actions, data, constraints
+    3. For unclear aspects:
+       - Make informed guesses based on context and industry standards
+       - Only mark with [NEEDS CLARIFICATION: specific question] if:
+         - The choice significantly impacts feature scope or user experience
+         - Multiple reasonable interpretations exist with different implications
+         - No reasonable default exists
+       - **LIMIT: Maximum 3 [NEEDS CLARIFICATION] markers total**
+       - Prioritize clarifications by impact: scope > security/privacy > user experience > technical details
+    4. Fill User Scenarios & Testing section
+       If no clear user flow: ERROR "Cannot determine user scenarios"
+    5. Generate Functional Requirements
+       Each requirement must be testable
+       Use reasonable defaults for unspecified details (document assumptions in Assumptions section)
+    6. Define Success Criteria
+       Create measurable, technology-agnostic outcomes
+       Include both quantitative metrics (time, performance, volume) and qualitative measures (user satisfaction, task completion)
+       Each criterion must be verifiable without implementation details
+    7. Identify Key Entities (if data involved)
+    8. Return: SUCCESS (spec ready for planning)
+
+6. Write the specification to SPEC_FILE using the template structure, replacing placeholders with concrete details derived from the feature description (arguments) while preserving section order and headings.
+
+7. **Specification Quality Validation**: After writing the initial spec, validate it against quality criteria:
+
+   a. **Create Spec Quality Checklist**: Generate a checklist file at `SPECIFY_FEATURE_DIRECTORY/checklists/requirements.md` using the checklist template structure with these validation items:
+
+      ```markdown
+      # Specification Quality Checklist: [FEATURE NAME]
+      
+      **Purpose**: Validate specification completeness and quality before proceeding to planning
+      **Created**: [DATE]
+      **Feature**: [Link to spec.md]
+      
+      ## Content Quality
+      
+      - [ ] No implementation details (languages, frameworks, APIs)
+      - [ ] Focused on user value and business needs
+      - [ ] Written for non-technical stakeholders
+      - [ ] All mandatory sections completed
+      
+      ## Requirement Completeness
+      
+      - [ ] No [NEEDS CLARIFICATION] markers remain
+      - [ ] Requirements are testable and unambiguous
+      - [ ] Success criteria are measurable
+      - [ ] Success criteria are technology-agnostic (no implementation details)
+      - [ ] All acceptance scenarios are defined
+      - [ ] Edge cases are identified
+      - [ ] Scope is clearly bounded
+      - [ ] Dependencies and assumptions identified
+      
+      ## Feature Readiness
+      
+      - [ ] All functional requirements have clear acceptance criteria
+      - [ ] User scenarios cover primary flows
+      - [ ] Feature meets measurable outcomes defined in Success Criteria
+      - [ ] No implementation details leak into specification
+      
+      ## Notes
+      
+      - Items marked incomplete require spec updates before `/speckit-clarify` or `/speckit-plan`
+      ```
+
+   b. **Run Validation Check**: Review the spec against each checklist item:
+      - For each item, determine if it passes or fails
+      - Document specific issues found (quote relevant spec sections)
+
+   c. **Handle Validation Results**:
+
+      - **If all items pass**: Mark checklist complete and proceed to the Mandatory Post-Execution Hooks section
+
+      - **If items fail (excluding [NEEDS CLARIFICATION])**:
+        1. List the failing items and specific issues
+        2. Update the spec to address each issue
+        3. Re-run validation until all items pass (max 3 iterations)
+        4. If still failing after 3 iterations, document remaining issues in checklist notes and warn user
+
+      - **If [NEEDS CLARIFICATION] markers remain**:
+        1. Extract all [NEEDS CLARIFICATION: ...] markers from the spec
+        2. **LIMIT CHECK**: If more than 3 markers exist, keep only the 3 most critical (by scope/security/UX impact) and make informed guesses for the rest
+        3. For each clarification needed (max 3), present options to user in this format:
+
+           ```markdown
+           ## Question [N]: [Topic]
+           
+           **Context**: [Quote relevant spec section]
+           
+           **What we need to know**: [Specific question from NEEDS CLARIFICATION marker]
+           
+           **Suggested Answers**:
+           
+           | Option | Answer | Implications |
+           |--------|--------|--------------|
+           | A      | [First suggested answer] | [What this means for the feature] |
+           | B      | [Second suggested answer] | [What this means for the feature] |
+           | C      | [Third suggested answer] | [What this means for the feature] |
+           | Custom | Provide your own answer | [Explain how to provide custom input] |
+           
+           **Your choice**: _[Wait for user response]_
+           ```
+
+        4. **CRITICAL - Table Formatting**: Ensure markdown tables are properly formatted:
+           - Use consistent spacing with pipes aligned
+           - Each cell should have spaces around content: `| Content |` not `|Content|`
+           - Header separator must have at least 3 dashes: `|--------|`
+           - Test that the table renders correctly in markdown preview
+        5. Number questions sequentially (Q1, Q2, Q3 - max 3 total)
+        6. Present all questions together before waiting for responses
+        7. Wait for user to respond with their choices for all questions (e.g., "Q1: A, Q2: Custom - [details], Q3: B")
+        8. Update the spec by replacing each [NEEDS CLARIFICATION] marker with the user's selected or provided answer
+        9. Re-run validation after all clarifications are resolved
+
+   d. **Update Checklist**: After each validation iteration, update the checklist file with current pass/fail status
+
+## Mandatory Post-Execution Hooks
+
+**You MUST complete this section before reporting completion to the user.**
+
+Check if `.specify/extensions.yml` exists in the project root.
+- If it does not exist, or no hooks are registered under `hooks.after_specify`, skip to the Completion Report.
+- If it exists, read it and look for entries under the `hooks.after_specify` key.
+- If the YAML cannot be parsed or is invalid, skip hook checking silently and continue to the Completion Report.
+- Filter out hooks where `enabled` is explicitly `false`. Treat hooks without an `enabled` field as enabled by default.
+- For each remaining hook, do **not** attempt to interpret or evaluate hook `condition` expressions:
+  - If the hook has no `condition` field, or it is null/empty, treat the hook as executable
+  - If the hook defines a non-empty `condition`, skip the hook and leave condition evaluation to the HookExecutor implementation
+- When constructing slash commands from hook command names, replace dots (`.`) with hyphens (`-`). For example, `speckit.git.commit` → `/speckit-git-commit`.
+- For each executable hook, output the following based on its `optional` flag:
+  - **Mandatory hook** (`optional: false`) — **You MUST emit `EXECUTE_COMMAND:` for each mandatory hook**:
+    ```
+    ## Extension Hooks
+
+    **Automatic Hook**: {extension}
+    Executing: `/{command}`
+    EXECUTE_COMMAND: {command}
+    ```
+  - **Optional hook** (`optional: true`):
+    ```
+    ## Extension Hooks
+
+    **Optional Hook**: {extension}
+    Command: `/{command}`
+    Description: {description}
+
+    Prompt: {prompt}
+    To execute: `/{command}`
+    ```
+
+## Completion Report
+
+Report completion to the user with:
+- `SPECIFY_FEATURE_DIRECTORY` — the feature directory path
+- `SPEC_FILE` — the spec file path
+- Checklist results summary
+- Readiness for the next phase (`/speckit-clarify` or `/speckit-plan`)
+
+**NOTE:** Branch creation is handled by the `before_specify` hook (git extension). Spec directory and file creation are always handled by this core command.
+
+## Quick Guidelines
+
+- Focus on **WHAT** users need and **WHY**.
+- Avoid HOW to implement (no tech stack, APIs, code structure).
+- Written for business stakeholders, not developers.
+- DO NOT create any checklists that are embedded in the spec. That will be a separate command.
+
+### Section Requirements
+
+- **Mandatory sections**: Must be completed for every feature
+- **Optional sections**: Include only when relevant to the feature
+- When a section doesn't apply, remove it entirely (don't leave as "N/A")
+
+### For AI Generation
+
+When creating this spec from a user prompt:
+
+1. **Make informed guesses**: Use context, industry standards, and common patterns to fill gaps
+2. **Document assumptions**: Record reasonable defaults in the Assumptions section
+3. **Limit clarifications**: Maximum 3 [NEEDS CLARIFICATION] markers - use only for critical decisions that:
+   - Significantly impact feature scope or user experience
+   - Have multiple reasonable interpretations with different implications
+   - Lack any reasonable default
+4. **Prioritize clarifications**: scope > security/privacy > user experience > technical details
+5. **Think like a tester**: Every vague requirement should fail the "testable and unambiguous" checklist item
+6. **Common areas needing clarification** (only if no reasonable default exists):
+   - Feature scope and boundaries (include/exclude specific use cases)
+   - User types and permissions (if multiple conflicting interpretations possible)
+   - Security/compliance requirements (when legally/financially significant)
+
+**Examples of reasonable defaults** (don't ask about these):
+
+- Data retention: Industry-standard practices for the domain
+- Performance targets: Standard web/mobile app expectations unless specified
+- Error handling: User-friendly messages with appropriate fallbacks
+- Authentication method: Standard session-based or OAuth2 for web apps
+- Integration patterns: Use project-appropriate patterns (REST/GraphQL for web services, function calls for libraries, CLI args for tools, etc.)
+
+### Success Criteria Guidelines
+
+Success criteria must be:
+
+1. **Measurable**: Include specific metrics (time, percentage, count, rate)
+2. **Technology-agnostic**: No mention of frameworks, languages, databases, or tools
+3. **User-focused**: Describe outcomes from user/business perspective, not system internals
+4. **Verifiable**: Can be tested/validated without knowing implementation details
+
+**Good examples**:
+
+- "Users can complete checkout in under 3 minutes"
+- "System supports 10,000 concurrent users"
+- "95% of searches return results in under 1 second"
+- "Task completion rate improves by 40%"
+
+**Bad examples** (implementation-focused):
+
+- "API response time is under 200ms" (too technical, use "Users see results instantly")
+- "Database can handle 1000 TPS" (implementation detail, use user-facing metric)
+- "React components render efficiently" (framework-specific)
+- "Redis cache hit rate above 80%" (technology-specific)
+
+## Done When
+
+- [ ] Specification written to `SPEC_FILE` and validated against quality checklist
+- [ ] Extension hooks dispatched or skipped according to the rules in Mandatory Post-Execution Hooks above
+- [ ] Completion reported to user with feature directory, spec file path, and checklist results
--- a/.agents/skills/speckit-tasks/SKILL.md
+++ b/.agents/skills/speckit-tasks/SKILL.md
@ -0,0 +1,212 @@
+---
+name: "speckit-tasks"
+description: "Generate an actionable, dependency-ordered tasks.md for the feature based on available design artifacts."
+compatibility: "Requires spec-kit project structure with .specify/ directory"
+metadata:
+  author: "github-spec-kit"
+  source: "templates/commands/tasks.md"
+---
+
+
+## User Input
+
+```text
+$ARGUMENTS
+```
+
+You **MUST** consider the user input before proceeding (if not empty).
+
+## Pre-Execution Checks
+
+**Check for extension hooks (before tasks generation)**:
+- Check if `.specify/extensions.yml` exists in the project root.
+- If it exists, read it and look for entries under the `hooks.before_tasks` key
+- If the YAML cannot be parsed or is invalid, skip hook checking silently and continue normally
+- Filter out hooks where `enabled` is explicitly `false`. Treat hooks without an `enabled` field as enabled by default.
+- For each remaining hook, do **not** attempt to interpret or evaluate hook `condition` expressions:
+  - If the hook has no `condition` field, or it is null/empty, treat the hook as executable
+  - If the hook defines a non-empty `condition`, skip the hook and leave condition evaluation to the HookExecutor implementation
+- When constructing slash commands from hook command names, replace dots (`.`) with hyphens (`-`). For example, `speckit.git.commit` → `/speckit-git-commit`.
+- For each executable hook, output the following based on its `optional` flag:
+  - **Optional hook** (`optional: true`):
+    ```
+    ## Extension Hooks
+
+    **Optional Pre-Hook**: {extension}
+    Command: `/{command}`
+    Description: {description}
+
+    Prompt: {prompt}
+    To execute: `/{command}`
+    ```
+  - **Mandatory hook** (`optional: false`):
+    ```
+    ## Extension Hooks
+
+    **Automatic Pre-Hook**: {extension}
+    Executing: `/{command}`
+    EXECUTE_COMMAND: {command}
+    
+    Wait for the result of the hook command before proceeding to the Outline.
+    ```
+- If no hooks are registered or `.specify/extensions.yml` does not exist, skip silently
+
+## Outline
+
+1. **Setup**: Run `.specify/scripts/bash/setup-tasks.sh --json` from repo root and parse FEATURE_DIR, TASKS_TEMPLATE, and AVAILABLE_DOCS list. `FEATURE_DIR` and `TASKS_TEMPLATE` must be absolute paths when provided. `AVAILABLE_DOCS` is a list of document names/relative paths available under `FEATURE_DIR` (for example `research.md` or `contracts/`). For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
+
+2. **Load design documents**: Read from FEATURE_DIR:
+   - **Required**: plan.md (tech stack, libraries, structure), spec.md (user stories with priorities)
+   - **Optional**: data-model.md (entities), contracts/ (interface contracts), research.md (decisions), quickstart.md (test scenarios)
+   - **IF EXISTS**: Load `.specify/memory/constitution.md` for project principles and governance constraints
+   - Note: Not all projects have all documents. Generate tasks based on what's available.
+
+3. **Execute task generation workflow**:
+   - Load plan.md and extract tech stack, libraries, project structure
+   - Load spec.md and extract user stories with their priorities (P1, P2, P3, etc.)
+   - If data-model.md exists: Extract entities and map to user stories
+   - If contracts/ exists: Map interface contracts to user stories
+   - If research.md exists: Extract decisions for setup tasks
+   - Generate tasks organized by user story (see Task Generation Rules below)
+   - Generate dependency graph showing user story completion order
+   - Create parallel execution examples per user story
+   - Validate task completeness (each user story has all needed tasks, independently testable)
+
+4. **Generate tasks.md**: Read the tasks template from TASKS_TEMPLATE (from the JSON output above) and use it as structure. If TASKS_TEMPLATE is empty, fall back to `.specify/templates/tasks-template.md`. Fill with:
+   - Correct feature name from plan.md
+   - Phase 1: Setup tasks (project initialization)
+   - Phase 2: Foundational tasks (blocking prerequisites for all user stories)
+   - Phase 3+: One phase per user story (in priority order from spec.md)
+   - Each phase includes: story goal, independent test criteria, tests (if requested), implementation tasks
+   - Final Phase: Polish & cross-cutting concerns
+   - All tasks must follow the strict checklist format (see Task Generation Rules below)
+   - Clear file paths for each task
+   - Dependencies section showing story completion order
+   - Parallel execution examples per story
+   - Implementation strategy section (MVP first, incremental delivery)
+
+## Mandatory Post-Execution Hooks
+
+**You MUST complete this section before reporting completion to the user.**
+
+Check if `.specify/extensions.yml` exists in the project root.
+- If it does not exist, or no hooks are registered under `hooks.after_tasks`, skip to the Completion Report.
+- If it exists, read it and look for entries under the `hooks.after_tasks` key.
+- If the YAML cannot be parsed or is invalid, skip hook checking silently and continue to the Completion Report.
+- Filter out hooks where `enabled` is explicitly `false`. Treat hooks without an `enabled` field as enabled by default.
+- For each remaining hook, do **not** attempt to interpret or evaluate hook `condition` expressions:
+  - If the hook has no `condition` field, or it is null/empty, treat the hook as executable
+  - If the hook defines a non-empty `condition`, skip the hook and leave condition evaluation to the HookExecutor implementation
+- When constructing slash commands from hook command names, replace dots (`.`) with hyphens (`-`). For example, `speckit.git.commit` → `/speckit-git-commit`.
+- For each executable hook, output the following based on its `optional` flag:
+  - **Mandatory hook** (`optional: false`) — **You MUST emit `EXECUTE_COMMAND:` for each mandatory hook**:
+    ```
+    ## Extension Hooks
+
+    **Automatic Hook**: {extension}
+    Executing: `/{command}`
+    EXECUTE_COMMAND: {command}
+    ```
+  - **Optional hook** (`optional: true`):
+    ```
+    ## Extension Hooks
+
+    **Optional Hook**: {extension}
+    Command: `/{command}`
+    Description: {description}
+
+    Prompt: {prompt}
+    To execute: `/{command}`
+    ```
+
+## Completion Report
+
+Output path to generated tasks.md and summary:
+- Total task count
+- Task count per user story
+- Parallel opportunities identified
+- Independent test criteria for each story
+- Suggested MVP scope (typically just User Story 1)
+- Format validation: Confirm ALL tasks follow the checklist format (checkbox, ID, labels, file paths)
+
+Context for task generation: $ARGUMENTS
+
+The tasks.md should be immediately executable - each task must be specific enough that an LLM can complete it without additional context.
+
+## Task Generation Rules
+
+**CRITICAL**: Tasks MUST be organized by user story to enable independent implementation and testing.
+
+**Tests are OPTIONAL**: Only generate test tasks if explicitly requested in the feature specification or if user requests TDD approach.
+
+### Checklist Format (REQUIRED)
+
+Every task MUST strictly follow this format:
+
+```text
+- [ ] [TaskID] [P?] [Story?] Description with file path
+```
+
+**Format Components**:
+
+1. **Checkbox**: ALWAYS start with `- [ ]` (markdown checkbox)
+2. **Task ID**: Sequential number (T001, T002, T003...) in execution order
+3. **[P] marker**: Include ONLY if task is parallelizable (different files, no dependencies on incomplete tasks)
+4. **[Story] label**: REQUIRED for user story phase tasks only
+   - Format: [US1], [US2], [US3], etc. (maps to user stories from spec.md)
+   - Setup phase: NO story label
+   - Foundational phase: NO story label  
+   - User Story phases: MUST have story label
+   - Polish phase: NO story label
+5. **Description**: Clear action with exact file path
+
+**Examples**:
+
+- ✅ CORRECT: `- [ ] T001 Create project structure per implementation plan`
+- ✅ CORRECT: `- [ ] T005 [P] Implement authentication middleware in src/middleware/auth.py`
+- ✅ CORRECT: `- [ ] T012 [P] [US1] Create User model in src/models/user.py`
+- ✅ CORRECT: `- [ ] T014 [US1] Implement UserService in src/services/user_service.py`
+- ❌ WRONG: `- [ ] Create User model` (missing ID and Story label)
+- ❌ WRONG: `T001 [US1] Create model` (missing checkbox)
+- ❌ WRONG: `- [ ] [US1] Create User model` (missing Task ID)
+- ❌ WRONG: `- [ ] T001 [US1] Create model` (missing file path)
+
+### Task Organization
+
+1. **From User Stories (spec.md)** - PRIMARY ORGANIZATION:
+   - Each user story (P1, P2, P3...) gets its own phase
+   - Map all related components to their story:
+     - Models needed for that story
+     - Services needed for that story
+     - Interfaces/UI needed for that story
+     - If tests requested: Tests specific to that story
+   - Mark story dependencies (most stories should be independent)
+
+2. **From Contracts**:
+   - Map each interface contract → to the user story it serves
+   - If tests requested: Each interface contract → contract test task [P] before implementation in that story's phase
+
+3. **From Data Model**:
+   - Map each entity to the user story(ies) that need it
+   - If entity serves multiple stories: Put in earliest story or Setup phase
+   - Relationships → service layer tasks in appropriate story phase
+
+4. **From Setup/Infrastructure**:
+   - Shared infrastructure → Setup phase (Phase 1)
+   - Foundational/blocking tasks → Foundational phase (Phase 2)
+   - Story-specific setup → within that story's phase
+
+### Phase Structure
+
+- **Phase 1**: Setup (project initialization)
+- **Phase 2**: Foundational (blocking prerequisites - MUST complete before user stories)
+- **Phase 3+**: User Stories in priority order (P1, P2, P3...)
+  - Within each story: Tests (if requested) → Models → Services → Endpoints → Integration
+  - Each phase should be a complete, independently testable increment
+- **Final Phase**: Polish & Cross-Cutting Concerns
+
+## Done When
+
+- [ ] tasks.md generated with all phases, task IDs, and file paths
+- [ ] Extension hooks dispatched or skipped according to the rules in Mandatory Post-Execution Hooks above
+- [ ] Completion reported to user with task count, story breakdown, and MVP scope
--- a/.agents/skills/speckit-taskstoissues/SKILL.md
+++ b/.agents/skills/speckit-taskstoissues/SKILL.md
@ -0,0 +1,104 @@
+---
+name: "speckit-taskstoissues"
+description: "Convert existing tasks into actionable, dependency-ordered GitHub issues for the feature based on available design artifacts."
+compatibility: "Requires spec-kit project structure with .specify/ directory"
+metadata:
+  author: "github-spec-kit"
+  source: "templates/commands/taskstoissues.md"
+---
+
+
+## User Input
+
+```text
+$ARGUMENTS
+```
+
+You **MUST** consider the user input before proceeding (if not empty).
+
+## Pre-Execution Checks
+
+**Check for extension hooks (before tasks-to-issues conversion)**:
+- Check if `.specify/extensions.yml` exists in the project root.
+- If it exists, read it and look for entries under the `hooks.before_taskstoissues` key
+- If the YAML cannot be parsed or is invalid, skip hook checking silently and continue normally
+- Filter out hooks where `enabled` is explicitly `false`. Treat hooks without an `enabled` field as enabled by default.
+- For each remaining hook, do **not** attempt to interpret or evaluate hook `condition` expressions:
+  - If the hook has no `condition` field, or it is null/empty, treat the hook as executable
+  - If the hook defines a non-empty `condition`, skip the hook and leave condition evaluation to the HookExecutor implementation
+- When constructing slash commands from hook command names, replace dots (`.`) with hyphens (`-`). For example, `speckit.git.commit` → `/speckit-git-commit`.
+- For each executable hook, output the following based on its `optional` flag:
+  - **Optional hook** (`optional: true`):
+    ```
+    ## Extension Hooks
+
+    **Optional Pre-Hook**: {extension}
+    Command: `/{command}`
+    Description: {description}
+
+    Prompt: {prompt}
+    To execute: `/{command}`
+    ```
+  - **Mandatory hook** (`optional: false`):
+    ```
+    ## Extension Hooks
+
+    **Automatic Pre-Hook**: {extension}
+    Executing: `/{command}`
+    EXECUTE_COMMAND: {command}
+
+    Wait for the result of the hook command before proceeding to the Outline.
+    ```
+- If no hooks are registered or `.specify/extensions.yml` does not exist, skip silently
+
+## Outline
+
+1. Run `.specify/scripts/bash/check-prerequisites.sh --json --require-tasks --include-tasks` from repo root and parse FEATURE_DIR and AVAILABLE_DOCS list. All paths must be absolute. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
+1. **IF EXISTS**: Load `.specify/memory/constitution.md` for project principles and governance constraints.
+1. From the executed script, extract the path to **tasks**.
+1. Get the Git remote by running:
+
+```bash
+git config --get remote.origin.url
+```
+
+> [!CAUTION]
+> ONLY PROCEED TO NEXT STEPS IF THE REMOTE IS A GITHUB URL
+
+1. For each task in the list, use the GitHub MCP server to create a new issue in the repository that is representative of the Git remote.
+
+> [!CAUTION]
+> UNDER NO CIRCUMSTANCES EVER CREATE ISSUES IN REPOSITORIES THAT DO NOT MATCH THE REMOTE URL
+
+## Post-Execution Checks
+
+**Check for extension hooks (after tasks-to-issues conversion)**:
+Check if `.specify/extensions.yml` exists in the project root.
+- If it exists, read it and look for entries under the `hooks.after_taskstoissues` key
+- If the YAML cannot be parsed or is invalid, skip hook checking silently and continue normally
+- Filter out hooks where `enabled` is explicitly `false`. Treat hooks without an `enabled` field as enabled by default.
+- For each remaining hook, do **not** attempt to interpret or evaluate hook `condition` expressions:
+  - If the hook has no `condition` field, or it is null/empty, treat the hook as executable
+  - If the hook defines a non-empty `condition`, skip the hook and leave condition evaluation to the HookExecutor implementation
+- When constructing slash commands from hook command names, replace dots (`.`) with hyphens (`-`). For example, `speckit.git.commit` → `/speckit-git-commit`.
+- For each executable hook, output the following based on its `optional` flag:
+  - **Optional hook** (`optional: true`):
+    ```
+    ## Extension Hooks
+
+    **Optional Hook**: {extension}
+    Command: `/{command}`
+    Description: {description}
+
+    Prompt: {prompt}
+    To execute: `/{command}`
+    ```
+  - **Mandatory hook** (`optional: false`):
+    ```
+    ## Extension Hooks
+
+    **Automatic Hook**: {extension}
+    Executing: `/{command}`
+    EXECUTE_COMMAND: {command}
+    ```
+- If no hooks are registered or `.specify/extensions.yml` does not exist, skip silently
--- a/.codex
+++ b/.codex
--- a/.env.example
+++ b/.env.example
@ -1,24 +1,56 @@
-# Standalone app-instance defaults
+# Shared values used by the root deployment flow in README.md

-BEAVER_PUBLIC_URL=http://172.19.0.245:18080
-BEAVER_HOST_IP=172.19.0.245
-BEAVER_HOST_PORT=18080
+PROJECT_ROOT=/home/ivan/xuan/beaver_project
+BEAVER_NET=beaver-instance-edge
+BEAVER_PROXY_CONTAINER_NAME=beaver-router-proxy

+BEAVER_DEPLOY_TOKEN=change-me
+BEAVER_AUTHZ_INTERNAL_TOKEN=change-me
+
+BEAVER_SERVER_IP=127.0.0.1
+BEAVER_BASE_DOMAIN=localhost
+
+BEAVER_PROVIDER=openai
+BEAVER_MODEL=openai/gpt-5
+BEAVER_API_KEY=sk-xxxxxxxx
+BEAVER_API_BASE=
+
+# Per-instance Beaver backend config. In Docker app-instance this should point
+# to the mounted single-user sandbox config, not to frontend env.
 BEAVER_HOME=/root/.beaver
 BEAVER_CONFIG_PATH=/root/.beaver/config.json
 BEAVER_WORKSPACE=/root/.beaver/workspace

-# Keycloak SSO
-BEAVER_KEYCLOAK_ISSUER=https://keycloak.bwgdi.com/realms/beaver
-BEAVER_KEYCLOAK_CLIENT_ID=beaver-agnet
-BEAVER_KEYCLOAK_REDIRECT_URI=http://172.19.0.245:18080/auth/callback
-BEAVER_KEYCLOAK_POST_LOGOUT_REDIRECT_URI=http://172.19.0.245:18080/logout/callback
+# Must be reachable from app-instance containers.
+BEAVER_AUTHZ_URL=http://beaver-authz-service:19090
+BEAVER_OUTLOOK_MCP_URL=
+BEAVER_OUTLOOK_MCP_SERVER_ID=outlook_mcp

-# Model provider settings used when generating config.json with run-standalone.sh
-BEAVER_PROVIDER=openai
-BEAVER_MODEL=openai/gpt-5
-BEAVER_API_KEY=
-BEAVER_API_BASE=
-
-# Optional runtime limits
+# User file system backed by MinIO/S3.
+BEAVER_MINIO_ROOT_USER=
+BEAVER_MINIO_ROOT_PASSWORD=
+BEAVER_USER_FILES_BUCKET=beaver-user-files
+BEAVER_USER_FILES_MINIO_ENDPOINT=
 BEAVER_USER_FILES_MAX_UPLOAD_BYTES=5368709120
+
+# Must be reachable from auth-portal and authz-service containers.
+BEAVER_DEPLOY_URL=http://beaver-deploy-control:8090
+
+# External connector sidecar
+EXTERNAL_CONNECTOR_BASE_URL=http://external-connector:8787
+# Required for connector management API authentication.
+EXTERNAL_CONNECTOR_TOKEN=change-me-connector-token
+# Required for sidecar -> Beaver bridge authentication.
+BEAVER_BRIDGE_TOKEN=change-me-bridge-token
+BEAVER_BRIDGE_BASE_URL=http://app-instance:8080
+EXTERNAL_CONNECTOR_PORT=8787
+CONNECTOR_PUBLIC_BASE_URL=http://localhost:8787
+# fake | official | vendor_cli | weixin_ilink | feishu_bot
+CONNECTOR_PROVIDER=official
+CONNECTOR_COMMAND_TIMEOUT_SECONDS=120
+WEIXIN_CONNECT_COMMAND=
+WEIXIN_STATUS_COMMAND=
+WEIXIN_SEND_COMMAND=
+FEISHU_CONNECT_COMMAND=
+FEISHU_STATUS_COMMAND=
+FEISHU_SEND_COMMAND=
--- a/.specify/extensions.yml
+++ b/.specify/extensions.yml
@ -0,0 +1,164 @@
+installed:
+- agent-context
+- git
+settings:
+  auto_execute_hooks: true
+hooks:
+  before_constitution:
+  - extension: git
+    command: speckit.git.initialize
+    enabled: true
+    optional: false
+    prompt: Execute speckit.git.initialize?
+    description: Initialize Git repository before constitution setup
+    condition: null
+  before_specify:
+  - extension: git
+    command: speckit.git.feature
+    enabled: true
+    optional: false
+    prompt: Execute speckit.git.feature?
+    description: Create feature branch before specification
+    condition: null
+  before_clarify:
+  - extension: git
+    command: speckit.git.commit
+    enabled: true
+    optional: true
+    prompt: Commit outstanding changes before clarification?
+    description: Auto-commit before spec clarification
+    condition: null
+  before_plan:
+  - extension: git
+    command: speckit.git.commit
+    enabled: true
+    optional: true
+    prompt: Commit outstanding changes before planning?
+    description: Auto-commit before implementation planning
+    condition: null
+  before_tasks:
+  - extension: git
+    command: speckit.git.commit
+    enabled: true
+    optional: true
+    prompt: Commit outstanding changes before task generation?
+    description: Auto-commit before task generation
+    condition: null
+  before_implement:
+  - extension: git
+    command: speckit.git.commit
+    enabled: true
+    optional: true
+    prompt: Commit outstanding changes before implementation?
+    description: Auto-commit before implementation
+    condition: null
+  before_checklist:
+  - extension: git
+    command: speckit.git.commit
+    enabled: true
+    optional: true
+    prompt: Commit outstanding changes before checklist?
+    description: Auto-commit before checklist generation
+    condition: null
+  before_analyze:
+  - extension: git
+    command: speckit.git.commit
+    enabled: true
+    optional: true
+    prompt: Commit outstanding changes before analysis?
+    description: Auto-commit before analysis
+    condition: null
+  before_taskstoissues:
+  - extension: git
+    command: speckit.git.commit
+    enabled: true
+    optional: true
+    prompt: Commit outstanding changes before issue sync?
+    description: Auto-commit before tasks-to-issues conversion
+    condition: null
+  after_constitution:
+  - extension: git
+    command: speckit.git.commit
+    enabled: true
+    optional: true
+    prompt: Commit constitution changes?
+    description: Auto-commit after constitution update
+    condition: null
+  after_specify:
+  - extension: git
+    command: speckit.git.commit
+    enabled: true
+    optional: true
+    prompt: Commit specification changes?
+    description: Auto-commit after specification
+    condition: null
+  - extension: agent-context
+    command: speckit.agent-context.update
+    enabled: true
+    optional: true
+    prompt: Execute speckit.agent-context.update?
+    description: Refresh agent context after specification
+    condition: null
+  after_clarify:
+  - extension: git
+    command: speckit.git.commit
+    enabled: true
+    optional: true
+    prompt: Commit clarification changes?
+    description: Auto-commit after spec clarification
+    condition: null
+  after_plan:
+  - extension: git
+    command: speckit.git.commit
+    enabled: true
+    optional: true
+    prompt: Commit plan changes?
+    description: Auto-commit after implementation planning
+    condition: null
+  - extension: agent-context
+    command: speckit.agent-context.update
+    enabled: true
+    optional: true
+    prompt: Execute speckit.agent-context.update?
+    description: Refresh agent context after planning
+    condition: null
+  after_tasks:
+  - extension: git
+    command: speckit.git.commit
+    enabled: true
+    optional: true
+    prompt: Commit task changes?
+    description: Auto-commit after task generation
+    condition: null
+  after_implement:
+  - extension: git
+    command: speckit.git.commit
+    enabled: true
+    optional: true
+    prompt: Commit implementation changes?
+    description: Auto-commit after implementation
+    condition: null
+  after_checklist:
+  - extension: git
+    command: speckit.git.commit
+    enabled: true
+    optional: true
+    prompt: Commit checklist changes?
+    description: Auto-commit after checklist generation
+    condition: null
+  after_analyze:
+  - extension: git
+    command: speckit.git.commit
+    enabled: true
+    optional: true
+    prompt: Commit analysis results?
+    description: Auto-commit after analysis
+    condition: null
+  after_taskstoissues:
+  - extension: git
+    command: speckit.git.commit
+    enabled: true
+    optional: true
+    prompt: Commit after syncing issues?
+    description: Auto-commit after tasks-to-issues conversion
+    condition: null
--- a/.specify/extensions/.registry
+++ b/.specify/extensions/.registry
@ -0,0 +1,47 @@
+{
+  "schema_version": "1.0",
+  "extensions": {
+    "git": {
+      "version": "1.0.0",
+      "source": "local",
+      "manifest_hash": "sha256:9731aa8143a72fbebfdb440f155038ab42642517c2b2bdbbf67c8fdbe076ed79",
+      "enabled": true,
+      "priority": 10,
+      "registered_commands": {
+        "agy": [
+          "speckit.git.feature",
+          "speckit.git.validate",
+          "speckit.git.remote",
+          "speckit.git.initialize",
+          "speckit.git.commit"
+        ],
+        "codex": [
+          "speckit.git.feature",
+          "speckit.git.validate",
+          "speckit.git.remote",
+          "speckit.git.initialize",
+          "speckit.git.commit"
+        ]
+      },
+      "registered_skills": [],
+      "installed_at": "2026-06-08T01:33:59.604628+00:00"
+    },
+    "agent-context": {
+      "version": "1.0.0",
+      "source": "local",
+      "manifest_hash": "sha256:9a1dc02d2d0139bb03860392ecacef79183be2c442feda2f9ccaa4e5907b1e47",
+      "enabled": true,
+      "priority": 10,
+      "registered_commands": {
+        "agy": [
+          "speckit.agent-context.update"
+        ],
+        "codex": [
+          "speckit.agent-context.update"
+        ]
+      },
+      "registered_skills": [],
+      "installed_at": "2026-06-08T01:33:59.640587+00:00"
+    }
+  }
+}
--- a/.specify/extensions/agent-context/README.md
+++ b/.specify/extensions/agent-context/README.md
@ -0,0 +1,57 @@
+# Coding Agent Context Extension
+
+This bundled extension manages the **coding agent context/instruction file** (e.g. `CLAUDE.md`, `.github/copilot-instructions.md`, `AGENTS.md`, `GEMINI.md`, …) for the active integration.
+
+It owns the lifecycle of the managed section delimited by the configurable start/end markers (defaults: `<!-- SPECKIT START -->` / `<!-- SPECKIT END -->`).
+
+## Why an extension?
+
+Not every Spec Kit user wants Spec Kit to write into the coding agent's context file. Extracting this behavior into a dedicated extension lets users:
+
+- **Opt out** entirely with `specify extension disable agent-context` — Spec Kit will then never create or modify the agent context file.
+- **Customize the markers** by editing `.specify/extensions/agent-context/agent-context-config.yml` — both the Python layer and the bundled scripts honor the same `context_markers` value.
+- **Refresh on demand** with `/speckit.agent-context.update`, or automatically through the hooks declared in `extension.yml` (`after_specify`, `after_plan`).
+
+## Commands
+
+| Command | Description |
+|---------|-------------|
+| `speckit.agent-context.update` | Refresh the managed section in the agent context file with the current plan path. |
+
+## Configuration
+
+All configuration flows through the extension's own config file at
+`.specify/extensions/agent-context/agent-context-config.yml`:
+
+```yaml
+# Path to the coding agent context file managed by this extension
+context_file: CLAUDE.md
+
+# Delimiters for the managed Spec Kit section
+context_markers:
+  start: "<!-- SPECKIT START -->"
+  end: "<!-- SPECKIT END -->"
+```
+
+- `context_file` — the project-relative path to the coding agent context file, written by `specify init` and `specify integration install`.
+- `context_markers.start` / `.end` — the delimiters around the managed section. Edit these to use custom markers.
+
+## Requirements
+
+The bundled update scripts require **Python 3** with **PyYAML** for YAML/upsert processing (PowerShell can also use `ConvertFrom-Yaml` when available).
+
+PyYAML ships with the `specify` CLI and is normally available via the same `python3` interpreter. If a hook reports *"PyYAML is required … not available in the current Python environment"*, it means the system `python3` differs from the one used to install Spec Kit. To resolve, run:
+
+```bash
+pip install pyyaml
+# or target the specific interpreter Spec Kit uses:
+/path/to/speckit-python -m pip install pyyaml
+```
+
+## Disable
+
+```bash
+specify extension disable agent-context
+```
+
+When disabled, Spec Kit skips context file creation, updates, and removal (the gates are inside `upsert_context_section()` and `remove_context_section()`).
--- a/.specify/extensions/agent-context/agent-context-config.yml
+++ b/.specify/extensions/agent-context/agent-context-config.yml
@ -0,0 +1,4 @@
+context_file: AGENTS.md
+context_markers:
+  start: <!-- SPECKIT START -->
+  end: <!-- SPECKIT END -->
--- a/.specify/extensions/agent-context/commands/speckit.agent-context.update.md
+++ b/.specify/extensions/agent-context/commands/speckit.agent-context.update.md
@ -0,0 +1,26 @@
+---
+description: "Refresh the managed Spec Kit section in the coding agent context file"
+---
+
+# Update Coding Agent Context
+
+Refresh the managed Spec Kit section inside the active coding agent's context/instruction file (e.g. `CLAUDE.md`, `.github/copilot-instructions.md`, `AGENTS.md`).
+
+## Behavior
+
+The script reads the agent-context extension config at
+`.specify/extensions/agent-context/agent-context-config.yml` to discover:
+
+- `context_file` — the path of the coding agent context file to manage.
+- `context_markers.start` / `.end` — the delimiters surrounding the managed section. Defaults to `<!-- SPECKIT START -->` and `<!-- SPECKIT END -->` when the field is missing.
+
+It then creates, replaces, or appends the managed block so that the section points at the most recent plan path when one can be discovered (`specs/<feature>/plan.md`).
+
+If `context_file` is empty or the file cannot be located, the command reports nothing to do and exits successfully.
+
+## Execution
+
+- **Bash**: `.specify/extensions/agent-context/scripts/bash/update-agent-context.sh [plan_path]`
+- **PowerShell**: `.specify/extensions/agent-context/scripts/powershell/update-agent-context.ps1 [plan_path]`
+
+When `plan_path` is omitted, the script auto-detects the most recently modified `specs/*/plan.md`.
--- a/.specify/extensions/agent-context/extension.yml
+++ b/.specify/extensions/agent-context/extension.yml
@ -0,0 +1,34 @@
+schema_version: "1.0"
+
+extension:
+  id: agent-context
+  name: "Coding Agent Context"
+  version: "1.0.0"
+  description: "Manages coding agent context/instruction files (e.g., CLAUDE.md, copilot-instructions.md) with project-specific plan references and configurable markers"
+  author: spec-kit-core
+  repository: https://github.com/github/spec-kit
+  license: MIT
+
+requires:
+  speckit_version: ">=0.2.0"
+
+provides:
+  commands:
+    - name: speckit.agent-context.update
+      file: commands/speckit.agent-context.update.md
+      description: "Refresh the managed Spec Kit section in the coding agent context file"
+
+hooks:
+  after_specify:
+    command: speckit.agent-context.update
+    optional: true
+    description: "Refresh agent context after specification"
+  after_plan:
+    command: speckit.agent-context.update
+    optional: true
+    description: "Refresh agent context after planning"
+
+tags:
+  - "agent"
+  - "context"
+  - "core"
--- a/.specify/extensions/agent-context/scripts/bash/update-agent-context.sh
+++ b/.specify/extensions/agent-context/scripts/bash/update-agent-context.sh
@ -0,0 +1,200 @@
+#!/usr/bin/env bash
+# update-agent-context.sh
+#
+# Refresh the managed Spec Kit section in the coding agent's context file
+# (e.g. CLAUDE.md, .github/copilot-instructions.md, AGENTS.md).
+#
+# Reads `context_file` and `context_markers.{start,end}` from the
+# agent-context extension config:
+#   .specify/extensions/agent-context/agent-context-config.yml
+#
+# Usage: update-agent-context.sh [plan_path]
+#
+# When `plan_path` is omitted, the script picks the most recently modified
+# `specs/*/plan.md` if any exist, otherwise emits the section without a
+# concrete plan path.
+
+set -euo pipefail
+
+PROJECT_ROOT="$(pwd)"
+EXT_CONFIG="$PROJECT_ROOT/.specify/extensions/agent-context/agent-context-config.yml"
+DEFAULT_START="<!-- SPECKIT START -->"
+DEFAULT_END="<!-- SPECKIT END -->"
+
+if [[ ! -f "$EXT_CONFIG" ]]; then
+  echo "agent-context: $EXT_CONFIG not found; nothing to do." >&2
+  exit 0
+fi
+
+# Locate a suitable Python interpreter (python3, then python).
+_python=""
+if command -v python3 >/dev/null 2>&1; then
+  _python="python3"
+elif command -v python >/dev/null 2>&1 && python --version 2>&1 | grep -q "^Python 3"; then
+  _python="python"
+fi
+
+if [[ -z "$_python" ]]; then
+  echo "agent-context: Python 3 not found on PATH; skipping update." >&2
+  exit 0
+fi
+
+# Parse extension config once; emit three newline-separated fields:
+# context_file, context_markers.start, context_markers.end
+if ! _raw_opts="$("$_python" - "$EXT_CONFIG" <<'PY'
+import sys
+try:
+    import yaml
+except ImportError:
+    print(
+        "agent-context: PyYAML is required to parse extension config but is not available "
+        "in the current Python environment.\n"
+        "  To resolve: pip install pyyaml (or install it into the environment used by python3).\n"
+        "  Context file will not be updated until PyYAML is importable.",
+        file=sys.stderr,
+    )
+    sys.exit(2)
+try:
+    with open(sys.argv[1], "r", encoding="utf-8") as fh:
+        data = yaml.safe_load(fh)
+except Exception as exc:
+    print(
+        f"agent-context: unable to parse {sys.argv[1]} ({exc}); cannot update context.",
+        file=sys.stderr,
+    )
+    sys.exit(2)
+if not isinstance(data, dict):
+    data = {}
+def get_str(obj, *keys):
+    node = obj
+    for k in keys:
+        if isinstance(node, dict) and k in node:
+            node = node[k]
+        else:
+            return ""
+    return node if isinstance(node, str) else ""
+print(get_str(data, "context_file"))
+print(get_str(data, "context_markers", "start"))
+print(get_str(data, "context_markers", "end"))
+PY
+)"; then
+  echo "agent-context: skipping update (see above for details)." >&2
+  exit 0
+fi
+
+_opts_lines=()
+while IFS= read -r _line || [[ -n "$_line" ]]; do
+  _opts_lines+=("$_line")
+done < <(printf '%s\n' "$_raw_opts")
+if (( ${#_opts_lines[@]} < 3 )); then
+  echo "agent-context: malformed config parser output; expected 3 lines (context_file, marker_start, marker_end), got ${#_opts_lines[@]}; skipping update." >&2
+  exit 0
+fi
+CONTEXT_FILE="${_opts_lines[0]}"
+MARKER_START="${_opts_lines[1]}"
+MARKER_END="${_opts_lines[2]}"
+
+if [[ -z "$CONTEXT_FILE" ]]; then
+  echo "agent-context: context_file not set in extension config; nothing to do." >&2
+  exit 0
+fi
+
+# Reject absolute paths, backslash separators, and '..' path segments in context_file
+if [[ "$CONTEXT_FILE" == /* ]] || [[ "$CONTEXT_FILE" =~ ^[A-Za-z]: ]]; then
+  echo "agent-context: context_file must be a project-relative path; got '$CONTEXT_FILE'." >&2
+  exit 1
+fi
+if [[ "$CONTEXT_FILE" == *\\* ]]; then
+  echo "agent-context: context_file must not contain backslash separators; got '$CONTEXT_FILE'." >&2
+  exit 1
+fi
+IFS='/' read -ra _cf_parts <<< "$CONTEXT_FILE"
+for _seg in "${_cf_parts[@]}"; do
+  if [[ "$_seg" == ".." ]]; then
+    echo "agent-context: context_file must not contain '..' path segments; got '$CONTEXT_FILE'." >&2
+    exit 1
+  fi
+done
+unset _cf_parts _seg
+
+[[ -z "$MARKER_START" ]] && MARKER_START="$DEFAULT_START"
+[[ -z "$MARKER_END"   ]] && MARKER_END="$DEFAULT_END"
+
+PLAN_PATH="${1:-}"
+if [[ -z "$PLAN_PATH" ]]; then
+  # Pick the most recently modified plan.md one level deep (specs/<feature>/plan.md).
+  # Use find + sort by modification time to avoid ls/head fragility with
+  # spaces in paths or SIGPIPE from pipefail.
+  _plan_abs="$("$_python" - "$PROJECT_ROOT" <<'PY'
+import sys, os
+from pathlib import Path
+specs = Path(sys.argv[1]) / "specs"
+plans = sorted(
+    specs.glob("*/plan.md"),
+    key=lambda p: p.stat().st_mtime,
+    reverse=True,
+)
+print(plans[0] if plans else "")
+PY
+)"
+  if [[ -n "$_plan_abs" ]]; then
+    PLAN_PATH="${_plan_abs#"$PROJECT_ROOT/"}"
+  fi
+fi
+
+CTX_PATH="$PROJECT_ROOT/$CONTEXT_FILE"
+mkdir -p "$(dirname "$CTX_PATH")"
+
+# Build the managed section
+TMP_SECTION="$(mktemp)"
+trap 'rm -f "$TMP_SECTION"' EXIT
+{
+  echo "$MARKER_START"
+  echo "For additional context about technologies to be used, project structure,"
+  echo "shell commands, and other important information, read the current plan"
+  if [[ -n "$PLAN_PATH" ]]; then
+    echo "at $PLAN_PATH"
+  fi
+  echo "$MARKER_END"
+} > "$TMP_SECTION"
+
+"$_python" - "$CTX_PATH" "$MARKER_START" "$MARKER_END" "$TMP_SECTION" <<'PY'
+import sys, os
+ctx_path, start, end, section_path = sys.argv[1:5]
+with open(section_path, "r", encoding="utf-8") as fh:
+    section = fh.read().rstrip("\n") + "\n"
+
+if os.path.exists(ctx_path):
+    with open(ctx_path, "r", encoding="utf-8-sig") as fh:
+        content = fh.read()
+    s = content.find(start)
+    e = content.find(end, s if s != -1 else 0)
+    if s != -1 and e != -1 and e > s:
+        end_of_marker = e + len(end)
+        if end_of_marker < len(content) and content[end_of_marker] == "\r":
+            end_of_marker += 1
+        if end_of_marker < len(content) and content[end_of_marker] == "\n":
+            end_of_marker += 1
+        new_content = content[:s] + section + content[end_of_marker:]
+    elif s != -1:
+        new_content = content[:s] + section
+    elif e != -1:
+        end_of_marker = e + len(end)
+        if end_of_marker < len(content) and content[end_of_marker] == "\r":
+            end_of_marker += 1
+        if end_of_marker < len(content) and content[end_of_marker] == "\n":
+            end_of_marker += 1
+        new_content = section + content[end_of_marker:]
+    else:
+        if content and not content.endswith("\n"):
+            content += "\n"
+        new_content = (content + "\n" + section) if content else section
+else:
+    new_content = section
+
+new_content = new_content.replace("\r\n", "\n").replace("\r", "\n")
+with open(ctx_path, "wb") as fh:
+    fh.write(new_content.encode("utf-8"))
+PY
+
+echo "agent-context: updated $CONTEXT_FILE"
--- a/.specify/extensions/agent-context/scripts/powershell/update-agent-context.ps1
+++ b/.specify/extensions/agent-context/scripts/powershell/update-agent-context.ps1
@ -0,0 +1,237 @@
+#!/usr/bin/env pwsh
+# update-agent-context.ps1
+#
+# Refresh the managed Spec Kit section in the coding agent's context file
+# (e.g. CLAUDE.md, .github/copilot-instructions.md, AGENTS.md).
+#
+# Reads `context_file` and `context_markers.{start,end}` from the
+# agent-context extension config:
+#   .specify/extensions/agent-context/agent-context-config.yml
+#
+# Usage: update-agent-context.ps1 [plan_path]
+
+[CmdletBinding()]
+param(
+    [Parameter(Position = 0)]
+    [string]$PlanPath
+)
+
+function Get-ConfigValue {
+    param(
+        [AllowNull()][object]$Object,
+        [Parameter(Mandatory = $true)][string]$Key
+    )
+
+    if ($null -eq $Object) {
+        return $null
+    }
+    if ($Object -is [System.Collections.IDictionary]) {
+        return $Object[$Key]
+    }
+    $prop = $Object.PSObject.Properties[$Key]
+    if ($prop) {
+        return $prop.Value
+    }
+    return $null
+}
+
+function Test-ConfigObject {
+    param(
+        [AllowNull()][object]$Object
+    )
+
+    if ($null -eq $Object) {
+        return $false
+    }
+    if ($Object -is [System.Collections.IDictionary]) {
+        return $true
+    }
+    if ($Object -is [System.Management.Automation.PSCustomObject]) {
+        return $true
+    }
+    return $false
+}
+
+$ErrorActionPreference = 'Stop'
+$DefaultStart = '<!-- SPECKIT START -->'
+$DefaultEnd   = '<!-- SPECKIT END -->'
+$ProjectRoot  = (Get-Location).Path
+$ExtConfig    = Join-Path $ProjectRoot '.specify/extensions/agent-context/agent-context-config.yml'
+
+if (-not (Test-Path -LiteralPath $ExtConfig)) {
+    Write-Warning "agent-context: $ExtConfig not found; nothing to do."
+    exit 0
+}
+
+$Options = $null
+if (Get-Command ConvertFrom-Yaml -ErrorAction SilentlyContinue) {
+    try {
+        $Options = Get-Content -LiteralPath $ExtConfig -Raw | ConvertFrom-Yaml -ErrorAction Stop
+    } catch {
+        # fall through to Python fallback
+    }
+}
+
+if ($null -eq $Options) {
+    # ConvertFrom-Yaml unavailable or failed; fall back to Python+PyYAML.
+    $pythonCmd = $null
+    foreach ($candidate in @('python3', 'python')) {
+        if (Get-Command $candidate -ErrorAction SilentlyContinue) {
+            # Verify it is Python 3
+            $verOut = & $candidate --version 2>&1
+            if ($verOut -match 'Python 3') {
+                $pythonCmd = $candidate
+                break
+            }
+        }
+    }
+
+    if ($pythonCmd) {
+        try {
+            $jsonOut = & $pythonCmd -c @'
+import json
+import sys
+try:
+    import yaml
+except ImportError:
+    print(
+        "agent-context: PyYAML is required to parse extension config; cannot update context.",
+        file=sys.stderr,
+    )
+    sys.exit(2)
+
+try:
+    with open(sys.argv[1], "r", encoding="utf-8") as fh:
+        data = yaml.safe_load(fh)
+except Exception as exc:
+    print(
+        f"agent-context: unable to parse {sys.argv[1]} ({exc}); cannot update context.",
+        file=sys.stderr,
+    )
+    sys.exit(2)
+
+if not isinstance(data, dict):
+    data = {}
+
+print(json.dumps(data))
+'@ $ExtConfig
+            if ($LASTEXITCODE -eq 0 -and $jsonOut) {
+                $Options = $jsonOut | ConvertFrom-Json -ErrorAction Stop
+            }
+        } catch {
+            $Options = $null
+        }
+    }
+
+    if (-not $Options) {
+        Write-Warning "agent-context: unable to parse $ExtConfig; skipping update."
+        exit 0
+    }
+}
+
+if (-not (Test-ConfigObject -Object $Options)) {
+    Write-Warning "agent-context: $ExtConfig must contain a YAML mapping; skipping update."
+    exit 0
+}
+
+$ContextFile = Get-ConfigValue -Object $Options -Key 'context_file'
+if (-not $ContextFile) {
+    Write-Warning 'agent-context: context_file not set in extension config; nothing to do.'
+    exit 0
+}
+
+# Reject absolute paths and '..' path segments in context_file
+if ([System.IO.Path]::IsPathRooted($ContextFile)) {
+    Write-Warning "agent-context: context_file must be a project-relative path; got '$ContextFile'."
+    exit 1
+}
+$cfSegments = $ContextFile -split '[/\\]'
+if ($cfSegments -contains '..') {
+    Write-Warning "agent-context: context_file must not contain '..' path segments; got '$ContextFile'."
+    exit 1
+}
+
+$MarkerStart = $DefaultStart
+$MarkerEnd   = $DefaultEnd
+$cm = Get-ConfigValue -Object $Options -Key 'context_markers'
+if ($cm) {
+    $cmStart = Get-ConfigValue -Object $cm -Key 'start'
+    if ($cmStart -is [string] -and $cmStart) {
+        $MarkerStart = $cmStart
+    }
+    $cmEnd = Get-ConfigValue -Object $cm -Key 'end'
+    if ($cmEnd -is [string] -and $cmEnd) {
+        $MarkerEnd = $cmEnd
+    }
+}
+
+if (-not $PlanPath) {
+    # Discover plan.md exactly one level deep (specs/<feature>/plan.md),
+    # matching the bash glob specs/*/plan.md. Wrap in try/catch so access errors under
+    # $ErrorActionPreference = 'Stop' don't abort the script.
+    try {
+        $specsDir = Join-Path $ProjectRoot 'specs'
+        $candidate = Get-ChildItem -Path $specsDir -Directory -ErrorAction SilentlyContinue |
+            ForEach-Object { Get-Item -LiteralPath (Join-Path $_.FullName 'plan.md') -ErrorAction SilentlyContinue } |
+            Where-Object { $_ } |
+            Sort-Object LastWriteTime -Descending |
+            Select-Object -First 1
+        if ($candidate) {
+            $PlanPath = [System.IO.Path]::GetRelativePath($ProjectRoot, $candidate.FullName).Replace('\','/')
+        }
+    } catch {
+        # Non-fatal: continue without a plan path.
+    }
+}
+
+$CtxPath = Join-Path $ProjectRoot $ContextFile
+$CtxDir  = Split-Path -Parent $CtxPath
+if ($CtxDir -and -not (Test-Path -LiteralPath $CtxDir)) {
+    New-Item -ItemType Directory -Path $CtxDir -Force | Out-Null
+}
+
+$lines = @($MarkerStart,
+           'For additional context about technologies to be used, project structure,',
+           'shell commands, and other important information, read the current plan')
+if ($PlanPath) {
+    $lines += "at $PlanPath"
+}
+$lines += $MarkerEnd
+$Section = ($lines -join "`n") + "`n"
+
+if (Test-Path -LiteralPath $CtxPath) {
+    $rawBytes = [System.IO.File]::ReadAllBytes($CtxPath)
+    # Strip UTF-8 BOM if present
+    if ($rawBytes.Length -ge 3 -and $rawBytes[0] -eq 0xEF -and $rawBytes[1] -eq 0xBB -and $rawBytes[2] -eq 0xBF) {
+        $content = [System.Text.Encoding]::UTF8.GetString($rawBytes, 3, $rawBytes.Length - 3)
+    } else {
+        $content = [System.Text.Encoding]::UTF8.GetString($rawBytes)
+    }
+
+    $s = $content.IndexOf($MarkerStart)
+    $e = if ($s -ge 0) { $content.IndexOf($MarkerEnd, $s) } else { $content.IndexOf($MarkerEnd) }
+
+    if ($s -ge 0 -and $e -ge 0 -and $e -gt $s) {
+        $endOfMarker = $e + $MarkerEnd.Length
+        if ($endOfMarker -lt $content.Length -and $content[$endOfMarker] -eq "`r") { $endOfMarker++ }
+        if ($endOfMarker -lt $content.Length -and $content[$endOfMarker] -eq "`n") { $endOfMarker++ }
+        $newContent = $content.Substring(0, $s) + $Section + $content.Substring($endOfMarker)
+    } elseif ($s -ge 0) {
+        $newContent = $content.Substring(0, $s) + $Section
+    } elseif ($e -ge 0) {
+        $endOfMarker = $e + $MarkerEnd.Length
+        if ($endOfMarker -lt $content.Length -and $content[$endOfMarker] -eq "`r") { $endOfMarker++ }
+        if ($endOfMarker -lt $content.Length -and $content[$endOfMarker] -eq "`n") { $endOfMarker++ }
+        $newContent = $Section + $content.Substring($endOfMarker)
+    } else {
+        if ($content -and -not $content.EndsWith("`n")) { $content += "`n" }
+        if ($content) { $newContent = $content + "`n" + $Section } else { $newContent = $Section }
+    }
+} else {
+    $newContent = $Section
+}
+
+$newContent = $newContent.Replace("`r`n", "`n").Replace("`r", "`n")
+[System.IO.File]::WriteAllText($CtxPath, $newContent, (New-Object System.Text.UTF8Encoding($false)))
+
+Write-Host "agent-context: updated $ContextFile"
--- a/.specify/extensions/git/README.md
+++ b/.specify/extensions/git/README.md
@ -0,0 +1,100 @@
+# Git Branching Workflow Extension
+
+Git repository initialization, feature branch creation, numbering (sequential/timestamp), validation, remote detection, and auto-commit for Spec Kit.
+
+## Overview
+
+This extension provides Git operations as an optional, self-contained module. It manages:
+
+- **Repository initialization** with configurable commit messages
+- **Feature branch creation** with sequential (`001-feature-name`) or timestamp (`20260319-143022-feature-name`) numbering
+- **Branch validation** to ensure branches follow naming conventions
+- **Git remote detection** for GitHub integration (e.g., issue creation)
+- **Auto-commit** after core commands (configurable per-command with custom messages)
+
+## Commands
+
+| Command | Description |
+|---------|-------------|
+| `speckit.git.initialize` | Initialize a Git repository with a configurable commit message |
+| `speckit.git.feature` | Create a feature branch with sequential or timestamp numbering |
+| `speckit.git.validate` | Validate current branch follows feature branch naming conventions |
+| `speckit.git.remote` | Detect Git remote URL for GitHub integration |
+| `speckit.git.commit` | Auto-commit changes (configurable per-command enable/disable and messages) |
+
+## Hooks
+
+| Event | Command | Optional | Description |
+|-------|---------|----------|-------------|
+| `before_constitution` | `speckit.git.initialize` | No | Init git repo before constitution |
+| `before_specify` | `speckit.git.feature` | No | Create feature branch before specification |
+| `before_clarify` | `speckit.git.commit` | Yes | Commit outstanding changes before clarification |
+| `before_plan` | `speckit.git.commit` | Yes | Commit outstanding changes before planning |
+| `before_tasks` | `speckit.git.commit` | Yes | Commit outstanding changes before task generation |
+| `before_implement` | `speckit.git.commit` | Yes | Commit outstanding changes before implementation |
+| `before_checklist` | `speckit.git.commit` | Yes | Commit outstanding changes before checklist |
+| `before_analyze` | `speckit.git.commit` | Yes | Commit outstanding changes before analysis |
+| `before_taskstoissues` | `speckit.git.commit` | Yes | Commit outstanding changes before issue sync |
+| `after_constitution` | `speckit.git.commit` | Yes | Auto-commit after constitution update |
+| `after_specify` | `speckit.git.commit` | Yes | Auto-commit after specification |
+| `after_clarify` | `speckit.git.commit` | Yes | Auto-commit after clarification |
+| `after_plan` | `speckit.git.commit` | Yes | Auto-commit after planning |
+| `after_tasks` | `speckit.git.commit` | Yes | Auto-commit after task generation |
+| `after_implement` | `speckit.git.commit` | Yes | Auto-commit after implementation |
+| `after_checklist` | `speckit.git.commit` | Yes | Auto-commit after checklist |
+| `after_analyze` | `speckit.git.commit` | Yes | Auto-commit after analysis |
+| `after_taskstoissues` | `speckit.git.commit` | Yes | Auto-commit after issue sync |
+
+## Configuration
+
+Configuration is stored in `.specify/extensions/git/git-config.yml`:
+
+```yaml
+# Branch numbering strategy: "sequential" or "timestamp"
+branch_numbering: sequential
+
+# Custom commit message for git init
+init_commit_message: "[Spec Kit] Initial commit"
+
+# Auto-commit per command (all disabled by default)
+# Example: enable auto-commit after specify
+auto_commit:
+  default: false
+  after_specify:
+    enabled: true
+    message: "[Spec Kit] Add specification"
+```
+
+## Installation
+
+```bash
+# Install the bundled git extension (no network required)
+specify extension add git
+```
+
+## Disabling
+
+```bash
+# Disable the git extension (spec creation continues without branching)
+specify extension disable git
+
+# Re-enable it
+specify extension enable git
+```
+
+## Graceful Degradation
+
+When Git is not installed or the directory is not a Git repository:
+- Spec directories are still created under `specs/`
+- Branch creation is skipped with a warning
+- Branch validation is skipped with a warning
+- Remote detection returns empty results
+
+## Scripts
+
+The extension bundles cross-platform scripts:
+
+- `scripts/bash/create-new-feature.sh` — Bash implementation
+- `scripts/bash/git-common.sh` — Shared Git utilities (Bash)
+- `scripts/powershell/create-new-feature.ps1` — PowerShell implementation
+- `scripts/powershell/git-common.ps1` — Shared Git utilities (PowerShell)
--- a/.specify/extensions/git/commands/speckit.git.commit.md
+++ b/.specify/extensions/git/commands/speckit.git.commit.md
@ -0,0 +1,48 @@
+---
+description: "Auto-commit changes after a Spec Kit command completes"
+---
+
+# Auto-Commit Changes
+
+Automatically stage and commit all changes after a Spec Kit command completes.
+
+## Behavior
+
+This command is invoked as a hook after (or before) core commands. It:
+
+1. Determines the event name from the hook context (e.g., if invoked as an `after_specify` hook, the event is `after_specify`; if `before_plan`, the event is `before_plan`)
+2. Checks `.specify/extensions/git/git-config.yml` for the `auto_commit` section
+3. Looks up the specific event key to see if auto-commit is enabled
+4. Falls back to `auto_commit.default` if no event-specific key exists
+5. Uses the per-command `message` if configured, otherwise a default message
+6. If enabled and there are uncommitted changes, runs `git add .` + `git commit`
+
+## Execution
+
+Determine the event name from the hook that triggered this command, then run the script:
+
+- **Bash**: `.specify/extensions/git/scripts/bash/auto-commit.sh <event_name>`
+- **PowerShell**: `.specify/extensions/git/scripts/powershell/auto-commit.ps1 <event_name>`
+
+Replace `<event_name>` with the actual hook event (e.g., `after_specify`, `before_plan`, `after_implement`).
+
+## Configuration
+
+In `.specify/extensions/git/git-config.yml`:
+
+```yaml
+auto_commit:
+  default: false          # Global toggle — set true to enable for all commands
+  after_specify:
+    enabled: true          # Override per-command
+    message: "[Spec Kit] Add specification"
+  after_plan:
+    enabled: false
+    message: "[Spec Kit] Add implementation plan"
+```
+
+## Graceful Degradation
+
+- If Git is not available or the current directory is not a repository: skips with a warning
+- If no config file exists: skips (disabled by default)
+- If no changes to commit: skips with a message
--- a/.specify/extensions/git/commands/speckit.git.feature.md
+++ b/.specify/extensions/git/commands/speckit.git.feature.md
@ -0,0 +1,67 @@
+---
+description: "Create a feature branch with sequential or timestamp numbering"
+---
+
+# Create Feature Branch
+
+Create and switch to a new git feature branch for the given specification. This command handles **branch creation only** — the spec directory and files are created by the core `__SPECKIT_COMMAND_SPECIFY__` workflow.
+
+## User Input
+
+```text
+$ARGUMENTS
+```
+
+You **MUST** consider the user input before proceeding (if not empty).
+
+## Environment Variable Override
+
+If the user explicitly provided `GIT_BRANCH_NAME` (e.g., via environment variable, argument, or in their request), pass it through to the script by setting the `GIT_BRANCH_NAME` environment variable before invoking the script. When `GIT_BRANCH_NAME` is set:
+- The script uses the exact value as the branch name, bypassing all prefix/suffix generation
+- `--short-name`, `--number`, and `--timestamp` flags are ignored
+- `FEATURE_NUM` is extracted from the name if it starts with a numeric prefix, otherwise set to the full branch name
+
+## Prerequisites
+
+- Verify Git is available by running `git rev-parse --is-inside-work-tree 2>/dev/null`
+- If Git is not available, warn the user and skip branch creation
+
+## Branch Numbering Mode
+
+Determine the branch numbering strategy by checking configuration in this order:
+
+1. Check `.specify/extensions/git/git-config.yml` for `branch_numbering` value
+2. Check `.specify/init-options.json` for `branch_numbering` value (backward compatibility)
+3. Default to `sequential` if neither exists
+
+## Execution
+
+Generate a concise short name (2-4 words) for the branch:
+- Analyze the feature description and extract the most meaningful keywords
+- Use action-noun format when possible (e.g., "add-user-auth", "fix-payment-bug")
+- Preserve technical terms and acronyms (OAuth2, API, JWT, etc.)
+
+Run the appropriate script based on your platform:
+
+- **Bash**: `.specify/extensions/git/scripts/bash/create-new-feature.sh --json --short-name "<short-name>" "<feature description>"`
+- **Bash (timestamp)**: `.specify/extensions/git/scripts/bash/create-new-feature.sh --json --timestamp --short-name "<short-name>" "<feature description>"`
+- **PowerShell**: `.specify/extensions/git/scripts/powershell/create-new-feature.ps1 -Json -ShortName "<short-name>" "<feature description>"`
+- **PowerShell (timestamp)**: `.specify/extensions/git/scripts/powershell/create-new-feature.ps1 -Json -Timestamp -ShortName "<short-name>" "<feature description>"`
+
+**IMPORTANT**:
+- Do NOT pass `--number` — the script determines the correct next number automatically
+- Always include the JSON flag (`--json` for Bash, `-Json` for PowerShell) so the output can be parsed reliably
+- You must only ever run this script once per feature
+- The JSON output will contain `BRANCH_NAME` and `FEATURE_NUM`
+
+## Graceful Degradation
+
+If Git is not installed or the current directory is not a Git repository:
+- Branch creation is skipped with a warning: `[specify] Warning: Git repository not detected; skipped branch creation`
+- The script still outputs `BRANCH_NAME` and `FEATURE_NUM` so the caller can reference them
+
+## Output
+
+The script outputs JSON with:
+- `BRANCH_NAME`: The branch name (e.g., `003-user-auth` or `20260319-143022-user-auth`)
+- `FEATURE_NUM`: The numeric or timestamp prefix used
--- a/.specify/extensions/git/commands/speckit.git.initialize.md
+++ b/.specify/extensions/git/commands/speckit.git.initialize.md
@ -0,0 +1,49 @@
+---
+description: "Initialize a Git repository with an initial commit"
+---
+
+# Initialize Git Repository
+
+Initialize a Git repository in the current project directory if one does not already exist.
+
+## Execution
+
+Run the appropriate script from the project root:
+
+- **Bash**: `.specify/extensions/git/scripts/bash/initialize-repo.sh`
+- **PowerShell**: `.specify/extensions/git/scripts/powershell/initialize-repo.ps1`
+
+If the extension scripts are not found, fall back to:
+- **Bash**: `git init && git add . && git commit -m "Initial commit from Specify template"`
+- **PowerShell**: `git init; git add .; git commit -m "Initial commit from Specify template"`
+
+The script handles all checks internally:
+- Skips if Git is not available
+- Skips if already inside a Git repository
+- Runs `git init`, `git add .`, and `git commit` with an initial commit message
+
+## Customization
+
+Replace the script to add project-specific Git initialization steps:
+- Custom `.gitignore` templates
+- Default branch naming (`git config init.defaultBranch`)
+- Git LFS setup
+- Git hooks installation
+- Commit signing configuration
+- Git Flow initialization
+
+## Output
+
+On success:
+- `[OK] Git repository initialized`
+
+## Graceful Degradation
+
+If Git is not installed:
+- Warn the user
+- Skip repository initialization
+- The project continues to function without Git (specs can still be created under `specs/`)
+
+If Git is installed but `git init`, `git add .`, or `git commit` fails:
+- Surface the error to the user
+- Stop this command rather than continuing with a partially initialized repository
--- a/.specify/extensions/git/commands/speckit.git.remote.md
+++ b/.specify/extensions/git/commands/speckit.git.remote.md
@ -0,0 +1,45 @@
+---
+description: "Detect Git remote URL for GitHub integration"
+---
+
+# Detect Git Remote URL
+
+Detect the Git remote URL for integration with GitHub services (e.g., issue creation).
+
+## Prerequisites
+
+- Check if Git is available by running `git rev-parse --is-inside-work-tree 2>/dev/null`
+- If Git is not available, output a warning and return empty:
+  ```
+  [specify] Warning: Git repository not detected; cannot determine remote URL
+  ```
+
+## Execution
+
+Run the following command to get the remote URL:
+
+```bash
+git config --get remote.origin.url
+```
+
+## Output
+
+Parse the remote URL and determine:
+
+1. **Repository owner**: Extract from the URL (e.g., `github` from `https://github.com/github/spec-kit.git`)
+2. **Repository name**: Extract from the URL (e.g., `spec-kit` from `https://github.com/github/spec-kit.git`)
+3. **Is GitHub**: Whether the remote points to a GitHub repository
+
+Supported URL formats:
+- HTTPS: `https://github.com/<owner>/<repo>.git`
+- SSH: `git@github.com:<owner>/<repo>.git`
+
+> [!CAUTION]
+> ONLY report a GitHub repository if the remote URL actually points to github.com.
+> Do NOT assume the remote is GitHub if the URL format doesn't match.
+
+## Graceful Degradation
+
+If Git is not installed, the directory is not a Git repository, or no remote is configured:
+- Return an empty result
+- Do NOT error — other workflows should continue without Git remote information
--- a/.specify/extensions/git/commands/speckit.git.validate.md
+++ b/.specify/extensions/git/commands/speckit.git.validate.md
@ -0,0 +1,49 @@
+---
+description: "Validate current branch follows feature branch naming conventions"
+---
+
+# Validate Feature Branch
+
+Validate that the current Git branch follows the expected feature branch naming conventions.
+
+## Prerequisites
+
+- Check if Git is available by running `git rev-parse --is-inside-work-tree 2>/dev/null`
+- If Git is not available, output a warning and skip validation:
+  ```
+  [specify] Warning: Git repository not detected; skipped branch validation
+  ```
+
+## Validation Rules
+
+Get the current branch name:
+
+```bash
+git rev-parse --abbrev-ref HEAD
+```
+
+The branch name must match one of these patterns:
+
+1. **Sequential**: `^[0-9]{3,}-` (e.g., `001-feature-name`, `042-fix-bug`, `1000-big-feature`)
+2. **Timestamp**: `^[0-9]{8}-[0-9]{6}-` (e.g., `20260319-143022-feature-name`)
+
+## Execution
+
+If on a feature branch (matches either pattern):
+- Output: `✓ On feature branch: <branch-name>`
+- Check if the corresponding spec directory exists under `specs/`:
+  - For sequential branches, look for `specs/<prefix>-*` where prefix matches the numeric portion
+  - For timestamp branches, look for `specs/<prefix>-*` where prefix matches the `YYYYMMDD-HHMMSS` portion
+- If spec directory exists: `✓ Spec directory found: <path>`
+- If spec directory missing: `⚠ No spec directory found for prefix <prefix>`
+
+If NOT on a feature branch:
+- Output: `✗ Not on a feature branch. Current branch: <branch-name>`
+- Output: `Feature branches should be named like: 001-feature-name or 20260319-143022-feature-name`
+
+## Graceful Degradation
+
+If Git is not installed or the directory is not a Git repository:
+- Check the `SPECIFY_FEATURE` environment variable as a fallback
+- If set, validate that value against the naming patterns
+- If not set, skip validation with a warning
--- a/.specify/extensions/git/config-template.yml
+++ b/.specify/extensions/git/config-template.yml
@ -0,0 +1,62 @@
+# Git Branching Workflow Extension Configuration
+# Copied to .specify/extensions/git/git-config.yml on install
+
+# Branch numbering strategy: "sequential" (001, 002, ...) or "timestamp" (YYYYMMDD-HHMMSS)
+branch_numbering: sequential
+
+# Commit message used by `git commit` during repository initialization
+init_commit_message: "[Spec Kit] Initial commit"
+
+# Auto-commit before/after core commands.
+# Set "default" to enable for all commands, then override per-command.
+# Each key can be true/false. Message is customizable per-command.
+auto_commit:
+  default: false
+  before_clarify:
+    enabled: false
+    message: "[Spec Kit] Save progress before clarification"
+  before_plan:
+    enabled: false
+    message: "[Spec Kit] Save progress before planning"
+  before_tasks:
+    enabled: false
+    message: "[Spec Kit] Save progress before task generation"
+  before_implement:
+    enabled: false
+    message: "[Spec Kit] Save progress before implementation"
+  before_checklist:
+    enabled: false
+    message: "[Spec Kit] Save progress before checklist"
+  before_analyze:
+    enabled: false
+    message: "[Spec Kit] Save progress before analysis"
+  before_taskstoissues:
+    enabled: false
+    message: "[Spec Kit] Save progress before issue sync"
+  after_constitution:
+    enabled: false
+    message: "[Spec Kit] Add project constitution"
+  after_specify:
+    enabled: false
+    message: "[Spec Kit] Add specification"
+  after_clarify:
+    enabled: false
+    message: "[Spec Kit] Clarify specification"
+  after_plan:
+    enabled: false
+    message: "[Spec Kit] Add implementation plan"
+  after_tasks:
+    enabled: false
+    message: "[Spec Kit] Add tasks"
+  after_implement:
+    enabled: false
+    message: "[Spec Kit] Implementation progress"
+  after_checklist:
+    enabled: false
+    message: "[Spec Kit] Add checklist"
+  after_analyze:
+    enabled: false
+    message: "[Spec Kit] Add analysis report"
+  after_taskstoissues:
+    enabled: false
+    message: "[Spec Kit] Sync tasks to issues"
--- a/.specify/extensions/git/extension.yml
+++ b/.specify/extensions/git/extension.yml
@ -0,0 +1,140 @@
+schema_version: "1.0"
+
+extension:
+  id: git
+  name: "Git Branching Workflow"
+  version: "1.0.0"
+  description: "Feature branch creation, numbering (sequential/timestamp), validation, and Git remote detection"
+  author: spec-kit-core
+  repository: https://github.com/github/spec-kit
+  license: MIT
+
+requires:
+  speckit_version: ">=0.2.0"
+  tools:
+    - name: git
+      required: false
+
+provides:
+  commands:
+    - name: speckit.git.feature
+      file: commands/speckit.git.feature.md
+      description: "Create a feature branch with sequential or timestamp numbering"
+    - name: speckit.git.validate
+      file: commands/speckit.git.validate.md
+      description: "Validate current branch follows feature branch naming conventions"
+    - name: speckit.git.remote
+      file: commands/speckit.git.remote.md
+      description: "Detect Git remote URL for GitHub integration"
+    - name: speckit.git.initialize
+      file: commands/speckit.git.initialize.md
+      description: "Initialize a Git repository with an initial commit"
+    - name: speckit.git.commit
+      file: commands/speckit.git.commit.md
+      description: "Auto-commit changes after a Spec Kit command completes"
+
+  config:
+    - name: "git-config.yml"
+      template: "config-template.yml"
+      description: "Git branching configuration"
+      required: false
+
+hooks:
+  before_constitution:
+    command: speckit.git.initialize
+    optional: false
+    description: "Initialize Git repository before constitution setup"
+  before_specify:
+    command: speckit.git.feature
+    optional: false
+    description: "Create feature branch before specification"
+  before_clarify:
+    command: speckit.git.commit
+    optional: true
+    prompt: "Commit outstanding changes before clarification?"
+    description: "Auto-commit before spec clarification"
+  before_plan:
+    command: speckit.git.commit
+    optional: true
+    prompt: "Commit outstanding changes before planning?"
+    description: "Auto-commit before implementation planning"
+  before_tasks:
+    command: speckit.git.commit
+    optional: true
+    prompt: "Commit outstanding changes before task generation?"
+    description: "Auto-commit before task generation"
+  before_implement:
+    command: speckit.git.commit
+    optional: true
+    prompt: "Commit outstanding changes before implementation?"
+    description: "Auto-commit before implementation"
+  before_checklist:
+    command: speckit.git.commit
+    optional: true
+    prompt: "Commit outstanding changes before checklist?"
+    description: "Auto-commit before checklist generation"
+  before_analyze:
+    command: speckit.git.commit
+    optional: true
+    prompt: "Commit outstanding changes before analysis?"
+    description: "Auto-commit before analysis"
+  before_taskstoissues:
+    command: speckit.git.commit
+    optional: true
+    prompt: "Commit outstanding changes before issue sync?"
+    description: "Auto-commit before tasks-to-issues conversion"
+  after_constitution:
+    command: speckit.git.commit
+    optional: true
+    prompt: "Commit constitution changes?"
+    description: "Auto-commit after constitution update"
+  after_specify:
+    command: speckit.git.commit
+    optional: true
+    prompt: "Commit specification changes?"
+    description: "Auto-commit after specification"
+  after_clarify:
+    command: speckit.git.commit
+    optional: true
+    prompt: "Commit clarification changes?"
+    description: "Auto-commit after spec clarification"
+  after_plan:
+    command: speckit.git.commit
+    optional: true
+    prompt: "Commit plan changes?"
+    description: "Auto-commit after implementation planning"
+  after_tasks:
+    command: speckit.git.commit
+    optional: true
+    prompt: "Commit task changes?"
+    description: "Auto-commit after task generation"
+  after_implement:
+    command: speckit.git.commit
+    optional: true
+    prompt: "Commit implementation changes?"
+    description: "Auto-commit after implementation"
+  after_checklist:
+    command: speckit.git.commit
+    optional: true
+    prompt: "Commit checklist changes?"
+    description: "Auto-commit after checklist generation"
+  after_analyze:
+    command: speckit.git.commit
+    optional: true
+    prompt: "Commit analysis results?"
+    description: "Auto-commit after analysis"
+  after_taskstoissues:
+    command: speckit.git.commit
+    optional: true
+    prompt: "Commit after syncing issues?"
+    description: "Auto-commit after tasks-to-issues conversion"
+
+tags:
+  - "git"
+  - "branching"
+  - "workflow"
+
+config:
+  defaults:
+    branch_numbering: sequential
+    init_commit_message: "[Spec Kit] Initial commit"
--- a/.specify/extensions/git/git-config.yml
+++ b/.specify/extensions/git/git-config.yml
@ -0,0 +1,62 @@
+# Git Branching Workflow Extension Configuration
+# Copied to .specify/extensions/git/git-config.yml on install
+
+# Branch numbering strategy: "sequential" (001, 002, ...) or "timestamp" (YYYYMMDD-HHMMSS)
+branch_numbering: sequential
+
+# Commit message used by `git commit` during repository initialization
+init_commit_message: "[Spec Kit] Initial commit"
+
+# Auto-commit before/after core commands.
+# Set "default" to enable for all commands, then override per-command.
+# Each key can be true/false. Message is customizable per-command.
+auto_commit:
+  default: false
+  before_clarify:
+    enabled: false
+    message: "[Spec Kit] Save progress before clarification"
+  before_plan:
+    enabled: false
+    message: "[Spec Kit] Save progress before planning"
+  before_tasks:
+    enabled: false
+    message: "[Spec Kit] Save progress before task generation"
+  before_implement:
+    enabled: false
+    message: "[Spec Kit] Save progress before implementation"
+  before_checklist:
+    enabled: false
+    message: "[Spec Kit] Save progress before checklist"
+  before_analyze:
+    enabled: false
+    message: "[Spec Kit] Save progress before analysis"
+  before_taskstoissues:
+    enabled: false
+    message: "[Spec Kit] Save progress before issue sync"
+  after_constitution:
+    enabled: false
+    message: "[Spec Kit] Add project constitution"
+  after_specify:
+    enabled: false
+    message: "[Spec Kit] Add specification"
+  after_clarify:
+    enabled: false
+    message: "[Spec Kit] Clarify specification"
+  after_plan:
+    enabled: false
+    message: "[Spec Kit] Add implementation plan"
+  after_tasks:
+    enabled: false
+    message: "[Spec Kit] Add tasks"
+  after_implement:
+    enabled: false
+    message: "[Spec Kit] Implementation progress"
+  after_checklist:
+    enabled: false
+    message: "[Spec Kit] Add checklist"
+  after_analyze:
+    enabled: false
+    message: "[Spec Kit] Add analysis report"
+  after_taskstoissues:
+    enabled: false
+    message: "[Spec Kit] Sync tasks to issues"
--- a/.specify/extensions/git/scripts/bash/auto-commit.sh
+++ b/.specify/extensions/git/scripts/bash/auto-commit.sh
@ -0,0 +1,140 @@
+#!/usr/bin/env bash
+# Git extension: auto-commit.sh
+# Automatically commit changes after a Spec Kit command completes.
+# Checks per-command config keys in git-config.yml before committing.
+#
+# Usage: auto-commit.sh <event_name>
+#   e.g.: auto-commit.sh after_specify
+
+set -e
+
+EVENT_NAME="${1:-}"
+if [ -z "$EVENT_NAME" ]; then
+    echo "Usage: $0 <event_name>" >&2
+    exit 1
+fi
+
+SCRIPT_DIR="$(CDPATH="" cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+
+_find_project_root() {
+    local dir="$1"
+    while [ "$dir" != "/" ]; do
+        if [ -d "$dir/.specify" ] || [ -d "$dir/.git" ]; then
+            echo "$dir"
+            return 0
+        fi
+        dir="$(dirname "$dir")"
+    done
+    return 1
+}
+
+REPO_ROOT=$(_find_project_root "$SCRIPT_DIR") || REPO_ROOT="$(pwd)"
+cd "$REPO_ROOT"
+
+# Check if git is available
+if ! command -v git >/dev/null 2>&1; then
+    echo "[specify] Warning: Git not found; skipped auto-commit" >&2
+    exit 0
+fi
+
+if ! git rev-parse --is-inside-work-tree >/dev/null 2>&1; then
+    echo "[specify] Warning: Not a Git repository; skipped auto-commit" >&2
+    exit 0
+fi
+
+# Read per-command config from git-config.yml
+_config_file="$REPO_ROOT/.specify/extensions/git/git-config.yml"
+_enabled=false
+_commit_msg=""
+
+if [ -f "$_config_file" ]; then
+    # Parse the auto_commit section for this event.
+    # Look for auto_commit.<event_name>.enabled and .message
+    # Also check auto_commit.default as fallback.
+    _in_auto_commit=false
+    _in_event=false
+    _default_enabled=false
+
+    while IFS= read -r _line; do
+        # Detect auto_commit: section
+        if echo "$_line" | grep -q '^auto_commit:'; then
+            _in_auto_commit=true
+            _in_event=false
+            continue
+        fi
+
+        # Exit auto_commit section on next top-level key
+        if $_in_auto_commit && echo "$_line" | grep -Eq '^[a-z]'; then
+            break
+        fi
+
+        if $_in_auto_commit; then
+            # Check default key
+            if echo "$_line" | grep -Eq "^[[:space:]]+default:[[:space:]]"; then
+                _val=$(echo "$_line" | sed 's/^[^:]*:[[:space:]]*//' | tr -d '[:space:]' | tr '[:upper:]' '[:lower:]')
+                [ "$_val" = "true" ] && _default_enabled=true
+            fi
+
+            # Detect our event subsection
+            if echo "$_line" | grep -Eq "^[[:space:]]+${EVENT_NAME}:"; then
+                _in_event=true
+                continue
+            fi
+
+            # Inside our event subsection
+            if $_in_event; then
+                # Exit on next sibling key (same indent level as event name)
+                if echo "$_line" | grep -Eq '^[[:space:]]{2}[a-z]' && ! echo "$_line" | grep -Eq '^[[:space:]]{4}'; then
+                    _in_event=false
+                    continue
+                fi
+                if echo "$_line" | grep -Eq '[[:space:]]+enabled:'; then
+                    _val=$(echo "$_line" | sed 's/^[^:]*:[[:space:]]*//' | tr -d '[:space:]' | tr '[:upper:]' '[:lower:]')
+                    [ "$_val" = "true" ] && _enabled=true
+                    [ "$_val" = "false" ] && _enabled=false
+                fi
+                if echo "$_line" | grep -Eq '[[:space:]]+message:'; then
+                    _commit_msg=$(echo "$_line" | sed 's/^[^:]*:[[:space:]]*//' | sed 's/^["'\'']//' | sed 's/["'\'']*$//')
+                fi
+            fi
+        fi
+    done < "$_config_file"
+
+    # If event-specific key not found, use default
+    if [ "$_enabled" = "false" ] && [ "$_default_enabled" = "true" ]; then
+        # Only use default if the event wasn't explicitly set to false
+        # Check if event section existed at all
+        if ! grep -q "^[[:space:]]*${EVENT_NAME}:" "$_config_file" 2>/dev/null; then
+            _enabled=true
+        fi
+    fi
+else
+    # No config file — auto-commit disabled by default
+    exit 0
+fi
+
+if [ "$_enabled" != "true" ]; then
+    exit 0
+fi
+
+# Check if there are changes to commit
+if git diff --quiet HEAD 2>/dev/null && git diff --cached --quiet 2>/dev/null && [ -z "$(git ls-files --others --exclude-standard 2>/dev/null)" ]; then
+    echo "[specify] No changes to commit after $EVENT_NAME" >&2
+    exit 0
+fi
+
+# Derive a human-readable command name from the event
+# e.g., after_specify -> specify, before_plan -> plan
+_command_name=$(echo "$EVENT_NAME" | sed 's/^after_//' | sed 's/^before_//')
+_phase=$(echo "$EVENT_NAME" | grep -q '^before_' && echo 'before' || echo 'after')
+
+# Use custom message if configured, otherwise default
+if [ -z "$_commit_msg" ]; then
+    _commit_msg="[Spec Kit] Auto-commit ${_phase} ${_command_name}"
+fi
+
+# Stage and commit
+_git_out=$(git add . 2>&1) || { echo "[specify] Error: git add failed: $_git_out" >&2; exit 1; }
+_git_out=$(git commit -q -m "$_commit_msg" 2>&1) || { echo "[specify] Error: git commit failed: $_git_out" >&2; exit 1; }
+
+echo "[OK] Changes committed ${_phase} ${_command_name}" >&2
--- a/.specify/extensions/git/scripts/bash/create-new-feature.sh
+++ b/.specify/extensions/git/scripts/bash/create-new-feature.sh
@ -0,0 +1,453 @@
+#!/usr/bin/env bash
+# Git extension: create-new-feature.sh
+# Adapted from core scripts/bash/create-new-feature.sh for extension layout.
+# Sources common.sh from the project's installed scripts, falling back to
+# git-common.sh for minimal git helpers.
+
+set -e
+
+JSON_MODE=false
+DRY_RUN=false
+ALLOW_EXISTING=false
+SHORT_NAME=""
+BRANCH_NUMBER=""
+USE_TIMESTAMP=false
+ARGS=()
+i=1
+while [ $i -le $# ]; do
+    arg="${!i}"
+    case "$arg" in
+        --json)
+            JSON_MODE=true
+            ;;
+        --dry-run)
+            DRY_RUN=true
+            ;;
+        --allow-existing-branch)
+            ALLOW_EXISTING=true
+            ;;
+        --short-name)
+            if [ $((i + 1)) -gt $# ]; then
+                echo 'Error: --short-name requires a value' >&2
+                exit 1
+            fi
+            i=$((i + 1))
+            next_arg="${!i}"
+            if [[ "$next_arg" == --* ]]; then
+                echo 'Error: --short-name requires a value' >&2
+                exit 1
+            fi
+            SHORT_NAME="$next_arg"
+            ;;
+        --number)
+            if [ $((i + 1)) -gt $# ]; then
+                echo 'Error: --number requires a value' >&2
+                exit 1
+            fi
+            i=$((i + 1))
+            next_arg="${!i}"
+            if [[ "$next_arg" == --* ]]; then
+                echo 'Error: --number requires a value' >&2
+                exit 1
+            fi
+            BRANCH_NUMBER="$next_arg"
+            if [[ ! "$BRANCH_NUMBER" =~ ^[0-9]+$ ]]; then
+                echo 'Error: --number must be a non-negative integer' >&2
+                exit 1
+            fi
+            ;;
+        --timestamp)
+            USE_TIMESTAMP=true
+            ;;
+        --help|-h)
+            echo "Usage: $0 [--json] [--dry-run] [--allow-existing-branch] [--short-name <name>] [--number N] [--timestamp] <feature_description>"
+            echo ""
+            echo "Options:"
+            echo "  --json              Output in JSON format"
+            echo "  --dry-run           Compute branch name without creating the branch"
+            echo "  --allow-existing-branch  Switch to branch if it already exists instead of failing"
+            echo "  --short-name <name> Provide a custom short name (2-4 words) for the branch"
+            echo "  --number N          Specify branch number manually (overrides auto-detection)"
+            echo "  --timestamp         Use timestamp prefix (YYYYMMDD-HHMMSS) instead of sequential numbering"
+            echo "  --help, -h          Show this help message"
+            echo ""
+            echo "Environment variables:"
+            echo "  GIT_BRANCH_NAME     Use this exact branch name, bypassing all prefix/suffix generation"
+            echo ""
+            echo "Examples:"
+            echo "  $0 'Add user authentication system' --short-name 'user-auth'"
+            echo "  $0 'Implement OAuth2 integration for API' --number 5"
+            echo "  $0 --timestamp --short-name 'user-auth' 'Add user authentication'"
+            echo "  GIT_BRANCH_NAME=my-branch $0 'feature description'"
+            exit 0
+            ;;
+        *)
+            ARGS+=("$arg")
+            ;;
+    esac
+    i=$((i + 1))
+done
+
+FEATURE_DESCRIPTION="${ARGS[*]}"
+if [ -z "$FEATURE_DESCRIPTION" ]; then
+    echo "Usage: $0 [--json] [--dry-run] [--allow-existing-branch] [--short-name <name>] [--number N] [--timestamp] <feature_description>" >&2
+    exit 1
+fi
+
+# Trim whitespace and validate description is not empty
+FEATURE_DESCRIPTION=$(echo "$FEATURE_DESCRIPTION" | sed -E 's/^[[:space:]]+|[[:space:]]+$//g')
+if [ -z "$FEATURE_DESCRIPTION" ]; then
+    echo "Error: Feature description cannot be empty or contain only whitespace" >&2
+    exit 1
+fi
+
+# Function to get highest number from specs directory
+get_highest_from_specs() {
+    local specs_dir="$1"
+    local highest=0
+
+    if [ -d "$specs_dir" ]; then
+        for dir in "$specs_dir"/*; do
+            [ -d "$dir" ] || continue
+            dirname=$(basename "$dir")
+            # Match sequential prefixes (>=3 digits), but skip timestamp dirs.
+            if echo "$dirname" | grep -Eq '^[0-9]{3,}-' && ! echo "$dirname" | grep -Eq '^[0-9]{8}-[0-9]{6}-'; then
+                number=$(echo "$dirname" | grep -Eo '^[0-9]+')
+                number=$((10#$number))
+                if [ "$number" -gt "$highest" ]; then
+                    highest=$number
+                fi
+            fi
+        done
+    fi
+
+    echo "$highest"
+}
+
+# Function to get highest number from git branches
+get_highest_from_branches() {
+    git branch -a 2>/dev/null | sed 's/^[* ]*//; s|^remotes/[^/]*/||' | _extract_highest_number
+}
+
+# Extract the highest sequential feature number from a list of ref names (one per line).
+_extract_highest_number() {
+    local highest=0
+    while IFS= read -r name; do
+        [ -z "$name" ] && continue
+        if echo "$name" | grep -Eq '^[0-9]{3,}-' && ! echo "$name" | grep -Eq '^[0-9]{8}-[0-9]{6}-'; then
+            number=$(echo "$name" | grep -Eo '^[0-9]+' || echo "0")
+            number=$((10#$number))
+            if [ "$number" -gt "$highest" ]; then
+                highest=$number
+            fi
+        fi
+    done
+    echo "$highest"
+}
+
+# Function to get highest number from remote branches without fetching (side-effect-free)
+get_highest_from_remote_refs() {
+    local highest=0
+
+    for remote in $(git remote 2>/dev/null); do
+        local remote_highest
+        remote_highest=$(GIT_TERMINAL_PROMPT=0 git ls-remote --heads "$remote" 2>/dev/null | sed 's|.*refs/heads/||' | _extract_highest_number)
+        if [ "$remote_highest" -gt "$highest" ]; then
+            highest=$remote_highest
+        fi
+    done
+
+    echo "$highest"
+}
+
+# Function to check existing branches and return next available number.
+check_existing_branches() {
+    local specs_dir="$1"
+    local skip_fetch="${2:-false}"
+
+    if [ "$skip_fetch" = true ]; then
+        local highest_remote=$(get_highest_from_remote_refs)
+        local highest_branch=$(get_highest_from_branches)
+        if [ "$highest_remote" -gt "$highest_branch" ]; then
+            highest_branch=$highest_remote
+        fi
+    else
+        git fetch --all --prune >/dev/null 2>&1 || true
+        local highest_branch=$(get_highest_from_branches)
+    fi
+
+    local highest_spec=$(get_highest_from_specs "$specs_dir")
+
+    local max_num=$highest_branch
+    if [ "$highest_spec" -gt "$max_num" ]; then
+        max_num=$highest_spec
+    fi
+
+    echo $((max_num + 1))
+}
+
+# Function to clean and format a branch name
+clean_branch_name() {
+    local name="$1"
+    echo "$name" | tr '[:upper:]' '[:lower:]' | sed 's/[^a-z0-9]/-/g' | sed 's/-\+/-/g' | sed 's/^-//' | sed 's/-$//'
+}
+
+# ---------------------------------------------------------------------------
+# Source common.sh for resolve_template, json_escape, get_repo_root, has_git.
+#
+# Search locations in priority order:
+#  1. .specify/scripts/bash/common.sh under the project root (installed project)
+#  2. scripts/bash/common.sh under the project root (source checkout fallback)
+#  3. git-common.sh next to this script (minimal fallback — lacks resolve_template)
+# ---------------------------------------------------------------------------
+SCRIPT_DIR="$(CDPATH="" cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+
+# Find project root by walking up from the script location
+_find_project_root() {
+    local dir="$1"
+    while [ "$dir" != "/" ]; do
+        if [ -d "$dir/.specify" ] || [ -d "$dir/.git" ]; then
+            echo "$dir"
+            return 0
+        fi
+        dir="$(dirname "$dir")"
+    done
+    return 1
+}
+
+_common_loaded=false
+_PROJECT_ROOT=$(_find_project_root "$SCRIPT_DIR") || true
+
+if [ -n "$_PROJECT_ROOT" ] && [ -f "$_PROJECT_ROOT/.specify/scripts/bash/common.sh" ]; then
+    source "$_PROJECT_ROOT/.specify/scripts/bash/common.sh"
+    _common_loaded=true
+elif [ -n "$_PROJECT_ROOT" ] && [ -f "$_PROJECT_ROOT/scripts/bash/common.sh" ]; then
+    source "$_PROJECT_ROOT/scripts/bash/common.sh"
+    _common_loaded=true
+elif [ -f "$SCRIPT_DIR/git-common.sh" ]; then
+    source "$SCRIPT_DIR/git-common.sh"
+    _common_loaded=true
+fi
+
+if [ "$_common_loaded" != "true" ]; then
+    echo "Error: Could not locate common.sh or git-common.sh. Please ensure the Specify core scripts are installed." >&2
+    exit 1
+fi
+
+# Resolve repository root
+if type get_repo_root >/dev/null 2>&1; then
+    REPO_ROOT=$(get_repo_root)
+elif git rev-parse --show-toplevel >/dev/null 2>&1; then
+    REPO_ROOT=$(git rev-parse --show-toplevel)
+elif [ -n "$_PROJECT_ROOT" ]; then
+    REPO_ROOT="$_PROJECT_ROOT"
+else
+    echo "Error: Could not determine repository root." >&2
+    exit 1
+fi
+
+# Check if git is available at this repo root
+if type has_git >/dev/null 2>&1; then
+    if has_git "$REPO_ROOT"; then
+        HAS_GIT=true
+    else
+        HAS_GIT=false
+    fi
+elif git -C "$REPO_ROOT" rev-parse --is-inside-work-tree >/dev/null 2>&1; then
+    HAS_GIT=true
+else
+    HAS_GIT=false
+fi
+
+cd "$REPO_ROOT"
+
+SPECS_DIR="$REPO_ROOT/specs"
+
+# Function to generate branch name with stop word filtering
+generate_branch_name() {
+    local description="$1"
+
+    local stop_words="^(i|a|an|the|to|for|of|in|on|at|by|with|from|is|are|was|were|be|been|being|have|has|had|do|does|did|will|would|should|could|can|may|might|must|shall|this|that|these|those|my|your|our|their|want|need|add|get|set)$"
+
+    local clean_name=$(echo "$description" | tr '[:upper:]' '[:lower:]' | sed 's/[^a-z0-9]/ /g')
+
+    local meaningful_words=()
+    for word in $clean_name; do
+        [ -z "$word" ] && continue
+        if ! echo "$word" | grep -qiE "$stop_words"; then
+            if [ ${#word} -ge 3 ]; then
+                meaningful_words+=("$word")
+            elif echo "$description" | grep -qw -- "${word^^}"; then
+                meaningful_words+=("$word")
+            fi
+        fi
+    done
+
+    if [ ${#meaningful_words[@]} -gt 0 ]; then
+        local max_words=3
+        if [ ${#meaningful_words[@]} -eq 4 ]; then max_words=4; fi
+
+        local result=""
+        local count=0
+        for word in "${meaningful_words[@]}"; do
+            if [ $count -ge $max_words ]; then break; fi
+            if [ -n "$result" ]; then result="$result-"; fi
+            result="$result$word"
+            count=$((count + 1))
+        done
+        echo "$result"
+    else
+        local cleaned=$(clean_branch_name "$description")
+        echo "$cleaned" | tr '-' '\n' | grep -v '^$' | head -3 | tr '\n' '-' | sed 's/-$//'
+    fi
+}
+
+# Check for GIT_BRANCH_NAME env var override (exact branch name, no prefix/suffix)
+if [ -n "${GIT_BRANCH_NAME:-}" ]; then
+    BRANCH_NAME="$GIT_BRANCH_NAME"
+    # Extract FEATURE_NUM from the branch name if it starts with a numeric prefix
+    # Check timestamp pattern first (YYYYMMDD-HHMMSS-) since it also matches the simpler ^[0-9]+ pattern
+    if echo "$BRANCH_NAME" | grep -Eq '^[0-9]{8}-[0-9]{6}-'; then
+        FEATURE_NUM=$(echo "$BRANCH_NAME" | grep -Eo '^[0-9]{8}-[0-9]{6}')
+        BRANCH_SUFFIX="${BRANCH_NAME#${FEATURE_NUM}-}"
+    elif echo "$BRANCH_NAME" | grep -Eq '^[0-9]+-'; then
+        FEATURE_NUM=$(echo "$BRANCH_NAME" | grep -Eo '^[0-9]+')
+        BRANCH_SUFFIX="${BRANCH_NAME#${FEATURE_NUM}-}"
+    else
+        FEATURE_NUM="$BRANCH_NAME"
+        BRANCH_SUFFIX="$BRANCH_NAME"
+    fi
+else
+    # Generate branch name
+    if [ -n "$SHORT_NAME" ]; then
+        BRANCH_SUFFIX=$(clean_branch_name "$SHORT_NAME")
+    else
+        BRANCH_SUFFIX=$(generate_branch_name "$FEATURE_DESCRIPTION")
+    fi
+
+    # Warn if --number and --timestamp are both specified
+    if [ "$USE_TIMESTAMP" = true ] && [ -n "$BRANCH_NUMBER" ]; then
+        >&2 echo "[specify] Warning: --number is ignored when --timestamp is used"
+        BRANCH_NUMBER=""
+    fi
+
+    # Determine branch prefix
+    if [ "$USE_TIMESTAMP" = true ]; then
+        FEATURE_NUM=$(date +%Y%m%d-%H%M%S)
+        BRANCH_NAME="${FEATURE_NUM}-${BRANCH_SUFFIX}"
+    else
+        if [ -z "$BRANCH_NUMBER" ]; then
+            if [ "$DRY_RUN" = true ] && [ "$HAS_GIT" = true ]; then
+                BRANCH_NUMBER=$(check_existing_branches "$SPECS_DIR" true)
+            elif [ "$DRY_RUN" = true ]; then
+                HIGHEST=$(get_highest_from_specs "$SPECS_DIR")
+                BRANCH_NUMBER=$((HIGHEST + 1))
+            elif [ "$HAS_GIT" = true ]; then
+                BRANCH_NUMBER=$(check_existing_branches "$SPECS_DIR")
+            else
+                HIGHEST=$(get_highest_from_specs "$SPECS_DIR")
+                BRANCH_NUMBER=$((HIGHEST + 1))
+            fi
+        fi
+
+        FEATURE_NUM=$(printf "%03d" "$((10#$BRANCH_NUMBER))")
+        BRANCH_NAME="${FEATURE_NUM}-${BRANCH_SUFFIX}"
+    fi
+fi
+
+# GitHub enforces a 244-byte limit on branch names
+MAX_BRANCH_LENGTH=244
+_byte_length() { printf '%s' "$1" | LC_ALL=C wc -c | tr -d ' '; }
+BRANCH_BYTE_LEN=$(_byte_length "$BRANCH_NAME")
+if [ -n "${GIT_BRANCH_NAME:-}" ] && [ "$BRANCH_BYTE_LEN" -gt $MAX_BRANCH_LENGTH ]; then
+    >&2 echo "Error: GIT_BRANCH_NAME must be 244 bytes or fewer in UTF-8. Provided value is ${BRANCH_BYTE_LEN} bytes."
+    exit 1
+elif [ "$BRANCH_BYTE_LEN" -gt $MAX_BRANCH_LENGTH ]; then
+    PREFIX_LENGTH=$(( ${#FEATURE_NUM} + 1 ))
+    MAX_SUFFIX_LENGTH=$((MAX_BRANCH_LENGTH - PREFIX_LENGTH))
+
+    TRUNCATED_SUFFIX=$(echo "$BRANCH_SUFFIX" | cut -c1-$MAX_SUFFIX_LENGTH)
+    TRUNCATED_SUFFIX=$(echo "$TRUNCATED_SUFFIX" | sed 's/-$//')
+
+    ORIGINAL_BRANCH_NAME="$BRANCH_NAME"
+    BRANCH_NAME="${FEATURE_NUM}-${TRUNCATED_SUFFIX}"
+
+    >&2 echo "[specify] Warning: Branch name exceeded GitHub's 244-byte limit"
+    >&2 echo "[specify] Original: $ORIGINAL_BRANCH_NAME (${#ORIGINAL_BRANCH_NAME} bytes)"
+    >&2 echo "[specify] Truncated to: $BRANCH_NAME (${#BRANCH_NAME} bytes)"
+fi
+
+if [ "$DRY_RUN" != true ]; then
+    if [ "$HAS_GIT" = true ]; then
+        branch_create_error=""
+        if ! branch_create_error=$(git checkout -q -b "$BRANCH_NAME" 2>&1); then
+            current_branch="$(git rev-parse --abbrev-ref HEAD 2>/dev/null || true)"
+            if git branch --list "$BRANCH_NAME" | grep -q .; then
+                if [ "$ALLOW_EXISTING" = true ]; then
+                    if [ "$current_branch" = "$BRANCH_NAME" ]; then
+                        :
+                    elif ! switch_branch_error=$(git checkout -q "$BRANCH_NAME" 2>&1); then
+                        >&2 echo "Error: Failed to switch to existing branch '$BRANCH_NAME'. Please resolve any local changes or conflicts and try again."
+                        if [ -n "$switch_branch_error" ]; then
+                            >&2 printf '%s\n' "$switch_branch_error"
+                        fi
+                        exit 1
+                    fi
+                elif [ "$USE_TIMESTAMP" = true ]; then
+                    >&2 echo "Error: Branch '$BRANCH_NAME' already exists. Rerun to get a new timestamp or use a different --short-name."
+                    exit 1
+                else
+                    >&2 echo "Error: Branch '$BRANCH_NAME' already exists. Please use a different feature name or specify a different number with --number."
+                    exit 1
+                fi
+            else
+                >&2 echo "Error: Failed to create git branch '$BRANCH_NAME'."
+                if [ -n "$branch_create_error" ]; then
+                    >&2 printf '%s\n' "$branch_create_error"
+                else
+                    >&2 echo "Please check your git configuration and try again."
+                fi
+                exit 1
+            fi
+        fi
+    else
+        >&2 echo "[specify] Warning: Git repository not detected; skipped branch creation for $BRANCH_NAME"
+    fi
+
+    printf '# To persist: export SPECIFY_FEATURE=%q\n' "$BRANCH_NAME" >&2
+fi
+
+if $JSON_MODE; then
+    if command -v jq >/dev/null 2>&1; then
+        if [ "$DRY_RUN" = true ]; then
+            jq -cn \
+                --arg branch_name "$BRANCH_NAME" \
+                --arg feature_num "$FEATURE_NUM" \
+                '{BRANCH_NAME:$branch_name,FEATURE_NUM:$feature_num,DRY_RUN:true}'
+        else
+            jq -cn \
+                --arg branch_name "$BRANCH_NAME" \
+                --arg feature_num "$FEATURE_NUM" \
+                '{BRANCH_NAME:$branch_name,FEATURE_NUM:$feature_num}'
+        fi
+    else
+        if type json_escape >/dev/null 2>&1; then
+            _je_branch=$(json_escape "$BRANCH_NAME")
+            _je_num=$(json_escape "$FEATURE_NUM")
+        else
+            _je_branch="$BRANCH_NAME"
+            _je_num="$FEATURE_NUM"
+        fi
+        if [ "$DRY_RUN" = true ]; then
+            printf '{"BRANCH_NAME":"%s","FEATURE_NUM":"%s","DRY_RUN":true}\n' "$_je_branch" "$_je_num"
+        else
+            printf '{"BRANCH_NAME":"%s","FEATURE_NUM":"%s"}\n' "$_je_branch" "$_je_num"
+        fi
+    fi
+else
+    echo "BRANCH_NAME: $BRANCH_NAME"
+    echo "FEATURE_NUM: $FEATURE_NUM"
+    if [ "$DRY_RUN" != true ]; then
+        printf '# To persist in your shell: export SPECIFY_FEATURE=%q\n' "$BRANCH_NAME"
+    fi
+fi
--- a/.specify/extensions/git/scripts/bash/git-common.sh
+++ b/.specify/extensions/git/scripts/bash/git-common.sh
@ -0,0 +1,54 @@
+#!/usr/bin/env bash
+# Git-specific common functions for the git extension.
+# Extracted from scripts/bash/common.sh — contains only git-specific
+# branch validation and detection logic.
+
+# Check if we have git available at the repo root
+has_git() {
+    local repo_root="${1:-$(pwd)}"
+    { [ -d "$repo_root/.git" ] || [ -f "$repo_root/.git" ]; } && \
+        command -v git >/dev/null 2>&1 && \
+        git -C "$repo_root" rev-parse --is-inside-work-tree >/dev/null 2>&1
+}
+
+# Strip a single optional path segment (e.g. gitflow "feat/004-name" -> "004-name").
+# Only when the full name is exactly two slash-free segments; otherwise returns the raw name.
+spec_kit_effective_branch_name() {
+    local raw="$1"
+    if [[ "$raw" =~ ^([^/]+)/([^/]+)$ ]]; then
+        printf '%s\n' "${BASH_REMATCH[2]}"
+    else
+        printf '%s\n' "$raw"
+    fi
+}
+
+# Validate that a branch name matches the expected feature branch pattern.
+# Accepts sequential (###-* with >=3 digits) or timestamp (YYYYMMDD-HHMMSS-*) formats.
+# Logic aligned with scripts/bash/common.sh check_feature_branch after effective-name normalization.
+check_feature_branch() {
+    local raw="$1"
+    local has_git_repo="$2"
+
+    # For non-git repos, we can't enforce branch naming but still provide output
+    if [[ "$has_git_repo" != "true" ]]; then
+        echo "[specify] Warning: Git repository not detected; skipped branch validation" >&2
+        return 0
+    fi
+
+    local branch
+    branch=$(spec_kit_effective_branch_name "$raw")
+
+    # Accept sequential prefix (3+ digits) but exclude malformed timestamps
+    # Malformed: 7-or-8 digit date + 6-digit time with no trailing slug (e.g. "2026031-143022" or "20260319-143022")
+    local is_sequential=false
+    if [[ "$branch" =~ ^[0-9]{3,}- ]] && [[ ! "$branch" =~ ^[0-9]{7}-[0-9]{6}- ]] && [[ ! "$branch" =~ ^[0-9]{7,8}-[0-9]{6}$ ]]; then
+        is_sequential=true
+    fi
+    if [[ "$is_sequential" != "true" ]] && [[ ! "$branch" =~ ^[0-9]{8}-[0-9]{6}- ]]; then
+        echo "ERROR: Not on a feature branch. Current branch: $raw" >&2
+        echo "Feature branches should be named like: 001-feature-name, 1234-feature-name, or 20260319-143022-feature-name" >&2
+        return 1
+    fi
+
+    return 0
+}
--- a/.specify/extensions/git/scripts/bash/initialize-repo.sh
+++ b/.specify/extensions/git/scripts/bash/initialize-repo.sh
@ -0,0 +1,54 @@
+#!/usr/bin/env bash
+# Git extension: initialize-repo.sh
+# Initialize a Git repository with an initial commit.
+# Customizable — replace this script to add .gitignore templates,
+# default branch config, git-flow, LFS, signing, etc.
+
+set -e
+
+SCRIPT_DIR="$(CDPATH="" cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+
+# Find project root
+_find_project_root() {
+    local dir="$1"
+    while [ "$dir" != "/" ]; do
+        if [ -d "$dir/.specify" ] || [ -d "$dir/.git" ]; then
+            echo "$dir"
+            return 0
+        fi
+        dir="$(dirname "$dir")"
+    done
+    return 1
+}
+
+REPO_ROOT=$(_find_project_root "$SCRIPT_DIR") || REPO_ROOT="$(pwd)"
+cd "$REPO_ROOT"
+
+# Read commit message from extension config, fall back to default
+COMMIT_MSG="[Spec Kit] Initial commit"
+_config_file="$REPO_ROOT/.specify/extensions/git/git-config.yml"
+if [ -f "$_config_file" ]; then
+    _msg=$(grep '^init_commit_message:' "$_config_file" 2>/dev/null | sed 's/^init_commit_message:[[:space:]]*//' | sed 's/^["'\'']//' | sed 's/["'\'']*$//')
+    if [ -n "$_msg" ]; then
+        COMMIT_MSG="$_msg"
+    fi
+fi
+
+# Check if git is available
+if ! command -v git >/dev/null 2>&1; then
+    echo "[specify] Warning: Git not found; skipped repository initialization" >&2
+    exit 0
+fi
+
+# Check if already a git repo
+if git rev-parse --is-inside-work-tree >/dev/null 2>&1; then
+    echo "[specify] Git repository already initialized; skipping" >&2
+    exit 0
+fi
+
+# Initialize
+_git_out=$(git init -q 2>&1) || { echo "[specify] Error: git init failed: $_git_out" >&2; exit 1; }
+_git_out=$(git add . 2>&1) || { echo "[specify] Error: git add failed: $_git_out" >&2; exit 1; }
+_git_out=$(git commit --allow-empty -q -m "$COMMIT_MSG" 2>&1) || { echo "[specify] Error: git commit failed: $_git_out" >&2; exit 1; }
+
+echo "✓ Git repository initialized" >&2
--- a/.specify/extensions/git/scripts/powershell/auto-commit.ps1
+++ b/.specify/extensions/git/scripts/powershell/auto-commit.ps1
@ -0,0 +1,169 @@
+#!/usr/bin/env pwsh
+# Git extension: auto-commit.ps1
+# Automatically commit changes after a Spec Kit command completes.
+# Checks per-command config keys in git-config.yml before committing.
+#
+# Usage: auto-commit.ps1 <event_name>
+#   e.g.: auto-commit.ps1 after_specify
+param(
+    [Parameter(Position = 0, Mandatory = $true)]
+    [string]$EventName
+)
+$ErrorActionPreference = 'Stop'
+
+function Find-ProjectRoot {
+    param([string]$StartDir)
+    $current = Resolve-Path $StartDir
+    while ($true) {
+        foreach ($marker in @('.specify', '.git')) {
+            if (Test-Path (Join-Path $current $marker)) {
+                return $current
+            }
+        }
+        $parent = Split-Path $current -Parent
+        if ($parent -eq $current) { return $null }
+        $current = $parent
+    }
+}
+
+$repoRoot = Find-ProjectRoot -StartDir $PSScriptRoot
+if (-not $repoRoot) { $repoRoot = Get-Location }
+Set-Location $repoRoot
+
+# Check if git is available
+if (-not (Get-Command git -ErrorAction SilentlyContinue)) {
+    Write-Warning "[specify] Warning: Git not found; skipped auto-commit"
+    exit 0
+}
+
+# Temporarily relax ErrorActionPreference so git stderr warnings
+# (e.g. CRLF notices on Windows) do not become terminating errors.
+$savedEAP = $ErrorActionPreference
+$ErrorActionPreference = 'Continue'
+try {
+    git rev-parse --is-inside-work-tree 2>$null | Out-Null
+    $isRepo = $LASTEXITCODE -eq 0
+} finally {
+    $ErrorActionPreference = $savedEAP
+}
+if (-not $isRepo) {
+    Write-Warning "[specify] Warning: Not a Git repository; skipped auto-commit"
+    exit 0
+}
+
+# Read per-command config from git-config.yml
+$configFile = Join-Path $repoRoot ".specify/extensions/git/git-config.yml"
+$enabled = $false
+$commitMsg = ""
+
+if (Test-Path $configFile) {
+    # Parse YAML to find auto_commit section
+    $inAutoCommit = $false
+    $inEvent = $false
+    $defaultEnabled = $false
+
+    foreach ($line in Get-Content $configFile) {
+        # Detect auto_commit: section
+        if ($line -match '^auto_commit:') {
+            $inAutoCommit = $true
+            $inEvent = $false
+            continue
+        }
+
+        # Exit auto_commit section on next top-level key
+        if ($inAutoCommit -and $line -match '^[a-z]') {
+            break
+        }
+
+        if ($inAutoCommit) {
+            # Check default key
+            if ($line -match '^\s+default:\s*(.+)$') {
+                $val = $matches[1].Trim().ToLower()
+                if ($val -eq 'true') { $defaultEnabled = $true }
+            }
+
+            # Detect our event subsection
+            if ($line -match "^\s+${EventName}:") {
+                $inEvent = $true
+                continue
+            }
+
+            # Inside our event subsection
+            if ($inEvent) {
+                # Exit on next sibling key (2-space indent, not 4+)
+                if ($line -match '^\s{2}[a-z]' -and $line -notmatch '^\s{4}') {
+                    $inEvent = $false
+                    continue
+                }
+                if ($line -match '\s+enabled:\s*(.+)$') {
+                    $val = $matches[1].Trim().ToLower()
+                    if ($val -eq 'true') { $enabled = $true }
+                    if ($val -eq 'false') { $enabled = $false }
+                }
+                if ($line -match '\s+message:\s*(.+)$') {
+                    $commitMsg = $matches[1].Trim() -replace '^["'']' -replace '["'']$'
+                }
+            }
+        }
+    }
+
+    # If event-specific key not found, use default
+    if (-not $enabled -and $defaultEnabled) {
+        $hasEventKey = Select-String -Path $configFile -Pattern "^\s*${EventName}:" -Quiet
+        if (-not $hasEventKey) {
+            $enabled = $true
+        }
+    }
+} else {
+    # No config file -- auto-commit disabled by default
+    exit 0
+}
+
+if (-not $enabled) {
+    exit 0
+}
+
+# Check if there are changes to commit
+# Relax ErrorActionPreference so CRLF warnings on stderr do not terminate.
+$savedEAP = $ErrorActionPreference
+$ErrorActionPreference = 'Continue'
+try {
+    git diff --quiet HEAD 2>$null; $d1 = $LASTEXITCODE
+    git diff --cached --quiet 2>$null; $d2 = $LASTEXITCODE
+    $untracked = git ls-files --others --exclude-standard 2>$null
+} finally {
+    $ErrorActionPreference = $savedEAP
+}
+
+if ($d1 -eq 0 -and $d2 -eq 0 -and -not $untracked) {
+    Write-Host "[specify] No changes to commit after $EventName" -ForegroundColor DarkGray
+    exit 0
+}
+
+# Derive a human-readable command name from the event
+$commandName = $EventName -replace '^after_', '' -replace '^before_', ''
+$phase = if ($EventName -match '^before_') { 'before' } else { 'after' }
+
+# Use custom message if configured, otherwise default
+if (-not $commitMsg) {
+    $commitMsg = "[Spec Kit] Auto-commit $phase $commandName"
+}
+
+# Stage and commit
+# Relax ErrorActionPreference so CRLF warnings on stderr do not terminate,
+# while still allowing redirected error output to be captured for diagnostics.
+$savedEAP = $ErrorActionPreference
+$ErrorActionPreference = 'Continue'
+try {
+    $out = git add . 2>&1 | Out-String
+    if ($LASTEXITCODE -ne 0) { throw "git add failed: $out" }
+    $out = git commit -q -m $commitMsg 2>&1 | Out-String
+    if ($LASTEXITCODE -ne 0) { throw "git commit failed: $out" }
+} catch {
+    Write-Warning "[specify] Error: $_"
+    exit 1
+} finally {
+    $ErrorActionPreference = $savedEAP
+}
+
+Write-Host "[OK] Changes committed $phase $commandName"
--- a/.specify/extensions/git/scripts/powershell/create-new-feature.ps1
+++ b/.specify/extensions/git/scripts/powershell/create-new-feature.ps1
@ -0,0 +1,403 @@
+#!/usr/bin/env pwsh
+# Git extension: create-new-feature.ps1
+# Adapted from core scripts/powershell/create-new-feature.ps1 for extension layout.
+# Sources common.ps1 from the project's installed scripts, falling back to
+# git-common.ps1 for minimal git helpers.
+[CmdletBinding()]
+param(
+    [switch]$Json,
+    [switch]$AllowExistingBranch,
+    [switch]$DryRun,
+    [string]$ShortName,
+    [Parameter()]
+    [long]$Number = 0,
+    [switch]$Timestamp,
+    [switch]$Help,
+    [Parameter(Position = 0, ValueFromRemainingArguments = $true)]
+    [string[]]$FeatureDescription
+)
+$ErrorActionPreference = 'Stop'
+
+if ($Help) {
+    Write-Host "Usage: ./create-new-feature.ps1 [-Json] [-DryRun] [-AllowExistingBranch] [-ShortName <name>] [-Number N] [-Timestamp] <feature description>"
+    Write-Host ""
+    Write-Host "Options:"
+    Write-Host "  -Json               Output in JSON format"
+    Write-Host "  -DryRun             Compute branch name without creating the branch"
+    Write-Host "  -AllowExistingBranch  Switch to branch if it already exists instead of failing"
+    Write-Host "  -ShortName <name>   Provide a custom short name (2-4 words) for the branch"
+    Write-Host "  -Number N           Specify branch number manually (overrides auto-detection)"
+    Write-Host "  -Timestamp          Use timestamp prefix (YYYYMMDD-HHMMSS) instead of sequential numbering"
+    Write-Host "  -Help               Show this help message"
+    Write-Host ""
+    Write-Host "Environment variables:"
+    Write-Host "  GIT_BRANCH_NAME     Use this exact branch name, bypassing all prefix/suffix generation"
+    Write-Host ""
+    exit 0
+}
+
+if (-not $FeatureDescription -or $FeatureDescription.Count -eq 0) {
+    Write-Error "Usage: ./create-new-feature.ps1 [-Json] [-DryRun] [-AllowExistingBranch] [-ShortName <name>] [-Number N] [-Timestamp] <feature description>"
+    exit 1
+}
+
+$featureDesc = ($FeatureDescription -join ' ').Trim()
+
+if ([string]::IsNullOrWhiteSpace($featureDesc)) {
+    Write-Error "Error: Feature description cannot be empty or contain only whitespace"
+    exit 1
+}
+
+function Get-HighestNumberFromSpecs {
+    param([string]$SpecsDir)
+
+    [long]$highest = 0
+    if (Test-Path $SpecsDir) {
+        Get-ChildItem -Path $SpecsDir -Directory | ForEach-Object {
+            if ($_.Name -match '^(\d{3,})-' -and $_.Name -notmatch '^\d{8}-\d{6}-') {
+                [long]$num = 0
+                if ([long]::TryParse($matches[1], [ref]$num) -and $num -gt $highest) {
+                    $highest = $num
+                }
+            }
+        }
+    }
+    return $highest
+}
+
+function Get-HighestNumberFromNames {
+    param([string[]]$Names)
+
+    [long]$highest = 0
+    foreach ($name in $Names) {
+        if ($name -match '^(\d{3,})-' -and $name -notmatch '^\d{8}-\d{6}-') {
+            [long]$num = 0
+            if ([long]::TryParse($matches[1], [ref]$num) -and $num -gt $highest) {
+                $highest = $num
+            }
+        }
+    }
+    return $highest
+}
+
+function Get-HighestNumberFromBranches {
+    param()
+
+    try {
+        $branches = git branch -a 2>$null
+        if ($LASTEXITCODE -eq 0 -and $branches) {
+            $cleanNames = $branches | ForEach-Object {
+                $_.Trim() -replace '^\*?\s+', '' -replace '^remotes/[^/]+/', ''
+            }
+            return Get-HighestNumberFromNames -Names $cleanNames
+        }
+    } catch {
+        Write-Verbose "Could not check Git branches: $_"
+    }
+    return 0
+}
+
+function Get-HighestNumberFromRemoteRefs {
+    [long]$highest = 0
+    try {
+        $remotes = git remote 2>$null
+        if ($remotes) {
+            foreach ($remote in $remotes) {
+                $env:GIT_TERMINAL_PROMPT = '0'
+                $refs = git ls-remote --heads $remote 2>$null
+                $env:GIT_TERMINAL_PROMPT = $null
+                if ($LASTEXITCODE -eq 0 -and $refs) {
+                    $refNames = $refs | ForEach-Object {
+                        if ($_ -match 'refs/heads/(.+)$') { $matches[1] }
+                    } | Where-Object { $_ }
+                    $remoteHighest = Get-HighestNumberFromNames -Names $refNames
+                    if ($remoteHighest -gt $highest) { $highest = $remoteHighest }
+                }
+            }
+        }
+    } catch {
+        Write-Verbose "Could not query remote refs: $_"
+    }
+    return $highest
+}
+
+function Get-NextBranchNumber {
+    param(
+        [string]$SpecsDir,
+        [switch]$SkipFetch
+    )
+
+    if ($SkipFetch) {
+        $highestBranch = Get-HighestNumberFromBranches
+        $highestRemote = Get-HighestNumberFromRemoteRefs
+        $highestBranch = [Math]::Max($highestBranch, $highestRemote)
+    } else {
+        try {
+            git fetch --all --prune 2>$null | Out-Null
+        } catch { }
+        $highestBranch = Get-HighestNumberFromBranches
+    }
+
+    $highestSpec = Get-HighestNumberFromSpecs -SpecsDir $SpecsDir
+    $maxNum = [Math]::Max($highestBranch, $highestSpec)
+    return $maxNum + 1
+}
+
+function ConvertTo-CleanBranchName {
+    param([string]$Name)
+    return $Name.ToLower() -replace '[^a-z0-9]', '-' -replace '-{2,}', '-' -replace '^-', '' -replace '-$', ''
+}
+
+# ---------------------------------------------------------------------------
+# Source common.ps1 from the project's installed scripts.
+# Search locations in priority order:
+#  1. .specify/scripts/powershell/common.ps1 under the project root
+#  2. scripts/powershell/common.ps1 under the project root (source checkout)
+#  3. git-common.ps1 next to this script (minimal fallback)
+# ---------------------------------------------------------------------------
+function Find-ProjectRoot {
+    param([string]$StartDir)
+    $current = Resolve-Path $StartDir
+    while ($true) {
+        foreach ($marker in @('.specify', '.git')) {
+            if (Test-Path (Join-Path $current $marker)) {
+                return $current
+            }
+        }
+        $parent = Split-Path $current -Parent
+        if ($parent -eq $current) { return $null }
+        $current = $parent
+    }
+}
+
+$projectRoot = Find-ProjectRoot -StartDir $PSScriptRoot
+$commonLoaded = $false
+
+if ($projectRoot) {
+    $candidates = @(
+        (Join-Path $projectRoot ".specify/scripts/powershell/common.ps1"),
+        (Join-Path $projectRoot "scripts/powershell/common.ps1")
+    )
+    foreach ($candidate in $candidates) {
+        if (Test-Path $candidate) {
+            . $candidate
+            $commonLoaded = $true
+            break
+        }
+    }
+}
+
+if (-not $commonLoaded -and (Test-Path "$PSScriptRoot/git-common.ps1")) {
+    . "$PSScriptRoot/git-common.ps1"
+    $commonLoaded = $true
+}
+
+if (-not $commonLoaded) {
+    throw "Unable to locate common script file. Please ensure the Specify core scripts are installed."
+}
+
+# Resolve repository root
+if (Get-Command Get-RepoRoot -ErrorAction SilentlyContinue) {
+    $repoRoot = Get-RepoRoot
+} elseif ($projectRoot) {
+    $repoRoot = $projectRoot
+} else {
+    throw "Could not determine repository root."
+}
+
+# Check if git is available
+if (Get-Command Test-HasGit -ErrorAction SilentlyContinue) {
+    # Call without parameters for compatibility with core common.ps1 (no -RepoRoot param)
+    # and git-common.ps1 (has -RepoRoot param with default).
+    $hasGit = Test-HasGit
+} else {
+    try {
+        git -C $repoRoot rev-parse --is-inside-work-tree 2>$null | Out-Null
+        $hasGit = ($LASTEXITCODE -eq 0)
+    } catch {
+        $hasGit = $false
+    }
+}
+
+Set-Location $repoRoot
+
+$specsDir = Join-Path $repoRoot 'specs'
+
+function Get-BranchName {
+    param([string]$Description)
+
+    $stopWords = @(
+        'i', 'a', 'an', 'the', 'to', 'for', 'of', 'in', 'on', 'at', 'by', 'with', 'from',
+        'is', 'are', 'was', 'were', 'be', 'been', 'being', 'have', 'has', 'had',
+        'do', 'does', 'did', 'will', 'would', 'should', 'could', 'can', 'may', 'might', 'must', 'shall',
+        'this', 'that', 'these', 'those', 'my', 'your', 'our', 'their',
+        'want', 'need', 'add', 'get', 'set'
+    )
+
+    $cleanName = $Description.ToLower() -replace '[^a-z0-9\s]', ' '
+    $words = $cleanName -split '\s+' | Where-Object { $_ }
+
+    $meaningfulWords = @()
+    foreach ($word in $words) {
+        if ($stopWords -contains $word) { continue }
+        if ($word.Length -ge 3) {
+            $meaningfulWords += $word
+        } elseif ($Description -match "\b$($word.ToUpper())\b") {
+            $meaningfulWords += $word
+        }
+    }
+
+    if ($meaningfulWords.Count -gt 0) {
+        $maxWords = if ($meaningfulWords.Count -eq 4) { 4 } else { 3 }
+        $result = ($meaningfulWords | Select-Object -First $maxWords) -join '-'
+        return $result
+    } else {
+        $result = ConvertTo-CleanBranchName -Name $Description
+        $fallbackWords = ($result -split '-') | Where-Object { $_ } | Select-Object -First 3
+        return [string]::Join('-', $fallbackWords)
+    }
+}
+
+# Check for GIT_BRANCH_NAME env var override (exact branch name, no prefix/suffix)
+if ($env:GIT_BRANCH_NAME) {
+    $branchName = $env:GIT_BRANCH_NAME
+    # Check 244-byte limit (UTF-8) for override names
+    $branchNameUtf8ByteCount = [System.Text.Encoding]::UTF8.GetByteCount($branchName)
+    if ($branchNameUtf8ByteCount -gt 244) {
+        throw "GIT_BRANCH_NAME must be 244 bytes or fewer in UTF-8. Provided value is $branchNameUtf8ByteCount bytes; please supply a shorter override branch name."
+    }
+    # Extract FEATURE_NUM from the branch name if it starts with a numeric prefix
+    # Check timestamp pattern first (YYYYMMDD-HHMMSS-) since it also matches the simpler ^\d+ pattern
+    if ($branchName -match '^(\d{8}-\d{6})-') {
+        $featureNum = $matches[1]
+    } elseif ($branchName -match '^(\d+)-') {
+        $featureNum = $matches[1]
+    } else {
+        $featureNum = $branchName
+    }
+} else {
+    if ($ShortName) {
+        $branchSuffix = ConvertTo-CleanBranchName -Name $ShortName
+    } else {
+        $branchSuffix = Get-BranchName -Description $featureDesc
+    }
+
+    if ($Timestamp -and $Number -ne 0) {
+        Write-Warning "[specify] Warning: -Number is ignored when -Timestamp is used"
+        $Number = 0
+    }
+
+    if ($Timestamp) {
+        $featureNum = Get-Date -Format 'yyyyMMdd-HHmmss'
+        $branchName = "$featureNum-$branchSuffix"
+    } else {
+        if ($Number -eq 0) {
+            if ($DryRun -and $hasGit) {
+                $Number = Get-NextBranchNumber -SpecsDir $specsDir -SkipFetch
+            } elseif ($DryRun) {
+                $Number = (Get-HighestNumberFromSpecs -SpecsDir $specsDir) + 1
+            } elseif ($hasGit) {
+                $Number = Get-NextBranchNumber -SpecsDir $specsDir
+            } else {
+                $Number = (Get-HighestNumberFromSpecs -SpecsDir $specsDir) + 1
+            }
+        }
+
+        $featureNum = ('{0:000}' -f $Number)
+        $branchName = "$featureNum-$branchSuffix"
+    }
+}
+
+$maxBranchLength = 244
+if ($branchName.Length -gt $maxBranchLength) {
+    $prefixLength = $featureNum.Length + 1
+    $maxSuffixLength = $maxBranchLength - $prefixLength
+
+    $truncatedSuffix = $branchSuffix.Substring(0, [Math]::Min($branchSuffix.Length, $maxSuffixLength))
+    $truncatedSuffix = $truncatedSuffix -replace '-$', ''
+
+    $originalBranchName = $branchName
+    $branchName = "$featureNum-$truncatedSuffix"
+
+    Write-Warning "[specify] Branch name exceeded GitHub's 244-byte limit"
+    Write-Warning "[specify] Original: $originalBranchName ($($originalBranchName.Length) bytes)"
+    Write-Warning "[specify] Truncated to: $branchName ($($branchName.Length) bytes)"
+}
+
+if (-not $DryRun) {
+    if ($hasGit) {
+        $branchCreated = $false
+        $branchCreateError = ''
+        try {
+            $branchCreateError = git checkout -q -b $branchName 2>&1 | Out-String
+            if ($LASTEXITCODE -eq 0) {
+                $branchCreated = $true
+            }
+        } catch {
+            $branchCreateError = $_.Exception.Message
+        }
+
+        if (-not $branchCreated) {
+            $currentBranch = ''
+            try { $currentBranch = (git rev-parse --abbrev-ref HEAD 2>$null).Trim() } catch {}
+            $existingBranch = git branch --list $branchName 2>$null
+            if ($existingBranch) {
+                if ($AllowExistingBranch) {
+                    if ($currentBranch -eq $branchName) {
+                        # Already on the target branch
+                    } else {
+                        $switchBranchError = git checkout -q $branchName 2>&1 | Out-String
+                        if ($LASTEXITCODE -ne 0) {
+                            if ($switchBranchError) {
+                                Write-Error "Error: Branch '$branchName' exists but could not be checked out.`n$($switchBranchError.Trim())"
+                            } else {
+                                Write-Error "Error: Branch '$branchName' exists but could not be checked out. Resolve any uncommitted changes or conflicts and try again."
+                            }
+                            exit 1
+                        }
+                    }
+                } elseif ($Timestamp) {
+                    Write-Error "Error: Branch '$branchName' already exists. Rerun to get a new timestamp or use a different -ShortName."
+                    exit 1
+                } else {
+                    Write-Error "Error: Branch '$branchName' already exists. Please use a different feature name or specify a different number with -Number."
+                    exit 1
+                }
+            } else {
+                if ($branchCreateError) {
+                    Write-Error "Error: Failed to create git branch '$branchName'.`n$($branchCreateError.Trim())"
+                } else {
+                    Write-Error "Error: Failed to create git branch '$branchName'. Please check your git configuration and try again."
+                }
+                exit 1
+            }
+        }
+    } else {
+        if ($Json) {
+            [Console]::Error.WriteLine("[specify] Warning: Git repository not detected; skipped branch creation for $branchName")
+        } else {
+            Write-Warning "[specify] Warning: Git repository not detected; skipped branch creation for $branchName"
+        }
+    }
+
+    $env:SPECIFY_FEATURE = $branchName
+}
+
+if ($Json) {
+    $obj = [PSCustomObject]@{
+        BRANCH_NAME = $branchName
+        FEATURE_NUM = $featureNum
+        HAS_GIT = $hasGit
+    }
+    if ($DryRun) {
+        $obj | Add-Member -NotePropertyName 'DRY_RUN' -NotePropertyValue $true
+    }
+    $obj | ConvertTo-Json -Compress
+} else {
+    Write-Output "BRANCH_NAME: $branchName"
+    Write-Output "FEATURE_NUM: $featureNum"
+    Write-Output "HAS_GIT: $hasGit"
+    if (-not $DryRun) {
+        Write-Output "SPECIFY_FEATURE environment variable set to: $branchName"
+    }
+}
--- a/.specify/extensions/git/scripts/powershell/git-common.ps1
+++ b/.specify/extensions/git/scripts/powershell/git-common.ps1
@ -0,0 +1,51 @@
+#!/usr/bin/env pwsh
+# Git-specific common functions for the git extension.
+# Extracted from scripts/powershell/common.ps1 -- contains only git-specific
+# branch validation and detection logic.
+
+function Test-HasGit {
+    param([string]$RepoRoot = (Get-Location))
+    try {
+        if (-not (Test-Path (Join-Path $RepoRoot '.git'))) { return $false }
+        if (-not (Get-Command git -ErrorAction SilentlyContinue)) { return $false }
+        git -C $RepoRoot rev-parse --is-inside-work-tree 2>$null | Out-Null
+        return ($LASTEXITCODE -eq 0)
+    } catch {
+        return $false
+    }
+}
+
+function Get-SpecKitEffectiveBranchName {
+    param([string]$Branch)
+    if ($Branch -match '^([^/]+)/([^/]+)$') {
+        return $Matches[2]
+    }
+    return $Branch
+}
+
+function Test-FeatureBranch {
+    param(
+        [string]$Branch,
+        [bool]$HasGit = $true
+    )
+
+    # For non-git repos, we can't enforce branch naming but still provide output
+    if (-not $HasGit) {
+        Write-Warning "[specify] Warning: Git repository not detected; skipped branch validation"
+        return $true
+    }
+
+    $raw = $Branch
+    $Branch = Get-SpecKitEffectiveBranchName $raw
+
+    # Accept sequential prefix (3+ digits) but exclude malformed timestamps
+    # Malformed: 7-or-8 digit date + 6-digit time with no trailing slug (e.g. "2026031-143022" or "20260319-143022")
+    $hasMalformedTimestamp = ($Branch -match '^[0-9]{7}-[0-9]{6}-') -or ($Branch -match '^(?:\d{7}|\d{8})-\d{6}$')
+    $isSequential = ($Branch -match '^[0-9]{3,}-') -and (-not $hasMalformedTimestamp)
+    if (-not $isSequential -and $Branch -notmatch '^\d{8}-\d{6}-') {
+        [Console]::Error.WriteLine("ERROR: Not on a feature branch. Current branch: $raw")
+        [Console]::Error.WriteLine("Feature branches should be named like: 001-feature-name, 1234-feature-name, or 20260319-143022-feature-name")
+        return $false
+    }
+    return $true
+}
--- a/.specify/extensions/git/scripts/powershell/initialize-repo.ps1
+++ b/.specify/extensions/git/scripts/powershell/initialize-repo.ps1
@ -0,0 +1,69 @@
+#!/usr/bin/env pwsh
+# Git extension: initialize-repo.ps1
+# Initialize a Git repository with an initial commit.
+# Customizable -- replace this script to add .gitignore templates,
+# default branch config, git-flow, LFS, signing, etc.
+$ErrorActionPreference = 'Stop'
+
+# Find project root
+function Find-ProjectRoot {
+    param([string]$StartDir)
+    $current = Resolve-Path $StartDir
+    while ($true) {
+        foreach ($marker in @('.specify', '.git')) {
+            if (Test-Path (Join-Path $current $marker)) {
+                return $current
+            }
+        }
+        $parent = Split-Path $current -Parent
+        if ($parent -eq $current) { return $null }
+        $current = $parent
+    }
+}
+
+$repoRoot = Find-ProjectRoot -StartDir $PSScriptRoot
+if (-not $repoRoot) { $repoRoot = Get-Location }
+Set-Location $repoRoot
+
+# Read commit message from extension config, fall back to default
+$commitMsg = "[Spec Kit] Initial commit"
+$configFile = Join-Path $repoRoot ".specify/extensions/git/git-config.yml"
+if (Test-Path $configFile) {
+    foreach ($line in Get-Content $configFile) {
+        if ($line -match '^init_commit_message:\s*(.+)$') {
+            $val = $matches[1].Trim() -replace '^["'']' -replace '["'']$'
+            if ($val) { $commitMsg = $val }
+            break
+        }
+    }
+}
+
+# Check if git is available
+if (-not (Get-Command git -ErrorAction SilentlyContinue)) {
+    Write-Warning "[specify] Warning: Git not found; skipped repository initialization"
+    exit 0
+}
+
+# Check if already a git repo
+try {
+    git rev-parse --is-inside-work-tree 2>$null | Out-Null
+    if ($LASTEXITCODE -eq 0) {
+        Write-Warning "[specify] Git repository already initialized; skipping"
+        exit 0
+    }
+} catch { }
+
+# Initialize
+try {
+    $out = git init -q 2>&1 | Out-String
+    if ($LASTEXITCODE -ne 0) { throw "git init failed: $out" }
+    $out = git add . 2>&1 | Out-String
+    if ($LASTEXITCODE -ne 0) { throw "git add failed: $out" }
+    $out = git commit --allow-empty -q -m $commitMsg 2>&1 | Out-String
+    if ($LASTEXITCODE -ne 0) { throw "git commit failed: $out" }
+} catch {
+    Write-Warning "[specify] Error: $_"
+    exit 1
+}
+
+Write-Host "[OK] Git repository initialized"
--- a/.specify/init-options.json
+++ b/.specify/init-options.json
@ -0,0 +1,9 @@
+{
+  "ai": "codex",
+  "ai_skills": true,
+  "branch_numbering": "sequential",
+  "here": true,
+  "integration": "codex",
+  "script": "sh",
+  "speckit_version": "0.9.5"
+}
--- a/.specify/integration.json
+++ b/.specify/integration.json
@ -0,0 +1,19 @@
+{
+  "version": "0.9.5",
+  "integration_state_schema": 1,
+  "installed_integrations": [
+    "codex"
+  ],
+  "integration_settings": {
+    "codex": {
+      "script": "sh",
+      "raw_options": "--skills",
+      "parsed_options": {
+        "skills": true
+      },
+      "invoke_separator": "-"
+    }
+  },
+  "integration": "codex",
+  "default_integration": "codex"
+}
--- a/.specify/integrations/codex.manifest.json
+++ b/.specify/integrations/codex.manifest.json
@ -0,0 +1,16 @@
+{
+  "integration": "codex",
+  "version": "0.9.5",
+  "installed_at": "2026-06-08T01:33:59.539838+00:00",
+  "files": {
+    ".agents/skills/speckit-analyze/SKILL.md": "753f1d49d830abc130132ad2864c780ea61fd57bbc71aa9888be24fdf0774800",
+    ".agents/skills/speckit-clarify/SKILL.md": "08e643cb56c88adf1f4b28821d490360186f6bc0dfb1f21a059e16e4b8e89b91",
+    ".agents/skills/speckit-constitution/SKILL.md": "e2cbe859958c5a05be52a44d63821e6a84d39f3d37acc05b550cc7ad85da0dab",
+    ".agents/skills/speckit-implement/SKILL.md": "796ab9a7f04281fee7d390087e89438f4215cbe2396a8a0118dafd12c0268894",
+    ".agents/skills/speckit-plan/SKILL.md": "67bfc751600f8ba46c8cf6fd32e609e1b5a468ab7eb62e26aaae70009ecb89f0",
+    ".agents/skills/speckit-checklist/SKILL.md": "734393e5698b390283db49135e1140d6ad529b65eae439bb0a53bc5acab2b529",
+    ".agents/skills/speckit-specify/SKILL.md": "e74c7b705bebbdf457d0b01e928a4d4f25bd3f77b8c650a2ef3c463e706550ec",
+    ".agents/skills/speckit-tasks/SKILL.md": "bb461317a2b17eda72250202197a8307e519a24cd22758d3091389c70d869af1",
+    ".agents/skills/speckit-taskstoissues/SKILL.md": "a3efcf92cf532420c10abf7b9253204ade34b7329888c28e271fa3da7750c584"
+  }
+}
--- a/.specify/integrations/speckit.manifest.json
+++ b/.specify/integrations/speckit.manifest.json
@ -0,0 +1,17 @@
+{
+  "integration": "speckit",
+  "version": "0.9.5",
+  "installed_at": "2026-06-08T01:33:59.545362+00:00",
+  "files": {
+    ".specify/scripts/bash/check-prerequisites.sh": "f4541a00257f035aa55a9fede6d964e51e6851c3dc2f81d0a6f367db18944765",
+    ".specify/scripts/bash/setup-tasks.sh": "7aeee15192a5ab3ba9ff3c3ae450d9994043bf0493c1eabc840da72a9742fc87",
+    ".specify/scripts/bash/setup-plan.sh": "b23cca3d769a217ab812a6059adb549622471f6893af234cf98ca2019ac4e1a1",
+    ".specify/scripts/bash/create-new-feature.sh": "bcf4964ca0c6c78717bb42d9e66b8c7e5ee82779cd96afc5aa7b08b75abe5790",
+    ".specify/scripts/bash/common.sh": "1b52fdc114424b83784d59477256e1854c23ee3135273625904eb0231cc0c37e",
+    ".specify/templates/plan-template.md": "cc7f7979cf8d8836ec26492785affd80791d3422a2b745062ec695be8c985ef7",
+    ".specify/templates/constitution-template.md": "ce7549540fa45543cca797a150201d868e64495fdff39dc38246fb17bd4024b3",
+    ".specify/templates/checklist-template.md": "c37695297e5d3153d64f82c21223509940b13932046c7961c42d1d669516130c",
+    ".specify/templates/spec-template.md": "3945437fc35cd30a5b2bf7beea680337c3516826d3efa5a6b92c4a7eca1ba28e",
+    ".specify/templates/tasks-template.md": "fc29a233f6f5a27ca31f1aa46b596af6500c627441c6e62b2bc4a1d721525842"
+  }
+}
--- a/.specify/memory/constitution.md
+++ b/.specify/memory/constitution.md
@ -0,0 +1,50 @@
+# [PROJECT_NAME] Constitution
+<!-- Example: Spec Constitution, TaskFlow Constitution, etc. -->
+
+## Core Principles
+
+### [PRINCIPLE_1_NAME]
+<!-- Example: I. Library-First -->
+[PRINCIPLE_1_DESCRIPTION]
+<!-- Example: Every feature starts as a standalone library; Libraries must be self-contained, independently testable, documented; Clear purpose required - no organizational-only libraries -->
+
+### [PRINCIPLE_2_NAME]
+<!-- Example: II. CLI Interface -->
+[PRINCIPLE_2_DESCRIPTION]
+<!-- Example: Every library exposes functionality via CLI; Text in/out protocol: stdin/args → stdout, errors → stderr; Support JSON + human-readable formats -->
+
+### [PRINCIPLE_3_NAME]
+<!-- Example: III. Test-First (NON-NEGOTIABLE) -->
+[PRINCIPLE_3_DESCRIPTION]
+<!-- Example: TDD mandatory: Tests written → User approved → Tests fail → Then implement; Red-Green-Refactor cycle strictly enforced -->
+
+### [PRINCIPLE_4_NAME]
+<!-- Example: IV. Integration Testing -->
+[PRINCIPLE_4_DESCRIPTION]
+<!-- Example: Focus areas requiring integration tests: New library contract tests, Contract changes, Inter-service communication, Shared schemas -->
+
+### [PRINCIPLE_5_NAME]
+<!-- Example: V. Observability, VI. Versioning & Breaking Changes, VII. Simplicity -->
+[PRINCIPLE_5_DESCRIPTION]
+<!-- Example: Text I/O ensures debuggability; Structured logging required; Or: MAJOR.MINOR.BUILD format; Or: Start simple, YAGNI principles -->
+
+## [SECTION_2_NAME]
+<!-- Example: Additional Constraints, Security Requirements, Performance Standards, etc. -->
+
+[SECTION_2_CONTENT]
+<!-- Example: Technology stack requirements, compliance standards, deployment policies, etc. -->
+
+## [SECTION_3_NAME]
+<!-- Example: Development Workflow, Review Process, Quality Gates, etc. -->
+
+[SECTION_3_CONTENT]
+<!-- Example: Code review requirements, testing gates, deployment approval process, etc. -->
+
+## Governance
+<!-- Example: Constitution supersedes all other practices; Amendments require documentation, approval, migration plan -->
+
+[GOVERNANCE_RULES]
+<!-- Example: All PRs/reviews must verify compliance; Complexity must be justified; Use [GUIDANCE_FILE] for runtime development guidance -->
+
+**Version**: [CONSTITUTION_VERSION] | **Ratified**: [RATIFICATION_DATE] | **Last Amended**: [LAST_AMENDED_DATE]
+<!-- Example: Version: 2.1.1 | Ratified: 2025-06-13 | Last Amended: 2025-07-16 -->
--- a/.specify/scripts/bash/check-prerequisites.sh
+++ b/.specify/scripts/bash/check-prerequisites.sh
@ -0,0 +1,192 @@
+#!/usr/bin/env bash
+
+# Consolidated prerequisite checking script
+#
+# This script provides unified prerequisite checking for Spec-Driven Development workflow.
+# It replaces the functionality previously spread across multiple scripts.
+#
+# Usage: ./check-prerequisites.sh [OPTIONS]
+#
+# OPTIONS:
+#   --json              Output in JSON format
+#   --require-tasks     Require tasks.md to exist (for implementation phase)
+#   --include-tasks     Include tasks.md in AVAILABLE_DOCS list
+#   --paths-only        Only output path variables (no validation)
+#   --help, -h          Show help message
+#
+# OUTPUTS:
+#   JSON mode: {"FEATURE_DIR":"...", "AVAILABLE_DOCS":["..."]}
+#   Text mode: FEATURE_DIR:... \n AVAILABLE_DOCS: \n ✓/✗ file.md
+#   Paths only: REPO_ROOT: ... \n BRANCH: ... \n FEATURE_DIR: ... etc.
+
+set -e
+
+# Parse command line arguments
+JSON_MODE=false
+REQUIRE_TASKS=false
+INCLUDE_TASKS=false
+PATHS_ONLY=false
+
+for arg in "$@"; do
+    case "$arg" in
+        --json)
+            JSON_MODE=true
+            ;;
+        --require-tasks)
+            REQUIRE_TASKS=true
+            ;;
+        --include-tasks)
+            INCLUDE_TASKS=true
+            ;;
+        --paths-only)
+            PATHS_ONLY=true
+            ;;
+        --help|-h)
+            cat << 'EOF'
+Usage: check-prerequisites.sh [OPTIONS]
+
+Consolidated prerequisite checking for Spec-Driven Development workflow.
+
+OPTIONS:
+  --json              Output in JSON format
+  --require-tasks     Require tasks.md to exist (for implementation phase)
+  --include-tasks     Include tasks.md in AVAILABLE_DOCS list
+  --paths-only        Only output path variables (no prerequisite validation)
+  --help, -h          Show this help message
+
+EXAMPLES:
+  # Check task prerequisites (plan.md required)
+  ./check-prerequisites.sh --json
+  
+  # Check implementation prerequisites (plan.md + tasks.md required)
+  ./check-prerequisites.sh --json --require-tasks --include-tasks
+  
+  # Get feature paths only (no validation)
+  ./check-prerequisites.sh --paths-only
+  
+EOF
+            exit 0
+            ;;
+        *)
+            echo "ERROR: Unknown option '$arg'. Use --help for usage information." >&2
+            exit 1
+            ;;
+    esac
+done
+
+# Source common functions
+SCRIPT_DIR="$(CDPATH="" cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+source "$SCRIPT_DIR/common.sh"
+
+# Get feature paths
+_paths_output=$(get_feature_paths) || { echo "ERROR: Failed to resolve feature paths" >&2; exit 1; }
+eval "$_paths_output"
+unset _paths_output
+
+# If paths-only mode, output paths and exit (no validation)
+if $PATHS_ONLY; then
+    if $JSON_MODE; then
+        # Minimal JSON paths payload (no validation performed)
+        if has_jq; then
+            jq -cn \
+                --arg repo_root "$REPO_ROOT" \
+                --arg branch "$CURRENT_BRANCH" \
+                --arg feature_dir "$FEATURE_DIR" \
+                --arg feature_spec "$FEATURE_SPEC" \
+                --arg impl_plan "$IMPL_PLAN" \
+                --arg tasks "$TASKS" \
+                '{REPO_ROOT:$repo_root,BRANCH:$branch,FEATURE_DIR:$feature_dir,FEATURE_SPEC:$feature_spec,IMPL_PLAN:$impl_plan,TASKS:$tasks}'
+        else
+            printf '{"REPO_ROOT":"%s","BRANCH":"%s","FEATURE_DIR":"%s","FEATURE_SPEC":"%s","IMPL_PLAN":"%s","TASKS":"%s"}\n' \
+                "$(json_escape "$REPO_ROOT")" "$(json_escape "$CURRENT_BRANCH")" "$(json_escape "$FEATURE_DIR")" "$(json_escape "$FEATURE_SPEC")" "$(json_escape "$IMPL_PLAN")" "$(json_escape "$TASKS")"
+        fi
+    else
+        echo "REPO_ROOT: $REPO_ROOT"
+        echo "BRANCH: $CURRENT_BRANCH"
+        echo "FEATURE_DIR: $FEATURE_DIR"
+        echo "FEATURE_SPEC: $FEATURE_SPEC"
+        echo "IMPL_PLAN: $IMPL_PLAN"
+        echo "TASKS: $TASKS"
+    fi
+    exit 0
+fi
+
+# Validate branch name
+check_feature_branch "$CURRENT_BRANCH" "$HAS_GIT" || exit 1
+
+# Validate required directories and files
+if [[ ! -d "$FEATURE_DIR" ]]; then
+    echo "ERROR: Feature directory not found: $FEATURE_DIR" >&2
+    echo "Run /speckit-specify first to create the feature structure." >&2
+    exit 1
+fi
+
+if [[ ! -f "$IMPL_PLAN" ]]; then
+    echo "ERROR: plan.md not found in $FEATURE_DIR" >&2
+    echo "Run /speckit-plan first to create the implementation plan." >&2
+    exit 1
+fi
+
+# Check for tasks.md if required
+if $REQUIRE_TASKS && [[ ! -f "$TASKS" ]]; then
+    echo "ERROR: tasks.md not found in $FEATURE_DIR" >&2
+    echo "Run /speckit-tasks first to create the task list." >&2
+    exit 1
+fi
+
+# Build list of available documents
+docs=()
+
+# Always check these optional docs
+[[ -f "$RESEARCH" ]] && docs+=("research.md")
+[[ -f "$DATA_MODEL" ]] && docs+=("data-model.md")
+
+# Check contracts directory (only if it exists and has files)
+if [[ -d "$CONTRACTS_DIR" ]] && [[ -n "$(ls -A "$CONTRACTS_DIR" 2>/dev/null)" ]]; then
+    docs+=("contracts/")
+fi
+
+[[ -f "$QUICKSTART" ]] && docs+=("quickstart.md")
+
+# Include tasks.md if requested and it exists
+if $INCLUDE_TASKS && [[ -f "$TASKS" ]]; then
+    docs+=("tasks.md")
+fi
+
+# Output results
+if $JSON_MODE; then
+    # Build JSON array of documents
+    if has_jq; then
+        if [[ ${#docs[@]} -eq 0 ]]; then
+            json_docs="[]"
+        else
+            json_docs=$(printf '%s\n' "${docs[@]}" | jq -R . | jq -s .)
+        fi
+        jq -cn \
+            --arg feature_dir "$FEATURE_DIR" \
+            --argjson docs "$json_docs" \
+            '{FEATURE_DIR:$feature_dir,AVAILABLE_DOCS:$docs}'
+    else
+        if [[ ${#docs[@]} -eq 0 ]]; then
+            json_docs="[]"
+        else
+            json_docs=$(for d in "${docs[@]}"; do printf '"%s",' "$(json_escape "$d")"; done)
+            json_docs="[${json_docs%,}]"
+        fi
+        printf '{"FEATURE_DIR":"%s","AVAILABLE_DOCS":%s}\n' "$(json_escape "$FEATURE_DIR")" "$json_docs"
+    fi
+else
+    # Text output
+    echo "FEATURE_DIR:$FEATURE_DIR"
+    echo "AVAILABLE_DOCS:"
+    
+    # Show status of each potential document
+    check_file "$RESEARCH" "research.md"
+    check_file "$DATA_MODEL" "data-model.md"
+    check_dir "$CONTRACTS_DIR" "contracts/"
+    check_file "$QUICKSTART" "quickstart.md"
+    
+    if $INCLUDE_TASKS; then
+        check_file "$TASKS" "tasks.md"
+    fi
+fi
--- a/.specify/scripts/bash/common.sh
+++ b/.specify/scripts/bash/common.sh
@ -0,0 +1,721 @@
+#!/usr/bin/env bash
+# Common functions and variables for all scripts
+
+# Find repository root by searching upward for .specify directory
+# This is the primary marker for spec-kit projects
+find_specify_root() {
+    local dir="${1:-$(pwd)}"
+    # Normalize to absolute path to prevent infinite loop with relative paths
+    # Use -- to handle paths starting with - (e.g., -P, -L)
+    dir="$(cd -- "$dir" 2>/dev/null && pwd)" || return 1
+    local prev_dir=""
+    while true; do
+        if [ -d "$dir/.specify" ]; then
+            echo "$dir"
+            return 0
+        fi
+        # Stop if we've reached filesystem root or dirname stops changing
+        if [ "$dir" = "/" ] || [ "$dir" = "$prev_dir" ]; then
+            break
+        fi
+        prev_dir="$dir"
+        dir="$(dirname "$dir")"
+    done
+    return 1
+}
+
+# Get repository root, prioritizing .specify directory over git
+# This prevents using a parent git repo when spec-kit is initialized in a subdirectory
+get_repo_root() {
+    # First, look for .specify directory (spec-kit's own marker)
+    local specify_root
+    if specify_root=$(find_specify_root); then
+        echo "$specify_root"
+        return
+    fi
+
+    # Fallback to git if no .specify found
+    if git rev-parse --show-toplevel >/dev/null 2>&1; then
+        git rev-parse --show-toplevel
+        return
+    fi
+
+    # Final fallback to script location for non-git repos
+    local script_dir="$(CDPATH="" cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+    (cd "$script_dir/../../.." && pwd)
+}
+
+# Get current branch, with fallback for non-git repositories
+get_current_branch() {
+    # First check if SPECIFY_FEATURE environment variable is set
+    if [[ -n "${SPECIFY_FEATURE:-}" ]]; then
+        echo "$SPECIFY_FEATURE"
+        return
+    fi
+
+    # Then check git if available at the spec-kit root (not parent)
+    local repo_root=$(get_repo_root)
+    if has_git; then
+        git -C "$repo_root" rev-parse --abbrev-ref HEAD
+        return
+    fi
+
+    # For non-git repos, try to find the latest feature directory
+    local specs_dir="$repo_root/specs"
+
+    if [[ -d "$specs_dir" ]]; then
+        local latest_feature=""
+        local highest=0
+        local latest_timestamp=""
+
+        for dir in "$specs_dir"/*; do
+            if [[ -d "$dir" ]]; then
+                local dirname=$(basename "$dir")
+                if [[ "$dirname" =~ ^([0-9]{8}-[0-9]{6})- ]]; then
+                    # Timestamp-based branch: compare lexicographically
+                    local ts="${BASH_REMATCH[1]}"
+                    if [[ "$ts" > "$latest_timestamp" ]]; then
+                        latest_timestamp="$ts"
+                        latest_feature=$dirname
+                    fi
+                elif [[ "$dirname" =~ ^([0-9]{3,})- ]]; then
+                    local number=${BASH_REMATCH[1]}
+                    number=$((10#$number))
+                    if [[ "$number" -gt "$highest" ]]; then
+                        highest=$number
+                        # Only update if no timestamp branch found yet
+                        if [[ -z "$latest_timestamp" ]]; then
+                            latest_feature=$dirname
+                        fi
+                    fi
+                fi
+            fi
+        done
+
+        if [[ -n "$latest_feature" ]]; then
+            echo "$latest_feature"
+            return
+        fi
+    fi
+
+    echo "main"  # Final fallback
+}
+
+# Check if we have git available at the spec-kit root level
+# Returns true only if git is installed and the repo root is inside a git work tree
+# Handles both regular repos (.git directory) and worktrees/submodules (.git file)
+has_git() {
+    # First check if git command is available (before calling get_repo_root which may use git)
+    command -v git >/dev/null 2>&1 || return 1
+    local repo_root=$(get_repo_root)
+    # Check if .git exists (directory or file for worktrees/submodules)
+    [ -e "$repo_root/.git" ] || return 1
+    # Verify it's actually a valid git work tree
+    git -C "$repo_root" rev-parse --is-inside-work-tree >/dev/null 2>&1
+}
+
+# Strip a single optional path segment (e.g. gitflow "feat/004-name" -> "004-name").
+# Only when the full name is exactly two slash-free segments; otherwise returns the raw name.
+spec_kit_effective_branch_name() {
+    local raw="$1"
+    if [[ "$raw" =~ ^([^/]+)/([^/]+)$ ]]; then
+        printf '%s\n' "${BASH_REMATCH[2]}"
+    else
+        printf '%s\n' "$raw"
+    fi
+}
+
+check_feature_branch() {
+    local raw="$1"
+    local has_git_repo="$2"
+
+    # For non-git repos, we can't enforce branch naming but still provide output
+    if [[ "$has_git_repo" != "true" ]]; then
+        echo "[specify] Warning: Git repository not detected; skipped branch validation" >&2
+        return 0
+    fi
+
+    local branch
+    branch=$(spec_kit_effective_branch_name "$raw")
+
+    # Accept sequential prefix (3+ digits) but exclude malformed timestamps
+    # Malformed: 7-or-8 digit date + 6-digit time with no trailing slug (e.g. "2026031-143022" or "20260319-143022")
+    local is_sequential=false
+    if [[ "$branch" =~ ^[0-9]{3,}- ]] && [[ ! "$branch" =~ ^[0-9]{7}-[0-9]{6}- ]] && [[ ! "$branch" =~ ^[0-9]{7,8}-[0-9]{6}$ ]]; then
+        is_sequential=true
+    fi
+    if [[ "$is_sequential" != "true" ]] && [[ ! "$branch" =~ ^[0-9]{8}-[0-9]{6}- ]]; then
+        echo "ERROR: Not on a feature branch. Current branch: $raw" >&2
+        echo "Feature branches should be named like: 001-feature-name, 1234-feature-name, or 20260319-143022-feature-name" >&2
+        return 1
+    fi
+
+    return 0
+}
+
+# Safely read .specify/feature.json's "feature_directory" value.
+# Prints the raw value (possibly relative) to stdout, or empty string if the file
+# is missing, unparseable, or does not contain the key. Always returns 0 so callers
+# under `set -e` cannot be aborted by parser failure.
+# Parser order mirrors the historical get_feature_paths behavior: jq -> python3 -> grep/sed.
+read_feature_json_feature_directory() {
+    local repo_root="$1"
+    local fj="$repo_root/.specify/feature.json"
+    [[ -f "$fj" ]] || { printf '%s' ''; return 0; }
+
+    local _fd=''
+    if command -v jq >/dev/null 2>&1; then
+        if ! _fd=$(jq -r '.feature_directory // empty' "$fj" 2>/dev/null); then
+            _fd=''
+        fi
+    elif command -v python3 >/dev/null 2>&1; then
+        # Use Python so pretty-printed/multi-line JSON still parses correctly.
+        if ! _fd=$(python3 -c "import json,sys; d=json.load(open(sys.argv[1])); v=d.get('feature_directory'); print(v if v else '')" "$fj" 2>/dev/null); then
+            _fd=''
+        fi
+    else
+        # Last-resort single-line grep/sed fallback. The `|| true` guards against
+        # grep returning 1 (no match) aborting under `set -e` / `pipefail`.
+        _fd=$( { grep -E '"feature_directory"[[:space:]]*:' "$fj" 2>/dev/null || true; } \
+            | head -n 1 \
+            | sed -E 's/^[^:]*:[[:space:]]*"([^"]*)".*$/\1/' )
+    fi
+
+    printf '%s' "$_fd"
+    return 0
+}
+
+# Returns 0 when .specify/feature.json lists feature_directory that exists as a directory
+# and matches the resolved active FEATURE_DIR (so /speckit-plan can skip git branch pattern checks).
+# Delegates parsing to read_feature_json_feature_directory, which is safe under `set -e`.
+feature_json_matches_feature_dir() {
+    local repo_root="$1"
+    local active_feature_dir="$2"
+
+    local _fd
+    _fd=$(read_feature_json_feature_directory "$repo_root")
+
+    [[ -n "$_fd" ]] || return 1
+    [[ "$_fd" != /* ]] && _fd="$repo_root/$_fd"
+    [[ -d "$_fd" ]] || return 1
+
+    local norm_json norm_active
+    norm_json="$(cd -- "$_fd" 2>/dev/null && pwd -P)" || return 1
+    norm_active="$(cd -- "$active_feature_dir" 2>/dev/null && pwd -P)" || return 1
+
+    [[ "$norm_json" == "$norm_active" ]]
+}
+
+# Find feature directory by numeric prefix instead of exact branch match
+# This allows multiple branches to work on the same spec (e.g., 004-fix-bug, 004-add-feature)
+find_feature_dir_by_prefix() {
+    local repo_root="$1"
+    local branch_name
+    branch_name=$(spec_kit_effective_branch_name "$2")
+    local specs_dir="$repo_root/specs"
+
+    # Extract prefix from branch (e.g., "004" from "004-whatever" or "20260319-143022" from timestamp branches)
+    local prefix=""
+    if [[ "$branch_name" =~ ^([0-9]{8}-[0-9]{6})- ]]; then
+        prefix="${BASH_REMATCH[1]}"
+    elif [[ "$branch_name" =~ ^([0-9]{3,})- ]]; then
+        prefix="${BASH_REMATCH[1]}"
+    else
+        # If branch doesn't have a recognized prefix, fall back to exact match
+        echo "$specs_dir/$branch_name"
+        return
+    fi
+
+    # Search for directories in specs/ that start with this prefix
+    local matches=()
+    if [[ -d "$specs_dir" ]]; then
+        for dir in "$specs_dir"/"$prefix"-*; do
+            if [[ -d "$dir" ]]; then
+                matches+=("$(basename "$dir")")
+            fi
+        done
+    fi
+
+    # Handle results
+    if [[ ${#matches[@]} -eq 0 ]]; then
+        # No match found - return the branch name path (will fail later with clear error)
+        echo "$specs_dir/$branch_name"
+    elif [[ ${#matches[@]} -eq 1 ]]; then
+        # Exactly one match - perfect!
+        echo "$specs_dir/${matches[0]}"
+    else
+        # Multiple matches - this shouldn't happen with proper naming convention
+        echo "ERROR: Multiple spec directories found with prefix '$prefix': ${matches[*]}" >&2
+        echo "Please ensure only one spec directory exists per prefix." >&2
+        return 1
+    fi
+}
+
+get_feature_paths() {
+    local repo_root=$(get_repo_root)
+    local current_branch=$(get_current_branch)
+    local has_git_repo="false"
+
+    if has_git; then
+        has_git_repo="true"
+    fi
+
+    # Resolve feature directory.  Priority:
+    #   1. SPECIFY_FEATURE_DIRECTORY env var (explicit override)
+    #   2. .specify/feature.json "feature_directory" key (persisted by /speckit-specify)
+    #   3. Branch-name-based prefix lookup (legacy fallback)
+    local feature_dir
+    if [[ -n "${SPECIFY_FEATURE_DIRECTORY:-}" ]]; then
+        feature_dir="$SPECIFY_FEATURE_DIRECTORY"
+        # Normalize relative paths to absolute under repo root
+        [[ "$feature_dir" != /* ]] && feature_dir="$repo_root/$feature_dir"
+    elif [[ -f "$repo_root/.specify/feature.json" ]]; then
+        # Shared, set -e-safe parser: jq -> python3 -> grep/sed. Returns empty on
+        # missing/unparseable/unset so we fall through to the branch-prefix lookup.
+        local _fd
+        _fd=$(read_feature_json_feature_directory "$repo_root")
+        if [[ -n "$_fd" ]]; then
+            feature_dir="$_fd"
+            # Normalize relative paths to absolute under repo root
+            [[ "$feature_dir" != /* ]] && feature_dir="$repo_root/$feature_dir"
+        elif ! feature_dir=$(find_feature_dir_by_prefix "$repo_root" "$current_branch"); then
+            echo "ERROR: Failed to resolve feature directory" >&2
+            return 1
+        fi
+    elif ! feature_dir=$(find_feature_dir_by_prefix "$repo_root" "$current_branch"); then
+        echo "ERROR: Failed to resolve feature directory" >&2
+        return 1
+    fi
+
+    # Use printf '%q' to safely quote values, preventing shell injection
+    # via crafted branch names or paths containing special characters
+    printf 'REPO_ROOT=%q\n' "$repo_root"
+    printf 'CURRENT_BRANCH=%q\n' "$current_branch"
+    printf 'HAS_GIT=%q\n' "$has_git_repo"
+    printf 'FEATURE_DIR=%q\n' "$feature_dir"
+    printf 'FEATURE_SPEC=%q\n' "$feature_dir/spec.md"
+    printf 'IMPL_PLAN=%q\n' "$feature_dir/plan.md"
+    printf 'TASKS=%q\n' "$feature_dir/tasks.md"
+    printf 'RESEARCH=%q\n' "$feature_dir/research.md"
+    printf 'DATA_MODEL=%q\n' "$feature_dir/data-model.md"
+    printf 'QUICKSTART=%q\n' "$feature_dir/quickstart.md"
+    printf 'CONTRACTS_DIR=%q\n' "$feature_dir/contracts"
+}
+
+# Check if jq is available for safe JSON construction
+has_jq() {
+    command -v jq >/dev/null 2>&1
+}
+
+get_invoke_separator() {
+    local repo_root="${1:-$(get_repo_root)}"
+    if [[ "${_SPECIFY_INVOKE_SEPARATOR_CACHE_REPO_ROOT:-}" == "$repo_root" && -n "${_SPECIFY_INVOKE_SEPARATOR_CACHE_VALUE:-}" ]]; then
+        printf '%s\n' "$_SPECIFY_INVOKE_SEPARATOR_CACHE_VALUE"
+        return 0
+    fi
+
+    local integration_json="$repo_root/.specify/integration.json"
+    local separator="."
+    local parsed_with_jq=0
+
+    if [[ -f "$integration_json" ]]; then
+        if command -v jq >/dev/null 2>&1; then
+            local jq_separator
+            if jq_separator=$(jq -r '(.default_integration // .integration // "") as $k | if $k == "" then "." else (.integration_settings[$k].invoke_separator // ".") end' "$integration_json" 2>/dev/null); then
+                parsed_with_jq=1
+                case "$jq_separator" in
+                    "."|"-") separator="$jq_separator" ;;
+                esac
+            fi
+        fi
+
+        if [[ "$parsed_with_jq" -eq 0 ]] && command -v python3 >/dev/null 2>&1; then
+            if separator=$(python3 - "$integration_json" <<'PY' 2>/dev/null
+import json
+import sys
+
+try:
+    with open(sys.argv[1], encoding="utf-8") as fh:
+        state = json.load(fh)
+    key = state.get("default_integration") or state.get("integration") or ""
+    settings = state.get("integration_settings")
+    separator = "."
+    if isinstance(key, str) and isinstance(settings, dict):
+        entry = settings.get(key)
+        if isinstance(entry, dict) and entry.get("invoke_separator") in {".", "-"}:
+            separator = entry["invoke_separator"]
+    print(separator)
+except Exception:
+    print(".")
+PY
+); then
+                case "$separator" in
+                    "."|"-") ;;
+                    *) separator="." ;;
+                esac
+            else
+                separator="."
+            fi
+        fi
+    fi
+
+    _SPECIFY_INVOKE_SEPARATOR_CACHE_REPO_ROOT="$repo_root"
+    _SPECIFY_INVOKE_SEPARATOR_CACHE_VALUE="$separator"
+    printf '%s\n' "$separator"
+}
+
+format_speckit_command() {
+    local command_name="$1"
+    local repo_root="${2:-$(get_repo_root)}"
+    local separator
+    if [[ "${_SPECIFY_INVOKE_SEPARATOR_CACHE_REPO_ROOT:-}" == "$repo_root" && -n "${_SPECIFY_INVOKE_SEPARATOR_CACHE_VALUE:-}" ]]; then
+        separator="$_SPECIFY_INVOKE_SEPARATOR_CACHE_VALUE"
+    else
+        separator=$(get_invoke_separator "$repo_root")
+        _SPECIFY_INVOKE_SEPARATOR_CACHE_REPO_ROOT="$repo_root"
+        _SPECIFY_INVOKE_SEPARATOR_CACHE_VALUE="$separator"
+    fi
+
+    command_name="${command_name#/}"
+    command_name="${command_name#speckit.}"
+    command_name="${command_name#speckit-}"
+    command_name="${command_name//./$separator}"
+
+    printf '/speckit%s%s\n' "$separator" "$command_name"
+}
+
+# Escape a string for safe embedding in a JSON value (fallback when jq is unavailable).
+# Handles backslash, double-quote, and JSON-required control character escapes (RFC 8259).
+json_escape() {
+    local s="$1"
+    s="${s//\\/\\\\}"
+    s="${s//\"/\\\"}"
+    s="${s//$'\n'/\\n}"
+    s="${s//$'\t'/\\t}"
+    s="${s//$'\r'/\\r}"
+    s="${s//$'\b'/\\b}"
+    s="${s//$'\f'/\\f}"
+    # Escape any remaining U+0001-U+001F control characters as \uXXXX.
+    # (U+0000/NUL cannot appear in bash strings and is excluded.)
+    # LC_ALL=C ensures ${#s} counts bytes and ${s:$i:1} yields single bytes,
+    # so multi-byte UTF-8 sequences (first byte >= 0xC0) pass through intact.
+    local LC_ALL=C
+    local i char code
+    for (( i=0; i<${#s}; i++ )); do
+        char="${s:$i:1}"
+        printf -v code '%d' "'$char" 2>/dev/null || code=256
+        if (( code >= 1 && code <= 31 )); then
+            printf '\\u%04x' "$code"
+        else
+            printf '%s' "$char"
+        fi
+    done
+}
+
+check_file() { [[ -f "$1" ]] && echo "  ✓ $2" || echo "  ✗ $2"; }
+check_dir() { [[ -d "$1" && -n $(ls -A "$1" 2>/dev/null) ]] && echo "  ✓ $2" || echo "  ✗ $2"; }
+
+# Resolve a template name to a file path using the priority stack:
+#   1. .specify/templates/overrides/
+#   2. .specify/presets/<preset-id>/templates/ (sorted by priority from .registry)
+#   3. .specify/extensions/<ext-id>/templates/
+#   4. .specify/templates/ (core)
+resolve_template() {
+    local template_name="$1"
+    local repo_root="$2"
+    local base="$repo_root/.specify/templates"
+
+    # Priority 1: Project overrides
+    local override="$base/overrides/${template_name}.md"
+    [ -f "$override" ] && echo "$override" && return 0
+
+    # Priority 2: Installed presets (sorted by priority from .registry)
+    local presets_dir="$repo_root/.specify/presets"
+    if [ -d "$presets_dir" ]; then
+        local registry_file="$presets_dir/.registry"
+        if [ -f "$registry_file" ] && command -v python3 >/dev/null 2>&1; then
+            # Read preset IDs sorted by priority (lower number = higher precedence).
+            # The python3 call is wrapped in an if-condition so that set -e does not
+            # abort the function when python3 exits non-zero (e.g. invalid JSON).
+            local sorted_presets=""
+            if sorted_presets=$(SPECKIT_REGISTRY="$registry_file" python3 -c "
+import json, sys, os
+try:
+    with open(os.environ['SPECKIT_REGISTRY']) as f:
+        data = json.load(f)
+    presets = data.get('presets', {})
+    for pid, meta in sorted(presets.items(), key=lambda x: x[1].get('priority', 10) if isinstance(x[1], dict) else 10):
+        if isinstance(meta, dict) and meta.get('enabled', True) is not False:
+            print(pid)
+except Exception:
+    sys.exit(1)
+" 2>/dev/null); then
+                if [ -n "$sorted_presets" ]; then
+                    # python3 succeeded and returned preset IDs — search in priority order
+                    while IFS= read -r preset_id; do
+                        local candidate="$presets_dir/$preset_id/templates/${template_name}.md"
+                        [ -f "$candidate" ] && echo "$candidate" && return 0
+                    done <<< "$sorted_presets"
+                fi
+                # python3 succeeded but registry has no presets — nothing to search
+            else
+                # python3 failed (missing, or registry parse error) — fall back to unordered directory scan
+                for preset in "$presets_dir"/*/; do
+                    [ -d "$preset" ] || continue
+                    local candidate="$preset/templates/${template_name}.md"
+                    [ -f "$candidate" ] && echo "$candidate" && return 0
+                done
+            fi
+        else
+            # Fallback: alphabetical directory order (no python3 available)
+            for preset in "$presets_dir"/*/; do
+                [ -d "$preset" ] || continue
+                local candidate="$preset/templates/${template_name}.md"
+                [ -f "$candidate" ] && echo "$candidate" && return 0
+            done
+        fi
+    fi
+
+    # Priority 3: Extension-provided templates
+    local ext_dir="$repo_root/.specify/extensions"
+    if [ -d "$ext_dir" ]; then
+        for ext in "$ext_dir"/*/; do
+            [ -d "$ext" ] || continue
+            # Skip hidden directories (e.g. .backup, .cache)
+            case "$(basename "$ext")" in .*) continue;; esac
+            local candidate="$ext/templates/${template_name}.md"
+            [ -f "$candidate" ] && echo "$candidate" && return 0
+        done
+    fi
+
+    # Priority 4: Core templates
+    local core="$base/${template_name}.md"
+    [ -f "$core" ] && echo "$core" && return 0
+
+    # Template not found in any location.
+    # Return 1 so callers can distinguish "not found" from "found".
+    # Callers running under set -e should use: TEMPLATE=$(resolve_template ...) || true
+    return 1
+}
+
+# Resolve a template name to composed content using composition strategies.
+# Reads strategy metadata from preset manifests and composes content
+# from multiple layers using prepend, append, or wrap strategies.
+#
+# Usage: CONTENT=$(resolve_template_content "template-name" "$REPO_ROOT")
+# Returns composed content string on stdout; exit code 1 if not found.
+resolve_template_content() {
+    local template_name="$1"
+    local repo_root="$2"
+    local base="$repo_root/.specify/templates"
+
+    # Collect all layers (highest priority first)
+    local -a layer_paths=()
+    local -a layer_strategies=()
+
+    # Priority 1: Project overrides (always "replace")
+    local override="$base/overrides/${template_name}.md"
+    if [ -f "$override" ]; then
+        layer_paths+=("$override")
+        layer_strategies+=("replace")
+    fi
+
+    # Priority 2: Installed presets (sorted by priority from .registry)
+    local presets_dir="$repo_root/.specify/presets"
+    if [ -d "$presets_dir" ]; then
+        local registry_file="$presets_dir/.registry"
+        local sorted_presets=""
+        if [ -f "$registry_file" ] && command -v python3 >/dev/null 2>&1; then
+            if sorted_presets=$(SPECKIT_REGISTRY="$registry_file" python3 -c "
+import json, sys, os
+try:
+    with open(os.environ['SPECKIT_REGISTRY']) as f:
+        data = json.load(f)
+    presets = data.get('presets', {})
+    for pid, meta in sorted(presets.items(), key=lambda x: x[1].get('priority', 10) if isinstance(x[1], dict) else 10):
+        if isinstance(meta, dict) and meta.get('enabled', True) is not False:
+            print(pid)
+except Exception:
+    sys.exit(1)
+" 2>/dev/null); then
+                if [ -n "$sorted_presets" ]; then
+                    local yaml_warned=false
+                    while IFS= read -r preset_id; do
+                        # Read strategy and file path from preset manifest
+                        local strategy="replace"
+                        local manifest_file=""
+                        local manifest="$presets_dir/$preset_id/preset.yml"
+                        if [ -f "$manifest" ] && command -v python3 >/dev/null 2>&1; then
+                            # Requires PyYAML; falls back to replace/convention if unavailable
+                            local result
+                            local py_stderr
+                            py_stderr=$(mktemp)
+                            result=$(SPECKIT_MANIFEST="$manifest" SPECKIT_TMPL="$template_name" python3 -c "
+import sys, os
+try:
+    import yaml
+except ImportError:
+    print('yaml_missing', file=sys.stderr)
+    print('replace\t')
+    sys.exit(0)
+try:
+    with open(os.environ['SPECKIT_MANIFEST']) as f:
+        data = yaml.safe_load(f)
+    for t in data.get('provides', {}).get('templates', []):
+        if t.get('name') == os.environ['SPECKIT_TMPL'] and t.get('type', 'template') == 'template':
+            print(t.get('strategy', 'replace') + '\t' + t.get('file', ''))
+            sys.exit(0)
+    print('replace\t')
+except Exception:
+    print('replace\t')
+" 2>"$py_stderr")
+                            local parse_status=$?
+                            if [ $parse_status -eq 0 ] && [ -n "$result" ]; then
+                                IFS=$'\t' read -r strategy manifest_file <<< "$result"
+                                strategy=$(printf '%s' "$strategy" | tr '[:upper:]' '[:lower:]')
+                            fi
+                            if [ "$yaml_warned" = false ] && grep -q 'yaml_missing' "$py_stderr" 2>/dev/null; then
+                                echo "Warning: PyYAML not available; composition strategies may be ignored" >&2
+                                yaml_warned=true
+                            fi
+                            rm -f "$py_stderr"
+                        fi
+                        # Try manifest file path first, then convention path
+                        local candidate=""
+                        if [ -n "$manifest_file" ]; then
+                            # Reject absolute paths and parent traversal
+                            case "$manifest_file" in
+                                /*|*../*|../*) manifest_file="" ;;
+                            esac
+                        fi
+                        if [ -n "$manifest_file" ]; then
+                            local mf="$presets_dir/$preset_id/$manifest_file"
+                            [ -f "$mf" ] && candidate="$mf"
+                        fi
+                        if [ -z "$candidate" ]; then
+                            local cf="$presets_dir/$preset_id/templates/${template_name}.md"
+                            [ -f "$cf" ] && candidate="$cf"
+                        fi
+                        if [ -n "$candidate" ]; then
+                            layer_paths+=("$candidate")
+                            layer_strategies+=("$strategy")
+                        fi
+                    done <<< "$sorted_presets"
+                fi
+            else
+                # python3 failed — fall back to unordered directory scan (replace only)
+                for preset in "$presets_dir"/*/; do
+                    [ -d "$preset" ] || continue
+                    local candidate="$preset/templates/${template_name}.md"
+                    if [ -f "$candidate" ]; then
+                        layer_paths+=("$candidate")
+                        layer_strategies+=("replace")
+                    fi
+                done
+            fi
+        else
+            # No python3 or registry — fall back to unordered directory scan (replace only)
+            for preset in "$presets_dir"/*/; do
+                [ -d "$preset" ] || continue
+                local candidate="$preset/templates/${template_name}.md"
+                if [ -f "$candidate" ]; then
+                    layer_paths+=("$candidate")
+                    layer_strategies+=("replace")
+                fi
+            done
+        fi
+    fi
+
+    # Priority 3: Extension-provided templates (always "replace")
+    local ext_dir="$repo_root/.specify/extensions"
+    if [ -d "$ext_dir" ]; then
+        for ext in "$ext_dir"/*/; do
+            [ -d "$ext" ] || continue
+            case "$(basename "$ext")" in .*) continue;; esac
+            local candidate="$ext/templates/${template_name}.md"
+            if [ -f "$candidate" ]; then
+                layer_paths+=("$candidate")
+                layer_strategies+=("replace")
+            fi
+        done
+    fi
+
+    # Priority 4: Core templates (always "replace")
+    local core="$base/${template_name}.md"
+    if [ -f "$core" ]; then
+        layer_paths+=("$core")
+        layer_strategies+=("replace")
+    fi
+
+    local count=${#layer_paths[@]}
+    [ "$count" -eq 0 ] && return 1
+
+    # Check if any layer uses a non-replace strategy
+    local has_composition=false
+    for s in "${layer_strategies[@]}"; do
+        [ "$s" != "replace" ] && has_composition=true && break
+    done
+
+    # If the top (highest-priority) layer is replace, it wins entirely —
+    # lower layers are irrelevant regardless of their strategies.
+    if [ "${layer_strategies[0]}" = "replace" ]; then
+        cat "${layer_paths[0]}"
+        return 0
+    fi
+
+    if [ "$has_composition" = false ]; then
+        cat "${layer_paths[0]}"
+        return 0
+    fi
+
+    # Find the effective base: scan from highest priority (index 0) downward
+    # to find the nearest replace layer. Only compose layers above that base.
+    local base_idx=-1
+    local i
+    for (( i=0; i<count; i++ )); do
+        if [ "${layer_strategies[$i]}" = "replace" ]; then
+            base_idx=$i
+            break
+        fi
+    done
+
+    if [ $base_idx -lt 0 ]; then
+        return 1  # no base layer found
+    fi
+
+    # Read the base content; compose layers above the base (higher priority)
+    local content
+    content=$(cat "${layer_paths[$base_idx]}"; printf x)
+    content="${content%x}"
+
+    for (( i=base_idx-1; i>=0; i-- )); do
+        local path="${layer_paths[$i]}"
+        local strat="${layer_strategies[$i]}"
+        local layer_content
+        # Preserve trailing newlines
+        layer_content=$(cat "$path"; printf x)
+        layer_content="${layer_content%x}"
+
+        case "$strat" in
+            replace) content="$layer_content" ;;
+            prepend) content="$(printf '%s\n\n%s' "$layer_content" "$content")" ;;
+            append)  content="$(printf '%s\n\n%s' "$content" "$layer_content")" ;;
+            wrap)
+                case "$layer_content" in
+                    *'{CORE_TEMPLATE}'*) ;;
+                    *) echo "Error: wrap strategy missing {CORE_TEMPLATE} placeholder" >&2; return 1 ;;
+                esac
+                while [[ "$layer_content" == *'{CORE_TEMPLATE}'* ]]; do
+                    local before="${layer_content%%\{CORE_TEMPLATE\}*}"
+                    local after="${layer_content#*\{CORE_TEMPLATE\}}"
+                    layer_content="${before}${content}${after}"
+                done
+                content="$layer_content"
+                ;;
+            *) echo "Error: unknown strategy '$strat'" >&2; return 1 ;;
+        esac
+    done
+
+    printf '%s' "$content"
+    return 0
+}
--- a/.specify/scripts/bash/create-new-feature.sh
+++ b/.specify/scripts/bash/create-new-feature.sh
@ -0,0 +1,413 @@
+#!/usr/bin/env bash
+
+set -e
+
+JSON_MODE=false
+DRY_RUN=false
+ALLOW_EXISTING=false
+SHORT_NAME=""
+BRANCH_NUMBER=""
+USE_TIMESTAMP=false
+ARGS=()
+i=1
+while [ $i -le $# ]; do
+    arg="${!i}"
+    case "$arg" in
+        --json)
+            JSON_MODE=true
+            ;;
+        --dry-run)
+            DRY_RUN=true
+            ;;
+        --allow-existing-branch)
+            ALLOW_EXISTING=true
+            ;;
+        --short-name)
+            if [ $((i + 1)) -gt $# ]; then
+                echo 'Error: --short-name requires a value' >&2
+                exit 1
+            fi
+            i=$((i + 1))
+            next_arg="${!i}"
+            # Check if the next argument is another option (starts with --)
+            if [[ "$next_arg" == --* ]]; then
+                echo 'Error: --short-name requires a value' >&2
+                exit 1
+            fi
+            SHORT_NAME="$next_arg"
+            ;;
+        --number)
+            if [ $((i + 1)) -gt $# ]; then
+                echo 'Error: --number requires a value' >&2
+                exit 1
+            fi
+            i=$((i + 1))
+            next_arg="${!i}"
+            if [[ "$next_arg" == --* ]]; then
+                echo 'Error: --number requires a value' >&2
+                exit 1
+            fi
+            BRANCH_NUMBER="$next_arg"
+            ;;
+        --timestamp)
+            USE_TIMESTAMP=true
+            ;;
+        --help|-h)
+            echo "Usage: $0 [--json] [--dry-run] [--allow-existing-branch] [--short-name <name>] [--number N] [--timestamp] <feature_description>"
+            echo ""
+            echo "Options:"
+            echo "  --json              Output in JSON format"
+            echo "  --dry-run           Compute branch name and paths without creating branches, directories, or files"
+            echo "  --allow-existing-branch  Switch to branch if it already exists instead of failing"
+            echo "  --short-name <name> Provide a custom short name (2-4 words) for the branch"
+            echo "  --number N          Specify branch number manually (overrides auto-detection)"
+            echo "  --timestamp         Use timestamp prefix (YYYYMMDD-HHMMSS) instead of sequential numbering"
+            echo "  --help, -h          Show this help message"
+            echo ""
+            echo "Examples:"
+            echo "  $0 'Add user authentication system' --short-name 'user-auth'"
+            echo "  $0 'Implement OAuth2 integration for API' --number 5"
+            echo "  $0 --timestamp --short-name 'user-auth' 'Add user authentication'"
+            exit 0
+            ;;
+        *)
+            ARGS+=("$arg")
+            ;;
+    esac
+    i=$((i + 1))
+done
+
+FEATURE_DESCRIPTION="${ARGS[*]}"
+if [ -z "$FEATURE_DESCRIPTION" ]; then
+    echo "Usage: $0 [--json] [--dry-run] [--allow-existing-branch] [--short-name <name>] [--number N] [--timestamp] <feature_description>" >&2
+    exit 1
+fi
+
+# Trim whitespace and validate description is not empty (e.g., user passed only whitespace)
+FEATURE_DESCRIPTION=$(echo "$FEATURE_DESCRIPTION" | sed -E 's/^[[:space:]]+|[[:space:]]+$//g')
+if [ -z "$FEATURE_DESCRIPTION" ]; then
+    echo "Error: Feature description cannot be empty or contain only whitespace" >&2
+    exit 1
+fi
+
+# Function to get highest number from specs directory
+get_highest_from_specs() {
+    local specs_dir="$1"
+    local highest=0
+    
+    if [ -d "$specs_dir" ]; then
+        for dir in "$specs_dir"/*; do
+            [ -d "$dir" ] || continue
+            dirname=$(basename "$dir")
+            # Match sequential prefixes (>=3 digits), but skip timestamp dirs.
+            if echo "$dirname" | grep -Eq '^[0-9]{3,}-' && ! echo "$dirname" | grep -Eq '^[0-9]{8}-[0-9]{6}-'; then
+                number=$(echo "$dirname" | grep -Eo '^[0-9]+')
+                number=$((10#$number))
+                if [ "$number" -gt "$highest" ]; then
+                    highest=$number
+                fi
+            fi
+        done
+    fi
+    
+    echo "$highest"
+}
+
+# Function to get highest number from git branches
+get_highest_from_branches() {
+    git branch -a 2>/dev/null | sed 's/^[* ]*//; s|^remotes/[^/]*/||' | _extract_highest_number
+}
+
+# Extract the highest sequential feature number from a list of ref names (one per line).
+# Shared by get_highest_from_branches and get_highest_from_remote_refs.
+_extract_highest_number() {
+    local highest=0
+    while IFS= read -r name; do
+        [ -z "$name" ] && continue
+        if echo "$name" | grep -Eq '^[0-9]{3,}-' && ! echo "$name" | grep -Eq '^[0-9]{8}-[0-9]{6}-'; then
+            number=$(echo "$name" | grep -Eo '^[0-9]+' || echo "0")
+            number=$((10#$number))
+            if [ "$number" -gt "$highest" ]; then
+                highest=$number
+            fi
+        fi
+    done
+    echo "$highest"
+}
+
+# Function to get highest number from remote branches without fetching (side-effect-free)
+get_highest_from_remote_refs() {
+    local highest=0
+
+    for remote in $(git remote 2>/dev/null); do
+        local remote_highest
+        remote_highest=$(GIT_TERMINAL_PROMPT=0 git ls-remote --heads "$remote" 2>/dev/null | sed 's|.*refs/heads/||' | _extract_highest_number)
+        if [ "$remote_highest" -gt "$highest" ]; then
+            highest=$remote_highest
+        fi
+    done
+
+    echo "$highest"
+}
+
+# Function to check existing branches (local and remote) and return next available number.
+# When skip_fetch is true, queries remotes via ls-remote (read-only) instead of fetching.
+check_existing_branches() {
+    local specs_dir="$1"
+    local skip_fetch="${2:-false}"
+
+    if [ "$skip_fetch" = true ]; then
+        # Side-effect-free: query remotes via ls-remote
+        local highest_remote=$(get_highest_from_remote_refs)
+        local highest_branch=$(get_highest_from_branches)
+        if [ "$highest_remote" -gt "$highest_branch" ]; then
+            highest_branch=$highest_remote
+        fi
+    else
+        # Fetch all remotes to get latest branch info (suppress errors if no remotes)
+        git fetch --all --prune >/dev/null 2>&1 || true
+        local highest_branch=$(get_highest_from_branches)
+    fi
+
+    # Get highest number from ALL specs (not just matching short name)
+    local highest_spec=$(get_highest_from_specs "$specs_dir")
+
+    # Take the maximum of both
+    local max_num=$highest_branch
+    if [ "$highest_spec" -gt "$max_num" ]; then
+        max_num=$highest_spec
+    fi
+
+    # Return next number
+    echo $((max_num + 1))
+}
+
+# Function to clean and format a branch name
+clean_branch_name() {
+    local name="$1"
+    echo "$name" | tr '[:upper:]' '[:lower:]' | sed 's/[^a-z0-9]/-/g' | sed 's/-\+/-/g' | sed 's/^-//' | sed 's/-$//'
+}
+
+# Resolve repository root using common.sh functions which prioritize .specify over git
+SCRIPT_DIR="$(CDPATH="" cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+source "$SCRIPT_DIR/common.sh"
+
+REPO_ROOT=$(get_repo_root)
+
+# Check if git is available at this repo root (not a parent)
+if has_git; then
+    HAS_GIT=true
+else
+    HAS_GIT=false
+fi
+
+cd "$REPO_ROOT"
+
+SPECS_DIR="$REPO_ROOT/specs"
+if [ "$DRY_RUN" != true ]; then
+    mkdir -p "$SPECS_DIR"
+fi
+
+# Function to generate branch name with stop word filtering and length filtering
+generate_branch_name() {
+    local description="$1"
+    
+    # Common stop words to filter out
+    local stop_words="^(i|a|an|the|to|for|of|in|on|at|by|with|from|is|are|was|were|be|been|being|have|has|had|do|does|did|will|would|should|could|can|may|might|must|shall|this|that|these|those|my|your|our|their|want|need|add|get|set)$"
+    
+    # Convert to lowercase and split into words
+    local clean_name=$(echo "$description" | tr '[:upper:]' '[:lower:]' | sed 's/[^a-z0-9]/ /g')
+    
+    # Filter words: remove stop words and words shorter than 3 chars (unless they're uppercase acronyms in original)
+    local meaningful_words=()
+    for word in $clean_name; do
+        # Skip empty words
+        [ -z "$word" ] && continue
+        
+        # Keep words that are NOT stop words AND (length >= 3 OR are potential acronyms)
+        if ! echo "$word" | grep -qiE "$stop_words"; then
+            if [ ${#word} -ge 3 ]; then
+                meaningful_words+=("$word")
+            elif echo "$description" | grep -q "\b${word^^}\b"; then
+                # Keep short words if they appear as uppercase in original (likely acronyms)
+                meaningful_words+=("$word")
+            fi
+        fi
+    done
+    
+    # If we have meaningful words, use first 3-4 of them
+    if [ ${#meaningful_words[@]} -gt 0 ]; then
+        local max_words=3
+        if [ ${#meaningful_words[@]} -eq 4 ]; then max_words=4; fi
+        
+        local result=""
+        local count=0
+        for word in "${meaningful_words[@]}"; do
+            if [ $count -ge $max_words ]; then break; fi
+            if [ -n "$result" ]; then result="$result-"; fi
+            result="$result$word"
+            count=$((count + 1))
+        done
+        echo "$result"
+    else
+        # Fallback to original logic if no meaningful words found
+        local cleaned=$(clean_branch_name "$description")
+        echo "$cleaned" | tr '-' '\n' | grep -v '^$' | head -3 | tr '\n' '-' | sed 's/-$//'
+    fi
+}
+
+# Generate branch name
+if [ -n "$SHORT_NAME" ]; then
+    # Use provided short name, just clean it up
+    BRANCH_SUFFIX=$(clean_branch_name "$SHORT_NAME")
+else
+    # Generate from description with smart filtering
+    BRANCH_SUFFIX=$(generate_branch_name "$FEATURE_DESCRIPTION")
+fi
+
+# Warn if --number and --timestamp are both specified
+if [ "$USE_TIMESTAMP" = true ] && [ -n "$BRANCH_NUMBER" ]; then
+    >&2 echo "[specify] Warning: --number is ignored when --timestamp is used"
+    BRANCH_NUMBER=""
+fi
+
+# Determine branch prefix
+if [ "$USE_TIMESTAMP" = true ]; then
+    FEATURE_NUM=$(date +%Y%m%d-%H%M%S)
+    BRANCH_NAME="${FEATURE_NUM}-${BRANCH_SUFFIX}"
+else
+    # Determine branch number
+    if [ -z "$BRANCH_NUMBER" ]; then
+        if [ "$DRY_RUN" = true ] && [ "$HAS_GIT" = true ]; then
+            # Dry-run: query remotes via ls-remote (side-effect-free, no fetch)
+            BRANCH_NUMBER=$(check_existing_branches "$SPECS_DIR" true)
+        elif [ "$DRY_RUN" = true ]; then
+            # Dry-run without git: local spec dirs only
+            HIGHEST=$(get_highest_from_specs "$SPECS_DIR")
+            BRANCH_NUMBER=$((HIGHEST + 1))
+        elif [ "$HAS_GIT" = true ]; then
+            # Check existing branches on remotes
+            BRANCH_NUMBER=$(check_existing_branches "$SPECS_DIR")
+        else
+            # Fall back to local directory check
+            HIGHEST=$(get_highest_from_specs "$SPECS_DIR")
+            BRANCH_NUMBER=$((HIGHEST + 1))
+        fi
+    fi
+
+    # Force base-10 interpretation to prevent octal conversion (e.g., 010 → 8 in octal, but should be 10 in decimal)
+    FEATURE_NUM=$(printf "%03d" "$((10#$BRANCH_NUMBER))")
+    BRANCH_NAME="${FEATURE_NUM}-${BRANCH_SUFFIX}"
+fi
+
+# GitHub enforces a 244-byte limit on branch names
+# Validate and truncate if necessary
+MAX_BRANCH_LENGTH=244
+if [ ${#BRANCH_NAME} -gt $MAX_BRANCH_LENGTH ]; then
+    # Calculate how much we need to trim from suffix
+    # Account for prefix length: timestamp (15) + hyphen (1) = 16, or sequential (3) + hyphen (1) = 4
+    PREFIX_LENGTH=$(( ${#FEATURE_NUM} + 1 ))
+    MAX_SUFFIX_LENGTH=$((MAX_BRANCH_LENGTH - PREFIX_LENGTH))
+    
+    # Truncate suffix at word boundary if possible
+    TRUNCATED_SUFFIX=$(echo "$BRANCH_SUFFIX" | cut -c1-$MAX_SUFFIX_LENGTH)
+    # Remove trailing hyphen if truncation created one
+    TRUNCATED_SUFFIX=$(echo "$TRUNCATED_SUFFIX" | sed 's/-$//')
+    
+    ORIGINAL_BRANCH_NAME="$BRANCH_NAME"
+    BRANCH_NAME="${FEATURE_NUM}-${TRUNCATED_SUFFIX}"
+    
+    >&2 echo "[specify] Warning: Branch name exceeded GitHub's 244-byte limit"
+    >&2 echo "[specify] Original: $ORIGINAL_BRANCH_NAME (${#ORIGINAL_BRANCH_NAME} bytes)"
+    >&2 echo "[specify] Truncated to: $BRANCH_NAME (${#BRANCH_NAME} bytes)"
+fi
+
+FEATURE_DIR="$SPECS_DIR/$BRANCH_NAME"
+SPEC_FILE="$FEATURE_DIR/spec.md"
+
+if [ "$DRY_RUN" != true ]; then
+    if [ "$HAS_GIT" = true ]; then
+        branch_create_error=""
+        if ! branch_create_error=$(git checkout -q -b "$BRANCH_NAME" 2>&1); then
+            current_branch="$(git rev-parse --abbrev-ref HEAD 2>/dev/null || true)"
+            # Check if branch already exists
+            if git branch --list "$BRANCH_NAME" | grep -q .; then
+                if [ "$ALLOW_EXISTING" = true ]; then
+                    # If we're already on the branch, continue without another checkout.
+                    if [ "$current_branch" = "$BRANCH_NAME" ]; then
+                        :
+                    # Otherwise switch to the existing branch instead of failing.
+                    elif ! switch_branch_error=$(git checkout -q "$BRANCH_NAME" 2>&1); then
+                        >&2 echo "Error: Failed to switch to existing branch '$BRANCH_NAME'. Please resolve any local changes or conflicts and try again."
+                        if [ -n "$switch_branch_error" ]; then
+                            >&2 printf '%s\n' "$switch_branch_error"
+                        fi
+                        exit 1
+                    fi
+                elif [ "$USE_TIMESTAMP" = true ]; then
+                    >&2 echo "Error: Branch '$BRANCH_NAME' already exists. Rerun to get a new timestamp or use a different --short-name."
+                    exit 1
+                else
+                    >&2 echo "Error: Branch '$BRANCH_NAME' already exists. Please use a different feature name or specify a different number with --number."
+                    exit 1
+                fi
+            else
+                >&2 echo "Error: Failed to create git branch '$BRANCH_NAME'."
+                if [ -n "$branch_create_error" ]; then
+                    >&2 printf '%s\n' "$branch_create_error"
+                else
+                    >&2 echo "Please check your git configuration and try again."
+                fi
+                exit 1
+            fi
+        fi
+    else
+        >&2 echo "[specify] Warning: Git repository not detected; skipped branch creation for $BRANCH_NAME"
+    fi
+
+    mkdir -p "$FEATURE_DIR"
+
+    if [ ! -f "$SPEC_FILE" ]; then
+        TEMPLATE=$(resolve_template "spec-template" "$REPO_ROOT") || true
+        if [ -n "$TEMPLATE" ] && [ -f "$TEMPLATE" ]; then
+            cp "$TEMPLATE" "$SPEC_FILE"
+        else
+            echo "Warning: Spec template not found; created empty spec file" >&2
+            touch "$SPEC_FILE"
+        fi
+    fi
+
+    # Inform the user how to persist the feature variable in their own shell
+    printf '# To persist: export SPECIFY_FEATURE=%q\n' "$BRANCH_NAME" >&2
+fi
+
+if $JSON_MODE; then
+    if command -v jq >/dev/null 2>&1; then
+        if [ "$DRY_RUN" = true ]; then
+            jq -cn \
+                --arg branch_name "$BRANCH_NAME" \
+                --arg spec_file "$SPEC_FILE" \
+                --arg feature_num "$FEATURE_NUM" \
+                '{BRANCH_NAME:$branch_name,SPEC_FILE:$spec_file,FEATURE_NUM:$feature_num,DRY_RUN:true}'
+        else
+            jq -cn \
+                --arg branch_name "$BRANCH_NAME" \
+                --arg spec_file "$SPEC_FILE" \
+                --arg feature_num "$FEATURE_NUM" \
+                '{BRANCH_NAME:$branch_name,SPEC_FILE:$spec_file,FEATURE_NUM:$feature_num}'
+        fi
+    else
+        if [ "$DRY_RUN" = true ]; then
+            printf '{"BRANCH_NAME":"%s","SPEC_FILE":"%s","FEATURE_NUM":"%s","DRY_RUN":true}\n' "$(json_escape "$BRANCH_NAME")" "$(json_escape "$SPEC_FILE")" "$(json_escape "$FEATURE_NUM")"
+        else
+            printf '{"BRANCH_NAME":"%s","SPEC_FILE":"%s","FEATURE_NUM":"%s"}\n' "$(json_escape "$BRANCH_NAME")" "$(json_escape "$SPEC_FILE")" "$(json_escape "$FEATURE_NUM")"
+        fi
+    fi
+else
+    echo "BRANCH_NAME: $BRANCH_NAME"
+    echo "SPEC_FILE: $SPEC_FILE"
+    echo "FEATURE_NUM: $FEATURE_NUM"
+    if [ "$DRY_RUN" != true ]; then
+        printf '# To persist in your shell: export SPECIFY_FEATURE=%q\n' "$BRANCH_NAME"
+    fi
+fi
--- a/.specify/scripts/bash/setup-plan.sh
+++ b/.specify/scripts/bash/setup-plan.sh
@ -0,0 +1,91 @@
+#!/usr/bin/env bash
+
+set -e
+
+# Parse command line arguments
+JSON_MODE=false
+ARGS=()
+
+for arg in "$@"; do
+    case "$arg" in
+        --json) 
+            JSON_MODE=true 
+            ;;
+        --help|-h) 
+            echo "Usage: $0 [--json]"
+            echo "  --json    Output results in JSON format"
+            echo "  --help    Show this help message"
+            exit 0 
+            ;;
+        *) 
+            ARGS+=("$arg") 
+            ;;
+    esac
+done
+
+# Get script directory and load common functions
+SCRIPT_DIR="$(CDPATH="" cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+source "$SCRIPT_DIR/common.sh"
+
+# Get all paths and variables from common functions
+_paths_output=$(get_feature_paths) || { echo "ERROR: Failed to resolve feature paths" >&2; exit 1; }
+eval "$_paths_output"
+unset _paths_output
+
+# If feature.json pins an existing feature directory, branch naming is not required.
+if ! feature_json_matches_feature_dir "$REPO_ROOT" "$FEATURE_DIR"; then
+    check_feature_branch "$CURRENT_BRANCH" "$HAS_GIT" || exit 1
+fi
+
+# Ensure the feature directory exists
+mkdir -p "$FEATURE_DIR"
+
+# Copy plan template if plan doesn't already exist
+if [[ -f "$IMPL_PLAN" ]]; then
+    if $JSON_MODE; then
+        echo "Plan already exists at $IMPL_PLAN, skipping template copy" >&2
+    else
+        echo "Plan already exists at $IMPL_PLAN, skipping template copy"
+    fi
+else
+    TEMPLATE=$(resolve_template "plan-template" "$REPO_ROOT") || true
+    if [[ -n "$TEMPLATE" ]] && [[ -f "$TEMPLATE" ]]; then
+        cp "$TEMPLATE" "$IMPL_PLAN"
+        if $JSON_MODE; then
+            echo "Copied plan template to $IMPL_PLAN" >&2
+        else
+            echo "Copied plan template to $IMPL_PLAN"
+        fi
+    else
+        if $JSON_MODE; then
+            echo "Warning: Plan template not found" >&2
+        else
+            echo "Warning: Plan template not found"
+        fi
+        # Create a basic plan file if template doesn't exist
+        touch "$IMPL_PLAN"
+    fi
+fi
+
+# Output results
+if $JSON_MODE; then
+    if has_jq; then
+        jq -cn \
+            --arg feature_spec "$FEATURE_SPEC" \
+            --arg impl_plan "$IMPL_PLAN" \
+            --arg specs_dir "$FEATURE_DIR" \
+            --arg branch "$CURRENT_BRANCH" \
+            --arg has_git "$HAS_GIT" \
+            '{FEATURE_SPEC:$feature_spec,IMPL_PLAN:$impl_plan,SPECS_DIR:$specs_dir,BRANCH:$branch,HAS_GIT:$has_git}'
+    else
+        printf '{"FEATURE_SPEC":"%s","IMPL_PLAN":"%s","SPECS_DIR":"%s","BRANCH":"%s","HAS_GIT":"%s"}\n' \
+            "$(json_escape "$FEATURE_SPEC")" "$(json_escape "$IMPL_PLAN")" "$(json_escape "$FEATURE_DIR")" "$(json_escape "$CURRENT_BRANCH")" "$(json_escape "$HAS_GIT")"
+    fi
+else
+    echo "FEATURE_SPEC: $FEATURE_SPEC"
+    echo "IMPL_PLAN: $IMPL_PLAN" 
+    echo "SPECS_DIR: $FEATURE_DIR"
+    echo "BRANCH: $CURRENT_BRANCH"
+    echo "HAS_GIT: $HAS_GIT"
+fi
+
--- a/.specify/scripts/bash/setup-tasks.sh
+++ b/.specify/scripts/bash/setup-tasks.sh
@ -0,0 +1,96 @@
+#!/usr/bin/env bash
+
+set -e
+
+# Parse command line arguments
+JSON_MODE=false
+
+for arg in "$@"; do
+    case "$arg" in
+        --json) JSON_MODE=true ;;
+        --help|-h)
+            echo "Usage: $0 [--json]"
+            echo "  --json    Output results in JSON format"
+            echo "  --help    Show this help message"
+            exit 0
+            ;;
+        *) echo "ERROR: Unknown option '$arg'" >&2; exit 1 ;;
+    esac
+done
+
+# Source common functions
+SCRIPT_DIR="$(CDPATH="" cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+source "$SCRIPT_DIR/common.sh"
+
+# Get feature paths
+_paths_output=$(get_feature_paths) || { echo "ERROR: Failed to resolve feature paths" >&2; exit 1; }
+eval "$_paths_output"
+unset _paths_output
+
+# Validate branch
+# If feature.json pins an existing feature directory, branch naming is not required.
+if ! feature_json_matches_feature_dir "$REPO_ROOT" "$FEATURE_DIR"; then
+    check_feature_branch "$CURRENT_BRANCH" "$HAS_GIT" || exit 1
+fi
+
+if [[ ! -f "$IMPL_PLAN" ]]; then
+    echo "ERROR: plan.md not found in $FEATURE_DIR" >&2
+    echo "Run /speckit-plan first to create the implementation plan." >&2
+    exit 1
+fi
+
+if [[ ! -f "$FEATURE_SPEC" ]]; then
+    echo "ERROR: spec.md not found in $FEATURE_DIR" >&2
+    echo "Run /speckit-specify first to create the feature structure." >&2
+    exit 1
+fi
+
+# Build available docs list
+docs=()
+[[ -f "$RESEARCH" ]] && docs+=("research.md")
+[[ -f "$DATA_MODEL" ]] && docs+=("data-model.md")
+if [[ -d "$CONTRACTS_DIR" ]] && [[ -n "$(ls -A "$CONTRACTS_DIR" 2>/dev/null)" ]]; then
+    docs+=("contracts/")
+fi
+[[ -f "$QUICKSTART" ]] && docs+=("quickstart.md")
+
+# Resolve tasks template through override stack
+TASKS_TEMPLATE=$(resolve_template "tasks-template" "$REPO_ROOT") || true
+if [[ -z "$TASKS_TEMPLATE" ]] || [[ ! -f "$TASKS_TEMPLATE" ]]; then
+    echo "ERROR: Could not resolve required tasks-template from the template override stack for $REPO_ROOT" >&2
+    echo "Template 'tasks-template' was not found in any supported location (overrides, presets, extensions, or shared core). Add an override at .specify/templates/overrides/tasks-template.md, or run 'specify init' / reinstall shared infra to restore the core .specify/templates/tasks-template.md template." >&2
+    exit 1
+fi
+
+# Output results
+if $JSON_MODE; then
+    if has_jq; then
+        if [[ ${#docs[@]} -eq 0 ]]; then
+            json_docs="[]"
+        else
+            json_docs=$(printf '%s\n' "${docs[@]}" | jq -R . | jq -s .)
+        fi
+        jq -cn \
+            --arg feature_dir "$FEATURE_DIR" \
+            --argjson docs "$json_docs" \
+            --arg tasks_template "${TASKS_TEMPLATE:-}" \
+            '{FEATURE_DIR:$feature_dir,AVAILABLE_DOCS:$docs,TASKS_TEMPLATE:$tasks_template}'
+    else
+        if [[ ${#docs[@]} -eq 0 ]]; then
+            json_docs="[]"
+        else
+            json_docs=$(for d in "${docs[@]}"; do printf '"%s",' "$(json_escape "$d")"; done)
+            json_docs="[${json_docs%,}]"
+        fi
+        printf '{"FEATURE_DIR":"%s","AVAILABLE_DOCS":%s,"TASKS_TEMPLATE":"%s"}\n' \
+            "$(json_escape "$FEATURE_DIR")" "$json_docs" "$(json_escape "${TASKS_TEMPLATE:-}")"
+    fi
+else
+    echo "FEATURE_DIR: $FEATURE_DIR"
+    echo "TASKS_TEMPLATE: ${TASKS_TEMPLATE:-not found}"
+    echo "AVAILABLE_DOCS:"
+    check_file "$RESEARCH" "research.md"
+    check_file "$DATA_MODEL" "data-model.md"
+    check_dir "$CONTRACTS_DIR" "contracts/"
+    check_file "$QUICKSTART" "quickstart.md"
+fi
--- a/.specify/templates/checklist-template.md
+++ b/.specify/templates/checklist-template.md
@ -0,0 +1,40 @@
+# [CHECKLIST TYPE] Checklist: [FEATURE NAME]
+
+**Purpose**: [Brief description of what this checklist covers]
+**Created**: [DATE]
+**Feature**: [Link to spec.md or relevant documentation]
+
+**Note**: This checklist is generated by the `/speckit-checklist` command based on feature context and requirements.
+
+<!-- 
+  ============================================================================
+  IMPORTANT: The checklist items below are SAMPLE ITEMS for illustration only.
+  
+  The /speckit-checklist command MUST replace these with actual items based on:
+  - User's specific checklist request
+  - Feature requirements from spec.md
+  - Technical context from plan.md
+  - Implementation details from tasks.md
+  
+  DO NOT keep these sample items in the generated checklist file.
+  ============================================================================
+-->
+
+## [Category 1]
+
+- [ ] CHK001 First checklist item with clear action
+- [ ] CHK002 Second checklist item
+- [ ] CHK003 Third checklist item
+
+## [Category 2]
+
+- [ ] CHK004 Another category item
+- [ ] CHK005 Item with specific criteria
+- [ ] CHK006 Final item in this category
+
+## Notes
+
+- Check items off as completed: `[x]`
+- Add comments or findings inline
+- Link to relevant resources or documentation
+- Items are numbered sequentially for easy reference
--- a/.specify/templates/constitution-template.md
+++ b/.specify/templates/constitution-template.md
@ -0,0 +1,50 @@
+# [PROJECT_NAME] Constitution
+<!-- Example: Spec Constitution, TaskFlow Constitution, etc. -->
+
+## Core Principles
+
+### [PRINCIPLE_1_NAME]
+<!-- Example: I. Library-First -->
+[PRINCIPLE_1_DESCRIPTION]
+<!-- Example: Every feature starts as a standalone library; Libraries must be self-contained, independently testable, documented; Clear purpose required - no organizational-only libraries -->
+
+### [PRINCIPLE_2_NAME]
+<!-- Example: II. CLI Interface -->
+[PRINCIPLE_2_DESCRIPTION]
+<!-- Example: Every library exposes functionality via CLI; Text in/out protocol: stdin/args → stdout, errors → stderr; Support JSON + human-readable formats -->
+
+### [PRINCIPLE_3_NAME]
+<!-- Example: III. Test-First (NON-NEGOTIABLE) -->
+[PRINCIPLE_3_DESCRIPTION]
+<!-- Example: TDD mandatory: Tests written → User approved → Tests fail → Then implement; Red-Green-Refactor cycle strictly enforced -->
+
+### [PRINCIPLE_4_NAME]
+<!-- Example: IV. Integration Testing -->
+[PRINCIPLE_4_DESCRIPTION]
+<!-- Example: Focus areas requiring integration tests: New library contract tests, Contract changes, Inter-service communication, Shared schemas -->
+
+### [PRINCIPLE_5_NAME]
+<!-- Example: V. Observability, VI. Versioning & Breaking Changes, VII. Simplicity -->
+[PRINCIPLE_5_DESCRIPTION]
+<!-- Example: Text I/O ensures debuggability; Structured logging required; Or: MAJOR.MINOR.BUILD format; Or: Start simple, YAGNI principles -->
+
+## [SECTION_2_NAME]
+<!-- Example: Additional Constraints, Security Requirements, Performance Standards, etc. -->
+
+[SECTION_2_CONTENT]
+<!-- Example: Technology stack requirements, compliance standards, deployment policies, etc. -->
+
+## [SECTION_3_NAME]
+<!-- Example: Development Workflow, Review Process, Quality Gates, etc. -->
+
+[SECTION_3_CONTENT]
+<!-- Example: Code review requirements, testing gates, deployment approval process, etc. -->
+
+## Governance
+<!-- Example: Constitution supersedes all other practices; Amendments require documentation, approval, migration plan -->
+
+[GOVERNANCE_RULES]
+<!-- Example: All PRs/reviews must verify compliance; Complexity must be justified; Use [GUIDANCE_FILE] for runtime development guidance -->
+
+**Version**: [CONSTITUTION_VERSION] | **Ratified**: [RATIFICATION_DATE] | **Last Amended**: [LAST_AMENDED_DATE]
+<!-- Example: Version: 2.1.1 | Ratified: 2025-06-13 | Last Amended: 2025-07-16 -->
--- a/.specify/templates/plan-template.md
+++ b/.specify/templates/plan-template.md
@ -0,0 +1,113 @@
+# Implementation Plan: [FEATURE]
+
+**Branch**: `[###-feature-name]` | **Date**: [DATE] | **Spec**: [link]
+
+**Input**: Feature specification from `/specs/[###-feature-name]/spec.md`
+
+**Note**: This template is filled in by the `/speckit-plan` command. See `.specify/templates/plan-template.md` for the execution workflow.
+
+## Summary
+
+[Extract from feature spec: primary requirement + technical approach from research]
+
+## Technical Context
+
+<!--
+  ACTION REQUIRED: Replace the content in this section with the technical details
+  for the project. The structure here is presented in advisory capacity to guide
+  the iteration process.
+-->
+
+**Language/Version**: [e.g., Python 3.11, Swift 5.9, Rust 1.75 or NEEDS CLARIFICATION]
+
+**Primary Dependencies**: [e.g., FastAPI, UIKit, LLVM or NEEDS CLARIFICATION]
+
+**Storage**: [if applicable, e.g., PostgreSQL, CoreData, files or N/A]
+
+**Testing**: [e.g., pytest, XCTest, cargo test or NEEDS CLARIFICATION]
+
+**Target Platform**: [e.g., Linux server, iOS 15+, WASM or NEEDS CLARIFICATION]
+
+**Project Type**: [e.g., library/cli/web-service/mobile-app/compiler/desktop-app or NEEDS CLARIFICATION]
+
+**Performance Goals**: [domain-specific, e.g., 1000 req/s, 10k lines/sec, 60 fps or NEEDS CLARIFICATION]
+
+**Constraints**: [domain-specific, e.g., <200ms p95, <100MB memory, offline-capable or NEEDS CLARIFICATION]
+
+**Scale/Scope**: [domain-specific, e.g., 10k users, 1M LOC, 50 screens or NEEDS CLARIFICATION]
+
+## Constitution Check
+
+*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*
+
+[Gates determined based on constitution file]
+
+## Project Structure
+
+### Documentation (this feature)
+
+```text
+specs/[###-feature]/
+├── plan.md              # This file (/speckit-plan command output)
+├── research.md          # Phase 0 output (/speckit-plan command)
+├── data-model.md        # Phase 1 output (/speckit-plan command)
+├── quickstart.md        # Phase 1 output (/speckit-plan command)
+├── contracts/           # Phase 1 output (/speckit-plan command)
+└── tasks.md             # Phase 2 output (/speckit-tasks command - NOT created by /speckit-plan)
+```
+
+### Source Code (repository root)
+<!--
+  ACTION REQUIRED: Replace the placeholder tree below with the concrete layout
+  for this feature. Delete unused options and expand the chosen structure with
+  real paths (e.g., apps/admin, packages/something). The delivered plan must
+  not include Option labels.
+-->
+
+```text
+# [REMOVE IF UNUSED] Option 1: Single project (DEFAULT)
+src/
+├── models/
+├── services/
+├── cli/
+└── lib/
+
+tests/
+├── contract/
+├── integration/
+└── unit/
+
+# [REMOVE IF UNUSED] Option 2: Web application (when "frontend" + "backend" detected)
+backend/
+├── src/
+│   ├── models/
+│   ├── services/
+│   └── api/
+└── tests/
+
+frontend/
+├── src/
+│   ├── components/
+│   ├── pages/
+│   └── services/
+└── tests/
+
+# [REMOVE IF UNUSED] Option 3: Mobile + API (when "iOS/Android" detected)
+api/
+└── [same as backend above]
+
+ios/ or android/
+└── [platform-specific structure: feature modules, UI flows, platform tests]
+```
+
+**Structure Decision**: [Document the selected structure and reference the real
+directories captured above]
+
+## Complexity Tracking
+
+> **Fill ONLY if Constitution Check has violations that must be justified**
+
+| Violation | Why Needed | Simpler Alternative Rejected Because |
+|-----------|------------|-------------------------------------|
+| [e.g., 4th project] | [current need] | [why 3 projects insufficient] |
+| [e.g., Repository pattern] | [specific problem] | [why direct DB access insufficient] |
--- a/.specify/templates/spec-template.md
+++ b/.specify/templates/spec-template.md
@ -0,0 +1,131 @@
+# Feature Specification: [FEATURE NAME]
+
+**Feature Branch**: `[###-feature-name]`
+
+**Created**: [DATE]
+
+**Status**: Draft
+
+**Input**: User description: "$ARGUMENTS"
+
+## User Scenarios & Testing *(mandatory)*
+
+<!--
+  IMPORTANT: User stories should be PRIORITIZED as user journeys ordered by importance.
+  Each user story/journey must be INDEPENDENTLY TESTABLE - meaning if you implement just ONE of them,
+  you should still have a viable MVP (Minimum Viable Product) that delivers value.
+
+  Assign priorities (P1, P2, P3, etc.) to each story, where P1 is the most critical.
+  Think of each story as a standalone slice of functionality that can be:
+  - Developed independently
+  - Tested independently
+  - Deployed independently
+  - Demonstrated to users independently
+-->
+
+### User Story 1 - [Brief Title] (Priority: P1)
+
+[Describe this user journey in plain language]
+
+**Why this priority**: [Explain the value and why it has this priority level]
+
+**Independent Test**: [Describe how this can be tested independently - e.g., "Can be fully tested by [specific action] and delivers [specific value]"]
+
+**Acceptance Scenarios**:
+
+1. **Given** [initial state], **When** [action], **Then** [expected outcome]
+2. **Given** [initial state], **When** [action], **Then** [expected outcome]
+
+---
+
+### User Story 2 - [Brief Title] (Priority: P2)
+
+[Describe this user journey in plain language]
+
+**Why this priority**: [Explain the value and why it has this priority level]
+
+**Independent Test**: [Describe how this can be tested independently]
+
+**Acceptance Scenarios**:
+
+1. **Given** [initial state], **When** [action], **Then** [expected outcome]
+
+---
+
+### User Story 3 - [Brief Title] (Priority: P3)
+
+[Describe this user journey in plain language]
+
+**Why this priority**: [Explain the value and why it has this priority level]
+
+**Independent Test**: [Describe how this can be tested independently]
+
+**Acceptance Scenarios**:
+
+1. **Given** [initial state], **When** [action], **Then** [expected outcome]
+
+---
+
+[Add more user stories as needed, each with an assigned priority]
+
+### Edge Cases
+
+<!--
+  ACTION REQUIRED: The content in this section represents placeholders.
+  Fill them out with the right edge cases.
+-->
+
+- What happens when [boundary condition]?
+- How does system handle [error scenario]?
+
+## Requirements *(mandatory)*
+
+<!--
+  ACTION REQUIRED: The content in this section represents placeholders.
+  Fill them out with the right functional requirements.
+-->
+
+### Functional Requirements
+
+- **FR-001**: System MUST [specific capability, e.g., "allow users to create accounts"]
+- **FR-002**: System MUST [specific capability, e.g., "validate email addresses"]
+- **FR-003**: Users MUST be able to [key interaction, e.g., "reset their password"]
+- **FR-004**: System MUST [data requirement, e.g., "persist user preferences"]
+- **FR-005**: System MUST [behavior, e.g., "log all security events"]
+
+*Example of marking unclear requirements:*
+
+- **FR-006**: System MUST authenticate users via [NEEDS CLARIFICATION: auth method not specified - email/password, SSO, OAuth?]
+- **FR-007**: System MUST retain user data for [NEEDS CLARIFICATION: retention period not specified]
+
+### Key Entities *(include if feature involves data)*
+
+- **[Entity 1]**: [What it represents, key attributes without implementation]
+- **[Entity 2]**: [What it represents, relationships to other entities]
+
+## Success Criteria *(mandatory)*
+
+<!--
+  ACTION REQUIRED: Define measurable success criteria.
+  These must be technology-agnostic and measurable.
+-->
+
+### Measurable Outcomes
+
+- **SC-001**: [Measurable metric, e.g., "Users can complete account creation in under 2 minutes"]
+- **SC-002**: [Measurable metric, e.g., "System handles 1000 concurrent users without degradation"]
+- **SC-003**: [User satisfaction metric, e.g., "90% of users successfully complete primary task on first attempt"]
+- **SC-004**: [Business metric, e.g., "Reduce support tickets related to [X] by 50%"]
+
+## Assumptions
+
+<!--
+  ACTION REQUIRED: The content in this section represents placeholders.
+  Fill them out with the right assumptions based on reasonable defaults
+  chosen when the feature description did not specify certain details.
+-->
+
+- [Assumption about target users, e.g., "Users have stable internet connectivity"]
+- [Assumption about scope boundaries, e.g., "Mobile support is out of scope for v1"]
+- [Assumption about data/environment, e.g., "Existing authentication system will be reused"]
+- [Dependency on existing system/service, e.g., "Requires access to the existing user profile API"]
--- a/.specify/templates/tasks-template.md
+++ b/.specify/templates/tasks-template.md
@ -0,0 +1,252 @@
+---
+
+description: "Task list template for feature implementation"
+---
+
+# Tasks: [FEATURE NAME]
+
+**Input**: Design documents from `/specs/[###-feature-name]/`
+
+**Prerequisites**: plan.md (required), spec.md (required for user stories), research.md, data-model.md, contracts/
+
+**Tests**: The examples below include test tasks. Tests are OPTIONAL - only include them if explicitly requested in the feature specification.
+
+**Organization**: Tasks are grouped by user story to enable independent implementation and testing of each story.
+
+## Format: `[ID] [P?] [Story] Description`
+
+- **[P]**: Can run in parallel (different files, no dependencies)
+- **[Story]**: Which user story this task belongs to (e.g., US1, US2, US3)
+- Include exact file paths in descriptions
+
+## Path Conventions
+
+- **Single project**: `src/`, `tests/` at repository root
+- **Web app**: `backend/src/`, `frontend/src/`
+- **Mobile**: `api/src/`, `ios/src/` or `android/src/`
+- Paths shown below assume single project - adjust based on plan.md structure
+
+<!--
+  ============================================================================
+  IMPORTANT: The tasks below are SAMPLE TASKS for illustration purposes only.
+
+  The /speckit-tasks command MUST replace these with actual tasks based on:
+  - User stories from spec.md (with their priorities P1, P2, P3...)
+  - Feature requirements from plan.md
+  - Entities from data-model.md
+  - Endpoints from contracts/
+
+  Tasks MUST be organized by user story so each story can be:
+  - Implemented independently
+  - Tested independently
+  - Delivered as an MVP increment
+
+  DO NOT keep these sample tasks in the generated tasks.md file.
+  ============================================================================
+-->
+
+## Phase 1: Setup (Shared Infrastructure)
+
+**Purpose**: Project initialization and basic structure
+
+- [ ] T001 Create project structure per implementation plan
+- [ ] T002 Initialize [language] project with [framework] dependencies
+- [ ] T003 [P] Configure linting and formatting tools
+
+---
+
+## Phase 2: Foundational (Blocking Prerequisites)
+
+**Purpose**: Core infrastructure that MUST be complete before ANY user story can be implemented
+
+**⚠️ CRITICAL**: No user story work can begin until this phase is complete
+
+Examples of foundational tasks (adjust based on your project):
+
+- [ ] T004 Setup database schema and migrations framework
+- [ ] T005 [P] Implement authentication/authorization framework
+- [ ] T006 [P] Setup API routing and middleware structure
+- [ ] T007 Create base models/entities that all stories depend on
+- [ ] T008 Configure error handling and logging infrastructure
+- [ ] T009 Setup environment configuration management
+
+**Checkpoint**: Foundation ready - user story implementation can now begin in parallel
+
+---
+
+## Phase 3: User Story 1 - [Title] (Priority: P1) 🎯 MVP
+
+**Goal**: [Brief description of what this story delivers]
+
+**Independent Test**: [How to verify this story works on its own]
+
+### Tests for User Story 1 (OPTIONAL - only if tests requested) ⚠️
+
+> **NOTE: Write these tests FIRST, ensure they FAIL before implementation**
+
+- [ ] T010 [P] [US1] Contract test for [endpoint] in tests/contract/test_[name].py
+- [ ] T011 [P] [US1] Integration test for [user journey] in tests/integration/test_[name].py
+
+### Implementation for User Story 1
+
+- [ ] T012 [P] [US1] Create [Entity1] model in src/models/[entity1].py
+- [ ] T013 [P] [US1] Create [Entity2] model in src/models/[entity2].py
+- [ ] T014 [US1] Implement [Service] in src/services/[service].py (depends on T012, T013)
+- [ ] T015 [US1] Implement [endpoint/feature] in src/[location]/[file].py
+- [ ] T016 [US1] Add validation and error handling
+- [ ] T017 [US1] Add logging for user story 1 operations
+
+**Checkpoint**: At this point, User Story 1 should be fully functional and testable independently
+
+---
+
+## Phase 4: User Story 2 - [Title] (Priority: P2)
+
+**Goal**: [Brief description of what this story delivers]
+
+**Independent Test**: [How to verify this story works on its own]
+
+### Tests for User Story 2 (OPTIONAL - only if tests requested) ⚠️
+
+- [ ] T018 [P] [US2] Contract test for [endpoint] in tests/contract/test_[name].py
+- [ ] T019 [P] [US2] Integration test for [user journey] in tests/integration/test_[name].py
+
+### Implementation for User Story 2
+
+- [ ] T020 [P] [US2] Create [Entity] model in src/models/[entity].py
+- [ ] T021 [US2] Implement [Service] in src/services/[service].py
+- [ ] T022 [US2] Implement [endpoint/feature] in src/[location]/[file].py
+- [ ] T023 [US2] Integrate with User Story 1 components (if needed)
+
+**Checkpoint**: At this point, User Stories 1 AND 2 should both work independently
+
+---
+
+## Phase 5: User Story 3 - [Title] (Priority: P3)
+
+**Goal**: [Brief description of what this story delivers]
+
+**Independent Test**: [How to verify this story works on its own]
+
+### Tests for User Story 3 (OPTIONAL - only if tests requested) ⚠️
+
+- [ ] T024 [P] [US3] Contract test for [endpoint] in tests/contract/test_[name].py
+- [ ] T025 [P] [US3] Integration test for [user journey] in tests/integration/test_[name].py
+
+### Implementation for User Story 3
+
+- [ ] T026 [P] [US3] Create [Entity] model in src/models/[entity].py
+- [ ] T027 [US3] Implement [Service] in src/services/[service].py
+- [ ] T028 [US3] Implement [endpoint/feature] in src/[location]/[file].py
+
+**Checkpoint**: All user stories should now be independently functional
+
+---
+
+[Add more user story phases as needed, following the same pattern]
+
+---
+
+## Phase N: Polish & Cross-Cutting Concerns
+
+**Purpose**: Improvements that affect multiple user stories
+
+- [ ] TXXX [P] Documentation updates in docs/
+- [ ] TXXX Code cleanup and refactoring
+- [ ] TXXX Performance optimization across all stories
+- [ ] TXXX [P] Additional unit tests (if requested) in tests/unit/
+- [ ] TXXX Security hardening
+- [ ] TXXX Run quickstart.md validation
+
+---
+
+## Dependencies & Execution Order
+
+### Phase Dependencies
+
+- **Setup (Phase 1)**: No dependencies - can start immediately
+- **Foundational (Phase 2)**: Depends on Setup completion - BLOCKS all user stories
+- **User Stories (Phase 3+)**: All depend on Foundational phase completion
+  - User stories can then proceed in parallel (if staffed)
+  - Or sequentially in priority order (P1 → P2 → P3)
+- **Polish (Final Phase)**: Depends on all desired user stories being complete
+
+### User Story Dependencies
+
+- **User Story 1 (P1)**: Can start after Foundational (Phase 2) - No dependencies on other stories
+- **User Story 2 (P2)**: Can start after Foundational (Phase 2) - May integrate with US1 but should be independently testable
+- **User Story 3 (P3)**: Can start after Foundational (Phase 2) - May integrate with US1/US2 but should be independently testable
+
+### Within Each User Story
+
+- Tests (if included) MUST be written and FAIL before implementation
+- Models before services
+- Services before endpoints
+- Core implementation before integration
+- Story complete before moving to next priority
+
+### Parallel Opportunities
+
+- All Setup tasks marked [P] can run in parallel
+- All Foundational tasks marked [P] can run in parallel (within Phase 2)
+- Once Foundational phase completes, all user stories can start in parallel (if team capacity allows)
+- All tests for a user story marked [P] can run in parallel
+- Models within a story marked [P] can run in parallel
+- Different user stories can be worked on in parallel by different team members
+
+---
+
+## Parallel Example: User Story 1
+
+```bash
+# Launch all tests for User Story 1 together (if tests requested):
+Task: "Contract test for [endpoint] in tests/contract/test_[name].py"
+Task: "Integration test for [user journey] in tests/integration/test_[name].py"
+
+# Launch all models for User Story 1 together:
+Task: "Create [Entity1] model in src/models/[entity1].py"
+Task: "Create [Entity2] model in src/models/[entity2].py"
+```
+
+---
+
+## Implementation Strategy
+
+### MVP First (User Story 1 Only)
+
+1. Complete Phase 1: Setup
+2. Complete Phase 2: Foundational (CRITICAL - blocks all stories)
+3. Complete Phase 3: User Story 1
+4. **STOP and VALIDATE**: Test User Story 1 independently
+5. Deploy/demo if ready
+
+### Incremental Delivery
+
+1. Complete Setup + Foundational → Foundation ready
+2. Add User Story 1 → Test independently → Deploy/Demo (MVP!)
+3. Add User Story 2 → Test independently → Deploy/Demo
+4. Add User Story 3 → Test independently → Deploy/Demo
+5. Each story adds value without breaking previous stories
+
+### Parallel Team Strategy
+
+With multiple developers:
+
+1. Team completes Setup + Foundational together
+2. Once Foundational is done:
+   - Developer A: User Story 1
+   - Developer B: User Story 2
+   - Developer C: User Story 3
+3. Stories complete and integrate independently
+
+---
+
+## Notes
+
+- [P] tasks = different files, no dependencies
+- [Story] label maps task to specific user story for traceability
+- Each user story should be independently completable and testable
+- Verify tests fail before implementing
+- Commit after each task or logical group
+- Stop at any checkpoint to validate story independently
+- Avoid: vague tasks, same file conflicts, cross-story dependencies that break independence
--- a/.specify/workflows/speckit/workflow.yml
+++ b/.specify/workflows/speckit/workflow.yml
@ -0,0 +1,77 @@
+schema_version: "1.0"
+workflow:
+  id: "speckit"
+  name: "Full SDD Cycle"
+  version: "1.0.0"
+  author: "GitHub"
+  description: "Runs specify → plan → tasks → implement with review gates"
+
+requires:
+  # 0.8.5 is the first release with engine-side resolution of the
+  # ``integration: "auto"`` default. Older versions would treat "auto"
+  # as a literal integration key and fail at dispatch.
+  speckit_version: ">=0.8.5"
+  integrations:
+    # The four commands below (specify, plan, tasks, implement) are core
+    # spec-kit commands provided by every integration. The list here is an
+    # advisory, non-exhaustive compatibility hint following the documented
+    # ``any: [...]`` schema -- it is NOT a closed set. The workflow runs
+    # against any integration the project was initialized with, including
+    # ones not listed below, as long as that integration provides the four
+    # core commands referenced in ``steps``.
+    any:
+      - "claude"
+      - "copilot"
+      - "gemini"
+      - "opencode"
+
+inputs:
+  spec:
+    type: string
+    required: true
+    prompt: "Describe what you want to build"
+  integration:
+    type: string
+    default: "auto"
+    prompt: "Integration to use (e.g. claude, copilot, gemini; 'auto' uses the project's initialized integration)"
+  scope:
+    type: string
+    default: "full"
+    enum: ["full", "backend-only", "frontend-only"]
+
+steps:
+  - id: specify
+    command: speckit.specify
+    integration: "{{ inputs.integration }}"
+    input:
+      args: "{{ inputs.spec }}"
+
+  - id: review-spec
+    type: gate
+    message: "Review the generated spec before planning."
+    options: [approve, reject]
+    on_reject: abort
+
+  - id: plan
+    command: speckit.plan
+    integration: "{{ inputs.integration }}"
+    input:
+      args: "{{ inputs.spec }}"
+
+  - id: review-plan
+    type: gate
+    message: "Review the plan before generating tasks."
+    options: [approve, reject]
+    on_reject: abort
+
+  - id: tasks
+    command: speckit.tasks
+    integration: "{{ inputs.integration }}"
+    input:
+      args: "{{ inputs.spec }}"
+
+  - id: implement
+    command: speckit.implement
+    integration: "{{ inputs.integration }}"
+    input:
+      args: "{{ inputs.spec }}"
--- a/.specify/workflows/workflow-registry.json
+++ b/.specify/workflows/workflow-registry.json
@ -0,0 +1,13 @@
+{
+  "schema_version": "1.0",
+  "workflows": {
+    "speckit": {
+      "name": "Full SDD Cycle",
+      "version": "1.0.0",
+      "description": "Runs specify \u2192 plan \u2192 tasks \u2192 implement with review gates",
+      "source": "bundled",
+      "installed_at": "2026-06-08T01:33:59.624631+00:00",
+      "updated_at": "2026-06-08T01:33:59.624637+00:00"
+    }
+  }
+}
--- a/2026-06-01-hermes-gateway-llm-design.md
+++ b/2026-06-01-hermes-gateway-llm-design.md
@ -0,0 +1,458 @@
+# Beaver Terminal WebSocket Integration Guide
+
+Date: 2026-06-01
+
+Audience: the small-terminal-side Codex agent that will modify terminal firmware or terminal app code.
+
+## Goal
+
+Connect the small terminal device to Beaver through a text-only WebSocket channel.
+
+The first acceptance target is simple:
+
+1. The terminal opens a WebSocket connection to Beaver.
+2. The terminal sends a `connect` frame with a stable `peer_id`.
+3. The terminal sends one text `message` frame.
+4. The terminal receives an `ack`.
+5. The terminal receives the final assistant text response from Beaver.
+6. The terminal can reconnect with the same `peer_id` and keep the same Beaver session.
+
+This document replaces the earlier Hermes LiveKit LLM adapter design for the terminal-side work. Do not implement a LiveKit LLM adapter from this document.
+
+## Non-Goals
+
+- Do not implement audio streaming.
+- Do not implement camera, screen, image, or multimodal frames.
+- Do not implement token streaming.
+- Do not implement terminal-side tools.
+- Do not implement AuthZ, device registration, OAuth, or pairing in the first pass.
+- Do not call Beaver REST chat endpoints or the existing Web UI `/ws/{session_id}` endpoint.
+- Do not build an OpenAI-compatible proxy.
+- Do not implement Hermes Agent or LiveKit changes on the terminal side.
+
+## Beaver Endpoint
+
+The terminal connects to:
+
+```text
+ws://<beaver-host>/api/channels/<channel_id>/ws
+```
+
+For local development through the Beaver app instance nginx port:
+
+```text
+ws://127.0.0.1:8080/api/channels/terminal-dev/ws
+```
+
+For direct backend development without nginx:
+
+```text
+ws://127.0.0.1:18080/api/channels/terminal-dev/ws
+```
+
+Use `wss://` when Beaver is deployed behind TLS.
+
+The expected first channel id is:
+
+```text
+terminal-dev
+```
+
+The terminal implementation should make the URL configurable, for example:
+
+```text
+BEAVER_WS_URL=ws://127.0.0.1:8080/api/channels/terminal-dev/ws
+TERMINAL_PEER_ID=device-001
+TERMINAL_DEVICE_NAME=desk-terminal
+```
+
+## Protocol Overview
+
+The transport is JSON over WebSocket.
+
+All frames are UTF-8 JSON objects. The terminal should ignore unknown fields. Beaver will ignore unknown fields unless the frame type is invalid.
+
+The protocol is request/reply oriented in this phase. Beaver sends only final assistant messages, not token deltas.
+
+Required frame flow:
+
+```text
+terminal -> Beaver: connect
+Beaver -> terminal: connected
+terminal -> Beaver: message
+Beaver -> terminal: ack
+Beaver -> terminal: message
+```
+
+Optional heartbeat:
+
+```text
+terminal -> Beaver: ping
+Beaver -> terminal: pong
+```
+
+## Connect Frame
+
+The terminal must send `connect` immediately after the WebSocket opens.
+
+Terminal to Beaver:
+
+```json
+{
+  "type": "connect",
+  "peer_id": "device-001",
+  "device_name": "desk-terminal",
+  "capabilities": ["text"]
+}
+```
+
+Required fields:
+
+- `type`: must be `"connect"`.
+- `peer_id`: stable terminal identity. Reuse this value across reconnects.
+
+Recommended fields:
+
+- `device_name`: human-readable terminal name.
+- `capabilities`: include `"text"`.
+
+Optional fields:
+
+- `thread_id`: optional sub-session key. Omit it for the first pass.
+- `user_id`: optional user identity. Omit it unless the terminal already has a stable user id.
+
+Beaver to terminal:
+
+```json
+{
+  "type": "connected",
+  "channel_id": "terminal-dev",
+  "session_id": "terminal-dev:local:device-001"
+}
+```
+
+The terminal should store `session_id` for logging and diagnostics. It does not need to send `session_id` back in message frames.
+
+## Message Frame
+
+Terminal to Beaver:
+
+```json
+{
+  "type": "message",
+  "message_id": "m-001",
+  "text": "hello"
+}
+```
+
+Required fields:
+
+- `type`: must be `"message"`.
+- `message_id`: unique id for this user message.
+- `text`: non-empty user text.
+
+Recommended `message_id` format:
+
+```text
+<peer_id>-<monotonic-counter>
+```
+
+Example:
+
+```text
+device-001-000001
+device-001-000002
+```
+
+The terminal should persist the counter if practical. If persistence is unavailable, generate a UUID or timestamp-based id. Reusing the same `message_id` tells Beaver to treat the frame as a duplicate.
+
+Optional fields:
+
+- `thread_id`: use only when the terminal intentionally wants a separate Beaver session.
+- `user_id`: use only when the terminal has a stable user id.
+
+## Ack Frame
+
+Beaver sends an ack after accepting or deduplicating the inbound message.
+
+Accepted:
+
+```json
+{
+  "type": "ack",
+  "message_id": "device-001-000001",
+  "session_id": "terminal-dev:local:device-001",
+  "accepted": true
+}
+```
+
+Duplicate still processing:
+
+```json
+{
+  "type": "ack",
+  "message_id": "device-001-000001",
+  "session_id": "terminal-dev:local:device-001",
+  "accepted": false,
+  "duplicate": true,
+  "pending": true
+}
+```
+
+Duplicate already completed:
+
+```json
+{
+  "type": "ack",
+  "message_id": "device-001-000001",
+  "session_id": "terminal-dev:local:device-001",
+  "accepted": false,
+  "duplicate": true,
+  "pending": false,
+  "reply": "cached assistant reply"
+}
+```
+
+Terminal behavior:
+
+- If `accepted` is true, wait for the assistant `message`.
+- If `duplicate` and `reply` is present, display the cached reply.
+- If `duplicate` and `pending` is true, keep waiting on the socket.
+- If `error` is present, display or log the error.
+
+## Assistant Message Frame
+
+Beaver to terminal:
+
+```json
+{
+  "type": "message",
+  "role": "assistant",
+  "message_id": "device-001-000001",
+  "run_id": "run-id",
+  "text": "assistant reply",
+  "finish_reason": "stop"
+}
+```
+
+Fields:
+
+- `type`: `"message"`.
+- `role`: `"assistant"`.
+- `message_id`: the user message id this response belongs to.
+- `run_id`: Beaver run id for diagnostics.
+- `text`: final assistant response.
+- `finish_reason`: usually `"stop"`, or `"error"` when the run failed.
+
+Terminal behavior:
+
+- Render or speak `text`.
+- Treat `finish_reason == "error"` as a failed turn.
+- Do not expect token-level streaming in this phase.
+
+## Ping And Pong
+
+Terminal to Beaver:
+
+```json
+{"type": "ping"}
+```
+
+Beaver to terminal:
+
+```json
+{"type": "pong"}
+```
+
+Recommended heartbeat interval:
+
+```text
+30 seconds
+```
+
+If no pong or other frame is received after a reasonable timeout, reconnect.
+
+## Error Frame
+
+Beaver may send:
+
+```json
+{
+  "type": "error",
+  "error": "human readable error"
+}
+```
+
+Terminal behavior:
+
+- Log the error.
+- Keep the connection open unless the WebSocket closes.
+- If the error is for a user message, allow the user to retry with a new `message_id`.
+
+Common first-pass errors:
+
+- `connect` is required before `message`.
+- `peer_id` is required.
+- `message_id` is required.
+- `text` is required.
+- Unsupported websocket frame type.
+
+## Terminal State Machine
+
+Implement the terminal client as a small state machine.
+
+```text
+DISCONNECTED
+  -> connect websocket
+CONNECTING
+  -> websocket open, send connect frame
+WAIT_CONNECTED
+  -> receive connected
+READY
+  -> send message frame
+WAIT_ACK
+  -> receive ack
+WAIT_REPLY
+  -> receive assistant message
+READY
+```
+
+On WebSocket close or network failure, transition to `DISCONNECTED` and reconnect with backoff.
+
+Recommended reconnect policy:
+
+- Start at 1 second.
+- Double up to 30 seconds.
+- Reset backoff after a successful `connected` frame.
+
+On reconnect, use the same `peer_id`.
+
+## Terminal Implementation Requirements
+
+The terminal-side code should provide:
+
+- A configurable Beaver WebSocket URL.
+- A stable `peer_id`.
+- A configurable `device_name`.
+- A monotonic or otherwise unique `message_id` generator.
+- JSON encoding and decoding.
+- Connect frame on socket open.
+- Ping/pong heartbeat.
+- Reconnect with backoff.
+- A queue or guard so only one user text turn is in flight at a time for the first pass.
+- Logging for `session_id`, `message_id`, `run_id`, and errors.
+
+The terminal-side code does not need:
+
+- Multi-room session logic.
+- Hermes session management.
+- LiveKit `AgentSession`.
+- Audio chunking.
+- Tool calls.
+- OAuth or token refresh.
+
+## Example Client Pseudocode
+
+```python
+peer_id = load_or_create_peer_id()
+counter = load_counter()
+
+async def run_terminal_client():
+    while True:
+        try:
+            async with connect(BEAVER_WS_URL) as ws:
+                await ws.send_json({
+                    "type": "connect",
+                    "peer_id": peer_id,
+                    "device_name": DEVICE_NAME,
+                    "capabilities": ["text"],
+                })
+
+                connected = await ws.receive_json()
+                assert connected["type"] == "connected"
+                log("session_id", connected["session_id"])
+
+                await read_send_receive_loop(ws)
+        except Exception as exc:
+            log("websocket disconnected", exc)
+            await sleep(next_backoff())
+
+async def send_user_text(ws, text):
+    global counter
+    counter += 1
+    save_counter(counter)
+    message_id = f"{peer_id}-{counter:06d}"
+
+    await ws.send_json({
+        "type": "message",
+        "message_id": message_id,
+        "text": text,
+    })
+
+    while True:
+        frame = await ws.receive_json()
+        if frame["type"] == "ack" and frame.get("message_id") == message_id:
+            if frame.get("reply"):
+                return frame["reply"]
+            continue
+        if frame["type"] == "message" and frame.get("role") == "assistant":
+            if frame.get("message_id") == message_id:
+                return frame.get("text", "")
+        if frame["type"] == "error":
+            raise RuntimeError(frame.get("error", "unknown error"))
+```
+
+Adapt the pseudocode to the terminal runtime language and WebSocket library.
+
+## Manual Test With websocat
+
+If `websocat` is available, a developer can manually test the protocol:
+
+```bash
+websocat ws://127.0.0.1:8080/api/channels/terminal-dev/ws
+```
+
+Then paste:
+
+```json
+{"type":"connect","peer_id":"device-001","device_name":"desk-terminal","capabilities":["text"]}
+```
+
+Expected response:
+
+```json
+{"type":"connected","channel_id":"terminal-dev","session_id":"terminal-dev:local:device-001"}
+```
+
+Then paste:
+
+```json
+{"type":"message","message_id":"device-001-000001","text":"hello"}
+```
+
+Expected responses:
+
+```json
+{"type":"ack","message_id":"device-001-000001","session_id":"terminal-dev:local:device-001","accepted":true}
+```
+
+Then, after Beaver finishes the run:
+
+```json
+{"type":"message","role":"assistant","message_id":"device-001-000001","run_id":"...","text":"...","finish_reason":"stop"}
+```
+
+## Acceptance Checklist For Terminal-Side Codex
+
+- The terminal opens the configured Beaver WebSocket URL.
+- The terminal sends `connect` immediately after open.
+- The terminal receives and logs `connected.session_id`.
+- The terminal sends text using a unique `message_id`.
+- The terminal receives `ack`.
+- The terminal receives and displays assistant `message.text`.
+- The terminal handles `ping`/`pong`.
+- The terminal reconnects with the same `peer_id`.
+- The terminal does not use REST chat or `/ws/{session_id}`.
+- The terminal implementation remains text-only for the first pass.
+
+When this checklist passes against Beaver, the first-stage device integration is accepted from the terminal side.
--- a/AGENTS.md
+++ b/AGENTS.md
@ -1,12 +1,4 @@
-# Repository Instructions
-
-This branch is scoped to the standalone `app-instance` package.
-
-Useful entry points:
-
- `app-instance/README.md`
- `app-instance/run-standalone.sh`
- `app-instance/backend`
- `app-instance/frontend`
-
-The repository no longer owns the old auth portal, authz service, deploy control plane, router proxy, or per-user instance orchestration.
+<!-- SPECKIT START -->
+For additional context about technologies to be used, project structure,
+shell commands, and other important information, read the current plan
+<!-- SPECKIT END -->
--- a/DESIGN.md
+++ b/DESIGN.md
@ -0,0 +1,411 @@
+# DESIGN.md
+
+## Brand
+
+**Beaver — Taupe**
+
+A calm editorial UI system focused on rhythm, hierarchy, and soft neutral contrast.  
+Designed for AI-native tools, dashboards, and minimalist productivity software.
+
+The interface emphasizes:
+
+- Spacious layouts
+- Soft grayscale surfaces
+- Typography-first hierarchy
+- Rounded geometry
+- Quiet shadows
+- Dense information with low visual noise
+
+The visual tone should feel:
+
+- thoughtful
+- mature
+- calm
+- premium
+- architectural
+- editorial
+
+Avoid:
+
+- saturated colors
+- hard borders
+- sharp corners
+- excessive gradients
+- loud shadows
+- playful illustration-heavy UI
+
+---
+
+# Colors
+
+## Core Palette
+
+| Token | Hex | Usage |
+|---|---|---|
+| background | `#F5F3F1` | Main app background |
+| foreground | `#0B0B0B` | Primary text |
+| primary | `#1D1715` | Primary actions |
+| secondary | `#E5E2DF` | Secondary surfaces |
+| muted | `#DDD9D6` | Muted backgrounds |
+| accent | `#CAC5C0` | Borders / subtle emphasis |
+
+---
+
+## Neutral Scale
+
+| Token | Hex |
+|---|---|
+| zinc-50 | `#F7F5F4` |
+| zinc-100 | `#ECE8E5` |
+| zinc-200 | `#D8D2CE` |
+| zinc-300 | `#B8AEA8` |
+| zinc-400 | `#8B7E77` |
+| zinc-500 | `#6A5E58` |
+| zinc-600 | `#4F4642` |
+| zinc-700 | `#342E2B` |
+
+---
+
+## Semantic Colors
+
+### Taupe
+
+| Step | Hex |
+|---|---|
+| taupe-100 | `#E7E2DE` |
+| taupe-300 | `#B8AEA8` |
+| taupe-500 | `#8B7E77` |
+| taupe-700 | `#5F5550` |
+
+### Sage
+
+| Step | Hex |
+|---|---|
+| sage-100 | `#E3E8E2` |
+| sage-300 | `#B7C2B5` |
+| sage-500 | `#869683` |
+| sage-700 | `#657162` |
+
+### Slate
+
+| Step | Hex |
+|---|---|
+| slate-100 | `#E4E7EB` |
+| slate-300 | `#BCC4CE` |
+| slate-500 | `#8C96A3` |
+| slate-700 | `#697281` |
+
+---
+
+# Typography
+
+## Philosophy
+
+Typography drives hierarchy.
+
+The system should feel like a modern editorial publication mixed with a productivity dashboard.
+
+Large headings use elegant serif typography.  
+UI and body copy use neutral grotesk sans-serif typography.
+
+---
+
+## Font Stack
+
+### Serif
+
+```css
+font-family: "Lora", Georgia, serif;
+```
+
+Used for:
+
+- hero titles
+- article headings
+- marketing emphasis
+- editorial sections
+
+---
+
+### Sans
+
+```css
+font-family: "Public Sans", Inter, sans-serif;
+```
+
+Used for:
+
+- UI
+- labels
+- forms
+- dashboards
+- buttons
+- navigation
+
+---
+
+## Type Scale
+
+| Style | Size | Weight | Line Height |
+|---|---|---|---|
+| h1 | 48px | 600 | 1.1 |
+| h2 | 36px | 600 | 1.15 |
+| h3 | 28px | 500 | 1.2 |
+| body-lg | 18px | 400 | 1.7 |
+| body | 16px | 400 | 1.6 |
+| small | 14px | 400 | 1.5 |
+| mono | 13px | 500 | 1.4 |
+
+---
+
+# Radius
+
+Rounded geometry should feel soft but architectural.
+
+| Token | Radius |
+|---|---|
+| xs | 4px |
+| sm | 8px |
+| md | 12px |
+| lg | 16px |
+| xl | 24px |
+| full | 999px |
+
+Cards should primarily use:
+
+```css
+border-radius: 16px;
+```
+
+---
+
+# Shadows
+
+Shadows should be subtle and diffused.
+
+Avoid strong elevation.
+
+## Soft
+
+```css
+box-shadow:
+0 1px 2px rgba(0,0,0,0.04),
+0 6px 24px rgba(0,0,0,0.03);
+```
+
+## Floating
+
+```css
+box-shadow:
+0 12px 40px rgba(0,0,0,0.06);
+```
+
+---
+
+# Grid
+
+## Layout
+
+- 12-column grid
+- Max width: `1280px`
+- Horizontal padding: `32px`
+- Large whitespace between sections
+
+---
+
+## Content Widths
+
+| Type | Width |
+|---|---|
+| reading | 720px |
+| dashboard | 1280px |
+| modal | 480px |
+| form | 560px |
+
+---
+
+# Spacing
+
+Base unit:
+
+```txt
+4px
+```
+
+Spacing scale:
+
+| Token | Value |
+|---|---|
+| 1 | 4px |
+| 2 | 8px |
+| 3 | 12px |
+| 4 | 16px |
+| 5 | 20px |
+| 6 | 24px |
+| 8 | 32px |
+| 10 | 40px |
+| 12 | 48px |
+| 16 | 64px |
+
+Use generous vertical rhythm.
+
+Sections should breathe.
+
+---
+
+# Components
+
+## Buttons
+
+### Primary
+
+- Dark background
+- White text
+- Pill radius
+- Minimal shadow
+
+```css
+background: #1D1715;
+color: white;
+border-radius: 999px;
+height: 40px;
+padding: 0 16px;
+```
+
+### Secondary
+
+```css
+background: #ECE8E5;
+color: #1D1715;
+```
+
+### Ghost
+
+Transparent background with subtle hover fill.
+
+---
+
+## Cards
+
+Cards are soft containers with quiet separation.
+
+```css
+background: rgba(255,255,255,0.7);
+border: 1px solid rgba(0,0,0,0.04);
+border-radius: 16px;
+```
+
+Avoid heavy borders.
+
+---
+
+## Inputs
+
+Inputs should feel invisible until focused.
+
+```css
+background: #F7F5F4;
+border: 1px solid transparent;
+```
+
+Focus:
+
+```css
+border-color: #8B7E77;
+box-shadow: 0 0 0 3px rgba(139,126,119,0.12);
+```
+
+---
+
+## Charts
+
+Charts should use muted earthy tones.
+
+Preferred palette:
+
+- taupe
+- sage
+- slate
+
+Avoid:
+
+- neon colors
+- bright blue dashboards
+- rainbow charts
+
+---
+
+# Motion
+
+Motion should be restrained and smooth.
+
+Preferred easing:
+
+```css
+cubic-bezier(0.22, 1, 0.36, 1)
+```
+
+Preferred duration:
+
+| Type | Duration |
+|---|---|
+| hover | 150ms |
+| panel | 250ms |
+| modal | 350ms |
+
+---
+
+# Layout Skeleton
+
+Application layout:
+
+- Left sidebar
+- Large content canvas
+- Floating top toolbar
+- Soft dashboard cards
+- Spacious internal padding
+
+The UI should always feel:
+
+- breathable
+- editorial
+- premium
+- calm
+
+Never dense or overly enterprise-looking.
+
+---
+
+# Design Keywords
+
+Use these words when generating UI:
+
+- editorial
+- taupe
+- soft neutral
+- premium minimal
+- typography-first
+- architectural spacing
+- calm dashboard
+- quiet luxury
+- modern serif
+- subtle shadows
+- muted grayscale
+- sophisticated SaaS
+
+---
+
+# AI Agent Instructions
+
+When generating UI:
+
+1. Prioritize whitespace over density
+2. Typography should create hierarchy
+3. Use muted neutral palettes
+4. Prefer soft cards over hard sections
+5. Avoid excessive color usage
+6. Keep interactions subtle
+7. Use serif fonts sparingly for emphasis
+8. Maintain premium visual restraint
+9. Design should feel timeless rather than trendy
+10. Every screen should feel breathable
--- a/README.md
+++ b/README.md
@ -1,75 +1,183 @@
-# Beaver App Instance
+# Beaver Project

-This branch contains a standalone Beaver app instance for an external orchestrator to deploy.
+`Beaver Project` 是一套单机 Docker 部署的多实例运行环境：

-The app instance is a single Docker image that runs:
+- 用户先进入独立的 `auth-portal` 完成注册或登录。
+- 注册会触发 `authz-service` 调用 `deploy-control`。
+- `deploy-control` 在同一台机器上创建一个独立的 `app-instance` 容器。
+- `router-proxy` 按实例域名把流量转发到对应容器。

- Next.js frontend
- FastAPI backend
- Nginx reverse proxy
+当前推荐的最小部署方式是一台 Linux / WSL2 Ubuntu 机器加 Docker。生产域名和 HTTPS 可以放在项目外层的 Nginx、Caddy、Traefik 或云负载均衡上。

-Authentication is delegated to Keycloak. This repository does not maintain local user registration, passwords, per-user instance routing, or deployment orchestration.
+## 组件

-## Runtime Contract
+| 目录 | 职责 | 默认端口 |
+| --- | --- | --- |
+| `auth-portal/` | 用户登录、注册、模型配置引导入口 | `3081` |
+| `authz-service/` | AuthZ 服务，负责账号和 backend 身份编排 | `19090` |
+| `deploy-control/` | 部署控制面，调用 Docker 创建和管理实例 | `8090` |
+| `router-proxy/` | 统一实例入口代理，按 Host 分发到实例容器 | `8088` |
+| `app-instance/` | 单用户运行实例，容器内包含前端、后端和 Nginx | 容器内 `8080` |

-The external orchestrator is responsible for:
+公网环境通常只暴露：

- Building or pulling the app image
- Providing a mounted Beaver home directory
- Providing `config.json`
- Choosing the public host and port
- Registering the public callback URL and post-logout callback URL in Keycloak
- Managing lifecycle, routing, TLS, and scaling
+- `auth-portal`: `3081`，或外层代理后的 `https://portal.example.com`
+- `router-proxy`: `8088`，或外层代理后的 `https://<slug>.apps.example.com`

-The app instance exposes container port `8080`:
+不要直接把 `deploy-control:8090` 和 `authz-service:19090` 暴露到公网。
+
+## 请求链路
+
+注册：

 ```text
-/      frontend
-/api   backend API
-/ws    backend WebSocket
+Browser
+  -> auth-portal
+  -> authz-service POST /portal/register
+  -> deploy-control POST /api/instances/register
+  -> app-instance/create-instance.sh
+  -> app-instance POST /api/auth/register
+  -> authz-service /oauth/register or /backends/register
+  -> auth-portal provider onboarding
+  -> deploy-control POST /api/instances/configure-provider
 ```

-## Keycloak
-
-Default IdP settings:
+登录：

 ```text
-issuer:    https://keycloak.bwgdi.com/realms/beaver
-client_id: beaver-agnet
+Browser
+  -> auth-portal
+  -> deploy-control POST /api/instances/resolve
+  -> app-instance POST /api/auth/login
+  -> app-instance frontend URL
 ```

-For the current plain-IP deployment, register these values in Keycloak:
+## 快速开始

-```text
-web_origin:   http://172.19.0.245:18080
-redirect_uri: http://172.19.0.245:18080/auth/callback
-post_logout_redirect_uri: http://172.19.0.245:18080/logout/callback
-```
+本机完整流程见：

-The app validates Keycloak JWT issuer, signature, expiry, audience/azp, and nonce where applicable. The Keycloak `sub` claim is used as the application user ID.
+- [部署指南.md](./部署指南.md)

-## Build
+域名、HTTPS、公网反向代理说明见：
+
+- [域名配置指引.md](./域名配置指引.md)
+
+最小配置变量：

 ```bash
-cd app-instance
-docker build -t beaver/app-instance:keycloak-login .
+export PROJECT_ROOT=/home/ivan/xuan/beaver_project
+export BEAVER_NET=beaver-instance-edge
+export BEAVER_PROXY_CONTAINER_NAME=beaver-router-proxy
+
+export BEAVER_DEPLOY_TOKEN="$(openssl rand -hex 32)"
+export BEAVER_AUTHZ_INTERNAL_TOKEN="$(openssl rand -hex 32)"
+
+export BEAVER_BASE_DOMAIN=localhost
+export BEAVER_AUTHZ_URL='http://beaver-authz-service:19090'
+export BEAVER_DEPLOY_URL='http://beaver-deploy-control:8090'
+
+export BEAVER_OUTLOOK_MCP_URL=''
+export BEAVER_OUTLOOK_MCP_SERVER_ID='outlook_mcp'
 ```

-## Run Locally
+启动顺序：
+
+1. 创建运行目录。
+2. 构建四个镜像。
+3. 创建共享 Docker network。
+4. 启动 `router-proxy`。
+5. 启动 `authz-service`。
+6. 启动 `deploy-control`。
+7. 启动 `auth-portal`。
+8. 打开 `http://127.0.0.1:3081/register` 测试注册。
+
+## 关键配置关系
+
+`DEPLOY_API_TOKEN` 和 `DEPLOY_CONTROL_API_TOKEN` 必须相等：
+
+- `auth-portal` / `authz-service` 用 `DEPLOY_API_TOKEN` 请求 `deploy-control`。
+- `deploy-control` 用 `DEPLOY_CONTROL_API_TOKEN` 校验请求。
+
+`AUTHZ_ISSUER` 在这套单机部署里要写容器网络地址：
+
+```text
+http://beaver-authz-service:19090
+```
+
+不要写成 `http://127.0.0.1:19090`，因为新创建的 `app-instance` 容器里的 `127.0.0.1` 指向它自己，不是 AuthZ 容器。
+
+`DEPLOY_PUBLIC_*` 决定新实例展示给用户的 URL：

 ```bash
-cd app-instance
-./run-standalone.sh --build --replace
+DEPLOY_PUBLIC_SCHEME=http
+DEPLOY_PUBLIC_BASE_DOMAIN=localhost
+DEPLOY_PUBLIC_PORT=8088
 ```

-The script prints the `web_origin`, `redirect_uri`, and `post_logout_redirect_uri` values that must be configured in Keycloak.
+本机测试时实例 URL 形如：

-## Useful Paths
+```text
+http://alice.localhost:8088
+```

- `app-instance/` - app image, frontend, backend, runtime helper script, app tests
- `skills/` - reusable skill content retained for product/runtime packaging
- `agents/` - agent registry retained for runtime packaging
- `docs/` - product and architecture documentation
- `AGENTS.md` - local coding-agent instructions
+正式 HTTPS 域名通常改成：

-See `app-instance/README.md` for the app-specific contract and environment variables.
+```bash
+DEPLOY_PUBLIC_SCHEME=https
+DEPLOY_PUBLIC_BASE_DOMAIN=apps.example.com
+DEPLOY_PUBLIC_PORT=443
+```
+
+实例 URL 形如：
+
+```text
+https://alice.apps.example.com
+```
+
+前提是你已经在项目外层把 `*.apps.example.com` 的 `80/443` 流量转发到 `router-proxy:8088`。
+
+## 模型配置方式
+
+当前版本不会在注册创建实例时写入模型 provider、model 或 API key。
+
+流程是：
+
+1. 注册先创建一个不含模型凭证的实例。
+2. `auth-portal` 进入模型配置引导页。
+3. 用户确认后，Portal 调用 `deploy-control /api/instances/configure-provider`。
+4. `deploy-control` 写入该实例的 `config.json` 并重启对应容器。
+
+如果用户跳过引导，实例仍会创建成功，但后续需要在实例内补齐 provider 配置后才能正常调用模型。
+
+## 持久化目录
+
+至少保留：
+
+```text
+authz-service/runtime/data
+app-instance/runtime/instances
+app-instance/runtime/registry
+router-proxy/runtime/conf.d
+```
+
+不要在需要保留账号、实例或配置时删除这些目录。
+
+## 模板文件
+
+可参考这些环境变量模板：
+
+- [`.env.example`](./.env.example)
+- [`auth-portal/src/.env.example`](./auth-portal/src/.env.example)
+- [`authz-service/.env.example`](./authz-service/.env.example)
+- [`deploy-control/.env.example`](./deploy-control/.env.example)
+- [`router-proxy/.env.example`](./router-proxy/.env.example)
+
+这些模板不会被脚本自动加载。你可以手动 `export`，也可以在 `docker run` 时使用 `--env-file`。
+
+## 子项目文档
+
+- [`app-instance/README.md`](./app-instance/README.md)
+- [`auth-portal/src/README.md`](./auth-portal/src/README.md)
+- [`authz-service/README.md`](./authz-service/README.md)
+- [`deploy-control/README.md`](./deploy-control/README.md)
+- [`router-proxy/README.md`](./router-proxy/README.md)
--- a/agents/registry.json
+++ b/agents/registry.json
@ -1,145 +1,4 @@
 {
-  "agents": [
-    {
-      "agent_id": "researcher",
-      "capabilities": [
-        "research",
-        "analysis",
-        "source review",
-        "requirements"
-      ],
-      "created_at": "2026-05-11T03:13:06.912240+00:00",
-      "description": "Finds facts, references, constraints, and implementation options.",
-      "display_name": "Researcher",
-      "metadata": {},
-      "model": null,
-      "name": "researcher",
-      "priority": 50,
-      "provider_name": null,
-      "role": "research",
-      "skill_names": [],
-      "source": "builtin",
-      "status": "active",
-      "system_prompt": "You are a research specialist. Gather concise evidence and tradeoffs for the parent task.",
-      "tags": [
-        "planning",
-        "research"
-      ],
-      "tool_hints": [],
-      "updated_at": "2026-05-11T03:13:06.912247+00:00"
-    },
-    {
-      "agent_id": "implementer",
-      "capabilities": [
-        "implementation",
-        "coding",
-        "refactor",
-        "integration"
-      ],
-      "created_at": "2026-05-11T03:13:06.912250+00:00",
-      "description": "Builds scoped implementation slices and proposes concrete changes.",
-      "display_name": "Implementer",
-      "metadata": {},
-      "model": null,
-      "name": "implementer",
-      "priority": 45,
-      "provider_name": null,
-      "role": "implementation",
-      "skill_names": [],
-      "source": "builtin",
-      "status": "active",
-      "system_prompt": "You are an implementation specialist. Produce practical, scoped implementation output.",
-      "tags": [
-        "coding",
-        "build"
-      ],
-      "tool_hints": [],
-      "updated_at": "2026-05-11T03:13:06.912251+00:00"
-    },
-    {
-      "agent_id": "reviewer",
-      "capabilities": [
-        "review",
-        "quality",
-        "risk",
-        "verification"
-      ],
-      "created_at": "2026-05-11T03:13:06.912252+00:00",
-      "description": "Reviews plans, code, outputs, and risks before final synthesis.",
-      "display_name": "Reviewer",
-      "metadata": {},
-      "model": null,
-      "name": "reviewer",
-      "priority": 45,
-      "provider_name": null,
-      "role": "review",
-      "skill_names": [],
-      "source": "builtin",
-      "status": "active",
-      "system_prompt": "You are a review specialist. Focus on defects, missing requirements, and risks.",
-      "tags": [
-        "review",
-        "quality"
-      ],
-      "tool_hints": [],
-      "updated_at": "2026-05-11T03:13:06.912253+00:00"
-    },
-    {
-      "agent_id": "tester",
-      "capabilities": [
-        "testing",
-        "verification",
-        "regression",
-        "qa"
-      ],
-      "created_at": "2026-05-11T03:13:06.912255+00:00",
-      "description": "Designs and executes verification checks for task outputs.",
-      "display_name": "Tester",
-      "metadata": {},
-      "model": null,
-      "name": "tester",
-      "priority": 40,
-      "provider_name": null,
-      "role": "testing",
-      "skill_names": [],
-      "source": "builtin",
-      "status": "active",
-      "system_prompt": "You are a testing specialist. Identify focused checks and report pass/fail evidence.",
-      "tags": [
-        "test",
-        "quality"
-      ],
-      "tool_hints": [],
-      "updated_at": "2026-05-11T03:13:06.912256+00:00"
-    },
-    {
-      "agent_id": "documenter",
-      "capabilities": [
-        "documentation",
-        "explanation",
-        "migration notes",
-        "release notes"
-      ],
-      "created_at": "2026-05-11T03:13:06.912257+00:00",
-      "description": "Writes and reconciles user-facing and internal documentation updates.",
-      "display_name": "Documenter",
-      "metadata": {},
-      "model": null,
-      "name": "documenter",
-      "priority": 35,
-      "provider_name": null,
-      "role": "documentation",
-      "skill_names": [],
-      "source": "builtin",
-      "status": "active",
-      "system_prompt": "You are a documentation specialist. Produce concise docs aligned with the implementation.",
-      "tags": [
-        "docs",
-        "communication"
-      ],
-      "tool_hints": [],
-      "updated_at": "2026-05-11T03:13:06.912258+00:00"
-    }
-  ],
+  "agents": [],
  "version": 1
 }
--- a/app-instance/.dockerignore
+++ b/app-instance/.dockerignore
@ -1,5 +1,4 @@
 runtime/
-.env
 backend/.git/
 backend/.venv/
 backend/.pytest_cache/
@ -14,3 +13,4 @@ frontend/node_modules/
 frontend/.next/
 frontend/.next-dev/
 frontend/tsconfig.tsbuildinfo
+
--- a/app-instance/.env.example
+++ b/app-instance/.env.example
@ -1,28 +0,0 @@
-# Docker image/runtime
-IMAGE_NAME=beaver/app-instance:keycloak
-CONTAINER_NAME=beaver-app-instance
-HOST_IP=172.19.0.245
-HOST_PORT=18080
-HOST_BIND_IP=0.0.0.0
-DATA_DIR=runtime/standalone
-DOCKER_NETWORK=
-
-# Beaver model config rendered into runtime/standalone/beaver-home/config.json
-BEAVER_PROVIDER=custom
-BEAVER_MODEL=
-BEAVER_API_KEY=
-BEAVER_API_BASE=
-
-# Keycloak public client
-KEYCLOAK_ISSUER=https://keycloak.bwgdi.com/realms/beaver
-KEYCLOAK_CLIENT_ID=beaver-agnet
-KEYCLOAK_REDIRECT_URI=
-KEYCLOAK_POST_LOGOUT_REDIRECT_URI=
-
-# Build mirrors
-NPM_REGISTRY=https://registry.npmmirror.com
-NPM_FETCH_RETRIES=5
-NPM_FETCH_RETRY_MIN_TIMEOUT=20000
-NPM_FETCH_RETRY_MAX_TIMEOUT=120000
-APT_MIRROR=https://mirrors.tuna.tsinghua.edu.cn/debian
-PYPI_INDEX_URL=https://pypi.tuna.tsinghua.edu.cn/simple
--- a/app-instance/Dockerfile
+++ b/app-instance/Dockerfile
@ -20,15 +20,11 @@ RUN --mount=type=cache,target=/root/.npm \

 COPY frontend/ ./

-ARG NEXT_PUBLIC_KEYCLOAK_ISSUER="https://keycloak.bwgdi.com/realms/beaver"
-ARG NEXT_PUBLIC_KEYCLOAK_CLIENT_ID="beaver-agnet"
-ARG NEXT_PUBLIC_KEYCLOAK_REDIRECT_URI=""
-ARG NEXT_PUBLIC_KEYCLOAK_POST_LOGOUT_REDIRECT_URI=""
+ARG NEXT_PUBLIC_AUTH_PORTAL_URL=""
+ARG NEXT_PUBLIC_AUTH_PORTAL_PORT="3081"

-ENV NEXT_PUBLIC_KEYCLOAK_ISSUER=${NEXT_PUBLIC_KEYCLOAK_ISSUER}
-ENV NEXT_PUBLIC_KEYCLOAK_CLIENT_ID=${NEXT_PUBLIC_KEYCLOAK_CLIENT_ID}
-ENV NEXT_PUBLIC_KEYCLOAK_REDIRECT_URI=${NEXT_PUBLIC_KEYCLOAK_REDIRECT_URI}
-ENV NEXT_PUBLIC_KEYCLOAK_POST_LOGOUT_REDIRECT_URI=${NEXT_PUBLIC_KEYCLOAK_POST_LOGOUT_REDIRECT_URI}
+ENV NEXT_PUBLIC_AUTH_PORTAL_URL=${NEXT_PUBLIC_AUTH_PORTAL_URL}
+ENV NEXT_PUBLIC_AUTH_PORTAL_PORT=${NEXT_PUBLIC_AUTH_PORTAL_PORT}

 # API / WS 走同域反代，不在构建时写死实例地址。
 RUN npm run build
@ -43,6 +39,7 @@ ENV DEBIAN_FRONTEND=noninteractive \
    BEAVER_HOME=/root/.beaver \
    BEAVER_CONFIG_PATH=/root/.beaver/config.json \
    BEAVER_WORKSPACE=/root/.beaver/workspace \
+    BEAVER_AUTH_FILE=/root/.beaver/web_auth_users.json \
    PORT=3000 \
    HOSTNAME=127.0.0.1

--- a/app-instance/README.md
+++ b/app-instance/README.md
@ -1,56 +1,120 @@
 # app-instance

-单体 app instance：一个 Docker 容器内运行前端、后端和 Nginx。这个目录只提供干净的应用实例，不再负责用户注册登录、实例编排、注册表或反向代理路由。
+单实例应用单元：
+
+- 一个 Docker 容器里同时运行前端、后端和 Nginx 反代
+- 前端走 `/`
+- 后端 API 走 `/api`
+- WebSocket 走 `/ws`

 ## 关键文件

- `Dockerfile`: 统一镜像构建入口
- `entrypoint.sh`: 容器内启动前端、后端、Nginx
- `run-standalone.sh`: 从 `.env` 读取参数，构建并运行单体容器
- `.env.example`: 可复制的部署参数模板
+- `Dockerfile`
+  - 统一镜像构建入口
+- `entrypoint.sh`
+  - 容器内启动前端、后端、Nginx
+- `create-instance.sh`
+  - 创建实例目录、生成配置、启动容器、写注册表
+- `remove-instance.sh`
+  - 删除容器、移除注册表、可选清理实例目录
+- `list-instances.sh`
+  - 查看当前注册实例
+- `instance-registry.py`
+  - 维护 `runtime/registry/instances.json`

-## Keycloak SSO
+## 注册表

-当前登录交给 Keycloak：
-
- issuer: `https://keycloak.bwgdi.com/realms/beaver`
- client_id: `beaver-agnet`
- flow: Authorization Code + PKCE S256
- password grant: disabled
- client secret: none
-
-应用需要在 Keycloak client 中配置：
-
- `redirect_uri`: `http://<host-ip>:<host-port>/auth/callback`
- `post_logout_redirect_uri`: `http://<host-ip>:<host-port>/logout/callback`
- `web_origin`: `http://<host-ip>:<host-port>`
-
-默认单机部署示例：
+默认注册表路径：

 ```text
-redirect_uri: http://172.19.0.245:18080/auth/callback
-post_logout_redirect_uri: http://172.19.0.245:18080/logout/callback
-web_origin: http://172.19.0.245:18080
+runtime/registry/instances.json
 ```

-## 部署
+每条记录至少包含：
+
+- `instance_id`
+- `instance_slug`
+- `container_name`
+- `host_port`
+- `public_url`
+- `instance_root`
+- `image_name`
+
+## 常用命令
+
+### 1. 构建镜像

 ```bash
-cp .env.example .env
-./run-standalone.sh --build --replace
+docker build -t beaver/app-instance:latest .
 ```

-脚本会输出当前需要填到 Keycloak 的 `redirect_uri`、`post_logout_redirect_uri` 和 `web_origin`。
+### 2. 创建实例

-## 数据目录
+```bash
+./create-instance.sh \
+  --image beaver/app-instance:latest \
+  --instance-id demo-001 \
+  --auth-username admin \
+  --auth-password 123456 \
+  --api-key 'your-api-key'
+```

-默认数据目录：
+可选参数：
+
+- `--host-port`
+- `--public-url`
+- `--username`
+- `--email`
+- `--instance-host`
+- `--authz-base-url`
+- `--backend-id`
+- `--client-id`
+- `--client-secret`
+- `--network`
+- `--host-bind-ip`
+- `--initial-skills-dir`
+- `--skip-initial-skills`
+- `--build`
+- `--replace`
+
+### 3. 查看实例
+
+```bash
+./list-instances.sh
+./list-instances.sh --json
+```
+
+### 4. 删除实例
+
+```bash
+./remove-instance.sh --instance-id demo-001
+```
+
+如果要把实例目录也一并清掉：
+
+```bash
+./remove-instance.sh --instance-id demo-001 --purge-data
+```
+
+## 目录约定
+
+默认实例数据目录：

 ```text
-runtime/standalone/beaver-home/
+runtime/instances/<instance-slug>/
 ```

-容器内挂载到：
+其中会生成：
+
+```text
+runtime/instances/<instance-slug>/
+└── beaver-home
+    ├── config.json
+    ├── web_auth_users.json
+    └── workspace/
+```
+
+这个目录是单用户 sandbox 的配置与数据边界。容器内会把它挂到：

 ```text
 /root/.beaver/
@ -63,4 +127,36 @@ BEAVER_CONFIG_PATH=/root/.beaver/config.json
 BEAVER_WORKSPACE=/root/.beaver/workspace
 ```

-模型 provider、API key、workspace 数据都在这个目录下维护。
+所以模型 `provider/api_key/api_base/model` 配一次即可，Web / channel 请求不需要、也不应该携带 API Key。
+
+`create-instance.sh` 默认会把仓库根目录的 `skills/` 非覆盖式复制到实例 workspace，并把同一个目录只读挂载到实例容器的 `/opt/app/initial-skills`。`entrypoint.sh` 每次启动都会用该目录补齐缺失的 published 初始 skills；已有 skill 目录不会被覆盖，index 只做并集追加。
+
+## 当前状态
+
+这层已经支持：
+
+- 统一镜像构建
+- 镜像内安装并启动新的 `beaver` 后端
+- 实例创建
+- 实例删除
+- 实例列表
+- 基于注册表的端口分配
+- 为 deploy-control / router-proxy 记录用户名和实例 host
+
+## 生产注意
+
+- 实例容器的宿主机端口默认只绑定 `127.0.0.1`
+- 外部访问应统一走 `router-proxy`
+- 如果你确实要把单个实例端口直接暴露到公网，再显式传 `--host-bind-ip 0.0.0.0`
+- 使用共享 `external-connector` sidecar 时，每个实例容器都必须带自己的内部回调地址：
+  `EXTERNAL_CONNECTOR_CALLBACK_BASE_URL=http://<app-instance-container-name>:8080`
+- 通过 `create-instance.sh --network <docker-network>` 创建实例时，脚本会默认使用
+  `http://<container-name>:8080` 作为回调地址；生产部署也可以用
+  `--external-connector-callback-base-url <url>` 显式覆盖
+- `BEAVER_BRIDGE_BASE_URL` 只作为 sidecar 的旧连接或兜底地址；多实例部署不能依赖它路由所有入站事件
+
+下一步可以继续接：
+
+- portal 调用创建实例
+- URL 分配和反向代理
+- 实例续期 / 停用 / 启用
--- a/app-instance/backend/beaver/interfaces/web/app.py
+++ b/app-instance/backend/beaver/interfaces/web/app.py
@ -7,6 +7,8 @@ import asyncio
 import io
 import mimetypes
 import os
+import re
+import secrets
 import shutil
 import time
 import zipfile
@ -16,8 +18,6 @@ from pathlib import Path
 from types import SimpleNamespace
 from typing import Any

-import httpx
-
 from beaver.engine.providers.registry import PROVIDERS, find_by_name
 from beaver.foundation.config import default_config_path, load_config
 from beaver.foundation.events import ChannelIdentity, InboundMessage
@ -50,9 +50,11 @@ from beaver.services.user_file_resolver import (
    UserFileStorageResolver,
    build_file_auth_context,
 )
-from beaver.skills.learning import SkillLearningWorker, SkillLearningWorkerConfig
+from beaver.skills.authoring import canonical_skill_format_instructions, ensure_canonical_skill_body, normalize_skill_frontmatter
+from beaver.skills.authoring.format import parse_skill_rewrite_json
+from beaver.skills.learning import SkillLearningService, SkillLearningWorker, SkillLearningWorkerConfig
 from beaver.skills.learning.replay import ReplayRunner
-from beaver.skills.catalog.utils import parse_frontmatter
+from beaver.skills.catalog.utils import extract_required_tool_names, parse_frontmatter

 from .deps import get_agent_service
 from .files import (
@ -70,12 +72,6 @@ from .files import (
    workspace_file_preview,
    workspace_file_path,
 )
-from .keycloak_auth import (
-    KeycloakAuthConfig,
-    KeycloakIdentity,
-    KeycloakTokenVerifier,
-    extract_bearer_token,
-)
 from .schemas import (
    WebChatAcceptanceRequest,
    WebChatAcceptanceResponse,
@ -103,8 +99,11 @@ from .schemas import (

 try:
    from fastapi import FastAPI, File, Form, Header, HTTPException, Request, UploadFile, WebSocket, WebSocketDisconnect
+    from fastapi.middleware.cors import CORSMiddleware
    from fastapi.responses import JSONResponse, Response
 except ModuleNotFoundError:  # pragma: no cover - fallback for skeleton-only environments
+    CORSMiddleware = None  # type: ignore[assignment]
+
    def File(default: Any = None) -> Any:  # type: ignore[override]
        return default

@ -281,6 +280,7 @@ async def _app_lifespan(
        worker = SkillLearningWorker(
            pipeline=loaded.skill_learning_pipeline,  # type: ignore[arg-type]
            provider_bundle_factory=lambda: attached_service._make_provider_bundle_for_task(loaded, {}),  # noqa: SLF001
+            replay_runner_factory=lambda: ReplayRunner(agent_loop=attached_service.create_loop()),
            config=worker_config,
        )
        worker_task = asyncio.create_task(worker.run_forever())
@ -523,6 +523,20 @@ def _self_restart_enabled() -> bool:
    return os.getenv("BEAVER_ENABLE_SELF_RESTART", "1").strip() not in {"0", "false", "False"}


+def _cors_allow_origins() -> list[str]:
+    raw = os.getenv("BEAVER_CORS_ALLOW_ORIGINS", "").strip()
+    if raw:
+        return [origin.strip().rstrip("/") for origin in raw.split(",") if origin.strip()]
+    return [
+        "http://127.0.0.1:3000",
+        "http://localhost:3000",
+        "http://127.0.0.1:3080",
+        "http://localhost:3080",
+        "http://127.0.0.1:3081",
+        "http://localhost:3081",
+    ]
+
+
 def _schedule_self_restart(delay_seconds: float = 0.75) -> None:
    import threading

@ -563,22 +577,25 @@ def create_app(
            shutdown_force=shutdown_force,
        ),
    )
-    app.state.keycloak_auth_config = KeycloakAuthConfig.from_env()
-    app.state.keycloak_token_verifier = KeycloakTokenVerifier(config=app.state.keycloak_auth_config)
+    if CORSMiddleware is not None:
+        app.add_middleware(
+            CORSMiddleware,
+            allow_origins=_cors_allow_origins(),
+            allow_credentials=True,
+            allow_methods=["*"],
+            allow_headers=["*"],
+        )
+    app.state.auth_tokens = {}
+    app.state.handoff_codes = {}
+    app.state.auth_file = Path(os.getenv("BEAVER_AUTH_FILE") or "")
    max_file_size = 50 * 1024 * 1024
    max_user_file_upload_size = _int_env("BEAVER_USER_FILES_MAX_UPLOAD_BYTES", 5 * 1024 * 1024 * 1024)
    user_file_upload_part_size = _int_env("BEAVER_USER_FILES_UPLOAD_PART_SIZE", 10 * 1024 * 1024)

    def _user_file_resolver(request: Request, authorization: str | None) -> UserFileStorageResolver:
-        identity = _require_web_identity(app, authorization)
+        username = _require_web_user(app, authorization)
        loaded = get_agent_service(request).create_loop().boot()
-        auth_context = build_file_auth_context(
-            username=identity.username,
-            config=loaded.config,
-            user_id=identity.user_id,
-            scopes=identity.realm_roles + identity.client_roles,
-            auth_source="keycloak",
-        )
+        auth_context = build_file_auth_context(username=username, config=loaded.config)
        return UserFileStorageResolver(config=loaded.config, workspace=loaded.workspace, auth_context=auth_context)

    async def _user_file_service(request: Request, authorization: str | None) -> UserFileService:
@ -982,72 +999,168 @@ def create_app(
        _schedule_self_restart()
        return JSONResponse({"ok": True, "restarting": True}, status_code=202)

-    @app.post("/api/auth/callback")
-    async def auth_callback(request: Request, payload: dict[str, Any]) -> dict[str, Any]:
-        code = _clean_text(payload.get("code"))
-        code_verifier = _clean_text(payload.get("code_verifier"))
-        redirect_uri = _clean_text(payload.get("redirect_uri"))
-        nonce = _clean_text(payload.get("nonce")) or None
-        if not code or not code_verifier or not redirect_uri:
-            raise HTTPException(status_code=400, detail="code, code_verifier, and redirect_uri are required")
+    @app.post("/api/auth/login")
+    async def auth_login(request: Request, payload: dict[str, Any]) -> dict[str, Any]:
+        username = _clean_text(payload.get("username"))
+        password = str(payload.get("password") or "")
+        if not username or not password:
+            raise HTTPException(status_code=400, detail="Username and password are required")

-        keycloak_config: KeycloakAuthConfig = app.state.keycloak_auth_config
-        try:
-            async with httpx.AsyncClient(timeout=15.0, trust_env=False) as client:
-                response = await client.post(
-                    keycloak_config.token_url,
-                    data={
-                        "grant_type": "authorization_code",
-                        "client_id": keycloak_config.client_id,
-                        "code": code,
-                        "redirect_uri": redirect_uri,
-                        "code_verifier": code_verifier,
-                    },
-                    headers={"Accept": "application/json"},
-                )
-        except httpx.HTTPError as exc:
-            raise HTTPException(status_code=502, detail=f"Keycloak token exchange failed: {exc}") from exc
-        if response.is_error:
-            raise HTTPException(status_code=401, detail=f"Keycloak token exchange rejected: {response.text}")
-        token_payload = response.json()
-        if not isinstance(token_payload, dict):
-            raise HTTPException(status_code=502, detail="Invalid Keycloak token response")
-        access_token = _clean_text(token_payload.get("access_token"))
-        id_token = _clean_text(token_payload.get("id_token"))
-        refresh_token = _clean_text(token_payload.get("refresh_token"))
-        if not access_token:
-            raise HTTPException(status_code=502, detail="Keycloak token response missing access_token")
+        users = _load_auth_users(_auth_file_path())
+        expected = users.get(username)
+        if expected is None or not secrets.compare_digest(expected, password):
+            raise HTTPException(status_code=401, detail="Invalid username or password")

-        verifier: KeycloakTokenVerifier = app.state.keycloak_token_verifier
-        identity = verifier.verify(id_token, expected_nonce=nonce) if id_token else verifier.verify(access_token)
-        verifier.verify(access_token)
+        token = _issue_web_token(app, username)
+        handoff_code, handoff_expires_at = _issue_handoff_code(app, username, token)
        return {
-            "access_token": access_token,
-            "id_token": id_token,
-            "refresh_token": refresh_token,
-            "expires_in": token_payload.get("expires_in"),
-            "token_type": token_payload.get("token_type") or "bearer",
-            "user_id": identity.user_id,
-            "username": identity.username,
-            "email": identity.email,
+            "access_token": token,
+            "refresh_token": "",
+            "token_type": "bearer",
+            "user_id": username,
+            "username": username,
            "role": "owner",
+            "handoff_code": handoff_code,
+            "handoff_expires_at": handoff_expires_at,
            "backend_connection": _backend_connection_view(request),
            "local_backend": _local_backend_view(),
        }

+    @app.post("/api/auth/register")
+    async def auth_register(request: Request, payload: dict[str, Any]) -> dict[str, Any]:
+        username = _clean_text(payload.get("username"))
+        password = str(payload.get("password") or "")
+        email = _clean_text(payload.get("email")) or ""
+        if not username or not password:
+            raise HTTPException(status_code=400, detail="Username and password are required")
+
+        auth_file = _auth_file_path()
+        users = _load_auth_users_if_present(auth_file)
+        user_exists = username in users
+        if user_exists and not secrets.compare_digest(users[username], password):
+            raise HTTPException(
+                status_code=409,
+                detail="Username already exists. Use the existing password to finish setup or log in.",
+            )
+
+        agent_service = get_agent_service(request)
+        loaded = agent_service.create_loop().boot()
+        config = loaded.config
+        authz_base_url = _clean_text(payload.get("authz_base_url")) or (config.authz.base_url if config.authz.enabled else "")
+        backend_name = _clean_text(payload.get("backend_name")) or config.backend_identity.name or username
+        requested_backend_id = _clean_text(payload.get("backend_id")) or config.backend_identity.backend_id or None
+        public_base_url = (
+            _clean_text(payload.get("base_url"))
+            or config.backend_identity.public_base_url
+            or os.getenv("BEAVER_FRONTEND_PUBLIC_BASE_URL")
+            or str(request.base_url).rstrip("/")
+        )
+        frontend_base_url = _clean_text(payload.get("frontend_base_url")) or public_base_url
+
+        authz_user_registered = False
+        authz_backend_registered = False
+        local_backend: dict[str, Any] | None = None
+
+        if authz_base_url:
+            from beaver.integrations.authz import AuthzClient
+
+            try:
+                authz_payload = await AuthzClient(
+                    authz_base_url,
+                    timeout_seconds=config.authz.request_timeout_seconds,
+                ).register_user(
+                    username=username,
+                    password=password,
+                    email=email or None,
+                    backend_name=backend_name,
+                    backend_id=requested_backend_id,
+                    base_url=public_base_url,
+                    frontend_base_url=frontend_base_url,
+                )
+            except Exception as exc:  # noqa: BLE001 - expose upstream setup failures to portal
+                raise HTTPException(status_code=502, detail=f"AuthZ registration failed: {exc}") from exc
+
+            backend = authz_payload.get("backend") if isinstance(authz_payload, dict) else {}
+            if isinstance(backend, dict):
+                backend_id = _clean_text(backend.get("backend_id")) or requested_backend_id
+                client_id = _clean_text(backend.get("client_id")) or backend_id
+                client_secret = _clean_text(backend.get("client_secret")) or config.backend_identity.client_secret
+                if backend_id and client_id and client_secret:
+                    local_backend = _save_backend_identity(
+                        agent_service,
+                        config_path=config.config_path or default_config_path(workspace=loaded.workspace),
+                        backend_id=backend_id,
+                        client_id=client_id,
+                        client_secret=client_secret,
+                        name=_clean_text(backend.get("name")) or backend_name,
+                        public_base_url=public_base_url,
+                        authz_base_url=authz_base_url,
+                    )
+                    authz_backend_registered = True
+            authz_user_registered = bool(authz_payload)
+
+        if not user_exists:
+            users[username] = password
+            _save_auth_users(auth_file, users)
+
+        token = _issue_web_token(app, username)
+        handoff_code, handoff_expires_at = _issue_handoff_code(app, username, token)
+        backend_connection = {
+            **_backend_connection_view(request),
+            "public_base_url": public_base_url,
+            "api_base_url": public_base_url,
+            "frontend_base_url": frontend_base_url,
+            "registered": bool(local_backend),
+        }
+        if local_backend is not None:
+            backend_connection.update(
+                {
+                    "backend_id": local_backend.get("backend_id"),
+                    "client_id": local_backend.get("client_id"),
+                    "name": local_backend.get("name"),
+                }
+            )
+        return {
+            "access_token": token,
+            "refresh_token": "",
+            "token_type": "bearer",
+            "user_id": username,
+            "username": username,
+            "email": email,
+            "role": "owner",
+            "handoff_code": handoff_code,
+            "handoff_expires_at": handoff_expires_at,
+            "existing_user": user_exists,
+            "authz": {
+                "enabled": bool(authz_base_url),
+                "base_url": authz_base_url or None,
+                "user_registered": authz_user_registered,
+                "backend_registered": authz_backend_registered,
+            },
+            "backend_connection": backend_connection,
+            "local_backend": local_backend or _local_backend_view(),
+        }
+
+    @app.post("/api/auth/handoff/consume")
+    async def auth_handoff_consume(payload: dict[str, Any]) -> dict[str, Any]:
+        return _consume_handoff_code(app, str(payload.get("code") or ""))
+
    @app.get("/api/auth/me")
    async def auth_me(authorization: str | None = Header(default=None)) -> dict[str, Any]:
-        identity = _require_web_identity(app, authorization)
+        username = _require_web_user(app, authorization)
        return {
-            "id": identity.user_id,
-            "username": identity.username,
-            "email": identity.email,
+            "id": username,
+            "username": username,
+            "email": os.getenv("BEAVER_BACKEND_IDENTITY__EMAIL", ""),
            "role": "owner",
            "quota_tier": "single-user",
        }

    @app.post("/api/auth/logout")
-    async def auth_logout() -> dict[str, Any]:
+    async def auth_logout(authorization: str | None = Header(default=None)) -> dict[str, Any]:
+        if authorization and authorization.lower().startswith("bearer "):
+            token = authorization[7:].strip()
+            app.state.auth_tokens.pop(token, None)
        return {"ok": True}

    @app.post("/api/providers/{provider_name}/config", response_model=WebProviderConfigResponse)
@ -1908,13 +2021,19 @@ def create_app(
        filename = file.filename or ""
        if not filename.endswith(".zip"):
            raise HTTPException(status_code=400, detail="File must be a .zip archive")
-        loaded = get_agent_service(request).create_loop().boot()
+        agent_service = get_agent_service(request)
+        loaded = agent_service.create_loop().boot()
        try:
            content = await file.read()
-            draft = _create_skill_upload_draft(loaded, filename, content)
+            draft_payload = _create_skill_upload_draft(loaded, filename, content)
+            draft = loaded.draft_service.get_draft(draft_payload["skill_name"], draft_payload["draft_id"])
+            if draft is not None:
+                await _rewrite_uploaded_skill_draft_with_llm(agent_service, loaded, draft, filename=filename)
+                draft = loaded.draft_service.get_draft(draft.skill_name, draft.draft_id) or draft
+                draft_payload = draft.to_dict()
        except ValueError as exc:
            raise HTTPException(status_code=400, detail=str(exc)) from exc
-        return draft
+        return draft_payload

    @app.get("/api/marketplaces/skills/search")
    async def search_skillhub(
@ -1984,13 +2103,17 @@ def create_app(
    @app.get("/api/skills/candidates")
    async def list_skill_candidates(request: Request, status: str | None = None) -> list[dict[str, Any]]:
        loaded = get_agent_service(request).create_loop().boot()
-        return [item.to_dict() for item in loaded.skill_learning_pipeline.list_candidates(status=status)]  # type: ignore[union-attr]
+        return [
+            _skill_learning_candidate_payload(loaded, item)
+            for item in loaded.skill_learning_pipeline.list_candidates(status=status)  # type: ignore[union-attr]
+        ]

    @app.get("/api/skills/candidates/{candidate_id}")
    async def get_skill_candidate(candidate_id: str, request: Request) -> dict[str, Any]:
        loaded = get_agent_service(request).create_loop().boot()
        try:
-            return loaded.skill_learning_pipeline.get_candidate(candidate_id).to_dict()  # type: ignore[union-attr]
+            candidate = loaded.skill_learning_pipeline.get_candidate(candidate_id)  # type: ignore[union-attr]
+            return _skill_learning_candidate_payload(loaded, candidate)
        except ValueError as exc:
            raise HTTPException(status_code=404, detail=str(exc)) from exc

@ -2003,25 +2126,19 @@ def create_app(
            candidate = loaded.skill_learning_pipeline.get_candidate(candidate_id)  # type: ignore[union-attr]
            if candidate.draft_skill_name and candidate.draft_id:
                try:
-                    return _skill_draft_payload(loaded, candidate.draft_skill_name, candidate.draft_id)
+                    loaded.skill_learning_pipeline.get_draft(candidate.draft_skill_name, candidate.draft_id)  # type: ignore[union-attr]
                except ValueError:
                    pass
+                else:
+                    return _skill_draft_payload(loaded, candidate.draft_skill_name, candidate.draft_id)
            provider_bundle = agent_service._make_provider_bundle_for_task(loaded, {})  # noqa: SLF001
            draft = await loaded.skill_learning_pipeline.synthesize_draft(  # type: ignore[union-attr]
                candidate_id,
                provider_bundle=provider_bundle,
            )
-            loaded.skill_learning_pipeline.check_safety(draft.skill_name, draft.draft_id)  # type: ignore[union-attr]
-            await loaded.skill_learning_pipeline.evaluate_draft(  # type: ignore[union-attr]
-                candidate_id,
-                draft.skill_name,
-                draft.draft_id,
-                provider_bundle=provider_bundle,
-                replay_runner=ReplayRunner(agent_loop=loop),
-            )
        except ValueError as exc:
            raise HTTPException(status_code=404, detail=str(exc)) from exc
-        return draft.to_dict()
+        return _skill_draft_payload(loaded, draft.skill_name, draft.draft_id)

    @app.post("/api/skills/candidates/{candidate_id}/regenerate")
    async def regenerate_skill_draft(candidate_id: str, request: Request) -> dict[str, Any]:
@ -2034,17 +2151,9 @@ def create_app(
                candidate_id,
                provider_bundle=provider_bundle,
            )
-            loaded.skill_learning_pipeline.check_safety(draft.skill_name, draft.draft_id)  # type: ignore[union-attr]
-            await loaded.skill_learning_pipeline.evaluate_draft(  # type: ignore[union-attr]
-                candidate_id,
-                draft.skill_name,
-                draft.draft_id,
-                provider_bundle=provider_bundle,
-                replay_runner=ReplayRunner(agent_loop=loop),
-            )
        except ValueError as exc:
            raise HTTPException(status_code=404, detail=str(exc)) from exc
-        return draft.to_dict()
+        return _skill_draft_payload(loaded, draft.skill_name, draft.draft_id)

    @app.post("/api/skills/learning/run-once")
    async def run_skill_learning_once(request: Request) -> dict[str, Any]:
@ -2101,17 +2210,31 @@ def create_app(

    @app.post("/api/skills/{skill_name}/drafts/{draft_id}/submit")
    async def submit_skill_draft(skill_name: str, draft_id: str, request: Request, payload: dict[str, Any] | None = None) -> dict[str, Any]:
-        loaded = get_agent_service(request).create_loop().boot()
+        agent_service = get_agent_service(request)
+        loop = agent_service.create_loop()
+        loaded = loop.boot()
        try:
-            review = loaded.skill_learning_pipeline.submit_review(  # type: ignore[union-attr]
-                skill_name,
-                draft_id,
-                requested_by=str((payload or {}).get("requested_by") or "web"),
-                notes=str((payload or {}).get("notes") or ""),
-            )
+            safety = loaded.skill_learning_pipeline.check_safety(skill_name, draft_id)  # type: ignore[union-attr]
+            if safety.passed and safety.risk_level != "critical":
+                loaded.skill_learning_pipeline.submit_review(  # type: ignore[union-attr]
+                    skill_name,
+                    draft_id,
+                    requested_by=str((payload or {}).get("requested_by") or "web"),
+                    notes=str((payload or {}).get("notes") or ""),
+                )
+                candidate_id = _skill_learning_candidate_id_for_draft(loaded, skill_name, draft_id)
+                if candidate_id is not None:
+                    provider_bundle = agent_service._make_provider_bundle_for_task(loaded, {})  # noqa: SLF001
+                    await loaded.skill_learning_pipeline.evaluate_draft(  # type: ignore[union-attr]
+                        candidate_id,
+                        skill_name,
+                        draft_id,
+                        provider_bundle=provider_bundle,
+                        replay_runner=ReplayRunner(agent_loop=loop),
+                    )
        except ValueError as exc:
            raise _skill_draft_http_error(exc) from exc
-        return review.to_dict()
+        return _skill_draft_payload(loaded, skill_name, draft_id)

    @app.post("/api/skills/{skill_name}/drafts/{draft_id}/approve")
    async def approve_skill_draft(skill_name: str, draft_id: str, request: Request, payload: dict[str, Any] | None = None) -> dict[str, Any]:
@ -2635,47 +2758,70 @@ def _create_skill_upload_draft(loaded: Any, filename: str, content: bytes) -> di
        if not file_infos:
            raise ValueError("Zip archive is empty")
        skill_entries = []
-        for info in file_infos:
-            parts = Path(info.filename.replace("\\", "/")).parts
-            if "__MACOSX" in parts or Path(info.filename).name == ".DS_Store":
-                continue
-            if info.filename.replace("\\", "/").startswith("/") or any(part in {"", ".", ".."} for part in parts):
-                raise ValueError(f"Unsafe archive entry: {info.filename}")
-            if parts[-1] == "SKILL.md":
-                if len(parts) not in (1, 2):
-                    raise ValueError("SKILL.md must be at root or inside one top-level directory")
-                skill_entries.append(info.filename)
-        if not skill_entries:
-            raise ValueError("Zip must contain SKILL.md")
-        skill_entry = skill_entries[0]
-        top = Path(skill_entry).parts[0] if len(Path(skill_entry).parts) == 2 else ""
-        raw_skill = archive.read(skill_entry).decode("utf-8", errors="replace")
-        frontmatter, body = parse_frontmatter(raw_skill)
-        skill_name = str(frontmatter.get("name") or top or Path(filename).stem).strip().replace(" ", "-")
-        if not skill_name or "/" in skill_name or "\\" in skill_name or skill_name in {".", ".."}:
-            raise ValueError("Could not determine a safe skill name")
-        files: list[tuple[str, bytes]] = []
+        safe_entries: list[tuple[Any, str, tuple[str, ...]]] = []
        for info in file_infos:
            raw = info.filename.replace("\\", "/")
            parts = Path(raw).parts
            if "__MACOSX" in parts or Path(raw).name == ".DS_Store":
                continue
-            if raw.startswith("/"):
+            if raw.startswith("/") or any(part in {"", ".", ".."} for part in parts):
                raise ValueError(f"Unsafe archive entry: {info.filename}")
-            if top and parts and parts[0] != top:
-                raise ValueError("Zip archive must contain a single top-level skill directory")
-            rel_parts = parts[1:] if top and parts and parts[0] == top else parts
+            safe_entries.append((info, raw, tuple(parts)))
+            if _is_skill_markdown_entry(parts[-1]):
+                skill_entries.append(raw)
+        if not skill_entries:
+            raise ValueError("Zip must contain SKILL.md")
+        if len(skill_entries) > 1:
+            raise ValueError("Zip must contain exactly one SKILL.md")
+        skill_entry = skill_entries[0]
+        skill_root = tuple(Path(skill_entry).parts[:-1])
+        raw_skill = archive.read(skill_entry).decode("utf-8", errors="replace")
+        frontmatter, body = parse_frontmatter(raw_skill)
+        skill_name = str(frontmatter.get("name") or (skill_root[-1] if skill_root else "") or Path(filename).stem).strip().replace(" ", "-")
+        if not skill_name or "/" in skill_name or "\\" in skill_name or skill_name in {".", ".."}:
+            raise ValueError("Could not determine a safe skill name")
+        proposed_frontmatter = normalize_skill_frontmatter(
+            {
+                **dict(frontmatter),
+                "name": skill_name,
+                "description": frontmatter.get("description") or skill_name,
+            },
+            skill_name=skill_name,
+        )
+        proposed_frontmatter["tools"] = _merge_tool_names(
+            proposed_frontmatter.get("tools"),
+            extract_required_tool_names(body),
+            _infer_uploaded_skill_tools(
+                skill_name=skill_name,
+                filename=filename,
+                frontmatter=proposed_frontmatter,
+                content=body,
+                loaded=loaded,
+            ),
+        )
+        proposed_content = ensure_canonical_skill_body(
+            body,
+            title=skill_name,
+            description=str(proposed_frontmatter.get("description") or ""),
+            tools=list(proposed_frontmatter.get("tools") or []),
+        )
+        files: list[tuple[str, bytes]] = []
+        for info, raw, parts in safe_entries:
+            if raw == skill_entry:
+                continue
+            if skill_root:
+                if parts[: len(skill_root)] != skill_root:
+                    continue
+                rel_parts = parts[len(skill_root):]
+            else:
+                rel_parts = parts
            if not rel_parts or any(part in {"", ".", ".."} for part in rel_parts):
                raise ValueError(f"Unsafe archive entry: {info.filename}")
            files.append(("/".join(rel_parts), archive.read(info)))
    draft = loaded.draft_service.create_new_skill_draft(
        skill_name=skill_name,
-        proposed_content=body,
-        proposed_frontmatter={
-            **dict(frontmatter),
-            "name": skill_name,
-            "description": frontmatter.get("description") or skill_name,
-        },
+        proposed_content=proposed_content,
+        proposed_frontmatter=proposed_frontmatter,
        created_by="web-upload",
        reason=f"Uploaded {filename}",
        evidence_refs=[{"kind": "upload", "filename": filename, "files": sorted(path for path, _ in files)}],
@ -2700,6 +2846,162 @@ def _create_skill_upload_draft(loaded: Any, filename: str, content: bytes) -> di
    return draft.to_dict()


+def _is_skill_markdown_entry(filename: str) -> bool:
+    return filename.strip().lower() in {"skill.md", "skills.md"}
+
+
+def _merge_tool_names(*groups: Any) -> list[str]:
+    result: list[str] = []
+    for group in groups:
+        if isinstance(group, str):
+            raw_items = group.split(",")
+        elif isinstance(group, (list, tuple, set)):
+            raw_items = list(group)
+        else:
+            raw_items = []
+        for item in raw_items:
+            cleaned = str(item).strip()
+            if cleaned and cleaned not in result:
+                result.append(cleaned)
+    return result
+
+
+def _infer_uploaded_skill_tools(
+    *,
+    skill_name: str,
+    filename: str,
+    frontmatter: dict[str, Any],
+    content: str,
+    loaded: Any,
+) -> list[str]:
+    available = _available_runtime_tool_names(loaded)
+    text = "\n".join(
+        [
+            skill_name,
+            filename,
+            json.dumps(frontmatter, ensure_ascii=False, sort_keys=True),
+            content,
+        ]
+    ).lower()
+    inferred: list[str] = []
+
+    for tool_name in sorted(available or _COMMON_RUNTIME_TOOL_NAMES):
+        if re.search(rf"(?<![a-z0-9_]){re.escape(tool_name.lower())}(?![a-z0-9_])", text):
+            inferred.append(tool_name)
+
+    def add_if_available(*tool_names: str) -> None:
+        for tool_name in tool_names:
+            if available is not None and tool_name not in available:
+                continue
+            if tool_name not in inferred:
+                inferred.append(tool_name)
+
+    if re.search(r"\b(weather|forecast|temperature|precipitation|rain|snow|humidity|wind|air quality|aqi)\b", text):
+        add_if_available("web_fetch", "web_search")
+    if re.search(r"\b(latest|current|today|tomorrow|news|search|query|lookup|find online|web search)\b", text):
+        add_if_available("web_search")
+    if re.search(r"\b(url|http|https|website|webpage|page|fetch|crawl|browser|online source)\b", text):
+        add_if_available("web_fetch")
+
+    return inferred
+
+
+def _available_runtime_tool_names(loaded: Any) -> set[str] | None:
+    registry = getattr(loaded, "tool_registry", None)
+    if registry is None:
+        return None
+    try:
+        return {spec.name for spec in registry.list_specs()}
+    except Exception:
+        return None
+
+
+_COMMON_RUNTIME_TOOL_NAMES = {
+    "web_fetch",
+    "web_search",
+    "read_file",
+    "write_file",
+    "patch_file",
+    "search_files",
+    "list_directory",
+    "memory",
+    "terminal",
+    "process",
+    "execute_code",
+    "skill_view",
+    "skills_list",
+    "skill_manage",
+    "cron",
+}
+
+
+async def _rewrite_uploaded_skill_draft_with_llm(agent_service: Any, loaded: Any, draft: Any, *, filename: str) -> None:
+    try:
+        provider_bundle = agent_service._make_provider_bundle_for_task(loaded, {})  # noqa: SLF001
+        provider = getattr(provider_bundle, "auxiliary_provider", None) or getattr(provider_bundle, "main_provider", None)
+        runtime = getattr(provider_bundle, "auxiliary_runtime", None) or getattr(provider_bundle, "main_runtime", None)
+        if provider is None:
+            return
+        available_tool_names = sorted(_available_runtime_tool_names(loaded) or _COMMON_RUNTIME_TOOL_NAMES)
+        response = await provider.chat(
+            messages=[
+                {
+                    "role": "system",
+                    "content": (
+                        "You rewrite uploaded Beaver skills into the required house style. "
+                        "Return only JSON with keys: frontmatter, content, change_reason. "
+                        "Do not include markdown fences."
+                    ),
+                },
+                {
+                    "role": "user",
+                    "content": (
+                        f"Uploaded filename: {filename}\n"
+                        f"Skill name: {draft.skill_name}\n"
+                        f"Current frontmatter:\n{json.dumps(draft.proposed_frontmatter, ensure_ascii=False, sort_keys=True)}\n\n"
+                        f"Current content:\n{draft.proposed_content}\n\n"
+                        f"Available runtime tool names:\n{json.dumps(available_tool_names, ensure_ascii=False)}\n\n"
+                        f"{canonical_skill_format_instructions()}\n\n"
+                        "Rewrite the skill so it is operational, concrete, and ready for review/publish. "
+                        "Infer exact required runtime tools from the uploaded content when the workflow depends on tools. "
+                        "Keep frontmatter.tools and the Required Tools section consistent."
+                    ),
+                },
+            ],
+            tools=None,
+            model=getattr(runtime, "model", None),
+            max_tokens=4096,
+            temperature=0,
+        )
+        payload = parse_skill_rewrite_json(response.content or "", skill_name=draft.skill_name)
+        if payload is None:
+            return
+        payload["frontmatter"]["tools"] = _merge_tool_names(
+            payload["frontmatter"].get("tools"),
+            extract_required_tool_names(payload["content"]),
+            _infer_uploaded_skill_tools(
+                skill_name=draft.skill_name,
+                filename=filename,
+                frontmatter=payload["frontmatter"],
+                content=payload["content"],
+                loaded=loaded,
+            ),
+        )
+        payload["content"] = ensure_canonical_skill_body(
+            payload["content"],
+            title=str(payload["frontmatter"].get("name") or draft.skill_name),
+            description=str(payload["frontmatter"].get("description") or ""),
+            tools=list(payload["frontmatter"].get("tools") or []),
+        )
+        draft.proposed_frontmatter = payload["frontmatter"]
+        draft.proposed_content = payload["content"]
+        if payload.get("change_reason"):
+            draft.reason = f"{draft.reason}; LLM rewrite: {payload['change_reason']}"
+        loaded.skill_spec_store.write_draft(draft)
+    except Exception:
+        return
+
+
 def _debug_runs_for_session(session_manager: Any, session_id: str) -> list[dict[str, Any]]:
    grouped: dict[str, list[Any]] = {}
    run_order: list[str] = []
@ -3204,6 +3506,82 @@ def _provider_enabled(provider_name: str, provider_cfg: Any) -> bool:
    )


+def _auth_file_path() -> Path:
+    raw = os.getenv("BEAVER_AUTH_FILE")
+    if raw:
+        return Path(raw)
+    return Path.home() / ".beaver" / "web_auth_users.json"
+
+
+def _load_auth_users(path: Path) -> dict[str, str]:
+    if not path.exists():
+        raise HTTPException(status_code=500, detail=f"Auth file not found: {path}")
+    try:
+        raw = json.loads(path.read_text(encoding="utf-8"))
+    except json.JSONDecodeError as exc:
+        raise HTTPException(status_code=500, detail=f"Invalid auth file: {path}") from exc
+
+    users: dict[str, str] = {}
+    if isinstance(raw, dict):
+        entries = raw.get("users") or raw.get("accounts")
+        if isinstance(entries, list):
+            for entry in entries:
+                if not isinstance(entry, dict):
+                    continue
+                username = _clean_text(entry.get("username"))
+                password = entry.get("password")
+                if username and isinstance(password, str):
+                    users[username] = password
+        for key, value in raw.items():
+            if key in {"users", "accounts"}:
+                continue
+            username = _clean_text(key)
+            if username and isinstance(value, str):
+                users[username] = value
+    if not users:
+        raise HTTPException(status_code=500, detail=f"No valid users found in auth file: {path}")
+    return users
+
+
+def _load_auth_users_if_present(path: Path) -> dict[str, str]:
+    if not path.exists():
+        return {}
+    return _load_auth_users(path)
+
+
+def _save_auth_users(path: Path, users: dict[str, str]) -> None:
+    path.parent.mkdir(parents=True, exist_ok=True)
+    payload = {
+        "users": [
+            {"username": username, "password": password}
+            for username, password in sorted(users.items())
+        ]
+    }
+    path.write_text(json.dumps(payload, ensure_ascii=False, indent=2) + "\n", encoding="utf-8")
+
+
+def _issue_web_token(app: FastAPI, username: str) -> str:
+    token = secrets.token_urlsafe(32)
+    app.state.auth_tokens[token] = username
+    return token
+
+
+def _handoff_ttl_seconds() -> int:
+    raw = os.getenv("BEAVER_HANDOFF_CODE_TTL_SECONDS", "90").strip()
+    try:
+        return max(15, int(raw))
+    except ValueError:
+        return 90
+
+
+def _handoff_replay_window_seconds() -> int:
+    raw = os.getenv("BEAVER_HANDOFF_REPLAY_WINDOW_SECONDS", "15").strip()
+    try:
+        return max(1, int(raw))
+    except ValueError:
+        return 15
+
+
 def _int_env(name: str, default: int) -> int:
    raw = os.getenv(name, "").strip()
    if not raw:
@ -3225,10 +3603,81 @@ def _human_upload_size(size: int) -> str:
    return f"{size}B"


-def _require_web_identity(app: FastAPI, authorization: str | None) -> KeycloakIdentity:
-    token = extract_bearer_token(authorization)
-    verifier: KeycloakTokenVerifier = app.state.keycloak_token_verifier
-    return verifier.verify(token)
+def _prune_handoff_codes(app: FastAPI) -> None:
+    now = time.time()
+    replay_window = _handoff_replay_window_seconds()
+    expired = []
+    for code, payload in list(app.state.handoff_codes.items()):
+        expires_at = float(payload.get("expires_at") or 0)
+        consumed_at = payload.get("consumed_at")
+        if expires_at <= now:
+            expired.append(code)
+        elif consumed_at is not None and now - float(consumed_at) > replay_window:
+            expired.append(code)
+    for code in expired:
+        app.state.handoff_codes.pop(code, None)
+
+
+def _issue_handoff_code(app: FastAPI, username: str, access_token: str, refresh_token: str = "") -> tuple[str, int]:
+    _prune_handoff_codes(app)
+    code = secrets.token_urlsafe(24)
+    expires_at = int(time.time()) + _handoff_ttl_seconds()
+    app.state.handoff_codes[code] = {
+        "username": username,
+        "access_token": access_token,
+        "refresh_token": refresh_token,
+        "expires_at": expires_at,
+        "consumed_at": None,
+    }
+    return code, expires_at
+
+
+def _consume_handoff_code(app: FastAPI, code: str) -> dict[str, Any]:
+    if not code.strip():
+        raise HTTPException(status_code=400, detail="Handoff code is required")
+    _prune_handoff_codes(app)
+    payload = app.state.handoff_codes.get(code)
+    if payload is None:
+        raise HTTPException(status_code=401, detail="Invalid or expired handoff code")
+    now = time.time()
+    expires_at = float(payload.get("expires_at") or 0)
+    if expires_at <= now:
+        app.state.handoff_codes.pop(code, None)
+        raise HTTPException(status_code=410, detail="Handoff code expired")
+    consumed_at = payload.get("consumed_at")
+    if consumed_at is None:
+        payload["consumed_at"] = now
+    elif now - float(consumed_at) > _handoff_replay_window_seconds():
+        app.state.handoff_codes.pop(code, None)
+        raise HTTPException(status_code=410, detail="Handoff code already used")
+    username = str(payload.get("username") or "").strip()
+    access_token = str(payload.get("access_token") or "").strip()
+    if not username or not access_token:
+        app.state.handoff_codes.pop(code, None)
+        raise HTTPException(status_code=401, detail="Invalid handoff payload")
+    return {
+        "access_token": access_token,
+        "refresh_token": str(payload.get("refresh_token") or ""),
+        "token_type": "bearer",
+        "user_id": username,
+        "username": username,
+        "role": "owner",
+    }
+
+
+def _require_web_user(app: FastAPI, authorization: str | None) -> str:
+    if not authorization:
+        raise HTTPException(status_code=401, detail="Missing Authorization header")
+    prefix = "bearer "
+    if not authorization.lower().startswith(prefix):
+        raise HTTPException(status_code=401, detail="Invalid Authorization header")
+    token = authorization[len(prefix):].strip()
+    if not token:
+        raise HTTPException(status_code=401, detail="Invalid token")
+    username = app.state.auth_tokens.get(token)
+    if not username:
+        raise HTTPException(status_code=401, detail="Invalid or expired token")
+    return username


 def _backend_connection_view(request: Request) -> dict[str, Any]:
@ -3328,6 +3777,39 @@ def _skill_detail_payload(loaded: Any, name: str, version: str | None) -> dict[s
    }


+def _skill_learning_candidate_payload(loaded: Any, candidate: Any) -> dict[str, Any]:
+    payload = candidate.to_dict()
+    evidence = dict(payload.get("evidence") or {})
+    task_text = _skill_learning_candidate_task_text(loaded, candidate)
+    if task_text:
+        evidence["task_text"] = task_text
+        evidence["theme"] = SkillLearningService._task_theme(task_text)
+        payload["evidence"] = evidence
+        if candidate.kind == "new_skill":
+            payload["evidence_summary"] = f"Theme: {evidence['theme']}"
+    return payload
+
+
+def _skill_learning_candidate_task_text(loaded: Any, candidate: Any) -> str:
+    evidence = candidate.evidence if isinstance(candidate.evidence, dict) else {}
+    task_id = str(evidence.get("task_id") or "").strip()
+    source_run_ids = set(candidate.source_run_ids or [])
+    try:
+        run_store = loaded.skill_learning_pipeline.learning_service.run_store
+        runs = run_store.list_runs()
+    except Exception:
+        return str(evidence.get("task_text") or "").strip()
+
+    if task_id:
+        task_runs = [record for record in runs if record.task_id == task_id]
+        if task_runs:
+            return SkillLearningService._representative_task_text(task_runs)
+    source_runs = [record for record in runs if record.run_id in source_run_ids]
+    if source_runs:
+        return SkillLearningService._representative_task_text(source_runs)
+    return str(evidence.get("task_text") or "").strip()
+
+
 def _skill_draft_payload(loaded: Any, skill_name: str, draft_id: str, *, include_reviews: bool = False) -> dict[str, Any]:
    draft = loaded.skill_learning_pipeline.get_draft(skill_name, draft_id)  # type: ignore[union-attr]
    safety = loaded.skill_learning_pipeline.get_safety_report(skill_name, draft_id)  # type: ignore[union-attr]
@ -3336,6 +3818,8 @@ def _skill_draft_payload(loaded: Any, skill_name: str, draft_id: str, *, include
        **draft.to_dict(),
        "safety_report": safety.to_dict() if safety is not None else None,
        "eval_report": eval_report.to_dict() if eval_report is not None else None,
+        "target_version": _skill_draft_target_version(loaded, draft.skill_name, draft.proposal_kind),
+        "base_skill": _skill_draft_base_skill_payload(loaded, draft),
    }
    if include_reviews:
        payload["reviews"] = [
@ -3345,6 +3829,45 @@ def _skill_draft_payload(loaded: Any, skill_name: str, draft_id: str, *, include
    return payload


+def _skill_draft_base_skill_payload(loaded: Any, draft: Any) -> dict[str, Any] | None:
+    if draft.proposal_kind == "new_skill" or not draft.base_version:
+        return None
+    store = loaded.skill_learning_pipeline.publisher.store  # type: ignore[union-attr]
+    loaded_version = store.read_published_skill(draft.skill_name, draft.base_version)
+    if loaded_version is None:
+        return None
+    version = loaded_version.version
+    return {
+        "skill_name": version.skill_name,
+        "version": version.version,
+        "frontmatter": dict(version.frontmatter),
+        "content": loaded_version.content,
+        "summary": version.summary,
+        "tool_hints": list(version.tool_hints),
+    }
+
+
+def _skill_draft_target_version(loaded: Any, skill_name: str, proposal_kind: str) -> str | None:
+    if proposal_kind == "retire_skill":
+        return None
+    versions = [
+        item
+        for item in loaded.skill_learning_pipeline.publisher.store.list_versions(skill_name)  # type: ignore[union-attr]
+        if isinstance(item, str) and item.startswith("v") and item[1:].isdigit()
+    ]
+    if not versions:
+        return "v0001"
+    latest = max(int(item[1:]) for item in versions)
+    return f"v{latest + 1:04d}"
+
+
+def _skill_learning_candidate_id_for_draft(loaded: Any, skill_name: str, draft_id: str) -> str | None:
+    for candidate in loaded.skill_learning_pipeline.list_candidates():  # type: ignore[union-attr]
+        if candidate.draft_skill_name == skill_name and candidate.draft_id == draft_id:
+            return candidate.candidate_id
+    return None
+
+
 def _skill_versions_payload(loaded: Any, record: Any) -> list[dict[str, Any]]:
    if record.source != "workspace":
        return [
--- a/app-instance/backend/beaver/interfaces/web/keycloak_auth.py
+++ b/app-instance/backend/beaver/interfaces/web/keycloak_auth.py
@ -1,152 +0,0 @@
-"""Keycloak OIDC token verification for the Beaver web app."""
-
-from __future__ import annotations
-
-from dataclasses import dataclass
-import os
-import time
-from typing import Any
-
-import jwt
-from jwt import PyJWKClient
-
-try:
-    from fastapi import HTTPException
-except ModuleNotFoundError:  # pragma: no cover
-    class HTTPException(Exception):  # type: ignore[override]
-        def __init__(self, status_code: int, detail: str) -> None:
-            super().__init__(detail)
-            self.status_code = status_code
-            self.detail = detail
-
-
-DEFAULT_KEYCLOAK_ISSUER = "https://keycloak.bwgdi.com/realms/beaver"
-DEFAULT_KEYCLOAK_CLIENT_ID = "beaver-agnet"
-
-
-@dataclass(frozen=True, slots=True)
-class KeycloakAuthConfig:
-    issuer: str
-    client_id: str
-    token_url: str
-    jwks_url: str
-
-    @classmethod
-    def from_env(cls) -> "KeycloakAuthConfig":
-        issuer = _clean_base_url(os.getenv("BEAVER_KEYCLOAK_ISSUER") or DEFAULT_KEYCLOAK_ISSUER)
-        client_id = (os.getenv("BEAVER_KEYCLOAK_CLIENT_ID") or DEFAULT_KEYCLOAK_CLIENT_ID).strip()
-        token_url = (
-            os.getenv("BEAVER_KEYCLOAK_TOKEN_URL", "").strip()
-            or f"{issuer}/protocol/openid-connect/token"
-        )
-        jwks_url = (
-            os.getenv("BEAVER_KEYCLOAK_JWKS_URL", "").strip()
-            or f"{issuer}/protocol/openid-connect/certs"
-        )
-        return cls(issuer=issuer, client_id=client_id, token_url=token_url, jwks_url=jwks_url)
-
-
-@dataclass(frozen=True, slots=True)
-class KeycloakIdentity:
-    user_id: str
-    username: str
-    email: str = ""
-    name: str = ""
-    realm_roles: tuple[str, ...] = ()
-    client_roles: tuple[str, ...] = ()
-
-
-def extract_bearer_token(authorization: str | None) -> str:
-    if not authorization:
-        raise HTTPException(status_code=401, detail="Missing Authorization header")
-    prefix = "bearer "
-    if not authorization.lower().startswith(prefix):
-        raise HTTPException(status_code=401, detail="Invalid Authorization header")
-    token = authorization[len(prefix):].strip()
-    if not token:
-        raise HTTPException(status_code=401, detail="Invalid token")
-    return token
-
-
-class KeycloakTokenVerifier:
-    def __init__(self, *, config: KeycloakAuthConfig) -> None:
-        self.config = config
-        self._jwks_client = PyJWKClient(config.jwks_url)
-
-    def verify(self, token: str, *, expected_nonce: str | None = None) -> KeycloakIdentity:
-        try:
-            signing_key = self._jwks_client.get_signing_key_from_jwt(token).key
-            claims = jwt.decode(
-                token,
-                signing_key,
-                algorithms=["RS256"],
-                issuer=self.config.issuer,
-                options={
-                    "require": ["exp", "iat", "iss"],
-                    "verify_aud": False,
-                },
-            )
-        except Exception as exc:  # noqa: BLE001 - normalize JWT/JWKS failures for HTTP callers
-            raise HTTPException(status_code=401, detail=f"Invalid token: {exc}") from exc
-        return self.validate_claims(claims, expected_nonce=expected_nonce)
-
-    def validate_claims(self, claims: dict[str, Any], *, expected_nonce: str | None = None) -> KeycloakIdentity:
-        now = int(time.time())
-        issuer = str(claims.get("iss") or "")
-        if issuer != self.config.issuer:
-            raise HTTPException(status_code=401, detail="Invalid token issuer")
-        exp = _int_claim(claims, "exp")
-        iat = _int_claim(claims, "iat")
-        if exp <= now:
-            raise HTTPException(status_code=401, detail="Token expired")
-        if iat > now + 120:
-            raise HTTPException(status_code=401, detail="Token issued in the future")
-        if not _matches_client(claims.get("aud"), self.config.client_id) and claims.get("azp") != self.config.client_id:
-            raise HTTPException(status_code=401, detail="Invalid token audience")
-        if expected_nonce is not None and claims.get("nonce") != expected_nonce:
-            raise HTTPException(status_code=401, detail="Invalid token nonce")
-
-        user_id = str(claims.get("sub") or "").strip()
-        if not user_id:
-            raise HTTPException(status_code=401, detail="Token subject is required")
-        username = (
-            str(claims.get("preferred_username") or "").strip()
-            or str(claims.get("email") or "").strip()
-            or user_id
-        )
-        return KeycloakIdentity(
-            user_id=user_id,
-            username=username,
-            email=str(claims.get("email") or "").strip(),
-            name=str(claims.get("name") or "").strip(),
-            realm_roles=_roles_from(claims.get("realm_access")),
-            client_roles=_roles_from((claims.get("resource_access") or {}).get(self.config.client_id) if isinstance(claims.get("resource_access"), dict) else None),
-        )
-
-
-def _clean_base_url(value: str) -> str:
-    return value.strip().rstrip("/")
-
-
-def _int_claim(claims: dict[str, Any], key: str) -> int:
-    try:
-        return int(claims[key])
-    except (KeyError, TypeError, ValueError) as exc:
-        raise HTTPException(status_code=401, detail=f"Token {key} claim is required") from exc
-
-
-def _matches_client(audience: Any, client_id: str) -> bool:
-    if isinstance(audience, str):
-        return audience == client_id
-    if isinstance(audience, list):
-        return client_id in {str(item) for item in audience}
-    return False
-
-
-def _roles_from(value: Any) -> tuple[str, ...]:
-    if not isinstance(value, dict):
-        return ()
-    roles = value.get("roles")
-    if not isinstance(roles, list):
-        return ()
-    return tuple(str(role) for role in roles if str(role).strip())
--- a/app-instance/backend/beaver/memory/skills/models.py
+++ b/app-instance/backend/beaver/memory/skills/models.py
@ -235,6 +235,12 @@ class SkillDraftEvalReport:
    confidence: str = "low"
    case_reports: list[dict[str, Any]] = field(default_factory=list)
    tool_mode_summary: dict[str, Any] = field(default_factory=dict)
+    ability_score_summary: dict[str, Any] = field(default_factory=dict)
+    tool_execution_summary: dict[str, Any] = field(default_factory=dict)
+    case_selection_summary: dict[str, Any] = field(default_factory=dict)
+    real_score_avg: float | None = None
+    synthetic_score_avg: float | None = None
+    overall_score_avg: float | None = None
    preservation_report: dict[str, Any] | None = None

    def to_dict(self) -> dict[str, Any]:
@ -261,6 +267,12 @@ class SkillDraftEvalReport:
            "confidence": self.confidence,
            "case_reports": [dict(item) for item in self.case_reports],
            "tool_mode_summary": dict(self.tool_mode_summary),
+            "ability_score_summary": dict(self.ability_score_summary),
+            "tool_execution_summary": dict(self.tool_execution_summary),
+            "case_selection_summary": dict(self.case_selection_summary),
+            "real_score_avg": self.real_score_avg,
+            "synthetic_score_avg": self.synthetic_score_avg,
+            "overall_score_avg": self.overall_score_avg,
            "preservation_report": (
                dict(self.preservation_report) if self.preservation_report is not None else None
            ),
@ -295,6 +307,12 @@ class SkillDraftEvalReport:
                if isinstance(item, dict)
            ],
            tool_mode_summary=dict(payload.get("tool_mode_summary") or {}),
+            ability_score_summary=dict(payload.get("ability_score_summary") or {}),
+            tool_execution_summary=dict(payload.get("tool_execution_summary") or {}),
+            case_selection_summary=dict(payload.get("case_selection_summary") or {}),
+            real_score_avg=_optional_bounded_float(payload.get("real_score_avg")),
+            synthetic_score_avg=_optional_bounded_float(payload.get("synthetic_score_avg")),
+            overall_score_avg=_optional_bounded_float(payload.get("overall_score_avg")),
            preservation_report=(
                dict(payload["preservation_report"])
                if isinstance(payload.get("preservation_report"), dict)
@ -309,6 +327,12 @@ def _optional_str(value: Any) -> str | None:
    return str(value)


+def _optional_bounded_float(value: Any) -> float | None:
+    if value in (None, ""):
+        return None
+    return _bounded_float(value, default=0.0)
+
+
 def _bounded_float(value: Any, *, default: float = 0.0) -> float:
    if value in (None, ""):
        return default
--- a/app-instance/backend/beaver/skills/authoring/init.py
+++ b/app-instance/backend/beaver/skills/authoring/init.py
@ -0,0 +1,19 @@
+"""Skill authoring helpers."""
+
+from .format import (
+    CANONICAL_SKILL_SECTION_HEADINGS,
+    canonical_skill_format_instructions,
+    canonicalize_skill_body,
+    ensure_canonical_skill_body,
+    is_canonical_skill_body,
+    normalize_skill_frontmatter,
+)
+
+__all__ = [
+    "CANONICAL_SKILL_SECTION_HEADINGS",
+    "canonical_skill_format_instructions",
+    "canonicalize_skill_body",
+    "ensure_canonical_skill_body",
+    "is_canonical_skill_body",
+    "normalize_skill_frontmatter",
+]
--- a/app-instance/backend/beaver/skills/authoring/format.py
+++ b/app-instance/backend/beaver/skills/authoring/format.py
@ -0,0 +1,250 @@
+"""Canonical Beaver skill authoring format."""
+
+from __future__ import annotations
+
+import json
+import re
+from typing import Any
+
+from beaver.skills.catalog.utils import extract_required_tool_names
+
+
+CANONICAL_SKILL_SECTION_HEADINGS: tuple[str, ...] = (
+    "## Overview",
+    "## When to Use",
+    "## Required Tools",
+    "## Workflow",
+    "## Validation",
+    "## Boundaries",
+    "## Anti-Patterns",
+)
+
+
+def canonical_skill_format_instructions() -> str:
+    headings = "\n".join(f"- {heading}" for heading in CANONICAL_SKILL_SECTION_HEADINGS)
+    return (
+        "Canonical Beaver SKILL.md format:\n"
+        "1. Return a frontmatter object with `name`, `description`, and `tools`.\n"
+        "2. `name` must be lowercase kebab-case. `description` must explain when the skill should be used.\n"
+        "3. `tools` must be an explicit JSON array of exact runtime tool names. Use [] only if no tool is required.\n"
+        "4. The Markdown content must start with one H1 title and include these H2 sections in this exact order:\n"
+        f"{headings}\n"
+        "5. Write concrete operational guidance, not a story about a past task.\n"
+        "6. Include validation steps and anti-patterns so future runs know how to avoid false completion."
+    )
+
+
+def normalize_skill_frontmatter(frontmatter: dict[str, Any] | None, *, skill_name: str) -> dict[str, Any]:
+    raw = dict(frontmatter or {})
+    name = _slug(str(raw.get("name") or skill_name))
+    description = str(raw.get("description") or f"Use when {name} guidance is needed.").strip()
+    tools = _coerce_string_list(raw.get("tools"))
+    normalized = {}
+    for key, value in raw.items():
+        if key in {"name", "description", "tools"}:
+            continue
+        if key in {"always", "internal"} and isinstance(value, str):
+            normalized[key] = value.strip().lower() in {"1", "true", "yes", "on"}
+            continue
+        normalized[key] = value
+    return {
+        "name": name,
+        "description": description,
+        "tools": tools,
+        **normalized,
+    }
+
+
+def is_canonical_skill_body(body: str) -> bool:
+    text = body.strip()
+    if not re.search(r"^#\s+\S", text, flags=re.MULTILINE):
+        return False
+    position = 0
+    for heading in CANONICAL_SKILL_SECTION_HEADINGS:
+        found = text.find(heading, position)
+        if found < 0:
+            return False
+        position = found + len(heading)
+    return True
+
+
+def ensure_canonical_skill_body(
+    body: str,
+    *,
+    title: str,
+    description: str = "",
+    tools: list[str] | None = None,
+) -> str:
+    if is_canonical_skill_body(body):
+        normalized = body.strip()
+        if tools:
+            normalized = _replace_required_tools_section(normalized, tools)
+        return normalized + "\n"
+    source = _compact_source_guidance(body)
+    overview = description or source or f"Use this skill for {title}."
+    return canonicalize_skill_body(
+        title=title,
+        overview=overview,
+        tools=list(tools or []),
+        workflow=[
+            "Identify whether the user's request matches the skill's trigger conditions.",
+            "Read the relevant source guidance below and apply only the steps that fit the current task.",
+            "Use the required tools deliberately and keep tool output tied to the user's goal.",
+        ],
+        validation=[
+            "Verify the requested outcome with the most direct available check.",
+            "Report any skipped step, unavailable dependency, or remaining uncertainty explicitly.",
+        ],
+        boundaries=[
+            "Do not broaden the task beyond the user's request.",
+            "Do not use tools that are not listed or clearly available in the current runtime.",
+        ],
+        anti_patterns=[
+            "Do not summarize the skill instead of applying it.",
+            "Do not claim completion without validation evidence.",
+        ],
+        source_guidance=source,
+    )
+
+
+def canonicalize_skill_body(
+    *,
+    title: str,
+    overview: str,
+    tools: list[str] | None = None,
+    workflow: list[str] | None = None,
+    validation: list[str] | None = None,
+    boundaries: list[str] | None = None,
+    anti_patterns: list[str] | None = None,
+    when_to_use: list[str] | None = None,
+    source_guidance: str = "",
+) -> str:
+    cleaned_title = _title(title)
+    tool_lines = _tool_lines(tools or [])
+    workflow_lines = _bullet_lines(workflow or ["Follow the workflow described by the current task and evidence."])
+    validation_lines = _bullet_lines(validation or ["Validate the result before reporting completion."])
+    boundary_lines = _bullet_lines(boundaries or ["Stay within the current task and workspace boundaries."])
+    anti_pattern_lines = _bullet_lines(anti_patterns or ["Do not skip validation."])
+    when_lines = _bullet_lines(when_to_use or [f"Use when the task requires {cleaned_title} guidance."])
+    source_section = f"\n\n### Source Guidance\n\n{source_guidance.strip()}" if source_guidance.strip() else ""
+    return (
+        f"# {cleaned_title}\n\n"
+        "## Overview\n\n"
+        f"{overview.strip() or f'Use this skill for {cleaned_title}.'}\n\n"
+        "## When to Use\n\n"
+        f"{when_lines}\n\n"
+        "## Required Tools\n\n"
+        f"{tool_lines}\n\n"
+        "## Workflow\n\n"
+        f"{workflow_lines}{source_section}\n\n"
+        "## Validation\n\n"
+        f"{validation_lines}\n\n"
+        "## Boundaries\n\n"
+        f"{boundary_lines}\n\n"
+        "## Anti-Patterns\n\n"
+        f"{anti_pattern_lines}\n"
+    )
+
+
+def parse_skill_rewrite_json(content: str, *, skill_name: str) -> dict[str, Any] | None:
+    cleaned = content.strip()
+    if cleaned.startswith("```"):
+        lines = cleaned.splitlines()
+        if len(lines) >= 3 and lines[0].startswith("```") and lines[-1].startswith("```"):
+            cleaned = "\n".join(lines[1:-1]).strip()
+    try:
+        payload = json.loads(cleaned)
+    except json.JSONDecodeError:
+        return None
+    if not isinstance(payload, dict):
+        return None
+    frontmatter = payload.get("frontmatter")
+    body = payload.get("content")
+    if not isinstance(frontmatter, dict) or not isinstance(body, str):
+        return None
+    normalized = normalize_skill_frontmatter(frontmatter, skill_name=skill_name)
+    normalized["tools"] = _merge_string_lists(
+        normalized.get("tools"),
+        extract_required_tool_names(body),
+    )
+    normalized_body = ensure_canonical_skill_body(
+        body,
+        title=normalized["name"],
+        description=normalized["description"],
+        tools=normalized["tools"],
+    )
+    return {
+        "frontmatter": normalized,
+        "content": normalized_body,
+        "change_reason": str(payload.get("change_reason") or ""),
+    }
+
+
+def _compact_source_guidance(body: str, *, max_chars: int = 20000) -> str:
+    text = body.strip()
+    if not text:
+        return ""
+    text = re.sub(r"^---\n.*?\n---\n?", "", text, flags=re.DOTALL).strip()
+    text = re.sub(r"\n{3,}", "\n\n", text)
+    text = re.sub(r"^(#{1,4})\s+", r"##\1 ", text, flags=re.MULTILINE)
+    return text[:max_chars].rstrip()
+
+
+def _tool_lines(tools: list[str]) -> str:
+    if not tools:
+        return "- No dedicated tools are required."
+    return "\n".join(f"- `{tool}`" for tool in tools)
+
+
+def _bullet_lines(items: list[str]) -> str:
+    cleaned = [str(item).strip() for item in items if str(item).strip()]
+    if not cleaned:
+        return "- No additional guidance."
+    return "\n".join(f"- {item}" for item in cleaned)
+
+
+def _coerce_string_list(value: Any) -> list[str]:
+    if isinstance(value, list):
+        raw_items = value
+    elif isinstance(value, str):
+        raw_items = value.split(",")
+    else:
+        raw_items = []
+    result: list[str] = []
+    for item in raw_items:
+        cleaned = str(item).strip()
+        if cleaned and cleaned not in result:
+            result.append(cleaned)
+    return result
+
+
+def _merge_string_lists(*values: Any) -> list[str]:
+    result: list[str] = []
+    for value in values:
+        for item in _coerce_string_list(value):
+            if item not in result:
+                result.append(item)
+    return result
+
+
+def _replace_required_tools_section(body: str, tools: list[str]) -> str:
+    replacement = "## Required Tools\n\n" + _tool_lines(tools)
+    updated, count = re.subn(
+        r"(?ms)^##\s+Required\s+Tools\s*\n.*?(?=^##\s+|\Z)",
+        replacement + "\n\n",
+        body.strip(),
+        count=1,
+    )
+    return updated.strip() if count else body.strip()
+
+
+def _slug(value: str) -> str:
+    text = value.strip().lower()
+    text = re.sub(r"[^a-z0-9-]+", "-", text)
+    text = re.sub(r"-{2,}", "-", text).strip("-")
+    return text or "generated-skill"
+
+
+def _title(value: str) -> str:
+    cleaned = str(value or "").strip().replace("-", " ")
+    return cleaned.title() if cleaned else "Generated Skill"
--- a/app-instance/backend/beaver/skills/builtin/intent-agent-router/SKILL.md
+++ b/app-instance/backend/beaver/skills/builtin/intent-agent-router/SKILL.md
@ -28,12 +28,13 @@ Choose `new_task` when the user asks for anything that needs the main Task agent

 The Intent Agent has no tools. If a request needs a tool, do not apologize and do not say you cannot access it. Route it to Task mode so the main agent can use tools.

-When there is an active task, do not force every new user message into that task. Use the active task and recent conversation to decide:
+When there is an active task, do not force every new user message into that task. A Session is the durable conversation/device/group context; a Task is one unit of work inside that Session. Use the active task and recent conversation to decide:

 - Choose `revise_task` when the user asks to change, correct, refine, expand, reformat, or redo the latest active task result.
- Choose `continue_task` for neutral follow-up questions or additional next steps that still belong to the active task.
+- Choose `continue_task` for neutral follow-up questions or additional next steps that explicitly depend on or extend the active task's latest result.
 - Choose `simple_chat` for unrelated lightweight conversation. This starts a new topic and the previous task will be accepted automatically.
 - Choose `new_task` when the user asks for clearly unrelated work that needs Task capabilities. This starts a new topic and the previous task will be accepted automatically.
+- Choose `new_task` for a standalone tool-dependent request even when it resembles the active task. Repeating "珠海天气怎么样" later is a fresh task unless the user clearly says to continue or revise the old result.
 - Choose `close_task` when the user says the task is satisfactory or finished, such as "可以了", "就这样", or "that's good".
 - Choose `abandon_task` when the user says to stop, cancel, or no longer do the active task.

@ -46,6 +47,7 @@ Examples with an active weather task:
 - "再详细一点" -> `revise_task`
 - "加上明后天穿衣建议" -> `revise_task`
 - "顺便查一下深圳" -> `continue_task`
+- "珠海天气怎么样" -> `new_task` when asked as a standalone later request
 - "帮我写一个采购合同" -> `new_task`
 - "吃饭没" -> `simple_chat`
 - "我在冰岛" -> `simple_chat`
--- a/app-instance/backend/beaver/skills/catalog/loader.py
+++ b/app-instance/backend/beaver/skills/catalog/loader.py
@ -27,6 +27,7 @@ from beaver.skills.specs.storage import SkillSpecStore
 from .utils import (
    check_requirements,
    escape_xml,
+    extract_required_tool_names,
    get_missing_requirements,
    parse_frontmatter,
    parse_skill_metadata_blob,
@ -111,13 +112,19 @@ class SkillsLoader:
                if not include_internal and _truthy(frontmatter.get("internal")):
                    continue
                normalized_frontmatter = dict(frontmatter)
+                meta_blob = parse_skill_metadata_blob(frontmatter.get("metadata", ""))
                record = SkillRecord(
                    name=name,
                    path=skill_file,
                    source=source,
                    version="legacy",
                    source_kind=source,
-                    tool_hints=self._coerce_tool_names(frontmatter.get("tools")),
+                    tool_hints=self._merge_tool_names(
+                        self._coerce_tool_names(frontmatter.get("tools")),
+                        self._coerce_tool_names(meta_blob.get("tools")),
+                        self._coerce_tool_names(meta_blob.get("required_tools")),
+                        extract_required_tool_names(body),
+                    ),
                    frontmatter=normalized_frontmatter,
                    description=str(frontmatter.get("description") or summarize_body(body) or name),
                )
@ -138,6 +145,7 @@ class SkillsLoader:
                path = self.workspace_skills / name / "SKILL.md"
            else:
                path = self.workspace_skills / name / "versions" / loaded.version.version / "SKILL.md"
+            _frontmatter, body = parse_frontmatter(loaded.content)
            record = SkillRecord(
                name=name,
                path=path,
@ -146,7 +154,10 @@ class SkillsLoader:
                content_hash=loaded.version.content_hash,
                source_kind=str(loaded.version.provenance.get("source_kind") or "workspace"),
                status=str(loaded.version.review_state or "published"),
-                tool_hints=list(loaded.version.tool_hints),
+                tool_hints=self._merge_tool_names(
+                    loaded.version.tool_hints,
+                    extract_required_tool_names(body),
+                ),
                frontmatter=dict(loaded.version.frontmatter),
                description=str(loaded.version.frontmatter.get("description") or loaded.version.summary or name),
            )
@ -201,23 +212,32 @@ class SkillsLoader:
            - read_file
            - search_files
        - 兼容 metadata JSON blob 里的 `tools`
+        - 兼容 canonical 正文 `## Required Tools` 段落
        """

        record = self._find_record(name)
        if record is not None and record.tool_hints:
            return list(record.tool_hints)

-        frontmatter = self.get_skill_metadata(name) or {}
+        content = self.load_published_skill(name) or self.load_skill(name) or ""
+        frontmatter, body = parse_frontmatter(content)
+        frontmatter = frontmatter or self.get_skill_metadata(name) or {}
        meta_blob = parse_skill_metadata_blob(frontmatter.get("metadata", ""))
-        names = [
-            *self._coerce_tool_names(frontmatter.get("tools")),
-            *self._coerce_tool_names(meta_blob.get("tools")),
-            *self._coerce_tool_names(meta_blob.get("required_tools")),
-        ]
+        names = self._merge_tool_names(
+            self._coerce_tool_names(frontmatter.get("tools")),
+            self._coerce_tool_names(meta_blob.get("tools")),
+            self._coerce_tool_names(meta_blob.get("required_tools")),
+            extract_required_tool_names(body),
+        )
+        return names
+
+    @staticmethod
+    def _merge_tool_names(*groups: Any) -> list[str]:
        result: list[str] = []
-        for item in names:
-            if item and item not in result:
-                result.append(item)
+        for group in groups:
+            for item in SkillsLoader._coerce_tool_names(group):
+                if item and item not in result:
+                    result.append(item)
        return result

    def load_skills_for_context(self, skill_names: list[str]) -> str:
--- a/app-instance/backend/beaver/skills/catalog/utils.py
+++ b/app-instance/backend/beaver/skills/catalog/utils.py
@ -84,6 +84,41 @@ def strip_frontmatter(content: str) -> str:
    return body


+def extract_required_tool_names(body: str) -> list[str]:
+    """从 canonical skill 正文的 `## Required Tools` 段落提取工具名。
+
+    这是 frontmatter `tools` 的容错补充，不从任意正文里猜工具。只读取明确
+    命名的 Required Tools section，支持常见 bullet/code 格式。
+    """
+
+    if not body:
+        return []
+
+    match = re.search(
+        r"(?ims)^##\s+Required\s+Tools\s*$\n(?P<section>.*?)(?=^##\s+|\Z)",
+        body,
+    )
+    if match is None:
+        return []
+
+    names: list[str] = []
+    for line in match.group("section").splitlines():
+        stripped = line.strip()
+        if not stripped or not stripped.startswith(("-", "*")):
+            continue
+        candidate = stripped[1:].strip()
+        code_matches = re.findall(r"`([^`]+)`", candidate)
+        raw_items = code_matches or re.split(r"[,，]", candidate)
+        for raw_item in raw_items:
+            name = raw_item.strip().strip("`\"' ")
+            if not name:
+                continue
+            token = name.split()[0].strip("`\"' :：-")
+            if re.fullmatch(r"[A-Za-z0-9_.:-]+", token) and token not in names:
+                names.append(token)
+    return names
+
+
 def parse_skill_metadata_blob(raw: str) -> dict[str, Any]:
    """解析 metadata 字段里的 JSON 扩展配置。

--- a/app-instance/backend/beaver/skills/learning/eval.py
+++ b/app-instance/backend/beaver/skills/learning/eval.py
@ -2,6 +2,8 @@

 from __future__ import annotations

+import json
+from typing import Any
 from uuid import uuid4

 from beaver.engine.context import SkillContext
@ -39,7 +41,16 @@ class SkillDraftEvaluator:
            return self._skipped(candidate, draft)

        runs = self.run_store.list_runs()
-        replay_cases = select_replay_cases(candidate, runs)
+        if replay_runner is not None:
+            replay_cases, case_selection_meta = await _prepare_eval_cases(
+                candidate=candidate,
+                draft=draft,
+                historical_cases=select_replay_cases(candidate, runs),
+                provider_bundle=provider_bundle,
+            )
+        else:
+            replay_cases = []
+            case_selection_meta = {}
        if replay_runner is not None and replay_cases:
            return await self._evaluate_replay(
                candidate=candidate,
@ -47,6 +58,7 @@ class SkillDraftEvaluator:
                replay_cases=replay_cases,
                provider_bundle=provider_bundle,
                replay_runner=replay_runner,
+                case_selection_meta=case_selection_meta,
            )
        return self._evaluate_heuristic(candidate, draft, runs)

@ -58,7 +70,7 @@ class SkillDraftEvaluator:
    ) -> SkillDraftEvalReport:
        runs_by_id = {record.run_id: record for record in runs}
        cases: list[dict] = []
-        for run_id in candidate.source_run_ids[:8]:
+        for run_id in candidate.source_run_ids[:10]:
            record = runs_by_id.get(run_id)
            if record is None:
                continue
@ -116,6 +128,7 @@ class SkillDraftEvaluator:
        replay_cases: list[dict],
        provider_bundle: ProviderBundle,
        replay_runner: ReplayRunner,
+        case_selection_meta: dict[str, Any] | None = None,
    ) -> SkillDraftEvalReport:
        case_reports: list[dict] = []
        legacy_cases: list[dict] = []
@ -147,17 +160,43 @@ class SkillDraftEvaluator:
                baseline=baseline,
                candidate=candidate_arm,
            )
-            baseline_score = surrogate["baseline_score"]
-            candidate_score = surrogate["candidate_score"]
+            baseline_ability = _ability_score(
+                case=case,
+                arm=baseline,
+                arm_name="baseline",
+            )
+            candidate_ability = _ability_score(
+                case=case,
+                arm=candidate_arm,
+                arm_name="candidate",
+            )
+            baseline_score = baseline_ability["final_score"]
+            candidate_score = candidate_ability["final_score"]
+            tool_execution_score = {
+                "baseline_score": surrogate["baseline_score"],
+                "candidate_score": surrogate["candidate_score"],
+                "delta": round(surrogate["candidate_score"] - surrogate["baseline_score"], 4),
+                "score_role": "diagnostic_only",
+            }
            case_report = {
                "run_id": case["run_id"],
                "task_id": case.get("task_id"),
                "session_id": case.get("session_id"),
+                "task_text": case.get("task_text"),
+                "synthetic": bool(case.get("synthetic")),
+                "tier": case.get("tier") or ("bronze" if case.get("synthetic") else "gold"),
+                "validator": case.get("validator"),
                "baseline": baseline,
                "candidate": candidate_arm,
                "baseline_score": baseline_score,
                "candidate_score": candidate_score,
                "delta": round(candidate_score - baseline_score, 4),
+                "ability_score": {
+                    "baseline": baseline_ability,
+                    "candidate": candidate_ability,
+                    "delta": round(candidate_score - baseline_score, 4),
+                },
+                "tool_execution_score": tool_execution_score,
                "execution_coverage": _arm_mode_coverage(baseline, candidate_arm, "executed"),
                "surrogate_coverage": _arm_mode_coverage(baseline, candidate_arm, "surrogate"),
                "blocked_tool_count": _arm_mode_count(baseline, candidate_arm, "blocked"),
@ -172,13 +211,23 @@ class SkillDraftEvaluator:
                {
                    "run_id": case["run_id"],
                    "session_id": case.get("session_id") or "",
+                    "task_text": case.get("task_text") or "",
+                    "synthetic": bool(case.get("synthetic")),
+                    "tier": case.get("tier") or ("bronze" if case.get("synthetic") else "gold"),
                    "baseline_score": baseline_score,
                    "candidate_score": candidate_score,
                    "delta": round(candidate_score - baseline_score, 4),
                }
            )
        preservation_report = _preservation_report(candidate, draft)
-        return _report_from_case_reports(candidate, draft, case_reports, legacy_cases, preservation_report)
+        return _report_from_case_reports(
+            candidate,
+            draft,
+            case_reports,
+            legacy_cases,
+            preservation_report,
+            case_selection_meta or {},
+        )

    def _skipped(self, candidate: SkillLearningCandidate, draft: SkillDraft) -> SkillDraftEvalReport:
        return SkillDraftEvalReport(
@ -238,22 +287,400 @@ def _preservation_report(candidate: SkillLearningCandidate, draft: SkillDraft) -
    return check_preservation(base_content=base_content, draft_content=draft.proposed_content)


+async def _prepare_eval_cases(
+    *,
+    candidate: SkillLearningCandidate,
+    draft: SkillDraft,
+    historical_cases: list[dict[str, Any]],
+    provider_bundle: ProviderBundle,
+) -> tuple[list[dict[str, Any]], dict[str, Any]]:
+    explicit_cases = _explicit_eval_cases(candidate)
+    merged = _dedupe_cases([*explicit_cases, *historical_cases])
+    usable, excluded = _filter_unscorable_cases(merged)
+    missing = max(0, 10 - len(usable))
+    generated: list[dict[str, Any]] = []
+    if missing:
+        generated = await _generate_synthetic_cases(
+            candidate=candidate,
+            draft=draft,
+            historical_cases=usable,
+            provider_bundle=provider_bundle,
+            count=missing,
+        )
+        generated, generated_excluded = _filter_unscorable_cases(generated)
+        excluded["synthetic_without_validator"] += generated_excluded["synthetic_without_validator"]
+        if len(generated) < missing:
+            generated.extend(
+                _fallback_synthetic_cases(
+                    candidate=candidate,
+                    historical_cases=usable,
+                    start_index=len(generated) + 1,
+                    count=missing - len(generated),
+                )
+            )
+    prepared = [*usable, *generated]
+    return prepared[:10], {
+        "requested_case_count": 10,
+        "historical_case_count": len(historical_cases),
+        "explicit_case_count": len(explicit_cases),
+        "generated_synthetic_count": sum(1 for item in prepared if item.get("synthetic")),
+        "excluded_synthetic_without_validator": excluded["synthetic_without_validator"],
+    }
+
+
+def _explicit_eval_cases(candidate: SkillLearningCandidate) -> list[dict[str, Any]]:
+    raw_cases = candidate.evidence.get("eval_cases") if isinstance(candidate.evidence, dict) else None
+    if not isinstance(raw_cases, list):
+        return []
+    result: list[dict[str, Any]] = []
+    for index, raw in enumerate(raw_cases, start=1):
+        if not isinstance(raw, dict):
+            continue
+        task_text = str(raw.get("task_text") or "").strip()
+        if not task_text:
+            continue
+        case = {
+            "run_id": str(raw.get("run_id") or f"explicit:{candidate.candidate_id}:{index:02d}"),
+            "task_id": raw.get("task_id") or f"explicit-{index:02d}",
+            "session_id": raw.get("session_id") or "explicit-eval",
+            "task_text": task_text,
+            "baseline_skill_names": list(raw.get("baseline_skill_names") or _baseline_skill_names(candidate)),
+            "candidate_skill_name": raw.get("candidate_skill_name") or candidate.draft_skill_name,
+            "accepted_score": _bounded_score(raw.get("accepted_score"), default=0.75),
+            "synthetic": bool(raw.get("synthetic")),
+            "tier": raw.get("tier") or ("bronze" if raw.get("synthetic") else "gold"),
+        }
+        if isinstance(raw.get("validator"), dict):
+            case["validator"] = dict(raw["validator"])
+        result.append(case)
+    return result
+
+
+def _dedupe_cases(cases: list[dict[str, Any]]) -> list[dict[str, Any]]:
+    result: list[dict[str, Any]] = []
+    seen: set[str] = set()
+    for case in cases:
+        run_id = str(case.get("run_id") or "")
+        task_text = str(case.get("task_text") or "")
+        key = run_id or task_text
+        if not key or key in seen:
+            continue
+        seen.add(key)
+        result.append(case)
+    return result
+
+
+def _filter_unscorable_cases(cases: list[dict[str, Any]]) -> tuple[list[dict[str, Any]], dict[str, int]]:
+    result: list[dict[str, Any]] = []
+    excluded = {"synthetic_without_validator": 0}
+    for case in cases:
+        if case.get("synthetic") and not isinstance(case.get("validator"), dict):
+            excluded["synthetic_without_validator"] += 1
+            continue
+        result.append(case)
+    return result, excluded
+
+
+async def _generate_synthetic_cases(
+    *,
+    candidate: SkillLearningCandidate,
+    draft: SkillDraft,
+    historical_cases: list[dict[str, Any]],
+    provider_bundle: ProviderBundle,
+    count: int,
+) -> list[dict[str, Any]]:
+    provider = provider_bundle.auxiliary_provider or provider_bundle.main_provider
+    runtime = provider_bundle.auxiliary_runtime or provider_bundle.main_runtime
+    model = getattr(runtime, "model", None)
+    try:
+        response = await provider.chat(
+            messages=[
+                {
+                    "role": "system",
+                    "content": (
+                        "You generate validator-first Beaver skill evaluation cases. "
+                        "Return only JSON with key cases. Each case must include task_text and validator. "
+                        "Validator type should be final_answer_contains with required_terms and optional forbidden_terms."
+                    ),
+                },
+                {
+                    "role": "user",
+                    "content": _synthetic_case_prompt(
+                        candidate=candidate,
+                        draft=draft,
+                        historical_cases=historical_cases,
+                        count=count,
+                    ),
+                },
+            ],
+            model=model,
+            max_tokens=2200,
+            temperature=0.4,
+        )
+    except Exception:
+        return []
+    payload = _parse_json_payload(response.content or "")
+    raw_cases = payload.get("cases") if isinstance(payload, dict) else None
+    if not isinstance(raw_cases, list):
+        return []
+    return _synthetic_case_payloads(candidate, raw_cases, start_index=1, limit=count)
+
+
+def _synthetic_case_prompt(
+    *,
+    candidate: SkillLearningCandidate,
+    draft: SkillDraft,
+    historical_cases: list[dict[str, Any]],
+    count: int,
+) -> str:
+    historical = [
+        {
+            "run_id": item.get("run_id"),
+            "task_text": item.get("task_text"),
+            "validator": item.get("validator"),
+        }
+        for item in historical_cases
+    ]
+    return (
+        f"Generate {count} synthetic evaluation cases for this skill draft.\n\n"
+        f"Candidate kind: {candidate.kind}\n"
+        f"Candidate reason: {candidate.reason}\n"
+        f"Draft skill name: {draft.skill_name}\n"
+        f"Related skills: {candidate.related_skill_names}\n"
+        f"Historical cases:\n{json.dumps(historical, ensure_ascii=False)}\n\n"
+        "Every synthetic case must be validator-first. Return exactly:\n"
+        '{"cases":[{"task_text":"...","validator":{"type":"final_answer_contains",'
+        '"required_terms":["..."],"forbidden_terms":["..."]},"tier":"bronze"}]}'
+    )
+
+
+def _parse_json_payload(content: str) -> dict[str, Any]:
+    cleaned = content.strip()
+    if cleaned.startswith("```"):
+        cleaned = cleaned.strip("`")
+        if cleaned.startswith("json"):
+            cleaned = cleaned[4:]
+    try:
+        payload = json.loads(cleaned)
+    except json.JSONDecodeError:
+        start = cleaned.find("{")
+        end = cleaned.rfind("}")
+        if start < 0 or end <= start:
+            return {}
+        try:
+            payload = json.loads(cleaned[start : end + 1])
+        except json.JSONDecodeError:
+            return {}
+    return payload if isinstance(payload, dict) else {}
+
+
+def _synthetic_case_payloads(
+    candidate: SkillLearningCandidate,
+    raw_cases: list[Any],
+    *,
+    start_index: int,
+    limit: int,
+) -> list[dict[str, Any]]:
+    result: list[dict[str, Any]] = []
+    for raw in raw_cases:
+        if not isinstance(raw, dict):
+            continue
+        task_text = str(raw.get("task_text") or "").strip()
+        validator = raw.get("validator")
+        if not task_text or not isinstance(validator, dict):
+            continue
+        result.append(
+            _synthetic_case_payload(
+                candidate,
+                task_text,
+                start_index + len(result),
+                validator=dict(validator),
+                tier=str(raw.get("tier") or "bronze"),
+            )
+        )
+        if len(result) >= limit:
+            break
+    return result
+
+
+def _fallback_synthetic_cases(
+    *,
+    candidate: SkillLearningCandidate,
+    historical_cases: list[dict[str, Any]],
+    start_index: int,
+    count: int,
+) -> list[dict[str, Any]]:
+    seed_text = ""
+    if historical_cases:
+        seed_text = str(historical_cases[(start_index - 1) % len(historical_cases)].get("task_text") or "")
+    if not seed_text:
+        seed_text = candidate.reason or candidate.draft_skill_name or "the candidate skill"
+    required_terms = _terms(seed_text)[:2] or ["done"]
+    return [
+        _synthetic_case_payload(
+            candidate,
+            f"Complete a realistic task related to {seed_text}. Scenario {index}.",
+            index,
+            validator={"type": "final_answer_contains", "required_terms": required_terms, "forbidden_terms": []},
+            tier="bronze",
+        )
+        for index in range(start_index, start_index + count)
+    ]
+
+
+def _synthetic_case_payload(
+    candidate: SkillLearningCandidate,
+    task_text: str,
+    index: int,
+    *,
+    validator: dict[str, Any],
+    tier: str,
+) -> dict[str, Any]:
+    return {
+        "run_id": f"synthetic:{candidate.candidate_id}:{index:02d}",
+        "task_id": f"synthetic-{index:02d}",
+        "session_id": "synthetic-eval",
+        "task_text": task_text,
+        "baseline_skill_names": _baseline_skill_names(candidate),
+        "candidate_skill_name": candidate.draft_skill_name,
+        "accepted_score": 0.75,
+        "synthetic": True,
+        "tier": tier,
+        "validator": validator,
+    }
+
+
+def _baseline_skill_names(candidate: SkillLearningCandidate) -> list[str]:
+    if candidate.kind == "revise_skill":
+        return list(candidate.related_skill_names[:1])
+    if candidate.kind == "merge_skills":
+        return list(candidate.related_skill_names)
+    return []
+
+
+def _ability_score(*, case: dict[str, Any], arm: dict[str, Any], arm_name: str) -> dict[str, Any]:
+    validator = case.get("validator") if isinstance(case.get("validator"), dict) else None
+    if validator is not None:
+        return _ability_from_validator(validator, arm)
+    if not case.get("synthetic"):
+        score = _bounded_score(case.get("accepted_score"), default=0.75) if arm_name == "baseline" else _ability_from_output(arm)["final_score"]
+        return _ability_breakdown(score=score, source="user_feedback" if arm_name == "baseline" else "llm_judge")
+    return _ability_breakdown(score=0.0, source="unscored", notes=["Synthetic cases require a validator."])
+
+
+def _ability_from_validator(validator: dict[str, Any], arm: dict[str, Any]) -> dict[str, Any]:
+    validator_type = str(validator.get("type") or "")
+    if validator_type != "final_answer_contains":
+        return _ability_from_output(arm, source="llm_judge", notes=[f"Unsupported validator type: {validator_type}"])
+
+    answer = str(arm.get("final_answer") or "").lower()
+    required_terms = [str(item).lower() for item in validator.get("required_terms") or [] if str(item).strip()]
+    forbidden_terms = [str(item).lower() for item in validator.get("forbidden_terms") or [] if str(item).strip()]
+    matched = sum(1 for term in required_terms if term in answer)
+    outcome = matched / max(1, len(required_terms))
+    unsafe = any(term in answer for term in forbidden_terms)
+    safety = 0.0 if unsafe else 1.0
+    final_score = (
+        0.40 * outcome
+        + 0.25 * outcome
+        + 0.15 * _process_validity(arm)
+        + 0.10 * safety
+        + 0.10 * _path_efficiency(arm, outcome)
+    )
+    return {
+        **_ability_breakdown(score=final_score, source="auto_validator"),
+        "outcome_correctness": round(outcome, 4),
+        "artifact_correctness": round(outcome, 4),
+        "safety_no_regression": round(safety, 4),
+        "validator_type": validator_type,
+    }
+
+
+def _ability_from_output(arm: dict[str, Any], *, source: str = "llm_judge", notes: list[str] | None = None) -> dict[str, Any]:
+    answer = str(arm.get("final_answer") or "").strip()
+    score = 0.7 if answer and arm.get("finish_reason") != "error" else 0.3
+    return _ability_breakdown(score=score, source=source, notes=notes)
+
+
+def _ability_breakdown(*, score: float, source: str, notes: list[str] | None = None) -> dict[str, Any]:
+    bounded = _bounded_score(score, default=0.0)
+    return {
+        "outcome_correctness": bounded,
+        "artifact_correctness": bounded,
+        "process_validity": bounded,
+        "safety_no_regression": bounded,
+        "path_efficiency": bounded,
+        "final_score": round(bounded, 4),
+        "source": source,
+        "notes": list(notes or []),
+    }
+
+
+def _process_validity(arm: dict[str, Any]) -> float:
+    if arm.get("finish_reason") == "error":
+        return 0.2
+    return 0.8 if arm.get("tool_calls") else 0.6
+
+
+def _path_efficiency(arm: dict[str, Any], outcome: float) -> float:
+    if outcome < 0.5:
+        return 0.3
+    call_count = len([item for item in arm.get("tool_calls") or [] if isinstance(item, dict)])
+    if call_count <= 3:
+        return 1.0
+    if call_count <= 6:
+        return 0.7
+    return 0.4
+
+
+def _bounded_score(value: Any, *, default: float) -> float:
+    try:
+        return max(0.0, min(1.0, float(value)))
+    except (TypeError, ValueError):
+        return default
+
+
+def _terms(text: str) -> list[str]:
+    return [part.strip(".,:;!?()[]{}").lower() for part in text.split() if len(part.strip(".,:;!?()[]{}")) > 3]
+
+
 def _report_from_case_reports(
    candidate: SkillLearningCandidate,
    draft: SkillDraft,
    case_reports: list[dict],
    legacy_cases: list[dict],
    preservation_report: dict | None,
+    case_selection_meta: dict[str, Any] | None = None,
 ) -> SkillDraftEvalReport:
    baseline_avg = sum(item["baseline_score"] for item in legacy_cases) / len(legacy_cases)
    candidate_avg = sum(item["candidate_score"] for item in legacy_cases) / len(legacy_cases)
    regressions = [item for item in legacy_cases if item["candidate_score"] < item["baseline_score"]]
    improved = [item for item in legacy_cases if item["candidate_score"] > item["baseline_score"]]
    unchanged = len(legacy_cases) - len(regressions) - len(improved)
+    real_cases = [item for item in legacy_cases if not item.get("synthetic")]
+    synthetic_cases = [item for item in legacy_cases if item.get("synthetic")]
    execution, surrogate, blocked = _coverage(case_reports)
    confidence = _confidence(execution, surrogate, blocked, [item.get("confidence") for item in case_reports])
    score_delta = candidate_avg - baseline_avg
    passed = candidate_avg >= 0.75 and not (regressions and score_delta <= 0) and blocked < 1.0
+    selection_meta = dict(case_selection_meta or {})
+    real_score_avg = _avg([item["candidate_score"] for item in real_cases])
+    synthetic_score_avg = _avg([item["candidate_score"] for item in synthetic_cases])
+    overall_score_avg = round(candidate_avg, 4)
+    ability_summary = {
+        "score_role": "primary",
+        "real_case_count": len(real_cases),
+        "synthetic_case_count": len(synthetic_cases),
+        "real_score_avg": real_score_avg,
+        "synthetic_score_avg": synthetic_score_avg,
+        "overall_score_avg": overall_score_avg,
+    }
+    tool_execution_summary = {
+        "score_role": "diagnostic_only",
+        "executed": execution,
+        "surrogate": surrogate,
+        "blocked": blocked,
+    }
    return SkillDraftEvalReport(
        report_id=uuid4().hex,
        skill_name=draft.skill_name,
@ -276,11 +703,34 @@ def _report_from_case_reports(
        blocked_coverage=blocked,
        confidence=confidence,
        case_reports=case_reports,
-        tool_mode_summary={"executed": execution, "surrogate": surrogate, "blocked": blocked},
+        tool_mode_summary={
+            "executed": execution,
+            "surrogate": surrogate,
+            "blocked": blocked,
+            "score_role": "diagnostic_only",
+            "real_case_count": len(real_cases),
+            "synthetic_case_count": len(synthetic_cases),
+            "real_score_avg": real_score_avg,
+            "synthetic_score_avg": synthetic_score_avg,
+            "overall_score_avg": overall_score_avg,
+            **selection_meta,
+        },
+        ability_score_summary=ability_summary,
+        tool_execution_summary=tool_execution_summary,
+        case_selection_summary=selection_meta,
+        real_score_avg=real_score_avg,
+        synthetic_score_avg=synthetic_score_avg,
+        overall_score_avg=overall_score_avg,
        preservation_report=preservation_report,
    )


+def _avg(values: list[float]) -> float | None:
+    if not values:
+        return None
+    return round(sum(values) / len(values), 4)
+
+
 def _coverage(case_reports: list[dict]) -> tuple[float, float, float]:
    counts = {"executed": 0, "surrogate": 0, "blocked": 0}
    for report in case_reports:
--- a/app-instance/backend/beaver/skills/learning/pipeline.py
+++ b/app-instance/backend/beaver/skills/learning/pipeline.py
@ -323,8 +323,8 @@ class SkillLearningPipelineService:

    def _validate_publish_gates(self, draft: SkillDraft, *, confirm_high_risk: bool) -> None:
        reviews = self.reviews_for_draft(draft.skill_name, draft.draft_id)
-        if not any(review.status == SkillReviewState.APPROVED.value for review in reviews):
-            raise ValueError("Draft must have an approved review before publish")
+        if not any(review.status in {SkillReviewState.IN_REVIEW.value, SkillReviewState.APPROVED.value} for review in reviews):
+            raise ValueError("Draft must be submitted for review before publish")
        safety = self.get_safety_report(draft.skill_name, draft.draft_id)
        if safety is None:
            raise ValueError("Draft requires a passing safety report before publish")
--- a/app-instance/backend/beaver/skills/learning/replay.py
+++ b/app-instance/backend/beaver/skills/learning/replay.py
@ -162,18 +162,23 @@ class ReplayRunner:
            registry=loaded.tool_registry,
            policy=self.policy,
        )
-        result = await self.agent_loop.process_direct(
-            request.task_text,
-            provider_bundle=request.provider_bundle,
-            include_skill_assembly=False,
-            include_tools=True,
-            pinned_skill_names=request.pinned_skill_names,
-            pinned_skill_contexts=request.pinned_skill_contexts,
-            max_tool_iterations=int(request.model_settings.get("max_tool_iterations") or 4),
-            temperature=float(request.model_settings.get("temperature") or 0.0),
-            source="skill_replay_eval",
-            tool_executor_override=replay_executor,
-        )
+        direct_kwargs = {
+            "provider_bundle": request.provider_bundle,
+            "include_skill_assembly": False,
+            "include_tools": True,
+            "pinned_skill_names": request.pinned_skill_names,
+            "pinned_skill_contexts": request.pinned_skill_contexts,
+            "max_tool_iterations": int(request.model_settings.get("max_tool_iterations") or 4),
+            "temperature": float(request.model_settings.get("temperature") or 0.0),
+            "source": "skill_replay_eval",
+            "tool_executor_override": replay_executor,
+        }
+        try:
+            result = await self.agent_loop.process_direct(request.task_text, **direct_kwargs)
+        except RuntimeError as exc:
+            if not _is_process_direct_disabled_while_running(exc) or not hasattr(self.agent_loop, "submit_direct"):
+                raise
+            result = await self.agent_loop.submit_direct(request.task_text, **direct_kwargs)
        return {
            "case_id": request.case_id,
            "arm": request.arm,
@ -188,6 +193,14 @@ class ReplayRunner:
        }


+def _is_process_direct_disabled_while_running(exc: RuntimeError) -> bool:
+    message = str(exc)
+    return (
+        "AgentLoop.process_direct() is disabled while run() is active" in message
+        and "submit tasks via submit_direct() instead" in message
+    )
+
+
 def _side_effects_from_traces(traces: list[dict[str, Any]]) -> list[dict[str, Any]]:
    effects: list[dict[str, Any]] = []
    for trace in traces:
--- a/app-instance/backend/beaver/skills/learning/service.py
+++ b/app-instance/backend/beaver/skills/learning/service.py
@ -99,6 +99,7 @@ class SkillLearningService:
        ]
        source_run_ids = [record.run_id for record in source_runs]
        source_session_ids = list(dict.fromkeys(record.session_id for record in source_runs))
+        representative_task_text = self._representative_task_text(source_runs, fallback=final_run.task_text)

        if not published_receipts:
            candidates.append(
@ -113,7 +114,8 @@ class SkillLearningService:
                        "task_id": task_id,
                        "final_accepted_run_id": final_accepted_run_id,
                        "source_run_ids": source_run_ids,
-                        "theme": self._task_theme(final_run.task_text),
+                        "task_text": representative_task_text,
+                        "theme": self._task_theme(representative_task_text),
                    },
                    status="open",
                    priority=1,
@ -329,8 +331,14 @@ class SkillLearningService:

    def _build_new_skill_candidates(self) -> list[SkillLearningCandidate]:
        groups: dict[str, list[RunRecord]] = {}
-        for record in self.run_store.list_runs():
-            key = self._task_theme(record.task_text)
+        all_runs = self.run_store.list_runs()
+        runs_by_task: dict[str, list[RunRecord]] = {}
+        for record in all_runs:
+            if record.task_id:
+                runs_by_task.setdefault(record.task_id, []).append(record)
+        for record in all_runs:
+            task_runs = runs_by_task.get(record.task_id, [record])
+            key = self._task_theme(self._representative_task_text(task_runs, fallback=record.task_text))
            if not key:
                continue
            groups.setdefault(key, []).append(record)
@ -443,12 +451,24 @@ class SkillLearningService:

    @staticmethod
    def _task_theme(task_text: str) -> str:
-        cleaned = re.sub(r"\s+", " ", task_text.strip().lower())
+        cleaned = re.sub(r"\s+", " ", task_text.strip())
        if not cleaned:
            return ""
-        words = cleaned.split(" ")
+        first_sentence = re.split(r"[。！？.!?]", cleaned, maxsplit=1)[0].strip()
+        if not first_sentence:
+            first_sentence = cleaned
+        words = first_sentence.split(" ")
        return " ".join(words[:8]).strip()

+    @staticmethod
+    def _representative_task_text(runs: list[RunRecord], *, fallback: str = "") -> str:
+        ordered = sorted(runs, key=lambda item: (item.attempt_index, item.started_at, item.run_id))
+        for record in ordered:
+            text = record.task_text.strip()
+            if text:
+                return text
+        return fallback.strip()
+
    @staticmethod
    def _suggest_skill_name(
        candidate: SkillLearningCandidate,
--- a/app-instance/backend/beaver/skills/learning/surrogate.py
+++ b/app-instance/backend/beaver/skills/learning/surrogate.py
@ -15,12 +15,15 @@ class SurrogateToolEvaluator:
        return {
            "baseline_score": baseline_score,
            "candidate_score": candidate_score,
+            "baseline_tool_execution_score": baseline_score,
+            "candidate_tool_execution_score": candidate_score,
            "delta": round(candidate_score - baseline_score, 4),
            "surrogate_tool_count": surrogate_count,
            "blocked_tool_count": blocked_count,
+            "score_role": "diagnostic_only",
            "confidence": confidence,
            "notes": [
-                "Surrogate score is based on intended tool calls, schemas, arguments, and task relevance.",
+                "Tool execution score is diagnostic only and is not the main task ability score.",
            ],
        }

--- a/app-instance/backend/beaver/skills/learning/synthesizer.py
+++ b/app-instance/backend/beaver/skills/learning/synthesizer.py
@ -6,6 +6,7 @@ import json
 from typing import Any

 from beaver.engine.providers.base import LLMProvider
+from beaver.skills.authoring import canonical_skill_format_instructions, ensure_canonical_skill_body, normalize_skill_frontmatter
 from beaver.skills.learning.evidence import EvidencePacket
 from beaver.memory.skills.models import SkillLearningCandidate

@ -58,7 +59,8 @@ class SkillDraftSynthesizer:
                    "content": (
                        "You synthesize Beaver skill drafts from execution evidence. "
                        "Return only JSON with keys: frontmatter, content, change_reason, "
-                        "preserved_sections, changed_sections, dropped_sections."
+                        "preserved_sections, changed_sections, dropped_sections. "
+                        "The content must follow the Canonical Beaver SKILL.md format."
                    ),
                },
                {"role": "user", "content": prompt},
@ -113,6 +115,7 @@ class SkillDraftSynthesizer:
            + "\n- tools: an explicit JSON array of exact tool names this skill needs. "
            + "Prefer called tool names when the workflow depends on them; use run-selected tool names only when clearly required. "
            + "Use [] only when no tool is required."
+            + "\n\n" + canonical_skill_format_instructions()
            + "\nThe JSON may include preserved_sections, changed_sections, and dropped_sections arrays."
        )

@ -144,14 +147,23 @@ class SkillDraftSynthesizer:

    @staticmethod
    def _normalize_payload(payload: dict[str, Any], evidence_packet: EvidencePacket) -> dict[str, Any]:
-        frontmatter = dict(payload.get("frontmatter") or {})
+        frontmatter = normalize_skill_frontmatter(
+            dict(payload.get("frontmatter") or {}),
+            skill_name=str((payload.get("frontmatter") or {}).get("name") or "generated-skill"),
+        )
        tool_hints = _coerce_string_list(frontmatter.get("tools"))
        if not tool_hints:
            tool_hints = _coerce_string_list(evidence_packet.metadata.get("tool_names"))
        frontmatter["tools"] = tool_hints
+        content = ensure_canonical_skill_body(
+            str(payload.get("content") or "").strip(),
+            title=str(frontmatter.get("name") or "generated-skill"),
+            description=str(frontmatter.get("description") or ""),
+            tools=tool_hints,
+        )
        return {
            "frontmatter": frontmatter,
-            "content": str(payload.get("content") or "").strip(),
+            "content": content,
            "change_reason": str(payload.get("change_reason") or ""),
            "preserved_sections": _coerce_string_list(payload.get("preserved_sections")),
            "changed_sections": _coerce_string_list(payload.get("changed_sections")),
@ -162,13 +174,20 @@ class SkillDraftSynthesizer:
    def _fallback_payload(candidate: SkillLearningCandidate, evidence_packet: EvidencePacket, action: str) -> dict[str, Any]:
        related = candidate.related_skill_names[0] if candidate.related_skill_names else "generated-skill"
        title = related.replace("_", "-")
-        content = "\n".join(f"- {item}" for item in evidence_packet.task_summaries[:5]) or "- No evidence captured."
+        tools = _coerce_string_list(evidence_packet.metadata.get("tool_names"))
+        content = ensure_canonical_skill_body(
+            "\n".join(f"- {item}" for item in evidence_packet.task_summaries[:5]) or "- No evidence captured.",
+            title=title,
+            description=candidate.reason or f"Auto-generated {action} draft for {title}.",
+            tools=tools,
+        )
        return {
            "frontmatter": {
+                "name": title,
                "description": candidate.reason or f"Auto-generated {action} draft for {title}.",
-                "tools": _coerce_string_list(evidence_packet.metadata.get("tool_names")),
+                "tools": tools,
            },
-            "content": f"# {title}\n\n## Evidence\n\n{content}\n",
+            "content": content,
            "change_reason": candidate.reason or f"Fallback {action} synthesis.",
            "preserved_sections": [],
            "changed_sections": [],
--- a/app-instance/backend/beaver/skills/learning/worker.py
+++ b/app-instance/backend/beaver/skills/learning/worker.py
@ -10,6 +10,7 @@ from typing import Callable
 from beaver.engine.providers import ProviderBundle
 from beaver.memory.skills import SkillLearningCandidate
 from beaver.skills.learning.pipeline import SkillLearningPipelineService
+from beaver.skills.learning.replay import ReplayRunner


@dataclass(slots=True)
@ -57,10 +58,12 @@ class SkillLearningWorker:
        *,
        pipeline: SkillLearningPipelineService,
        provider_bundle_factory: Callable[[], ProviderBundle],
+        replay_runner_factory: Callable[[], ReplayRunner] | None = None,
        config: SkillLearningWorkerConfig | None = None,
    ) -> None:
        self.pipeline = pipeline
        self.provider_bundle_factory = provider_bundle_factory
+        self.replay_runner_factory = replay_runner_factory
        self.config = config or SkillLearningWorkerConfig.from_env()
        self._running = False
        self._lock = asyncio.Lock()
@ -126,6 +129,7 @@ class SkillLearningWorker:
            draft.skill_name,
            draft.draft_id,
            provider_bundle=self.provider_bundle_factory(),
+            replay_runner=self.replay_runner_factory() if self.replay_runner_factory is not None else None,
        )
        return True

--- a/app-instance/backend/beaver/skills/publisher/service.py
+++ b/app-instance/backend/beaver/skills/publisher/service.py
@ -16,8 +16,8 @@ class SkillPublisher:

    def publish(self, skill_name: str, draft_id: str, publisher: str, notes: str = "") -> SkillVersion:
        draft = self._require_draft(skill_name, draft_id)
-        if draft.status != SkillReviewState.APPROVED.value:
-            raise ValueError("Draft must be approved before publish")
+        if draft.status not in {SkillReviewState.IN_REVIEW.value, SkillReviewState.APPROVED.value}:
+            raise ValueError("Draft must be submitted for review before publish")
        if draft.proposal_kind == "retire_skill":
            raise ValueError("Retire proposals must be applied through apply_retire_proposal")

@ -81,8 +81,8 @@ class SkillPublisher:

    def apply_retire_proposal(self, skill_name: str, draft_id: str, actor: str, notes: str = "") -> SkillSpec:
        draft = self._require_draft(skill_name, draft_id)
-        if draft.status != SkillReviewState.APPROVED.value:
-            raise ValueError("Retire proposal must be approved before apply")
+        if draft.status not in {SkillReviewState.IN_REVIEW.value, SkillReviewState.APPROVED.value}:
+            raise ValueError("Retire proposal must be submitted for review before apply")
        if draft.proposal_kind != "retire_skill":
            raise ValueError("Only retire_skill proposals can be applied as retire proposals")

--- a/app-instance/backend/beaver/tasks/router.py
+++ b/app-instance/backend/beaver/tasks/router.py
@ -25,7 +25,11 @@ class MainAgentRouter:
        timeout_seconds: float = 8.0,
    ) -> MainAgentDecision:
        if provider is None:
-            return self._fallback(active_task=active_task, reason="router_provider_unavailable")
+            return self._apply_active_task_boundary(
+                self._fallback(active_task=active_task, reason="router_provider_unavailable"),
+                message=message,
+                active_task=active_task,
+            )
        chat_kwargs: dict[str, Any] = {
            "messages": [
                {
@ -58,10 +62,18 @@ class MainAgentRouter:
        for attempt_timeout in (timeout_seconds, 12.0):
            try:
                response = await asyncio.wait_for(provider.chat(**chat_kwargs), timeout=attempt_timeout)
-                return self.from_json(response.content or "", active_task=active_task)
+                return self._apply_active_task_boundary(
+                    self.from_json(response.content or "", active_task=active_task),
+                    message=message,
+                    active_task=active_task,
+                )
            except Exception as exc:
                last_error = exc
-        return self._fallback(active_task=active_task, reason=f"router_failed: {last_error}")
+        return self._apply_active_task_boundary(
+            self._fallback(active_task=active_task, reason=f"router_failed: {last_error}"),
+            message=message,
+            active_task=active_task,
+        )

    def from_json(self, text: str, *, active_task: TaskRecord | None = None) -> MainAgentDecision:
        payload = self._parse_json_object(text)
@ -121,6 +133,31 @@ class MainAgentRouter:
            return MainAgentDecision(mode="task", reason=reason, action="continue_task")
        return MainAgentDecision(mode="simple", reason=reason, action="simple_chat")

+    def _apply_active_task_boundary(
+        self,
+        decision: MainAgentDecision,
+        *,
+        message: str,
+        active_task: TaskRecord | None,
+    ) -> MainAgentDecision:
+        if active_task is None or decision.action != "continue_task":
+            return decision
+        if not _looks_like_fresh_task_request(message):
+            return decision
+        if _looks_like_explicit_task_followup(message):
+            return decision
+        title = decision.short_title or active_task.metadata.get("short_title")
+        return MainAgentDecision(
+            mode="task",
+            reason=(
+                "fresh standalone task request in the same session; "
+                "do not attach it to the active task without explicit follow-up wording"
+            ),
+            starts_new_task=True,
+            short_title=title,
+            action="create_task",
+        )
+
    @staticmethod
    def _prompt(
        *,
@ -159,15 +196,19 @@ class MainAgentRouter:
            "- close_task: user explicitly says the active Task is done/satisfactory/finished.\n"
            "- abandon_task: user explicitly says to stop, cancel, abandon, or no longer do the active Task.\n\n"
            "Critical policy:\n"
-            "- If there is an active Task, choose continue_task or revise_task unless the user's topic is completely unrelated "
-            "to that Task or the user explicitly closes/abandons it.\n"
+            "- A Session is the durable conversation/device/group context. A Task is one unit of work inside that Session. "
+            "Do not use an active Task as a reason to merge every later message into the same work item.\n"
+            "- If there is an active Task, choose continue_task only when the current message explicitly depends on, extends, "
+            "or asks a direct follow-up about that active Task's latest result.\n"
            "- With an active Task, choose simple_chat for unrelated lightweight conversation and new_task for unrelated work "
            "that needs Task capabilities. Either decision starts a new topic.\n"
            "- An unrelated lightweight conversation must not be classified as revise_task merely because the active Task is awaiting acceptance.\n"
            "- Choose revise_task when the active Task is awaiting feedback or needs revision and the user asks for changes "
            "such as '改一下', '加上', '删除', '换成', '再详细点', '格式改成', '不要', or equivalent wording.\n"
-            "- Choose continue_task for neutral follow-up questions or additional next steps that do not imply dissatisfaction with the previous result.\n"
-            "- Use new_task only when the user clearly asks to start a different task.\n"
+            "- Choose continue_task for neutral follow-up questions or additional next steps that refer to the previous result, "
+            "for example '顺便查一下深圳', '这个也加上', or '继续'.\n"
+            "- A standalone tool-dependent request such as a fresh weather/search/file/run/test request is new_task even when it is "
+            "similar to the active Task. Repeating '珠海天气怎么样' later is a new Task unless the user says to revise or continue the old result.\n"
            "- If there is no active Task, choose new_task only for work that requires execution, iteration, tools, files, "
            "implementation, validation, or multi-step completion. Otherwise choose simple_chat.\n"
            "- Requests that need current, real-time, external, user-private, local-file, web, weather, price, news, "
@ -203,3 +244,99 @@ def _clean_short_title(value: Any) -> str | None:
        return None
    title = " ".join(str(value).strip().split())
    return title[:40] or None
+
+
+def _looks_like_explicit_task_followup(message: str) -> bool:
+    text = _compact_text(message)
+    if not text:
+        return False
+    markers = (
+        "继续",
+        "接着",
+        "上面",
+        "刚才",
+        "前面",
+        "这个",
+        "那个",
+        "它",
+        "结果",
+        "再",
+        "也",
+        "顺便",
+        "补充",
+        "加上",
+        "加入",
+        "删除",
+        "去掉",
+        "改",
+        "换成",
+        "重做",
+        "详细",
+        "展开",
+        "格式",
+        "continue",
+        "same task",
+        "previous",
+        "above",
+        "that result",
+        "revise",
+        "update it",
+        "add",
+        "remove",
+        "change",
+        "also",
+    )
+    return any(marker in text for marker in markers)
+
+
+def _looks_like_fresh_task_request(message: str) -> bool:
+    text = _compact_text(message)
+    if not text:
+        return False
+    markers = (
+        "天气",
+        "气温",
+        "下雨",
+        "降雨",
+        "空气质量",
+        "预报",
+        "查一下",
+        "帮我查",
+        "搜索",
+        "搜一下",
+        "看看最新",
+        "最新",
+        "今天",
+        "明天",
+        "上传",
+        "下载",
+        "文件",
+        "运行",
+        "执行",
+        "测试",
+        "构建",
+        "部署",
+        "修复",
+        "weather",
+        "forecast",
+        "temperature",
+        "search",
+        "look up",
+        "latest",
+        "today",
+        "tomorrow",
+        "upload",
+        "download",
+        "file",
+        "run",
+        "execute",
+        "test",
+        "build",
+        "deploy",
+        "fix",
+    )
+    return any(marker in text for marker in markers)
+
+
+def _compact_text(message: str) -> str:
+    return " ".join(str(message or "").strip().lower().split())
--- a/app-instance/backend/docs/architecture/project-comparison.html
+++ b/app-instance/backend/docs/architecture/project-comparison.html
@ -886,9 +886,11 @@ tool_hints:
          <div class="panel">
            <h3>Beaver 平台部署层</h3>
            <ul>
-              <li><span class="code">Keycloak OIDC</span>：用户入口、SSO 登录与登出。</li>
-              <li><span class="code">app-instance</span>：单体应用容器，内置前端、后端和 Nginx。</li>
-              <li>外部编排系统负责部署和访问入口，当前应用不维护实例注册表。</li>
+              <li><span class="code">auth-portal</span>：用户入口、登录注册。</li>
+              <li><span class="code">authz-service</span>：授权服务和后端身份。</li>
+              <li><span class="code">deploy-control</span>：创建和管理单用户实例。</li>
+              <li><span class="code">router-proxy</span>：按 Host 转发到实例。</li>
+              <li>每个用户一个 <span class="code">app-instance</span>，Docker 容器隔离 + per-instance workspace。</li>
            </ul>
          </div>
          <div class="panel">
--- a/app-instance/backend/pyproject.toml
+++ b/app-instance/backend/pyproject.toml
@ -13,7 +13,6 @@ dependencies = [
    "litellm>=1.79.0,<2.0.0",
    "minio>=7.2.0,<8.0.0",
    "openai>=1.79.0,<2.0.0",
-    "PyJWT[crypto]>=2.9.0,<3.0.0",
    "pydantic>=2.12.0,<3.0.0",
    "python-multipart>=0.0.20,<1.0.0",
    "typer>=0.20.0,<1.0.0",
--- a/app-instance/backend/tests/unit/test_initial_skill_tool_hints.py
+++ b/app-instance/backend/tests/unit/test_initial_skill_tool_hints.py
@ -4,6 +4,7 @@ import json
 from pathlib import Path

 from beaver.engine import EngineLoader
+from beaver.skills.authoring.format import is_canonical_skill_body
 from beaver.skills.catalog.utils import parse_frontmatter


@ -69,6 +70,16 @@ def test_skill_authoring_admin_is_seeded_but_not_initial() -> None:
        assert version["tool_hints"] == expected_tools


+def test_seeded_skill_bodies_use_canonical_format() -> None:
+    for index_name in ("published", "disabled"):
+        index = json.loads((REPO_ROOT / "skills" / "_index" / f"{index_name}.json").read_text(encoding="utf-8"))
+        for skill_name in index["items"]:
+            skill_dir = REPO_ROOT / "skills" / skill_name / "versions" / "v0001"
+            _frontmatter, body = parse_frontmatter((skill_dir / "SKILL.md").read_text(encoding="utf-8"))
+
+            assert is_canonical_skill_body(body), skill_name
+
+
 def test_default_runtime_registers_skill_view_tool(tmp_path: Path) -> None:
    loaded = EngineLoader(workspace=tmp_path).load()
    try:
--- a/app-instance/backend/tests/unit/test_keycloak_auth.py
+++ b/app-instance/backend/tests/unit/test_keycloak_auth.py
@ -1,110 +0,0 @@
-from __future__ import annotations
-
-import time
-
-import jwt
-import pytest
-from fastapi import HTTPException
-
-from beaver.interfaces.web.keycloak_auth import (
-    KeycloakAuthConfig,
-    KeycloakIdentity,
-    KeycloakTokenVerifier,
-    extract_bearer_token,
-)
-
-
-def _verifier() -> KeycloakTokenVerifier:
-    return KeycloakTokenVerifier(
-        config=KeycloakAuthConfig(
-            issuer="https://keycloak.bwgdi.com/realms/beaver",
-            client_id="beaver-agnet",
-            token_url="https://keycloak.bwgdi.com/realms/beaver/protocol/openid-connect/token",
-            jwks_url="https://keycloak.bwgdi.com/realms/beaver/protocol/openid-connect/certs",
-        )
-    )
-
-
-def _claims(**overrides):
-    now = int(time.time())
-    payload = {
-        "sub": "user-123",
-        "preferred_username": "alice",
-        "email": "alice@example.com",
-        "name": "Alice Example",
-        "iss": "https://keycloak.bwgdi.com/realms/beaver",
-        "aud": "beaver-agnet",
-        "azp": "beaver-agnet",
-        "iat": now,
-        "exp": now + 300,
-        "nonce": "nonce-1",
-        "realm_access": {"roles": ["user", "admin"]},
-        "resource_access": {"beaver-agnet": {"roles": ["agent-user"]}},
-    }
-    payload.update(overrides)
-    return payload
-
-
-def test_extract_bearer_token_accepts_case_insensitive_prefix() -> None:
-    assert extract_bearer_token("Bearer abc.def") == "abc.def"
-    assert extract_bearer_token("bearer xyz") == "xyz"
-
-
-def test_extract_bearer_token_rejects_missing_or_invalid_header() -> None:
-    with pytest.raises(HTTPException) as missing:
-        extract_bearer_token(None)
-    with pytest.raises(HTTPException) as invalid:
-        extract_bearer_token("Basic abc")
-
-    assert missing.value.status_code == 401
-    assert invalid.value.status_code == 401
-
-
-def test_validate_claims_accepts_audience_and_extracts_roles() -> None:
-    identity = _verifier().validate_claims(_claims(), expected_nonce="nonce-1")
-
-    assert identity == KeycloakIdentity(
-        user_id="user-123",
-        username="alice",
-        email="alice@example.com",
-        name="Alice Example",
-        realm_roles=("user", "admin"),
-        client_roles=("agent-user",),
-    )
-
-
-def test_validate_claims_accepts_azp_when_audience_differs() -> None:
-    identity = _verifier().validate_claims(_claims(aud="account", azp="beaver-agnet"))
-
-    assert identity.user_id == "user-123"
-
-
-def test_validate_claims_rejects_wrong_nonce() -> None:
-    with pytest.raises(HTTPException) as exc:
-        _verifier().validate_claims(_claims(), expected_nonce="different")
-
-    assert exc.value.status_code == 401
-    assert "nonce" in exc.value.detail.lower()
-
-
-def test_validate_claims_rejects_wrong_audience_and_azp() -> None:
-    with pytest.raises(HTTPException) as exc:
-        _verifier().validate_claims(_claims(aud="account", azp="other-client"))
-
-    assert exc.value.status_code == 401
-    assert "audience" in exc.value.detail.lower()
-
-
-def test_verify_raises_http_exception_for_bad_jwt(monkeypatch) -> None:
-    verifier = _verifier()
-
-    def fake_decode(*args, **kwargs):
-        raise jwt.InvalidTokenError("bad token")
-
-    monkeypatch.setattr(jwt, "decode", fake_decode)
-
-    with pytest.raises(HTTPException) as exc:
-        verifier.verify("bad-token")
-
-    assert exc.value.status_code == 401
-    assert "invalid token" in exc.value.detail.lower()
--- a/app-instance/backend/tests/unit/test_main_agent_router.py
+++ b/app-instance/backend/tests/unit/test_main_agent_router.py
@ -87,6 +87,14 @@ def _task() -> TaskRecord:
    )


+def _weather_task() -> TaskRecord:
+    task = _task()
+    task.description = "珠海天气怎样"
+    task.goal = "珠海天气怎样"
+    task.metadata["short_title"] = "查询珠海天气"
+    return task
+
+
 def test_router_continues_active_task_from_llm_decision() -> None:
    provider = RouterProvider('{"action":"continue_task","reason":"related","short_title":"任务连续性"}')
    decision = asyncio.run(
@ -103,6 +111,35 @@ def test_router_continues_active_task_from_llm_decision() -> None:
    assert provider.calls[0]["max_tokens"] == 256


+def test_router_keeps_same_session_but_starts_new_task_for_standalone_weather_repeat() -> None:
+    decision = asyncio.run(
+        MainAgentRouter().classify(
+            "珠海天气怎么样",
+            active_task=_weather_task(),
+            provider=RouterProvider('{"action":"continue_task","reason":"neutral follow-up","short_title":"查询珠海天气"}'),
+        )
+    )
+
+    assert decision.is_task
+    assert decision.action == "create_task"
+    assert decision.starts_new_task is True
+    assert "fresh standalone task request" in decision.reason
+
+
+def test_router_allows_explicit_followup_to_continue_active_weather_task() -> None:
+    decision = asyncio.run(
+        MainAgentRouter().classify(
+            "顺便查一下深圳",
+            active_task=_weather_task(),
+            provider=RouterProvider('{"action":"continue_task","reason":"related follow-up","short_title":"查询珠海天气"}'),
+        )
+    )
+
+    assert decision.is_task
+    assert decision.action == "continue_task"
+    assert decision.starts_new_task is False
+
+
 def test_router_marks_revision_from_llm_decision() -> None:
    decision = asyncio.run(
        MainAgentRouter().classify(
@ -163,6 +200,8 @@ def test_router_prompt_treats_unrelated_lightweight_conversation_as_new_topic()
    prompt = provider.calls[0]["messages"][1]["content"]
    assert "unrelated lightweight conversation" in prompt
    assert "must not be classified as revise_task merely because the active Task is awaiting acceptance" in prompt
+    assert "A Session is the durable conversation/device/group context" in prompt
+    assert "Repeating '珠海天气怎么样' later is a new Task" in prompt


 def test_router_closes_active_task_from_llm_decision() -> None:
--- a/app-instance/backend/tests/unit/test_marketplace_and_mcp.py
+++ b/app-instance/backend/tests/unit/test_marketplace_and_mcp.py
@ -5,13 +5,40 @@ from types import SimpleNamespace

 import pytest

-from beaver.interfaces.web.app import _create_skill_upload_draft
+from beaver.engine.providers.base import LLMProvider, LLMResponse
+from beaver.interfaces.web.app import _create_skill_upload_draft, _rewrite_uploaded_skill_draft_with_llm
 from beaver.services.skillhub_service import SkillHubService
+from beaver.skills.authoring.format import is_canonical_skill_body
+from beaver.skills.catalog.utils import extract_required_tool_names
 from beaver.skills.drafts import DraftService
 from beaver.skills.specs import SkillSpecStore
 from beaver.tools.mcp.wrapper import MCPToolWrapper


+class RewriteProvider(LLMProvider):
+    def __init__(self) -> None:
+        super().__init__()
+        self.messages = []
+
+    async def chat(self, messages, tools=None, model=None, max_tokens=None, temperature=0.7, thinking_enabled=None):
+        self.messages = messages
+        return LLMResponse(
+            content="""{
+              "frontmatter": {
+                "name": "skill",
+                "description": "Use when uploaded skill guidance needs QA formatting.",
+                "tools": ["read_file"]
+              },
+              "content": "# Skill\\n\\n## Overview\\n\\nLLM rewritten overview.\\n\\n## When to Use\\n\\n- Use when testing upload rewrite.\\n\\n## Required Tools\\n\\n- `read_file`\\n\\n## Workflow\\n\\n- Follow the rewritten workflow.\\n\\n## Validation\\n\\n- Verify the result.\\n\\n## Boundaries\\n\\n- Stay in scope.\\n\\n## Anti-Patterns\\n\\n- Do not skip rewrite validation.\\n",
+              "change_reason": "normalized upload"
+            }""",
+            model=model,
+        )
+
+    def get_default_model(self):
+        return "rewrite-model"
+
+
 class FakeSkillHubService(SkillHubService):
    async def _get_json(self, path, *, params=None):
        if path == "/skills":
@ -99,6 +126,106 @@ def test_upload_skill_zip_keeps_supporting_files_on_draft(tmp_path):
    assert upload_dir.endswith(draft["draft_id"])


+def test_upload_skill_zip_canonicalizes_uploaded_skill_body(tmp_path):
+    store = SkillSpecStore(tmp_path)
+    loaded = SimpleNamespace(skill_spec_store=store, draft_service=DraftService(store))
+    buffer = io.BytesIO()
+    with zipfile.ZipFile(buffer, "w") as archive:
+        archive.writestr(
+            "skill/SKILL.md",
+            "---\nname: skill\ndescription: raw upload\ntools:\n  - read_file\n---\nBody without our format.\n",
+        )
+
+    draft = _create_skill_upload_draft(loaded, "skill.zip", buffer.getvalue())
+
+    assert draft["proposed_frontmatter"]["name"] == "skill"
+    assert draft["proposed_frontmatter"]["tools"] == ["read_file"]
+    assert is_canonical_skill_body(draft["proposed_content"])
+
+
+def test_upload_skill_zip_infers_weather_web_tools_from_content(tmp_path):
+    store = SkillSpecStore(tmp_path)
+    loaded = SimpleNamespace(skill_spec_store=store, draft_service=DraftService(store))
+    buffer = io.BytesIO()
+    with zipfile.ZipFile(buffer, "w") as archive:
+        archive.writestr(
+            "weather_search/skills.md",
+            "---\nname: weather-search\ndescription: weather lookup\n---\nLook up current weather and forecast for a city online.\n",
+        )
+
+    draft = _create_skill_upload_draft(loaded, "weather_search.zip", buffer.getvalue())
+
+    assert draft["proposed_frontmatter"]["tools"] == ["web_fetch", "web_search"]
+    assert extract_required_tool_names(draft["proposed_content"]) == ["web_fetch", "web_search"]
+    assert is_canonical_skill_body(draft["proposed_content"])
+
+
+def test_upload_skill_llm_rewrite_updates_draft(tmp_path):
+    store = SkillSpecStore(tmp_path)
+    draft_service = DraftService(store)
+    draft = draft_service.create_new_skill_draft(
+        skill_name="skill",
+        proposed_content="# Skill\n\n## Overview\n\nFallback.",
+        proposed_frontmatter={"name": "skill", "description": "fallback", "tools": ["read_file"]},
+        created_by="test",
+        reason="upload",
+    )
+    provider = RewriteProvider()
+    agent_service = SimpleNamespace(
+        _make_provider_bundle_for_task=lambda _loaded, _kwargs: SimpleNamespace(
+            main_provider=provider,
+            main_runtime=SimpleNamespace(model="rewrite-model"),
+        )
+    )
+    loaded = SimpleNamespace(skill_spec_store=store, draft_service=draft_service)
+
+    asyncio.run(_rewrite_uploaded_skill_draft_with_llm(agent_service, loaded, draft, filename="skill.zip"))
+    rewritten = draft_service.get_draft("skill", draft.draft_id)
+
+    assert rewritten is not None
+    assert "LLM rewritten overview" in rewritten.proposed_content
+    assert is_canonical_skill_body(rewritten.proposed_content)
+    assert "Canonical Beaver SKILL.md format" in provider.messages[1]["content"]
+    assert "Available runtime tool names" in provider.messages[1]["content"]
+
+
+def test_upload_skill_zip_accepts_nested_single_skill_directory(tmp_path):
+    store = SkillSpecStore(tmp_path)
+    loaded = SimpleNamespace(skill_spec_store=store, draft_service=DraftService(store))
+    buffer = io.BytesIO()
+    with zipfile.ZipFile(buffer, "w") as archive:
+        archive.writestr(
+            "plugin/skills/nested-skill/SKILL.md",
+            "---\nname: nested-skill\ndescription: nested\n---\nBody\n",
+        )
+        archive.writestr("plugin/skills/nested-skill/references/a.txt", "context")
+        archive.writestr("plugin/README.md", "ignore package file")
+
+    draft = _create_skill_upload_draft(loaded, "plugin.zip", buffer.getvalue())
+
+    assert draft["skill_name"] == "nested-skill"
+    upload_dir = draft["evidence_refs"][0]["supporting_upload_dir"]
+    assert (tmp_path / "skills" / "nested-skill" / "draft_uploads" / draft["draft_id"] / "references" / "a.txt").read_text() == "context"
+    assert "README.md" not in draft["evidence_refs"][0]["files"]
+
+
+def test_upload_skill_zip_accepts_common_skill_markdown_name_aliases(tmp_path):
+    store = SkillSpecStore(tmp_path)
+    loaded = SimpleNamespace(skill_spec_store=store, draft_service=DraftService(store))
+    buffer = io.BytesIO()
+    with zipfile.ZipFile(buffer, "w") as archive:
+        archive.writestr(
+            "weather_search/skills.md",
+            "---\nname: weather-search\ndescription: weather lookup\n---\nBody\n",
+        )
+
+    draft = _create_skill_upload_draft(loaded, "weather_search.zip", buffer.getvalue())
+
+    assert draft["skill_name"] == "weather-search"
+    assert draft["proposed_frontmatter"]["name"] == "weather-search"
+    assert is_canonical_skill_body(draft["proposed_content"])
+
+
 def test_mcp_wrapper_metadata_preserves_server_id_with_underscores():
    tool_def = SimpleNamespace(name="auth_status", description="Auth", inputSchema={"type": "object", "properties": {}})

--- a/app-instance/backend/tests/unit/test_phase5_skills_runtime.py
+++ b/app-instance/backend/tests/unit/test_phase5_skills_runtime.py
@ -184,7 +184,7 @@ def test_skill_lifecycle_publish_revision_and_rollback(tmp_path: Path) -> None:
    assert published.version == "v0002"
    assert store.get_current_version("release-checklist") == "v0002"

-    with pytest.raises(ValueError, match="approved"):
+    with pytest.raises(ValueError, match="submitted for review"):
        publisher.publish("release-checklist", revision.draft_id, publisher="reviewer", notes="duplicate")

    rolled_back = publisher.rollback("release-checklist", "v0001", actor="reviewer", reason="regression")
@ -529,6 +529,66 @@ def test_skill_learning_service_generates_new_skill_for_task_without_published_s
    assert candidates[0].source_run_ids == ["task-run-1"]


+def test_skill_learning_service_uses_original_task_text_for_new_skill_theme(tmp_path: Path) -> None:
+    store = SkillSpecStore(tmp_path)
+    run_store = RunMemoryStore(tmp_path / "memory" / "runs")
+    learning_store = SkillLearningStore(tmp_path / "memory" / "skills")
+    service = SkillLearningService(
+        run_store=run_store,
+        learning_store=learning_store,
+        draft_service=DraftService(store),
+        evidence_selector=EvidenceSelector(run_store),
+    )
+    now = datetime.now(timezone.utc).isoformat()
+    run_store.append_run_record(
+        RunRecord(
+            run_id="task-run-1",
+            session_id="session-task",
+            task_id="task-1",
+            attempt_index=1,
+            task_text="Compare direct production restart with staging rollout",
+            started_at=now,
+            ended_at=now,
+            success=False,
+            finish_reason="stop",
+            feedback={"feedback_type": "revise", "comment": "I do not see the docs"},
+            activated_skills=[],
+            validation_result=None,
+        )
+    )
+    run_store.append_run_record(
+        RunRecord(
+            run_id="task-run-2",
+            session_id="session-task",
+            task_id="task-1",
+            attempt_index=2,
+            task_text="I do not see the docs",
+            started_at=now,
+            ended_at=now,
+            success=True,
+            finish_reason="stop",
+            feedback={"feedback_type": "satisfied", "acceptance_type": "accept"},
+            activated_skills=[],
+            validation_result={"accepted": True, "score": 0.9},
+        )
+    )
+
+    candidates = service.build_learning_candidates_for_task("task-1", trigger_run_id="task-run-2")
+
+    assert [candidate.candidate_id for candidate in candidates] == ["new:task:task-1"]
+    assert candidates[0].evidence["theme"] == "Compare direct production restart with staging rollout"
+    assert candidates[0].evidence["task_text"] == "Compare direct production restart with staging rollout"
+
+
+def test_task_theme_uses_first_sentence_for_chinese_text() -> None:
+    assert (
+        SkillLearningService._task_theme(
+            "帮我比较两种发布流程的风险：A 是直接重启线上容器，B 是先部署 staging 再切 production。请给出推荐方案、原因、验证步骤和回滚策略。"
+        )
+        == "帮我比较两种发布流程的风险：A 是直接重启线上容器，B 是先部署 staging 再切 production"
+    )
+
+
 def test_agent_loop_records_skill_receipts_and_effects(tmp_path: Path) -> None:
    skill = SkillContext(
        name="docker-debug",
--- a/app-instance/backend/tests/unit/test_skill_authoring_format.py
+++ b/app-instance/backend/tests/unit/test_skill_authoring_format.py
@ -0,0 +1,54 @@
+from __future__ import annotations
+
+from beaver.skills.authoring.format import (
+    CANONICAL_SKILL_SECTION_HEADINGS,
+    canonical_skill_format_instructions,
+    canonicalize_skill_body,
+    is_canonical_skill_body,
+    parse_skill_rewrite_json,
+)
+
+
+def test_canonical_skill_body_contains_required_sections() -> None:
+    body = canonicalize_skill_body(
+        title="Filesystem Operation",
+        overview="Read and update project files safely.",
+        tools=["read_file", "write_file"],
+        workflow=["Inspect the file before editing.", "Use the smallest safe edit."],
+        validation=["Re-read changed files before reporting completion."],
+        boundaries=["Do not edit files outside the workspace."],
+        anti_patterns=["Do not overwrite files without reading them first."],
+    )
+
+    assert is_canonical_skill_body(body)
+    for heading in CANONICAL_SKILL_SECTION_HEADINGS:
+        assert heading in body
+
+
+def test_canonical_skill_format_instructions_are_prompt_ready() -> None:
+    instructions = canonical_skill_format_instructions()
+
+    assert "Canonical Beaver SKILL.md format" in instructions
+    assert "frontmatter" in instructions
+    assert "name" in instructions
+    assert "description" in instructions
+    assert "tools" in instructions
+    for heading in CANONICAL_SKILL_SECTION_HEADINGS:
+        assert heading in instructions
+
+
+def test_parse_skill_rewrite_json_backfills_frontmatter_tools_from_required_tools_section() -> None:
+    payload = parse_skill_rewrite_json(
+        """{
+          "frontmatter": {
+            "name": "weather-search",
+            "description": "weather lookup",
+            "tools": []
+          },
+          "content": "# Weather Search\\n\\n## Overview\\n\\nLook up weather.\\n\\n## When to Use\\n\\n- Weather requests.\\n\\n## Required Tools\\n\\n- `web_fetch`\\n- `web_search`\\n\\n## Workflow\\n\\n- Fetch current weather.\\n\\n## Validation\\n\\n- Check source freshness.\\n\\n## Boundaries\\n\\n- Do not guess.\\n\\n## Anti-Patterns\\n\\n- Do not fabricate data.\\n"
+        }""",
+        skill_name="weather-search",
+    )
+
+    assert payload is not None
+    assert payload["frontmatter"]["tools"] == ["web_fetch", "web_search"]
--- a/app-instance/backend/tests/unit/test_skill_learning_eval.py
+++ b/app-instance/backend/tests/unit/test_skill_learning_eval.py
@ -19,8 +19,22 @@ from beaver.skills.specs import SkillSpecStore


 class StubProvider(LLMProvider):
-    async def chat(self, messages: list[dict], tools: list[dict] | None = None, model: str | None = None, max_tokens: int = 4096, temperature: float = 0.7) -> LLMResponse:
-        return LLMResponse(content="ok")
+    def __init__(self, content: str = "ok") -> None:
+        super().__init__()
+        self.content = content
+        self.calls: list[dict] = []
+
+    async def chat(
+        self,
+        messages: list[dict],
+        tools: list[dict] | None = None,
+        model: str | None = None,
+        max_tokens: int = 4096,
+        temperature: float = 0.7,
+        thinking_enabled: bool | None = None,
+    ) -> LLMResponse:
+        self.calls.append({"messages": messages, "model": model, "max_tokens": max_tokens, "temperature": temperature})
+        return LLMResponse(content=self.content)

    def get_default_model(self) -> str:
        return "stub"
@ -92,7 +106,6 @@ def test_eval_pass_allows_publish_after_safety_and_review(tmp_path: Path) -> Non
    report = asyncio.run(pipeline.evaluate_draft("candidate-1", draft.skill_name, draft.draft_id, provider_bundle=_bundle()))
    safety = pipeline.check_safety(draft.skill_name, draft.draft_id)
    pipeline.submit_review(draft.skill_name, draft.draft_id, requested_by="tester")
-    pipeline.approve(draft.skill_name, draft.draft_id, reviewer="tester")
    published = pipeline.publish(draft.skill_name, draft.draft_id, publisher="tester")

    assert report.passed is True
@ -114,7 +127,6 @@ def test_eval_regression_blocks_publish(tmp_path: Path) -> None:
    report = asyncio.run(pipeline.evaluate_draft("candidate-1", draft.skill_name, draft.draft_id, provider_bundle=_bundle()))
    pipeline.check_safety(draft.skill_name, draft.draft_id)
    pipeline.submit_review(draft.skill_name, draft.draft_id, requested_by="tester")
-    pipeline.approve(draft.skill_name, draft.draft_id, reviewer="tester")

    assert report.passed is False
    assert pipeline.get_candidate("candidate-1").status == "eval_failed"
@ -160,7 +172,14 @@ def test_eval_does_not_clear_safety_failed_status(tmp_path: Path) -> None:


 class FakeReplayRunner:
+    def __init__(self, *, baseline_answer: str = "done", candidate_answer: str = "done") -> None:
+        self.baseline_answer = baseline_answer
+        self.candidate_answer = candidate_answer
+        self.requests = []
+
    async def run_arm(self, request):
+        self.requests.append(request)
+        final_answer = self.candidate_answer if request.arm == "candidate" else self.baseline_answer
        return {
            "case_id": request.case_id,
            "arm": request.arm,
@ -168,7 +187,7 @@ class FakeReplayRunner:
            "run_id": f"{request.arm}-run",
            "task_text": request.task_text,
            "finish_reason": "stop",
-            "final_answer": "done",
+            "final_answer": final_answer,
            "tool_calls": [
                {
                    "tool_name": "write_file",
@ -213,3 +232,102 @@ def test_eval_report_includes_replay_case_and_coverage(tmp_path: Path) -> None:
    assert 0.0 <= report.execution_coverage <= 1.0
    assert 0.0 <= report.surrogate_coverage <= 1.0
    assert report.confidence in {"low", "medium", "high"}
+    assert "ability_score" in report.case_reports[0]
+    assert "tool_execution_score" in report.case_reports[0]
+    assert report.ability_score_summary["score_role"] == "primary"
+    assert report.tool_execution_summary["score_role"] == "diagnostic_only"
+
+
+def test_replay_main_score_uses_validator_not_tool_success(tmp_path: Path) -> None:
+    pipeline = _pipeline(tmp_path)
+    pipeline.learning_store.update_learning_candidate(
+        "candidate-1",
+        evidence={
+            "eval_cases": [
+                {
+                    "run_id": "validator-case",
+                    "task_id": "validator-case",
+                    "session_id": "eval",
+                    "task_text": "Write the release verdict.",
+                    "validator": {
+                        "type": "final_answer_contains",
+                        "required_terms": ["ship"],
+                        "forbidden_terms": ["do not ship"],
+                    },
+                    "accepted_score": 0.5,
+                }
+            ]
+        },
+    )
+    draft = pipeline.draft_service.create_new_skill_draft(
+        skill_name="release-checklist",
+        proposed_content="# Release\n\nRun tests.",
+        proposed_frontmatter={"description": "release", "tools": []},
+        created_by="test",
+        reason="test",
+    )
+    pipeline.learning_store.update_learning_candidate("candidate-1", draft_skill_name=draft.skill_name, draft_id=draft.draft_id)
+
+    report = asyncio.run(
+        pipeline.evaluate_draft(
+            "candidate-1",
+            draft.skill_name,
+            draft.draft_id,
+            provider_bundle=_bundle(),
+            replay_runner=FakeReplayRunner(
+                baseline_answer="Do not ship. Tests are failing.",
+                candidate_answer="Ship after smoke tests pass.",
+            ),
+        )
+    )
+
+    case = report.case_reports[0]
+    assert case["tool_execution_score"]["baseline_score"] == 0.85
+    assert case["tool_execution_score"]["candidate_score"] == 0.85
+    assert case["baseline_score"] < case["candidate_score"]
+    assert report.tool_mode_summary["score_role"] == "diagnostic_only"
+    assert report.ability_score_summary["score_role"] == "primary"
+    assert report.real_score_avg is not None
+    assert report.synthetic_score_avg is not None
+
+
+def test_synthetic_cases_without_validator_are_not_replay_scored(tmp_path: Path) -> None:
+    pipeline = _pipeline(tmp_path)
+    pipeline.learning_store.update_learning_candidate(
+        "candidate-1",
+        evidence={
+            "eval_cases": [
+                {
+                    "run_id": "synthetic:no-validator",
+                    "task_id": "synthetic-no-validator",
+                    "session_id": "synthetic-eval",
+                    "task_text": "Synthetic task without an oracle.",
+                    "synthetic": True,
+                    "accepted_score": 0.75,
+                }
+            ]
+        },
+    )
+    draft = pipeline.draft_service.create_new_skill_draft(
+        skill_name="release-checklist",
+        proposed_content="# Release\n\nRun tests.",
+        proposed_frontmatter={"description": "release", "tools": []},
+        created_by="test",
+        reason="test",
+    )
+    pipeline.learning_store.update_learning_candidate("candidate-1", draft_skill_name=draft.skill_name, draft_id=draft.draft_id)
+    replay_runner = FakeReplayRunner()
+
+    report = asyncio.run(
+        pipeline.evaluate_draft(
+            "candidate-1",
+            draft.skill_name,
+            draft.draft_id,
+            provider_bundle=_bundle(),
+            replay_runner=replay_runner,
+        )
+    )
+
+    assert "synthetic:no-validator" not in {case["run_id"] for case in report.case_reports}
+    assert all("synthetic:no-validator" not in request.case_id for request in replay_runner.requests)
+    assert report.case_selection_summary["excluded_synthetic_without_validator"] == 1
--- a/app-instance/backend/tests/unit/test_skill_learning_eval_report_model.py
+++ b/app-instance/backend/tests/unit/test_skill_learning_eval_report_model.py
@ -31,6 +31,12 @@ def test_eval_report_defaults_preserve_legacy_payload_shape() -> None:
    assert payload["confidence"] == "low"
    assert payload["case_reports"] == []
    assert payload["tool_mode_summary"] == {}
+    assert payload["ability_score_summary"] == {}
+    assert payload["tool_execution_summary"] == {}
+    assert payload["case_selection_summary"] == {}
+    assert payload["real_score_avg"] is None
+    assert payload["synthetic_score_avg"] is None
+    assert payload["overall_score_avg"] is None
    assert payload["preservation_report"] is None
    assert payload["cases"] == [{"run_id": "run-1"}]

@ -59,3 +65,37 @@ def test_eval_report_reads_legacy_payload_without_replay_fields() -> None:
    assert report.mode == "heuristic"
    assert report.confidence == "low"
    assert report.case_reports == []
+
+
+def test_eval_report_persists_ability_and_case_split_fields() -> None:
+    report = SkillDraftEvalReport(
+        report_id="eval-replay",
+        skill_name="debug",
+        draft_id="draft-1",
+        candidate_id="candidate-1",
+        passed=True,
+        baseline_score_avg=0.5,
+        candidate_score_avg=0.8,
+        score_delta=0.3,
+        regression_count=0,
+        improved_count=1,
+        unchanged_count=0,
+        mode="replay",
+        eval_version="replay-v2",
+        real_score_avg=0.9,
+        synthetic_score_avg=0.6,
+        overall_score_avg=0.8,
+        ability_score_summary={"score_role": "primary", "real_case_count": 1},
+        tool_execution_summary={"score_role": "diagnostic_only", "executed": 1.0},
+        case_selection_summary={"excluded_synthetic_without_validator": 2},
+    )
+
+    payload = report.to_dict()
+    restored = SkillDraftEvalReport.from_dict(payload)
+
+    assert payload["real_score_avg"] == 0.9
+    assert payload["synthetic_score_avg"] == 0.6
+    assert payload["overall_score_avg"] == 0.8
+    assert restored.ability_score_summary == {"score_role": "primary", "real_case_count": 1}
+    assert restored.tool_execution_summary == {"score_role": "diagnostic_only", "executed": 1.0}
+    assert restored.case_selection_summary == {"excluded_synthetic_without_validator": 2}
--- a/app-instance/backend/tests/unit/test_skill_learning_pipeline.py
+++ b/app-instance/backend/tests/unit/test_skill_learning_pipeline.py
@ -55,14 +55,12 @@ def test_pipeline_lists_candidates_and_moves_draft_through_review(tmp_path: Path
        reason="test",
    )

-    review = pipeline.submit_review(draft.skill_name, draft.draft_id, requested_by="tester")
-    approved = pipeline.approve(draft.skill_name, draft.draft_id, reviewer="tester")
    safety = pipeline.check_safety(draft.skill_name, draft.draft_id)
+    review = pipeline.submit_review(draft.skill_name, draft.draft_id, requested_by="tester")
    version = pipeline.publish(draft.skill_name, draft.draft_id, publisher="tester")

    assert pipeline.list_candidates()[0].candidate_id == "candidate-1"
    assert review.status == SkillReviewState.IN_REVIEW.value
-    assert approved.status == SkillReviewState.APPROVED.value
    assert safety.passed is True
    assert version.skill_name == "new-skill"
    assert pipeline.get_draft(draft.skill_name, draft.draft_id).status == SkillReviewState.PUBLISHED.value
@ -93,7 +91,6 @@ def test_pipeline_does_not_resubmit_terminal_draft(tmp_path: Path) -> None:
    )

    pipeline.submit_review(draft.skill_name, draft.draft_id, requested_by="tester")
-    pipeline.approve(draft.skill_name, draft.draft_id, reviewer="tester")
    pipeline.check_safety(draft.skill_name, draft.draft_id)
    pipeline.publish(draft.skill_name, draft.draft_id, publisher="tester")

@ -165,7 +162,6 @@ def test_publish_blocks_low_confidence_replay_report(tmp_path: Path) -> None:
        )
    )
    pipeline.submit_review(draft.skill_name, draft.draft_id, requested_by="tester")
-    pipeline.approve(draft.skill_name, draft.draft_id, reviewer="tester")
    pipeline.check_safety(draft.skill_name, draft.draft_id)

    with pytest.raises(ValueError, match="low confidence"):
@ -201,7 +197,6 @@ def test_publish_blocks_failed_preservation_report(tmp_path: Path) -> None:
        )
    )
    pipeline.submit_review(draft.skill_name, draft.draft_id, requested_by="tester")
-    pipeline.approve(draft.skill_name, draft.draft_id, reviewer="tester")
    pipeline.check_safety(draft.skill_name, draft.draft_id)

    with pytest.raises(ValueError, match="preservation"):
--- a/app-instance/backend/tests/unit/test_skill_learning_replay_runner.py
+++ b/app-instance/backend/tests/unit/test_skill_learning_replay_runner.py
@ -16,6 +16,25 @@ class FakeAgentLoop:
        return SimpleNamespace(session_id="session-replay", run_id="run-replay", output_text="done", finish_reason="stop")


+class FakeRunningAgentLoop(FakeAgentLoop):
+    def __init__(self) -> None:
+        self.process_direct_calls = 0
+        self.submit_direct_calls: list[tuple[str, dict]] = []
+
+    async def process_direct(self, task: str, **kwargs):
+        self.process_direct_calls += 1
+        raise RuntimeError(
+            "AgentLoop.process_direct() is disabled while run() is active; "
+            "submit tasks via submit_direct() instead."
+        )
+
+    async def submit_direct(self, task: str, **kwargs):
+        self.submit_direct_calls.append((task, kwargs))
+        executor = kwargs["tool_executor_override"]
+        await executor.execute("mcp_outlook_send_email", {"to": "ada@example.com"})
+        return SimpleNamespace(session_id="session-queued", run_id="run-queued", output_text="queued done", finish_reason="stop")
+
+
 def test_replay_runner_returns_arm_report_with_tool_trace() -> None:
    runner = ReplayRunner(agent_loop=FakeAgentLoop())
    request = ReplayArmRequest(
@ -34,3 +53,33 @@ def test_replay_runner_returns_arm_report_with_tool_trace() -> None:
    assert report["arm"] == "candidate"
    assert report["finish_reason"] == "stop"
    assert report["tool_calls"][0]["tool_name"] == "mcp_outlook_send_email"
+
+
+def test_replay_runner_queues_arm_when_agent_loop_is_running() -> None:
+    agent_loop = FakeRunningAgentLoop()
+    runner = ReplayRunner(agent_loop=agent_loop)
+    request = ReplayArmRequest(
+        case_id="case-queued",
+        arm="baseline",
+        task_text="Send a status email to Ada.",
+        pinned_skill_names=["filesystem-operation"],
+        pinned_skill_contexts=[{"name": "filesystem-operation"}],
+        provider_bundle=object(),
+        model_settings={"max_tool_iterations": 3, "temperature": 0.1},
+    )
+
+    report = asyncio.run(runner.run_arm(request))
+
+    assert agent_loop.process_direct_calls == 1
+    assert len(agent_loop.submit_direct_calls) == 1
+    queued_task, queued_kwargs = agent_loop.submit_direct_calls[0]
+    assert queued_task == "Send a status email to Ada."
+    assert queued_kwargs["source"] == "skill_replay_eval"
+    assert queued_kwargs["include_skill_assembly"] is False
+    assert queued_kwargs["include_tools"] is True
+    assert queued_kwargs["pinned_skill_names"] == ["filesystem-operation"]
+    assert queued_kwargs["max_tool_iterations"] == 3
+    assert queued_kwargs["temperature"] == 0.1
+    assert report["session_id"] == "session-queued"
+    assert report["run_id"] == "run-queued"
+    assert report["tool_calls"][0]["tool_name"] == "mcp_outlook_send_email"
--- a/app-instance/backend/tests/unit/test_skill_learning_safety.py
+++ b/app-instance/backend/tests/unit/test_skill_learning_safety.py
@ -74,7 +74,6 @@ def test_safety_marks_dangerous_tools_high_and_requires_confirm(tmp_path: Path)

    report = pipeline.check_safety(draft.skill_name, draft.draft_id)
    pipeline.submit_review(draft.skill_name, draft.draft_id, requested_by="tester")
-    pipeline.approve(draft.skill_name, draft.draft_id, reviewer="tester")

    assert report.passed is True
    assert report.risk_level == "high"
@ -94,7 +93,6 @@ def test_publish_requires_safety_report(tmp_path: Path) -> None:
        reason="test",
    )
    pipeline.submit_review(draft.skill_name, draft.draft_id, requested_by="tester")
-    pipeline.approve(draft.skill_name, draft.draft_id, reviewer="tester")

    with pytest.raises(ValueError, match="safety report"):
        pipeline.publish(draft.skill_name, draft.draft_id, publisher="tester")
--- a/app-instance/backend/tests/unit/test_skill_learning_synthesizer_preservation.py
+++ b/app-instance/backend/tests/unit/test_skill_learning_synthesizer_preservation.py
@ -1,6 +1,7 @@
 from __future__ import annotations

 from beaver.memory.skills import SkillLearningCandidate
+from beaver.skills.authoring.format import CANONICAL_SKILL_SECTION_HEADINGS
 from beaver.skills.learning.evidence import EvidencePacket
 from beaver.skills.learning.synthesizer import SkillDraftSynthesizer

@ -39,3 +40,6 @@ def test_revision_prompt_includes_base_skill_snapshot() -> None:
    assert "Do not delete files." in prompt
    assert "preserved_sections" in prompt
    assert "dropped_sections" in prompt
+    assert "Canonical Beaver SKILL.md format" in prompt
+    for heading in CANONICAL_SKILL_SECTION_HEADINGS:
+        assert heading in prompt
--- a/app-instance/backend/tests/unit/test_skill_learning_web_api.py
+++ b/app-instance/backend/tests/unit/test_skill_learning_web_api.py
@ -1,12 +1,37 @@
 from __future__ import annotations

 from pathlib import Path
+from types import SimpleNamespace

 from fastapi.testclient import TestClient

+from beaver.memory.runs import RunRecord
 from beaver.interfaces.web.app import create_app
-from beaver.memory.skills import SkillLearningCandidate
+from beaver.memory.skills import SkillDraftEvalReport, SkillLearningCandidate
 from beaver.services.agent_service import AgentService
+from beaver.skills.specs import SkillVersion
+
+
+class StubEvaluator:
+    def __init__(self) -> None:
+        self.calls = 0
+
+    async def evaluate(self, *, candidate, draft, provider_bundle, replay_runner=None):
+        self.calls += 1
+        return SkillDraftEvalReport(
+            report_id="eval-existing",
+            skill_name=draft.skill_name,
+            draft_id=draft.draft_id,
+            candidate_id=candidate.candidate_id,
+            passed=True,
+            baseline_score_avg=0.5,
+            candidate_score_avg=0.8,
+            score_delta=0.3,
+            regression_count=0,
+            improved_count=1,
+            unchanged_count=0,
+            status="completed",
+        )


 def test_skill_learning_candidates_and_run_once_api(tmp_path: Path) -> None:
@ -31,3 +56,191 @@ def test_skill_learning_candidates_and_run_once_api(tmp_path: Path) -> None:
    assert candidates[0]["candidate_id"] == "candidate-1"
    assert "risk_level" in candidates[0]
    assert run_once["processed"] >= 0
+
+
+def test_skill_learning_candidates_payload_prefers_original_task_text(tmp_path: Path) -> None:
+    service = AgentService(workspace=tmp_path)
+    loaded = service.create_loop().boot()
+    now = "2026-06-11T00:00:00+00:00"
+    loaded.skill_learning_service.run_store.append_run_record(  # type: ignore[union-attr]
+        RunRecord(
+            run_id="run-original",
+            session_id="session-task",
+            task_id="task-1",
+            attempt_index=1,
+            task_text="Compare direct production restart with staging rollout",
+            started_at=now,
+            ended_at=now,
+            success=False,
+            finish_reason="stop",
+            feedback={"feedback_type": "revise", "comment": "I do not see the docs"},
+            activated_skills=[],
+            validation_result=None,
+        )
+    )
+    loaded.skill_learning_service.run_store.append_run_record(  # type: ignore[union-attr]
+        RunRecord(
+            run_id="run-final",
+            session_id="session-task",
+            task_id="task-1",
+            attempt_index=2,
+            task_text="I do not see the docs",
+            started_at=now,
+            ended_at=now,
+            success=True,
+            finish_reason="stop",
+            feedback={"feedback_type": "satisfied", "acceptance_type": "accept"},
+            activated_skills=[],
+            validation_result={"accepted": True, "score": 0.9},
+        )
+    )
+    loaded.skill_learning_store.record_learning_candidate(  # type: ignore[union-attr]
+        SkillLearningCandidate(
+            candidate_id="new:task:task-1",
+            kind="new_skill",
+            source_run_ids=["run-original", "run-final"],
+            source_session_ids=["session-task"],
+            related_skill_names=[],
+            reason="test",
+            evidence={"task_id": "task-1", "theme": "i do not see the docs"},
+        )
+    )
+    app = create_app(service=service, manage_service_lifecycle=False)
+
+    with TestClient(app) as client:
+        candidates = client.get("/api/skills/candidates").json()
+
+    payload = next(item for item in candidates if item["candidate_id"] == "new:task:task-1")
+    assert payload["evidence"]["theme"] == "Compare direct production restart with staging rollout"
+    assert payload["evidence"]["task_text"] == "Compare direct production restart with staging rollout"
+
+
+def test_generate_draft_does_not_run_review_checks(tmp_path: Path, monkeypatch) -> None:
+    service = AgentService(workspace=tmp_path)
+    loaded = service.create_loop().boot()
+    draft = loaded.skill_learning_pipeline.draft_service.create_new_skill_draft(  # type: ignore[union-attr]
+        skill_name="filesystem-operation",
+        proposed_content="# Filesystem Operation\n\nUse files safely.",
+        proposed_frontmatter={"description": "filesystem", "tools": []},
+        created_by="test",
+        reason="test",
+    )
+    loaded.skill_learning_store.record_learning_candidate(  # type: ignore[union-attr]
+        SkillLearningCandidate(
+            candidate_id="candidate-existing",
+            kind="revise_skill",
+            source_run_ids=["run-1"],
+            source_session_ids=["session-1"],
+            related_skill_names=["filesystem-operation"],
+            reason="revise",
+            status="draft_ready",
+            draft_skill_name=draft.skill_name,
+            draft_id=draft.draft_id,
+        )
+    )
+    evaluator = StubEvaluator()
+    loaded.skill_learning_pipeline.evaluator = evaluator  # type: ignore[union-attr]
+    monkeypatch.setattr(
+        service,
+        "_make_provider_bundle_for_task",
+        lambda loaded, kwargs: SimpleNamespace(main_provider=object()),
+    )
+    app = create_app(service=service, manage_service_lifecycle=False)
+
+    with TestClient(app) as client:
+        response = client.post("/api/skills/candidates/candidate-existing/draft")
+
+    assert response.status_code == 200
+    payload = response.json()
+    assert evaluator.calls == 0
+    assert payload["draft_id"] == draft.draft_id
+    assert payload["safety_report"] is None
+    assert payload["eval_report"] is None
+    assert loaded.skill_learning_pipeline.get_eval_report(draft.skill_name, draft.draft_id) is None  # type: ignore[union-attr]
+
+
+def test_submit_draft_runs_safety_and_eval(tmp_path: Path, monkeypatch) -> None:
+    service = AgentService(workspace=tmp_path)
+    loaded = service.create_loop().boot()
+    draft = loaded.skill_learning_pipeline.draft_service.create_new_skill_draft(  # type: ignore[union-attr]
+        skill_name="filesystem-operation",
+        proposed_content="# Filesystem Operation\n\nUse files safely.",
+        proposed_frontmatter={"description": "filesystem", "tools": []},
+        created_by="test",
+        reason="test",
+    )
+    loaded.skill_learning_store.record_learning_candidate(  # type: ignore[union-attr]
+        SkillLearningCandidate(
+            candidate_id="candidate-existing",
+            kind="revise_skill",
+            source_run_ids=["run-1"],
+            source_session_ids=["session-1"],
+            related_skill_names=["filesystem-operation"],
+            reason="revise",
+            status="draft_ready",
+            draft_skill_name=draft.skill_name,
+            draft_id=draft.draft_id,
+        )
+    )
+    evaluator = StubEvaluator()
+    loaded.skill_learning_pipeline.evaluator = evaluator  # type: ignore[union-attr]
+    monkeypatch.setattr(
+        service,
+        "_make_provider_bundle_for_task",
+        lambda loaded, kwargs: SimpleNamespace(main_provider=object()),
+    )
+    app = create_app(service=service, manage_service_lifecycle=False)
+
+    with TestClient(app) as client:
+        response = client.post(f"/api/skills/{draft.skill_name}/drafts/{draft.draft_id}/submit")
+
+    assert response.status_code == 200
+    payload = response.json()
+    assert evaluator.calls == 1
+    assert payload["status"] == "in_review"
+    assert payload["safety_report"]["passed"] is True
+    assert payload["eval_report"]["report_id"] == "eval-existing"
+
+
+def test_draft_payload_includes_target_version_for_revision(tmp_path: Path) -> None:
+    service = AgentService(workspace=tmp_path)
+    loaded = service.create_loop().boot()
+    loaded.skill_spec_store.write_skill_version(  # type: ignore[union-attr]
+        SkillVersion(
+            skill_name="filesystem-operation",
+            version="v0001",
+            content_hash="hash-v1",
+            summary_hash="summary-v1",
+            created_at="2026-06-01T00:00:00+00:00",
+            created_by="test",
+            change_reason="initial",
+            parent_version=None,
+            review_state="published",
+            frontmatter={"description": "filesystem", "name": "filesystem-operation", "tools": []},
+            summary="filesystem",
+            tool_hints=[],
+        ),
+        "# Filesystem Operation\n\nUse files.",
+    )
+    loaded.skill_spec_store.set_current_version("filesystem-operation", "v0001")  # type: ignore[union-attr]
+    draft = loaded.skill_learning_pipeline.draft_service.create_revision_draft(  # type: ignore[union-attr]
+        skill_name="filesystem-operation",
+        base_version="v0001",
+        proposed_content="# Filesystem Operation\n\nUse files better.",
+        proposed_frontmatter={"description": "filesystem", "name": "filesystem-operation", "tools": []},
+        created_by="test",
+        reason="revise",
+    )
+    app = create_app(service=service, manage_service_lifecycle=False)
+
+    with TestClient(app) as client:
+        response = client.get("/api/skills/drafts")
+
+    assert response.status_code == 200
+    payload = next(item for item in response.json() if item["draft_id"] == draft.draft_id)
+    assert payload["proposal_kind"] == "revise_skill"
+    assert payload["base_version"] == "v0001"
+    assert payload["target_version"] == "v0002"
+    assert payload["base_skill"]["version"] == "v0001"
+    assert payload["base_skill"]["content"] == "# Filesystem Operation\n\nUse files."
+    assert payload["base_skill"]["frontmatter"]["name"] == "filesystem-operation"
--- a/Show More
+++ b/Show More