feat(app): 移除内置agents并添加CORS支持和技能上传优化

移除了agents/registry.json中的所有内置agents配置，将agents数组清空。为web应用添加了CORS中间件支持，允许指定的前端地址跨域访问。重构了技能上传功能，增加了LLM重写机制，自动规范化上传的技能格式。新增了工具名称提取逻辑，从技能正文中自动识别Required Tools段落。更新了技能学习候选者和草稿的载荷结构，添加评估报告统计信息。修改了意图路由技能的说明，改进任务状态管理逻辑。
2026-06-12 13:25:20 +08:00
parent fc9fd93c36
commit 8aeb97a5fc
76 changed files with 3382 additions and 553 deletions
--- a/docs/product-discovery/beaver/product-discovery-report.md
+++ b/docs/product-discovery/beaver/product-discovery-report.md
@ -87,7 +87,6 @@ For product pilots:
 | Connector maturity varies by channel | Customer demos must avoid overpromising |
 | Multi-instance deployment is powerful but operationally sensitive | Pilot success depends on stable setup and clear runbooks |
 | Skill learning needs strong governance | Reuse can become risk if publishing is weak |
-| Metrics are not yet productized | Hard to prove pilot value without baseline and target |
 | Customer research is not yet captured | Current roadmap is inferred from implementation and product judgment |

 ## User Segments
@ -345,51 +344,6 @@ Opportunity 3: I need successful work to become reusable.
 | Production writes through connectors without review | Trust risk |
 | Complex enterprise RBAC before pilot validation | May overbuild before segment clarity |

-## Metrics Dashboard
-
-### North Star Metric
-
-Accepted Agent Workflows:
-
-> Number of AI-assisted tasks or scheduled workflows accepted by users per active pilot team per week.
-
-Why this metric: it captures real delivered value better than messages sent, tokens used, or model calls.
-
-### Input Metrics
-
-| Metric | Definition | Target For Pilot |
-| --- | --- | --- |
-| Task Creation Rate | Tasks created / active users / week | Increasing weekly |
-| Acceptance Rate | Accepted task runs / completed task runs | >=60% in pilot |
-| Revision Rate | Runs needing revision / completed runs | Track down over time |
-| Evidence Coverage | Task runs with timeline/tool/artifact evidence / task runs | >=90% |
-| Skill Candidate Rate | Accepted tasks producing candidates / accepted tasks | >=20% after week 2 |
-| Skill Reuse Rate | Runs activating published pilot skills / task runs | >=15% after skills exist |
-| Scheduled Success Rate | Accepted scheduled outputs / scheduled runs | >=50% for selected workflows |
-| Deployment Success Time | Fresh deployment time to first working user | <2 hours for pilot |
-
-### Guardrail Metrics
-
-| Metric | Alert |
-| --- | --- |
-| Critical tool/security incident | Any occurrence |
-| Instance creation failure rate | >10% in pilot |
-| Provider configuration failure rate | >20% |
-| Task run failure rate | >20% for 2 consecutive days |
-| Connector side-effect incident | Any unintended external write |
-| User file permission/storage incident | Any cross-user or cross-instance leak |
-| p95 task completion latency | Exceeds pilot workflow tolerance |
-
-### Business Metrics
-
- Pilot activation: teams reaching first accepted task.
- Time to first accepted task.
- Weekly active task users.
- Repeated workflow count.
- Skill reuse per team.
- Customer-reported time saved.
- Pilot conversion intent.
-
 ## Customer Research Plan

 No direct interview transcripts were provided. Research should start immediately before locking roadmap.
@ -454,7 +408,7 @@ We are studying how teams move AI from chat into real work. We are not asking wh

 1. Pick 2-3 pilot workflows: project brief, weekly report, document review, support triage, or file processing.
 2. Run fresh deployment rehearsal from README/deployment guide and record gaps.
-3. Define pilot metrics and instrument accepted tasks, revisions, skill candidates, skill reuse, and run failures.
+3. Define pilot learning questions and instrument the events needed to answer them.
 4. Create a task evidence narrative prototype on top of existing timeline data.
 5. Package pilot workflow templates as skills or documented demos.
 6. Validate provider onboarding with 3 non-engineer users.