feat(app): 移除内置agents并添加CORS支持和技能上传优化

移除了agents/registry.json中的所有内置agents配置,将agents数组清空。
为web应用添加了CORS中间件支持,允许指定的前端地址跨域访问。
重构了技能上传功能,增加了LLM重写机制,自动规范化上传的技能格式。
新增了工具名称提取逻辑,从技能正文中自动识别Required Tools段落。
更新了技能学习候选者和草稿的载荷结构,添加评估报告统计信息。
修改了意图路由技能的说明,改进任务状态管理逻辑。
This commit is contained in:
2026-06-12 13:25:20 +08:00
parent fc9fd93c36
commit 8aeb97a5fc
76 changed files with 3382 additions and 553 deletions

View File

@ -87,7 +87,6 @@ For product pilots:
| Connector maturity varies by channel | Customer demos must avoid overpromising |
| Multi-instance deployment is powerful but operationally sensitive | Pilot success depends on stable setup and clear runbooks |
| Skill learning needs strong governance | Reuse can become risk if publishing is weak |
| Metrics are not yet productized | Hard to prove pilot value without baseline and target |
| Customer research is not yet captured | Current roadmap is inferred from implementation and product judgment |
## User Segments
@ -345,51 +344,6 @@ Opportunity 3: I need successful work to become reusable.
| Production writes through connectors without review | Trust risk |
| Complex enterprise RBAC before pilot validation | May overbuild before segment clarity |
## Metrics Dashboard
### North Star Metric
Accepted Agent Workflows:
> Number of AI-assisted tasks or scheduled workflows accepted by users per active pilot team per week.
Why this metric: it captures real delivered value better than messages sent, tokens used, or model calls.
### Input Metrics
| Metric | Definition | Target For Pilot |
| --- | --- | --- |
| Task Creation Rate | Tasks created / active users / week | Increasing weekly |
| Acceptance Rate | Accepted task runs / completed task runs | >=60% in pilot |
| Revision Rate | Runs needing revision / completed runs | Track down over time |
| Evidence Coverage | Task runs with timeline/tool/artifact evidence / task runs | >=90% |
| Skill Candidate Rate | Accepted tasks producing candidates / accepted tasks | >=20% after week 2 |
| Skill Reuse Rate | Runs activating published pilot skills / task runs | >=15% after skills exist |
| Scheduled Success Rate | Accepted scheduled outputs / scheduled runs | >=50% for selected workflows |
| Deployment Success Time | Fresh deployment time to first working user | <2 hours for pilot |
### Guardrail Metrics
| Metric | Alert |
| --- | --- |
| Critical tool/security incident | Any occurrence |
| Instance creation failure rate | >10% in pilot |
| Provider configuration failure rate | >20% |
| Task run failure rate | >20% for 2 consecutive days |
| Connector side-effect incident | Any unintended external write |
| User file permission/storage incident | Any cross-user or cross-instance leak |
| p95 task completion latency | Exceeds pilot workflow tolerance |
### Business Metrics
- Pilot activation: teams reaching first accepted task.
- Time to first accepted task.
- Weekly active task users.
- Repeated workflow count.
- Skill reuse per team.
- Customer-reported time saved.
- Pilot conversion intent.
## Customer Research Plan
No direct interview transcripts were provided. Research should start immediately before locking roadmap.
@ -454,7 +408,7 @@ We are studying how teams move AI from chat into real work. We are not asking wh
1. Pick 2-3 pilot workflows: project brief, weekly report, document review, support triage, or file processing.
2. Run fresh deployment rehearsal from README/deployment guide and record gaps.
3. Define pilot metrics and instrument accepted tasks, revisions, skill candidates, skill reuse, and run failures.
3. Define pilot learning questions and instrument the events needed to answer them.
4. Create a task evidence narrative prototype on top of existing timeline data.
5. Package pilot workflow templates as skills or documented demos.
6. Validate provider onboarding with 3 non-engineer users.