feat(coordinator): 添加团队节点默认最大工具迭代次数配置

添加 DEFAULT_TEAM_NODE_MAX_TOOL_ITERATIONS 配置项以控制团队节点的最大工具迭代次数,
并修改 LocalAgentRunner 中的逻辑来使用此默认值当 envelope 中未指定时。

fix(runtime): 修复团队节点运行成功判断逻辑

更新运行成功判断条件,将 finish_reason 为 "max_tool_iterations_finalized" 的情况
视为运行失败,并添加对原始工具调用输出的检测,避免将其误判为成功完成。

feat(mcp): 添加团队工作流MCP工具类别支持

增加新的本地MCP工具类别 "team_workflow" 及其对应的工具创建功能,
为团队工作流提供本地工具支持。

refactor(engine): 调整AgentLoop最大工具迭代次数设置

将 AgentProfile 中的默认 max_tool_iterations 从 30 增加到 100,
同时移除 TaskExecutionPlanner 构造函数中的重复参数传递。

perf(mcp): 优化MCP连接管理避免重复连接

添加 mcp_connected 标志来跟踪MCP连接状态,确保 connect_all 只执行一次,
提高性能并避免不必要的重复连接。

refactor(skills): 移除技能团队模板相关功能

移除与技能团队模板相关的代码,包括解析、存储和处理逻辑,
简化技能记录结构和加载流程。

feat(process): 增强会话过程投影器功能

添加技能激活快照事件处理,改进团队运行完成消息显示,
并增强技能激活事件的时间戳记录功能。

refactor(tasks): 简化任务尝试编排器团队执行逻辑

移除团队执行相关代码,将所有任务统一按单步执行处理,
简化任务编排器的复杂度并提升执行效率。

fix(evidence): 修复节点证据评估中需求验证逻辑

更新节点证据评估逻辑,跳过自然语言证据需求的确定性验证,
只执行机器可读的需求验证,避免因自然语言需求导致的节点失败。
This commit is contained in:
2026-06-26 16:36:29 +08:00
parent 53b13e8eac
commit 520a21a027
360 changed files with 13271 additions and 1848 deletions

View File

@ -0,0 +1,150 @@
# Template-Guided Team Routing Design
## Goal
Make an activated Skill with a valid `beaver-team-template` a first-class execution option for the Main Agent. The Main Agent makes the execution-mode choice during its first normal model turn: it either calls `run_agent_team` or proceeds as a single agent. No new model round, natural-language decision reason, fixed-role agent, parallel Team model, or nested Team is introduced.
## Problem
The MGM/Galaxy financial-report task activated `mgm-galaxy-financial-chart-report-safe`, whose template describes source collection, metric extraction, validation, and report generation. The initial task planner nevertheless returned `planner_skipped_simple_task`, because its keyword prefilter did not recognize the request. The Main Agent had `run_agent_team` in its tool schema and the template in its Skill guidance, but the existing prompt only said that it "may" or should "prefer" the Team tool. It selected `web_search` directly and the run stopped at the web-search low-quality budget before any Team was created.
The issue is not missing tool registration or a broken Team runtime. The decision to use an active template is currently an unconstrained, implicit LLM preference.
## Scope
In scope:
- Main-Agent first-turn routing when one or more activated Skills provide a valid Team template.
- A compact, structured representation of the selected primary template in Main-Agent guidance.
- Explicit first-turn mode semantics inferred from normal execution:
- calling `run_agent_team` selects Team mode;
- calling another tool, or replying without a tool call, selects single-agent mode.
- Structured lifecycle observability for the chosen mode without an LLM-authored reason.
- Tests for template-present Team routing, single-agent opt-out, and legacy no-template behavior.
Out of scope:
- A separate mode-selection provider call or a `select_execution_mode` tool.
- Requiring a natural-language explanation for single-agent execution.
- Changing `ExecutionGraph`, `ExecutionNode`, `LocalAgentRunner`, scheduler/evidence/synthesis semantics, node Skill binding, or tool scopes.
- Fixed-role agents, nested Teams, a parallel Team runtime, frontend work, or chart-renderer tools.
- Changing the existing pre-execution `TaskExecutionPlanner` heuristic; this design makes Main-Agent routing reliable even when that planner returns `single`.
## Existing Runtime Boundary
```text
TaskAttemptOrchestrator
→ TaskExecutionPlanner.plan()
→ Main Agent AgentLoop.process_direct()
→ Skill activation + ToolAssembler
→ provider first response
→ selected tool execution
```
Today, `AgentLoop` registers `run_agent_team` for a root Task run and adds `TASK_AGENT_TEAM_CAPABILITY_PROMPT`. `SkillContext.team_template` is already populated from a Skill's `beaver-team-template` block. Therefore the implementation belongs at the Main-Agent prompt/tool boundary, not in a new Team runtime.
## Design
### 1. Template Eligibility
The normal activated Skill list remains authoritative. The routing helper considers only Skills where `SkillContext.team_template` is a non-empty mapping with a non-empty `nodes` list.
At most one template is primary. Selection is deterministic and preserves activation order: the first eligible activated Skill is primary; later eligible templates are guidance only. The primary Skill name and ignored template Skill names are included in structured runtime metadata, not in an LLM decision response.
No template is synthesized from Skill prose. Invalid/missing templates retain existing single-agent behavior.
### 2. First-Turn Routing Contract
When a root Task run has an eligible primary template and `run_agent_team` is exposed, the system guidance tells the Main Agent:
```text
Before beginning ordinary work, choose one execution path in this first response.
- For staged collection, extraction, validation, comparison, research, or reporting represented by the active template, call run_agent_team now using task-only nodes derived from the primary template.
- Choose single-agent execution only when the user's request is plainly a one-step request, explicitly asks not to delegate, or the template does not fit the immediate request.
- Do not call ordinary tools before this choice.
- If you choose single-agent execution, call ordinary tools or answer normally; do not explain the routing choice.
```
The template is included as compact JSON in this guidance, along with its Skill name. This prevents the model having to reconstruct the graph from prose. The existing template and task-only schema restrictions remain authoritative: no `agent` or `role` fields, no nested Team.
This is a prompting constraint, not a second classifier or a forced Team decision. A model can intentionally opt out by beginning normal single-agent work.
### 3. Mode Lock and Fallback
The first provider response locks the root run's mode:
| First response | Mode | Subsequent behavior |
| --- | --- | --- |
| contains `run_agent_team` | `team` | Execute the Team tool and retain existing Team outcome/synthesis behavior. |
| contains another tool call | `single` | Execute the ordinary tool call. Hide/reject later `run_agent_team` calls for this root run. |
| contains no tool call | `single` | Return the answer normally. |
| Team tool returns an error | `team` | Preserve its tool result and allow the Main Agent's normal post-tool response; do not silently start a second Team. |
The lock prevents a half-completed single-agent search from later creating a Team with overlapping evidence and ambiguous timeline ownership. A new Task attempt starts a fresh first-turn routing decision.
The existing pre-execution planner may still create a Team before Main Agent execution. If it does, no Main-Agent route decision is needed: its mode is already `team`. The new behavior applies to the current `single` pre-plan path, where the Main Agent otherwise owns execution.
### 4. Observability
Persist a machine-readable event immediately after the first provider response is classified:
```json
{
"execution_mode": "team",
"routing_source": "main_agent_first_turn",
"primary_template_skill": "mgm-galaxy-financial-chart-report-safe",
"ignored_template_skills": []
}
```
For a no-template run, no new event is necessary. For a template run that selects single mode, the same event is written with `execution_mode: "single"`. There is deliberately no `reason` field and no user-visible decision text.
The event makes future task detail/process projections able to distinguish “Team was not available” from “Team was available and the Main Agent selected single execution,” without inflating token usage.
### 5. Failure Semantics
- Template parse/selection failure: retain existing tool selection and single-agent execution.
- `run_agent_team` unavailable: do not claim Team routing; retain normal single-agent execution.
- Invalid Team arguments: current Team tool error path remains intact; it is visible as a tool result.
- First response tries both `run_agent_team` and ordinary tools: reject/defer ordinary tools for that turn and execute only Team, because Team mode owns the execution plan. This is a defensive runtime rule to preserve the no-mixed-mode invariant.
## Test Strategy
Unit tests cover the smallest boundaries:
1. Eligible activated template produces the first-turn route guidance containing the primary template and task-only Team instruction.
2. A first response containing `run_agent_team` records `team` mode and still follows the current Team tool call path.
3. A first response containing `web_search` records `single` mode; a later `run_agent_team` call is not exposed/executed in that run.
4. A valid template does not make ordinary no-template runs change their tool list or behavior.
5. Multiple templates select the first activated template and report the others as ignored metadata.
6. Existing agent-loop and Task/Team regression tests stay green.
## Acceptance Criteria
For a request that activates the MGM/Galaxy Skill:
```text
Skill activation
→ primary template is available to Main Agent first turn
→ Main Agent calls run_agent_team before web_search
→ Team graph and child nodes are created
```
For a plainly one-step request with an activated but non-fitting template:
```text
Skill activation
→ Main Agent begins normal execution directly
→ execution_mode=single is recorded
→ no additional model round and no reason text are generated
```
All existing semantics for generic workers, node tool scopes, Skill-based tool assembly, evidence gates, final synthesis, and no nested Teams remain unchanged.
## Risks
- This relies on the Main Agent following a stronger first-turn instruction; it is materially more reliable than `may/prefer`, but still model-mediated rather than a hard classifier.
- A template can be activated for a broad Skill while not fitting a narrow user follow-up. The explicit single-agent route is retained for that case.
- Hiding `run_agent_team` after first-turn single selection changes a root run's available tool list over iterations. The event and tests must make that state transition explicit.
- Existing pre-planner keyword routing remains a separate heuristic. It can still choose a Team early for known complex tasks; it is no longer the sole mechanism for template-driven Team execution.