feat(coordinator): 添加团队节点默认最大工具迭代次数配置

添加 DEFAULT_TEAM_NODE_MAX_TOOL_ITERATIONS 配置项以控制团队节点的最大工具迭代次数,
并修改 LocalAgentRunner 中的逻辑来使用此默认值当 envelope 中未指定时。

fix(runtime): 修复团队节点运行成功判断逻辑

更新运行成功判断条件,将 finish_reason 为 "max_tool_iterations_finalized" 的情况
视为运行失败,并添加对原始工具调用输出的检测,避免将其误判为成功完成。

feat(mcp): 添加团队工作流MCP工具类别支持

增加新的本地MCP工具类别 "team_workflow" 及其对应的工具创建功能,
为团队工作流提供本地工具支持。

refactor(engine): 调整AgentLoop最大工具迭代次数设置

将 AgentProfile 中的默认 max_tool_iterations 从 30 增加到 100,
同时移除 TaskExecutionPlanner 构造函数中的重复参数传递。

perf(mcp): 优化MCP连接管理避免重复连接

添加 mcp_connected 标志来跟踪MCP连接状态,确保 connect_all 只执行一次,
提高性能并避免不必要的重复连接。

refactor(skills): 移除技能团队模板相关功能

移除与技能团队模板相关的代码,包括解析、存储和处理逻辑,
简化技能记录结构和加载流程。

feat(process): 增强会话过程投影器功能

添加技能激活快照事件处理,改进团队运行完成消息显示,
并增强技能激活事件的时间戳记录功能。

refactor(tasks): 简化任务尝试编排器团队执行逻辑

移除团队执行相关代码,将所有任务统一按单步执行处理,
简化任务编排器的复杂度并提升执行效率。

fix(evidence): 修复节点证据评估中需求验证逻辑

更新节点证据评估逻辑,跳过自然语言证据需求的确定性验证,
只执行机器可读的需求验证,避免因自然语言需求导致的节点失败。
This commit is contained in:
2026-06-26 16:36:29 +08:00
parent 53b13e8eac
commit 520a21a027
360 changed files with 13271 additions and 1848 deletions

View File

@ -0,0 +1,257 @@
# Steven MGM/Galaxy Team Template Demo
## Target
Install `mgm-galaxy-financial-chart-report-safe` into Steven's Beaver workspace, then run one task that exercises:
```text
Planner
→ Skill Template selection
→ ExecutionGraph / ExecutionNode adaptation
→ Node Skill Binding
→ Team execution
→ Tool scope filtering
→ Evidence gate
→ Final synthesis complete/incomplete outcome
```
## Current environment status observed by Codex
The repository contains Steven's instance metadata:
```text
instance_id: steven
container_name: app-instance-steven
beaver_home: app-instance/runtime/instances/steven/beaver-home
workspace: app-instance/runtime/instances/steven/beaver-home/workspace
public_url: http://steven.172.19.0.245.nip.io:8088
```
Codex could not directly apply the skill to the live Steven instance in this session because:
```text
docker API: permission denied while connecting to /var/run/docker.sock
Steven workspace/skills parent dir: owned by nobody:nogroup and not writable by current user
local backend .venv: incomplete after uv environment rebuild; missing test/runtime dependencies
```
So this runbook is the exact artifact to apply from a shell with Docker or filesystem permission.
## Install Skill into Steven workspace
From repository root, run as a user that can write Steven's workspace:
```bash
SKILL_NAME=mgm-galaxy-financial-chart-report-safe
WORKSPACE=app-instance/runtime/instances/steven/beaver-home/workspace
SKILL_DIR="$WORKSPACE/skills/$SKILL_NAME"
mkdir -p "$SKILL_DIR/versions/v0001"
cp docs/superpowers/examples/mgm-galaxy-financial-chart-report-safe.SKILL.md \
"$SKILL_DIR/versions/v0001/SKILL.md"
python3 - <<'PY'
import hashlib
import json
from pathlib import Path
from datetime import datetime, timezone
skill_name = "mgm-galaxy-financial-chart-report-safe"
workspace = Path("app-instance/runtime/instances/steven/beaver-home/workspace")
skill_dir = workspace / "skills" / skill_name
skill_md = skill_dir / "versions" / "v0001" / "SKILL.md"
content = skill_md.read_text(encoding="utf-8")
digest = "sha256:" + hashlib.sha256(content.encode("utf-8")).hexdigest()
now = datetime.now(timezone.utc).isoformat()
(skill_dir / "current.json").write_text(
json.dumps({"current_version": "v0001"}, indent=2, ensure_ascii=False) + "\n",
encoding="utf-8",
)
(skill_dir / "skill.json").write_text(
json.dumps(
{
"name": skill_name,
"display_name": "MGM/Galaxy Financial Chart Report Safe",
"description": "Compare MGM China and Galaxy Entertainment using official financial sources, produce chart-ready Markdown, and avoid claiming generated chart image/file artifacts.",
"created_at": now,
"updated_at": now,
"current_version": "v0001",
"status": "active",
"tags": ["finance", "research", "report", "chart-ready-data", "mgm", "galaxy"],
"owners": ["steven"],
"source_kind": "workspace",
"lineage": [],
},
indent=2,
ensure_ascii=False,
)
+ "\n",
encoding="utf-8",
)
(skill_dir / "versions" / "v0001" / "version.json").write_text(
json.dumps(
{
"skill_name": skill_name,
"version": "v0001",
"content_hash": digest,
"summary_hash": digest,
"created_at": now,
"created_by": "steven",
"change_reason": "Add real Skill Team Template example for MGM/Galaxy finance report demo",
"parent_version": None,
"review_state": "published",
"frontmatter": {
"name": skill_name,
"description": "Compare MGM China and Galaxy Entertainment using official financial sources, produce chart-ready Markdown, and avoid claiming generated chart image/file artifacts.",
"tools": ["web_search", "web_fetch"],
},
"summary": "MGM/Galaxy finance report skill with a task-only Beaver team template for official source collection, metric extraction, validation, and Markdown chart-ready reporting.",
"tool_hints": ["web_search", "web_fetch"],
"provenance": {"source_kind": "manual_demo", "target_instance": "steven"},
"tree_hash": "",
},
indent=2,
ensure_ascii=False,
)
+ "\n",
encoding="utf-8",
)
index_path = workspace / "skills" / "_index" / "published.json"
index_path.parent.mkdir(parents=True, exist_ok=True)
try:
payload = json.loads(index_path.read_text(encoding="utf-8"))
except FileNotFoundError:
payload = {"items": []}
items = [str(item) for item in payload.get("items", [])]
if skill_name not in items:
items.append(skill_name)
index_path.write_text(json.dumps({"items": items}, indent=2, ensure_ascii=False) + "\n", encoding="utf-8")
PY
```
## Restart or start Steven container
If the container already exists:
```bash
docker restart app-instance-steven
```
If it does not exist, use the existing instance metadata and project scripts rather than creating a new instance id.
## Demo task prompt
Send this as Steven's user task:
```text
Use the MGM/Galaxy finance report skill to compare MGM China Holdings and Galaxy Entertainment using official public financial disclosures. Produce a concise board-style Markdown report with source URLs, a comparison table, chart-ready data, and a text/Mermaid chart section. Do not claim a generated image or saved chart file.
```
## Expected planning shape
The planner should produce a team DAG with these task nodes:
```text
collect_official_sources
→ extract_financial_metrics
→ validate_metrics
→ generate_chart_report
```
Expected node constraints:
```text
collect_official_sources.allowed_tool_names = ["web_search", "web_fetch"]
extract_financial_metrics.allowed_tool_names = ["web_fetch"]
validate_metrics.allowed_tool_names = []
generate_chart_report.allowed_tool_names = []
```
The created workers should remain generic:
```text
node.agent.role = ""
node.agent.metadata.sub_agent_kind = "generic_skill_worker"
```
## Expected complete outcome
If source collection and extraction produce required evidence:
```text
Planner
→ TeamRunResult with required nodes completion_status=succeeded
→ task_outcome=complete
→ tool-free final synthesis
→ final Markdown report
```
The final output may include:
```text
comparison table
chart-ready data
Mermaid
Markdown chart section
text bar chart fallback
final textual report
```
It must not claim:
```text
generated chart image
generated chart file
saved chart artifact
```
## Expected incomplete outcome
If official-source evidence is missing or web tools fail:
```text
collect_official_sources.completion_status=partial
→ evidence_gaps populated
→ because block_downstream_on_partial=true, downstream nodes are blocked
→ task_outcome=incomplete
→ tool-free final synthesis still runs
→ final answer is prefixed with an incomplete notice
```
The final response should explain which required evidence was missing instead of fabricating metrics.
## Verification queries
After running the task, inspect Steven's event log:
```bash
WORKSPACE=app-instance/runtime/instances/steven/beaver-home/workspace
tail -n 200 "$WORKSPACE/tasks/events.jsonl"
```
Look for:
```text
task_execution_planned
task_team_run_completed or task_team_run_failed
task_synthesis_completed
```
For `task_execution_planned`, verify:
```text
planner_adaptation.template_used = true
planner_adaptation.selected_template = mgm-galaxy-financial-chart-report-safe
node_ids include collect_official_sources/extract_financial_metrics/validate_metrics/generate_chart_report
```
For `task_synthesis_completed`, verify:
```text
task_outcome = complete | incomplete
incomplete_node_ids = [] for complete, otherwise populated
```