Files

steven_li 520a21a027 feat(coordinator): 添加团队节点默认最大工具迭代次数配置

添加 DEFAULT_TEAM_NODE_MAX_TOOL_ITERATIONS 配置项以控制团队节点的最大工具迭代次数，
并修改 LocalAgentRunner 中的逻辑来使用此默认值当 envelope 中未指定时。

fix(runtime): 修复团队节点运行成功判断逻辑

更新运行成功判断条件，将 finish_reason 为 "max_tool_iterations_finalized" 的情况
视为运行失败，并添加对原始工具调用输出的检测，避免将其误判为成功完成。

feat(mcp): 添加团队工作流MCP工具类别支持

增加新的本地MCP工具类别 "team_workflow" 及其对应的工具创建功能，
为团队工作流提供本地工具支持。

refactor(engine): 调整AgentLoop最大工具迭代次数设置

将 AgentProfile 中的默认 max_tool_iterations 从 30 增加到 100，
同时移除 TaskExecutionPlanner 构造函数中的重复参数传递。

perf(mcp): 优化MCP连接管理避免重复连接

添加 mcp_connected 标志来跟踪MCP连接状态，确保 connect_all 只执行一次，
提高性能并避免不必要的重复连接。

refactor(skills): 移除技能团队模板相关功能

移除与技能团队模板相关的代码，包括解析、存储和处理逻辑，
简化技能记录结构和加载流程。

feat(process): 增强会话过程投影器功能

添加技能激活快照事件处理，改进团队运行完成消息显示，
并增强技能激活事件的时间戳记录功能。

refactor(tasks): 简化任务尝试编排器团队执行逻辑

移除团队执行相关代码，将所有任务统一按单步执行处理，
简化任务编排器的复杂度并提升执行效率。

fix(evidence): 修复节点证据评估中需求验证逻辑

更新节点证据评估逻辑，跳过自然语言证据需求的确定性验证，
只执行机器可读的需求验证，避免因自然语言需求导致的节点失败。

2026-06-26 16:36:29 +08:00

7.7 KiB

Raw Blame History

Steven MGM/Galaxy Team Template Demo

Target

Install mgm-galaxy-financial-chart-report-safe into Steven's Beaver workspace, then run one task that exercises:

Planner
→ Skill Template selection
→ ExecutionGraph / ExecutionNode adaptation
→ Node Skill Binding
→ Team execution
→ Tool scope filtering
→ Evidence gate
→ Final synthesis complete/incomplete outcome

Current environment status observed by Codex

The repository contains Steven's instance metadata:

instance_id: steven
container_name: app-instance-steven
beaver_home: app-instance/runtime/instances/steven/beaver-home
workspace: app-instance/runtime/instances/steven/beaver-home/workspace
public_url: http://steven.172.19.0.245.nip.io:8088

Codex could not directly apply the skill to the live Steven instance in this session because:

docker API: permission denied while connecting to /var/run/docker.sock
Steven workspace/skills parent dir: owned by nobody:nogroup and not writable by current user
local backend .venv: incomplete after uv environment rebuild; missing test/runtime dependencies

So this runbook is the exact artifact to apply from a shell with Docker or filesystem permission.

Install Skill into Steven workspace

From repository root, run as a user that can write Steven's workspace:

SKILL_NAME=mgm-galaxy-financial-chart-report-safe
WORKSPACE=app-instance/runtime/instances/steven/beaver-home/workspace
SKILL_DIR="$WORKSPACE/skills/$SKILL_NAME"

mkdir -p "$SKILL_DIR/versions/v0001"
cp docs/superpowers/examples/mgm-galaxy-financial-chart-report-safe.SKILL.md \
  "$SKILL_DIR/versions/v0001/SKILL.md"

python3 - <<'PY'
import hashlib
import json
from pathlib import Path
from datetime import datetime, timezone

skill_name = "mgm-galaxy-financial-chart-report-safe"
workspace = Path("app-instance/runtime/instances/steven/beaver-home/workspace")
skill_dir = workspace / "skills" / skill_name
skill_md = skill_dir / "versions" / "v0001" / "SKILL.md"
content = skill_md.read_text(encoding="utf-8")
digest = "sha256:" + hashlib.sha256(content.encode("utf-8")).hexdigest()
now = datetime.now(timezone.utc).isoformat()

(skill_dir / "current.json").write_text(
    json.dumps({"current_version": "v0001"}, indent=2, ensure_ascii=False) + "\n",
    encoding="utf-8",
)

(skill_dir / "skill.json").write_text(
    json.dumps(
        {
            "name": skill_name,
            "display_name": "MGM/Galaxy Financial Chart Report Safe",
            "description": "Compare MGM China and Galaxy Entertainment using official financial sources, produce chart-ready Markdown, and avoid claiming generated chart image/file artifacts.",
            "created_at": now,
            "updated_at": now,
            "current_version": "v0001",
            "status": "active",
            "tags": ["finance", "research", "report", "chart-ready-data", "mgm", "galaxy"],
            "owners": ["steven"],
            "source_kind": "workspace",
            "lineage": [],
        },
        indent=2,
        ensure_ascii=False,
    )
    + "\n",
    encoding="utf-8",
)

(skill_dir / "versions" / "v0001" / "version.json").write_text(
    json.dumps(
        {
            "skill_name": skill_name,
            "version": "v0001",
            "content_hash": digest,
            "summary_hash": digest,
            "created_at": now,
            "created_by": "steven",
            "change_reason": "Add real Skill Team Template example for MGM/Galaxy finance report demo",
            "parent_version": None,
            "review_state": "published",
            "frontmatter": {
                "name": skill_name,
                "description": "Compare MGM China and Galaxy Entertainment using official financial sources, produce chart-ready Markdown, and avoid claiming generated chart image/file artifacts.",
                "tools": ["web_search", "web_fetch"],
            },
            "summary": "MGM/Galaxy finance report skill with a task-only Beaver team template for official source collection, metric extraction, validation, and Markdown chart-ready reporting.",
            "tool_hints": ["web_search", "web_fetch"],
            "provenance": {"source_kind": "manual_demo", "target_instance": "steven"},
            "tree_hash": "",
        },
        indent=2,
        ensure_ascii=False,
    )
    + "\n",
    encoding="utf-8",
)

index_path = workspace / "skills" / "_index" / "published.json"
index_path.parent.mkdir(parents=True, exist_ok=True)
try:
    payload = json.loads(index_path.read_text(encoding="utf-8"))
except FileNotFoundError:
    payload = {"items": []}
items = [str(item) for item in payload.get("items", [])]
if skill_name not in items:
    items.append(skill_name)
index_path.write_text(json.dumps({"items": items}, indent=2, ensure_ascii=False) + "\n", encoding="utf-8")
PY

Restart or start Steven container

If the container already exists:

docker restart app-instance-steven

If it does not exist, use the existing instance metadata and project scripts rather than creating a new instance id.

Demo task prompt

Send this as Steven's user task:

Use the MGM/Galaxy finance report skill to compare MGM China Holdings and Galaxy Entertainment using official public financial disclosures. Produce a concise board-style Markdown report with source URLs, a comparison table, chart-ready data, and a text/Mermaid chart section. Do not claim a generated image or saved chart file.

Expected planning shape

The planner should produce a team DAG with these task nodes:

collect_official_sources
→ extract_financial_metrics
→ validate_metrics
→ generate_chart_report

Expected node constraints:

collect_official_sources.allowed_tool_names = ["web_search", "web_fetch"]
extract_financial_metrics.allowed_tool_names = ["web_fetch"]
validate_metrics.allowed_tool_names = []
generate_chart_report.allowed_tool_names = []

The created workers should remain generic:

node.agent.role = ""
node.agent.metadata.sub_agent_kind = "generic_skill_worker"

Expected complete outcome

If source collection and extraction produce required evidence:

Planner
→ TeamRunResult with required nodes completion_status=succeeded
→ task_outcome=complete
→ tool-free final synthesis
→ final Markdown report

The final output may include:

comparison table
chart-ready data
Mermaid
Markdown chart section
text bar chart fallback
final textual report

It must not claim:

generated chart image
generated chart file
saved chart artifact

Expected incomplete outcome

If official-source evidence is missing or web tools fail:

collect_official_sources.completion_status=partial
→ evidence_gaps populated
→ because block_downstream_on_partial=true, downstream nodes are blocked
→ task_outcome=incomplete
→ tool-free final synthesis still runs
→ final answer is prefixed with an incomplete notice

The final response should explain which required evidence was missing instead of fabricating metrics.

Verification queries

After running the task, inspect Steven's event log:

WORKSPACE=app-instance/runtime/instances/steven/beaver-home/workspace
tail -n 200 "$WORKSPACE/tasks/events.jsonl"

Look for:

task_execution_planned
task_team_run_completed or task_team_run_failed
task_synthesis_completed

For task_execution_planned, verify:

planner_adaptation.template_used = true
planner_adaptation.selected_template = mgm-galaxy-financial-chart-report-safe
node_ids include collect_official_sources/extract_financial_metrics/validate_metrics/generate_chart_report

For task_synthesis_completed, verify:

task_outcome = complete | incomplete
incomplete_node_ids = [] for complete, otherwise populated

7.7 KiB Raw Blame History