添加 RuntimeContext 类用于捕获模型运行时的日期时间信息, 包括UTC时间、本地时间和时区信息,并在系统提示中显示这些信息。 同时增加最大上下文消息数和工具迭代次数的配置选项, 将验证服务从引擎加载器中移除,并更新相关的数据结构和接口。 BREAKING CHANGE: 移除了验证服务,相关字段被替换为证据状态和接受状态。 - 添加 RuntimeContext 类和相关渲染方法 - 增加 max_context_messages 和 max_tool_iterations 配置 - 移除 ValidationService 相关代码 - 更新消息记录中的验证状态字段 - 添加原始工具调用检测和回退处理
955 lines
39 KiB
HTML
955 lines
39 KiB
HTML
<!DOCTYPE html>
|
||
<html lang="zh-CN">
|
||
<head>
|
||
<meta charset="UTF-8">
|
||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||
<title>Beaver Backend Module Blueprint</title>
|
||
<style>
|
||
:root {
|
||
--c-bg: #f8fafc;
|
||
--c-canvas: #ffffff;
|
||
--c-border: #cbd5e1;
|
||
--c-border-strong: #94a3b8;
|
||
--c-text-main: #0f172a;
|
||
--c-text-sub: #64748b;
|
||
--c-text-soft: #475569;
|
||
--c-accent: #111827;
|
||
--c-risk: #b91c1c;
|
||
--c-ok: #166534;
|
||
--font-ui: Inter, Helvetica, Arial, sans-serif;
|
||
--font-mono: "JetBrains Mono", Consolas, "Liberation Mono", monospace;
|
||
}
|
||
|
||
* {
|
||
box-sizing: border-box;
|
||
}
|
||
|
||
body {
|
||
margin: 0;
|
||
min-height: 100vh;
|
||
background: var(--c-bg);
|
||
color: var(--c-text-main);
|
||
font-family: var(--font-ui);
|
||
line-height: 1.55;
|
||
}
|
||
|
||
a {
|
||
color: inherit;
|
||
text-decoration: underline;
|
||
text-decoration-thickness: 1px;
|
||
text-underline-offset: 2px;
|
||
}
|
||
|
||
.page {
|
||
width: min(1500px, 100%);
|
||
margin: 0 auto;
|
||
padding: 32px;
|
||
}
|
||
|
||
.diagram-canvas {
|
||
background: var(--c-canvas);
|
||
border: 1px solid var(--c-border);
|
||
padding: 32px;
|
||
}
|
||
|
||
.diagram-header {
|
||
display: grid;
|
||
grid-template-columns: minmax(0, 1fr) auto;
|
||
gap: 24px;
|
||
align-items: start;
|
||
border-bottom: 1px solid var(--c-border);
|
||
padding-bottom: 18px;
|
||
margin-bottom: 24px;
|
||
}
|
||
|
||
.diagram-title {
|
||
margin: 0 0 6px;
|
||
font-size: 24px;
|
||
font-weight: 700;
|
||
letter-spacing: 0;
|
||
}
|
||
|
||
.diagram-subtitle,
|
||
.meta-line,
|
||
.kicker {
|
||
font-family: var(--font-mono);
|
||
font-size: 11px;
|
||
color: var(--c-text-sub);
|
||
text-transform: uppercase;
|
||
letter-spacing: 0.05em;
|
||
}
|
||
|
||
.meta-box {
|
||
border: 1px solid var(--c-border);
|
||
padding: 10px 12px;
|
||
min-width: 280px;
|
||
font-family: var(--font-mono);
|
||
font-size: 12px;
|
||
color: var(--c-text-soft);
|
||
}
|
||
|
||
.summary {
|
||
display: grid;
|
||
grid-template-columns: 1.15fr 0.85fr;
|
||
gap: 16px;
|
||
margin-bottom: 18px;
|
||
}
|
||
|
||
.panel,
|
||
.module,
|
||
.flow-box,
|
||
.note,
|
||
.table-wrap {
|
||
border: 1px solid var(--c-border);
|
||
background: var(--c-canvas);
|
||
}
|
||
|
||
.panel {
|
||
padding: 16px;
|
||
}
|
||
|
||
.panel h2,
|
||
.section h2 {
|
||
margin: 0 0 10px;
|
||
font-size: 17px;
|
||
letter-spacing: 0;
|
||
}
|
||
|
||
.panel p,
|
||
.module p,
|
||
.note p {
|
||
margin: 0;
|
||
color: var(--c-text-soft);
|
||
font-size: 13px;
|
||
}
|
||
|
||
.badge-row {
|
||
display: flex;
|
||
flex-wrap: wrap;
|
||
gap: 6px;
|
||
margin-top: 12px;
|
||
}
|
||
|
||
.badge {
|
||
display: inline-block;
|
||
border: 1px solid var(--c-border);
|
||
padding: 2px 6px;
|
||
font-family: var(--font-mono);
|
||
font-size: 10px;
|
||
color: var(--c-text-sub);
|
||
white-space: nowrap;
|
||
}
|
||
|
||
.badge-solid {
|
||
border-color: var(--c-accent);
|
||
background: var(--c-accent);
|
||
color: var(--c-canvas);
|
||
}
|
||
|
||
.section {
|
||
margin-top: 24px;
|
||
border-top: 1px solid var(--c-border);
|
||
padding-top: 24px;
|
||
}
|
||
|
||
.section-head {
|
||
display: grid;
|
||
grid-template-columns: minmax(0, 1fr) auto;
|
||
gap: 16px;
|
||
align-items: end;
|
||
margin-bottom: 14px;
|
||
}
|
||
|
||
.section-head p {
|
||
margin: 4px 0 0;
|
||
max-width: 980px;
|
||
color: var(--c-text-sub);
|
||
font-size: 13px;
|
||
}
|
||
|
||
.module-grid {
|
||
display: grid;
|
||
grid-template-columns: repeat(3, minmax(0, 1fr));
|
||
gap: 12px;
|
||
}
|
||
|
||
.module {
|
||
padding: 14px;
|
||
display: flex;
|
||
flex-direction: column;
|
||
gap: 10px;
|
||
min-height: 260px;
|
||
}
|
||
|
||
.module h3 {
|
||
margin: 0;
|
||
font-size: 15px;
|
||
letter-spacing: 0;
|
||
}
|
||
|
||
.module-label {
|
||
font-family: var(--font-mono);
|
||
font-size: 11px;
|
||
color: var(--c-text-sub);
|
||
}
|
||
|
||
.file-list,
|
||
.bullets,
|
||
.checks {
|
||
margin: 0;
|
||
padding-left: 18px;
|
||
color: var(--c-text-soft);
|
||
font-size: 13px;
|
||
}
|
||
|
||
.file-list {
|
||
font-family: var(--font-mono);
|
||
font-size: 11px;
|
||
line-height: 1.55;
|
||
}
|
||
|
||
.flow {
|
||
display: grid;
|
||
grid-template-columns: repeat(7, minmax(0, 1fr));
|
||
gap: 10px;
|
||
align-items: stretch;
|
||
}
|
||
|
||
.flow-box {
|
||
min-height: 118px;
|
||
padding: 12px;
|
||
position: relative;
|
||
}
|
||
|
||
.flow-box::after {
|
||
content: "";
|
||
position: absolute;
|
||
top: 50%;
|
||
right: -10px;
|
||
width: 10px;
|
||
border-top: 1px solid var(--c-border-strong);
|
||
}
|
||
|
||
.flow-box:last-child::after {
|
||
display: none;
|
||
}
|
||
|
||
.flow-box h3 {
|
||
margin: 0 0 8px;
|
||
font-size: 13px;
|
||
}
|
||
|
||
.flow-box p {
|
||
margin: 0;
|
||
color: var(--c-text-soft);
|
||
font-size: 12px;
|
||
}
|
||
|
||
.matrix {
|
||
display: grid;
|
||
grid-template-columns: 280px minmax(0, 1fr);
|
||
border-top: 1px solid var(--c-border);
|
||
border-left: 1px solid var(--c-border);
|
||
}
|
||
|
||
.matrix > div {
|
||
border-right: 1px solid var(--c-border);
|
||
border-bottom: 1px solid var(--c-border);
|
||
padding: 10px 12px;
|
||
font-size: 13px;
|
||
}
|
||
|
||
.matrix .key {
|
||
font-family: var(--font-mono);
|
||
color: var(--c-text-main);
|
||
background: #f8fafc;
|
||
}
|
||
|
||
.table-wrap {
|
||
overflow-x: auto;
|
||
}
|
||
|
||
table {
|
||
width: 100%;
|
||
min-width: 980px;
|
||
border-collapse: collapse;
|
||
font-size: 13px;
|
||
}
|
||
|
||
th,
|
||
td {
|
||
border-bottom: 1px solid var(--c-border);
|
||
border-right: 1px solid var(--c-border);
|
||
padding: 10px 12px;
|
||
vertical-align: top;
|
||
text-align: left;
|
||
}
|
||
|
||
th {
|
||
font-family: var(--font-mono);
|
||
font-size: 11px;
|
||
text-transform: uppercase;
|
||
letter-spacing: 0.05em;
|
||
color: var(--c-text-sub);
|
||
background: #f8fafc;
|
||
}
|
||
|
||
tr:last-child td {
|
||
border-bottom: 0;
|
||
}
|
||
|
||
th:last-child,
|
||
td:last-child {
|
||
border-right: 0;
|
||
}
|
||
|
||
code,
|
||
.mono {
|
||
font-family: var(--font-mono);
|
||
font-size: 0.92em;
|
||
color: var(--c-text-main);
|
||
}
|
||
|
||
.risk {
|
||
border-color: #fecaca;
|
||
}
|
||
|
||
.risk h3,
|
||
.risk .module-label {
|
||
color: var(--c-risk);
|
||
}
|
||
|
||
.ok {
|
||
color: var(--c-ok);
|
||
font-weight: 600;
|
||
}
|
||
|
||
.cols-2 {
|
||
display: grid;
|
||
grid-template-columns: repeat(2, minmax(0, 1fr));
|
||
gap: 12px;
|
||
}
|
||
|
||
.cols-4 {
|
||
display: grid;
|
||
grid-template-columns: repeat(4, minmax(0, 1fr));
|
||
gap: 12px;
|
||
}
|
||
|
||
.note {
|
||
padding: 14px;
|
||
}
|
||
|
||
@media (max-width: 1180px) {
|
||
.module-grid,
|
||
.summary,
|
||
.cols-4 {
|
||
grid-template-columns: 1fr 1fr;
|
||
}
|
||
|
||
.flow {
|
||
grid-template-columns: 1fr 1fr;
|
||
}
|
||
|
||
.flow-box::after {
|
||
display: none;
|
||
}
|
||
}
|
||
|
||
@media (max-width: 760px) {
|
||
.page {
|
||
padding: 12px;
|
||
}
|
||
|
||
.diagram-canvas {
|
||
padding: 16px;
|
||
}
|
||
|
||
.diagram-header,
|
||
.summary,
|
||
.section-head,
|
||
.module-grid,
|
||
.cols-2,
|
||
.cols-4,
|
||
.flow,
|
||
.matrix {
|
||
grid-template-columns: 1fr;
|
||
}
|
||
|
||
.meta-box {
|
||
min-width: 0;
|
||
}
|
||
}
|
||
</style>
|
||
</head>
|
||
<body>
|
||
<main class="page">
|
||
<article class="diagram-canvas">
|
||
<header class="diagram-header">
|
||
<div>
|
||
<div class="diagram-title">Beaver Backend Module Blueprint</div>
|
||
<div class="diagram-subtitle">Flat Engineering Blueprint / app-instance/backend / 2026-05-25</div>
|
||
</div>
|
||
<div class="meta-box">
|
||
SOURCE: <span class="mono">app-instance/backend</span><br>
|
||
STYLE: <span class="mono">projcet_review/blueprinter.md</span><br>
|
||
SCOPE: <span class="mono">backend code + tests + architecture docs</span><br>
|
||
MULTI-PAGE: <a href="backend_blueprint/index.html">backend_blueprint/index.html</a>
|
||
</div>
|
||
</header>
|
||
|
||
<section class="summary">
|
||
<div class="panel">
|
||
<h2>项目是干嘛的</h2>
|
||
<p>
|
||
Beaver 后端是一个面向用户任务的 agent runtime。它接收来自 Web、WebSocket、CLI、Gateway、Cron 或 MCP 的请求,
|
||
用 Main Agent 判断这是不是一个需要跟踪的 Task;简单问题直接回复,复杂任务进入 Task mode。Task mode 会规划单 agent
|
||
或 team 执行,运行统一的 <code>AgentLoop</code>,选择技能和工具,调用模型,记录事实证据,并等待用户接受、修改或放弃。
|
||
只有用户接受后的 Task evidence 才会沉淀为可学习的 skill 候选。
|
||
</p>
|
||
<div class="badge-row">
|
||
<span class="badge-solid badge">UNIFIED ENGINE</span>
|
||
<span class="badge">TASK MODE</span>
|
||
<span class="badge">TEAM COORDINATOR</span>
|
||
<span class="badge">SKILL LEARNING</span>
|
||
<span class="badge">MCP TOOLS</span>
|
||
<span class="badge">SCHEDULED TASKS</span>
|
||
</div>
|
||
</div>
|
||
<div class="panel">
|
||
<h2>最关键的架构判断</h2>
|
||
<p>
|
||
主 agent、team node、sub-agent 都不各自实现一套 runtime;它们最后都回到同一个 <code>beaver.engine.AgentLoop</code>。
|
||
因此后续修改时要优先确认:入口层是不是薄的,服务层是不是只编排,真正 tool loop / prompt / provider / session 逻辑是不是仍在 engine 内收口。
|
||
</p>
|
||
<div class="badge-row">
|
||
<span class="badge">interfaces -> services</span>
|
||
<span class="badge">services -> engine</span>
|
||
<span class="badge">engine -> skills/tools/memory</span>
|
||
</div>
|
||
</div>
|
||
</section>
|
||
|
||
<section class="section">
|
||
<div class="section-head">
|
||
<div>
|
||
<h2>主执行流</h2>
|
||
<p>这是后端最重要的一条路径,后续逐模块修改文档应该先对齐这条链路。</p>
|
||
</div>
|
||
<div class="kicker">CHAT / TASK / ACCEPTANCE / LEARNING</div>
|
||
</div>
|
||
<div class="flow">
|
||
<div class="flow-box">
|
||
<h3>1. 入口接收</h3>
|
||
<p><code>/api/chat</code>、WebSocket、CLI、Gateway 或 Cron 把用户消息转给 <code>AgentService</code>。</p>
|
||
</div>
|
||
<div class="flow-box">
|
||
<h3>2. 意图路由</h3>
|
||
<p><code>MainAgentRouter</code> 结合 active task 和近期会话,判断 simple / new_task / continue / revise / close / abandon。</p>
|
||
</div>
|
||
<div class="flow-box">
|
||
<h3>3. Task 建模</h3>
|
||
<p><code>TaskService</code> 写入 <code>tasks.json</code> 和 <code>events.jsonl</code>,维护 open/running/awaiting_acceptance/closed 状态。</p>
|
||
</div>
|
||
<div class="flow-box">
|
||
<h3>4. 执行规划</h3>
|
||
<p><code>TaskExecutionPlanner</code> 让辅助模型选择 single 或 team,并为 team 生成 sequence / parallel / DAG 节点。</p>
|
||
</div>
|
||
<div class="flow-box">
|
||
<h3>5. 统一运行</h3>
|
||
<p><code>AgentLoop</code> 冻结 memory,选 skill,选 tool,构建 prompt,调用 provider,执行 tool loop。</p>
|
||
</div>
|
||
<div class="flow-box">
|
||
<h3>6. 事实证据</h3>
|
||
<p><code>EvidenceBuilder</code> 汇总 run/team/tool 证据。Evidence 只记录事实,不判断、不打分、不 gate。</p>
|
||
</div>
|
||
<div class="flow-box">
|
||
<h3>7. 验收学习</h3>
|
||
<p>用户接受 Task 后生成 accepted task evidence 和 learning candidates;worker 可生成 draft,但不会自动 approve/publish。</p>
|
||
</div>
|
||
</div>
|
||
</section>
|
||
|
||
<section class="section">
|
||
<div class="section-head">
|
||
<div>
|
||
<h2>模块总览</h2>
|
||
<p>每个模块下面都写明责任、逻辑、具体怎么做,以及关键文件。</p>
|
||
</div>
|
||
<div class="kicker">MODULE RESPONSIBILITY MAP</div>
|
||
</div>
|
||
|
||
<div class="module-grid">
|
||
<section class="module">
|
||
<div class="module-label">foundation</div>
|
||
<h3>底层配置、事件和通用模型</h3>
|
||
<p>负责加载实例级配置、定义 provider/MCP/AuthZ/backend identity schema、提供 message bus 和 cron 数据模型。</p>
|
||
<ul class="bullets">
|
||
<li>配置来源优先级:<code>BEAVER_CONFIG_PATH</code>、<code>BEAVER_HOME/config.json</code>、workspace 下 <code>.beaver/config.json</code>。</li>
|
||
<li><code>BeaverConfig.resolve_provider_target()</code> 从默认模型、显式 provider 和已配置凭据推导运行目标。</li>
|
||
<li><code>MessageBus</code> 用 async queue 承接 gateway inbound/outbound。</li>
|
||
<li><code>CronSchedule/CronJob/CronRunRecord</code> 是定时任务持久化模型。</li>
|
||
</ul>
|
||
<ul class="file-list">
|
||
<li>beaver/foundation/config/schema.py</li>
|
||
<li>beaver/foundation/config/loader.py</li>
|
||
<li>beaver/foundation/events/message_bus.py</li>
|
||
<li>beaver/foundation/models/cron.py</li>
|
||
<li>beaver/foundation/embedding.py</li>
|
||
</ul>
|
||
</section>
|
||
|
||
<section class="module">
|
||
<div class="module-label">interfaces</div>
|
||
<h3>薄入口层</h3>
|
||
<p>负责把 HTTP、WebSocket、CLI、Gateway、MCP server 的输入转换成服务层调用,不应保存核心执行逻辑。</p>
|
||
<ul class="bullets">
|
||
<li>Web app lifespan 启动 <code>AgentService</code> running mode、<code>CronService</code> 和可选 skill learning worker。</li>
|
||
<li><code>/api/chat</code> 和 <code>/ws/{session_id}</code> 都委托给 <code>_run_web_direct()</code> / <code>AgentService</code>。</li>
|
||
<li>文件 API 分两类:聊天附件 <code>workspace/files/<id></code> 与 workspace 浏览/上传/预览。</li>
|
||
<li>MCP interface 暴露 memory/tools server,Gateway 用 <code>MessageBus</code> 桥接渠道。</li>
|
||
</ul>
|
||
<ul class="file-list">
|
||
<li>beaver/interfaces/web/app.py</li>
|
||
<li>beaver/interfaces/web/files.py</li>
|
||
<li>beaver/interfaces/cli/main.py</li>
|
||
<li>beaver/interfaces/gateway/main.py</li>
|
||
<li>beaver/interfaces/mcp/*.py</li>
|
||
<li>beaver/interfaces/channels/*.py</li>
|
||
</ul>
|
||
</section>
|
||
|
||
<section class="module">
|
||
<div class="module-label">services</div>
|
||
<h3>应用服务编排层</h3>
|
||
<p>负责把入口请求转成系统内部流程:agent 运行、task mode、cron、team、memory、skill hub、process projection。</p>
|
||
<ul class="bullets">
|
||
<li><code>AgentService</code> 是主入口,区分 direct mode 和 running mode。</li>
|
||
<li><code>_process_with_main_agent()</code> 先做意图分类,再决定是否进入 Task。</li>
|
||
<li><code>_run_task_mode()</code> 管理 task planning、team 执行、主 agent synthesis、evidence 记录和用户验收状态。</li>
|
||
<li><code>CronService</code> 负责持久化定时任务、计算下一次运行、记录 history。</li>
|
||
<li><code>SessionProcessProjector</code> 把隐藏 task/team 事件投影给前端过程视图。</li>
|
||
</ul>
|
||
<ul class="file-list">
|
||
<li>beaver/services/agent_service.py</li>
|
||
<li>beaver/services/team_service.py</li>
|
||
<li>beaver/services/cron_service.py</li>
|
||
<li>beaver/services/process_service.py</li>
|
||
<li>beaver/services/skillhub_service.py</li>
|
||
</ul>
|
||
</section>
|
||
|
||
<section class="module">
|
||
<div class="module-label">engine</div>
|
||
<h3>统一 agent 运行内核</h3>
|
||
<p>这是主 agent 和 delegated agent 共用的核心。它装配 runtime,构建上下文,选择技能和工具,驱动 provider/tool loop,并记录所有运行事件。</p>
|
||
<ul class="bullets">
|
||
<li><code>EngineLoader</code> 装配 session、memory、run store、skill store、tool registry、MCP manager、task/evidence 服务。</li>
|
||
<li><code>AgentLoop.process_direct()</code> 是单次运行主链;running mode 下只能通过 queue <code>submit_direct()</code>。</li>
|
||
<li>每个 run 独立捕获 frozen memory snapshot,避免 parallel team runs 共享快照互相污染。</li>
|
||
<li>运行时写入 <code>run_started</code>、skill activation、tool selection、LLM request、tool result、run completed/failed 等事件。</li>
|
||
</ul>
|
||
<ul class="file-list">
|
||
<li>beaver/engine/loader.py</li>
|
||
<li>beaver/engine/loop.py</li>
|
||
<li>beaver/engine/context/builder.py</li>
|
||
<li>beaver/engine/providers/*.py</li>
|
||
<li>beaver/engine/session/*.py</li>
|
||
</ul>
|
||
</section>
|
||
|
||
<section class="module">
|
||
<div class="module-label">providers</div>
|
||
<h3>模型 provider 抽象与选路</h3>
|
||
<p>把不同模型网关统一成 <code>LLMProvider.chat()</code>,返回统一 <code>LLMResponse</code> 和 <code>ToolCallRequest</code>。</p>
|
||
<ul class="bullets">
|
||
<li><code>ProviderRuntime</code> 描述解析后的 provider、model、api mode、凭据、headers、routing。</li>
|
||
<li><code>ProviderBundle</code> 同时包含 main、fallback、auxiliary、embedding runtime。</li>
|
||
<li><code>FallbackProviderChain</code> 在主 provider 返回 error 或抛异常时按单次调用切到 fallback。</li>
|
||
<li>实现包含 LiteLLM、Anthropic、OpenAI Codex API、OpenAI-compatible custom。</li>
|
||
</ul>
|
||
<ul class="file-list">
|
||
<li>beaver/engine/providers/base.py</li>
|
||
<li>beaver/engine/providers/runtime.py</li>
|
||
<li>beaver/engine/providers/factory.py</li>
|
||
<li>beaver/engine/providers/registry.py</li>
|
||
<li>beaver/engine/providers/litellm.py</li>
|
||
<li>beaver/engine/providers/anthropic.py</li>
|
||
<li>beaver/engine/providers/codex.py</li>
|
||
</ul>
|
||
</section>
|
||
|
||
<section class="module">
|
||
<div class="module-label">tasks</div>
|
||
<h3>内部 Task、事实证据和用户验收</h3>
|
||
<p>负责把“需要执行和跟踪”的用户请求变成可持久化、可重试、可验收的 Task。</p>
|
||
<ul class="bullets">
|
||
<li><code>MainAgentRouter</code> 使用 LLM JSON 决策区分 simple/task/continue/revise/close/abandon。</li>
|
||
<li><code>TaskExecutionPlanner</code> 让辅助模型选择 single 或 team,并限制 team 节点最多 6 个。</li>
|
||
<li><code>TaskSkillResolver</code> 为 team node 匹配 published skill;没有匹配时生成 one-run ephemeral guidance。</li>
|
||
<li><code>EvidenceBuilder</code> 只记录事实证据;Task 是否完成只由用户验收决定。</li>
|
||
</ul>
|
||
<ul class="file-list">
|
||
<li>beaver/tasks/models.py</li>
|
||
<li>beaver/tasks/service.py</li>
|
||
<li>beaver/tasks/router.py</li>
|
||
<li>beaver/tasks/planner.py</li>
|
||
<li>beaver/tasks/skill_resolver.py</li>
|
||
<li>beaver/tasks/evidence.py</li>
|
||
<li>beaver/tasks/store.py</li>
|
||
</ul>
|
||
</section>
|
||
|
||
<section class="module">
|
||
<div class="module-label">coordinator</div>
|
||
<h3>多 agent / team 编排</h3>
|
||
<p>负责把 team execution graph 转成多个 delegated runs。v1 真正实现的是 sequence、parallel、DAG;其它 strategy 目前保留但未实现。</p>
|
||
<ul class="bullets">
|
||
<li><code>ExecutionGraph.validate()</code> 校验节点唯一、依赖存在、无环,以及 strategy 是否已实现。</li>
|
||
<li><code>TeamGraphScheduler</code> 按策略运行节点,失败依赖会把后续节点标记 blocked。</li>
|
||
<li><code>LocalAgentRunner</code> 为每个节点生成 child session,并仍调用同一个 <code>AgentLoop</code>。</li>
|
||
<li>Agent registry 和 LocalSubagentStore 支持管理 specialist/subagent,但当前 Task 主链主要走 generic skill worker。</li>
|
||
</ul>
|
||
<ul class="file-list">
|
||
<li>beaver/coordinator/models.py</li>
|
||
<li>beaver/coordinator/execution/scheduler.py</li>
|
||
<li>beaver/coordinator/local.py</li>
|
||
<li>beaver/coordinator/registry/*.py</li>
|
||
<li>beaver/coordinator/subagents.py</li>
|
||
</ul>
|
||
</section>
|
||
|
||
<section class="module">
|
||
<div class="module-label">tools</div>
|
||
<h3>工具契约、选择和执行</h3>
|
||
<p>负责把内建工具和 MCP 工具统一暴露为 provider function schema,并在 tool loop 里执行模型返回的调用。</p>
|
||
<ul class="bullets">
|
||
<li><code>ToolSpec</code> 是工具元数据和 schema 的事实来源,可导出 MCP descriptor 和 provider schema。</li>
|
||
<li><code>ToolAssembler</code> 按 always tools、skill tool hints、embedding retrieval 选择本轮工具。</li>
|
||
<li><code>ToolExecutor</code> 兼容 <code>ToolCallRequest</code> 和 OpenAI 风格 dict,解析参数并调用 registry。</li>
|
||
<li>内建工具覆盖 memory、session search、filesystem、web fetch/search、terminal/process/code、cron、skill admin、delegation utility。</li>
|
||
</ul>
|
||
<ul class="file-list">
|
||
<li>beaver/tools/base.py</li>
|
||
<li>beaver/tools/registry/tool_registry.py</li>
|
||
<li>beaver/tools/runtime/executor.py</li>
|
||
<li>beaver/tools/assembler/task_assembler.py</li>
|
||
<li>beaver/tools/builtins/*.py</li>
|
||
<li>beaver/tools/mcp/wrapper.py</li>
|
||
</ul>
|
||
</section>
|
||
|
||
<section class="module">
|
||
<div class="module-label">skills</div>
|
||
<h3>技能目录、选择、生命周期和学习</h3>
|
||
<p>负责发现、选择、注入、版本化、审核、发布和自动学习 Beaver skills。</p>
|
||
<ul class="bullets">
|
||
<li><code>SkillsLoader</code> 读取 workspace published skills、plugin/extra dirs、builtin skills,解析 frontmatter 和工具提示。</li>
|
||
<li><code>SkillAssembler</code> 用 embedding 召回候选,再用 LLM 做 shortlist/final 选择,并返回 <code>SkillContext</code>。</li>
|
||
<li><code>SkillSpecStore</code> 管理 <code>skill.json</code>、<code>current.json</code>、versions、drafts、reviews。</li>
|
||
<li><code>SkillLearningPipelineService</code> 协调 candidate -> draft -> safety/eval -> review -> approve -> publish。</li>
|
||
</ul>
|
||
<ul class="file-list">
|
||
<li>beaver/skills/catalog/*.py</li>
|
||
<li>beaver/skills/assembler/*.py</li>
|
||
<li>beaver/skills/specs/*.py</li>
|
||
<li>beaver/skills/drafts/service.py</li>
|
||
<li>beaver/skills/reviews/service.py</li>
|
||
<li>beaver/skills/publisher/service.py</li>
|
||
<li>beaver/skills/learning/*.py</li>
|
||
</ul>
|
||
</section>
|
||
|
||
<section class="module">
|
||
<div class="module-label">memory</div>
|
||
<h3>会话、长期记忆、运行记忆和学习状态</h3>
|
||
<p>负责保存对话事件、长期记忆、run receipt、skill effect、skill learning candidates 和安全/eval 报告。</p>
|
||
<ul class="bullets">
|
||
<li>会话存 SQLite,包含 <code>sessions</code>、<code>messages</code> 和 FTS5 <code>messages_fts</code>。</li>
|
||
<li>长期记忆只有 <code>MEMORY.md</code> 和 <code>USER.md</code> 两个桶,写入前扫描 prompt injection / secret exfiltration 风险。</li>
|
||
<li>run memory 用 JSONL 保存 <code>RunRecord</code> 和 <code>SkillEffectRecord</code>。</li>
|
||
<li>skill learning store 维护候选状态、performance snapshot、safety report、eval report。</li>
|
||
</ul>
|
||
<ul class="file-list">
|
||
<li>beaver/engine/session/*.py</li>
|
||
<li>beaver/memory/curated/*.py</li>
|
||
<li>beaver/memory/runs/*.py</li>
|
||
<li>beaver/memory/skills/*.py</li>
|
||
<li>beaver/memory/search/transcript_store.py</li>
|
||
</ul>
|
||
</section>
|
||
|
||
<section class="module">
|
||
<div class="module-label">integrations</div>
|
||
<h3>外部系统与协议集成</h3>
|
||
<p>负责连接 AuthZ、MCP 和 Outlook。WhatsApp、A2A、providers 目录当前主要是占位。</p>
|
||
<ul class="bullets">
|
||
<li><code>MCPConnectionManager</code> 支持 stdio 和 streamable HTTP MCP server,并把远端 tools 注册成 <code>mcp_{server}_{tool}</code>。</li>
|
||
<li>远端 MCP 可用 AuthZ backend token 模式,通过 backend identity 换取 bearer token。</li>
|
||
<li>Outlook integration 通过 AuthZ 或直接凭据连接,维护 workspace meta,提供 status/overview/messages/events/detail。</li>
|
||
<li><code>AuthzClient</code> 负责用户/backend 注册、权限查询、token 签发。</li>
|
||
</ul>
|
||
<ul class="file-list">
|
||
<li>beaver/integrations/mcp/connection.py</li>
|
||
<li>beaver/integrations/authz/client.py</li>
|
||
<li>beaver/integrations/outlook/__init__.py</li>
|
||
<li>beaver/integrations/a2a/__init__.py</li>
|
||
<li>beaver/integrations/whatsapp/__init__.py</li>
|
||
</ul>
|
||
</section>
|
||
|
||
<section class="module risk">
|
||
<div class="module-label">permissions</div>
|
||
<h3>权限与治理层</h3>
|
||
<p>目录已经存在,但当前基本是空骨架。实际权限约束主要散落在具体工具、workspace path 校验、memory safety 和 skill draft safety 中。</p>
|
||
<ul class="bullets">
|
||
<li><code>permissions/guards</code>、<code>policies</code>、<code>profiles</code> 只有 docstring。</li>
|
||
<li><code>ToolsConfig.restrict_to_workspace</code> 已在配置 schema 里存在,但需要逐工具核对是否真正执行。</li>
|
||
<li>后续如果要做能力治理,应把工具执行、MCP sensitive 标记、provider/terminal/file 操作统一接入这里。</li>
|
||
</ul>
|
||
<ul class="file-list">
|
||
<li>beaver/permissions/__init__.py</li>
|
||
<li>beaver/permissions/guards/__init__.py</li>
|
||
<li>beaver/permissions/policies/__init__.py</li>
|
||
<li>beaver/permissions/profiles/__init__.py</li>
|
||
</ul>
|
||
</section>
|
||
</div>
|
||
</section>
|
||
|
||
<section class="section">
|
||
<div class="section-head">
|
||
<div>
|
||
<h2>核心数据落点</h2>
|
||
<p>这些文件/数据库是运行后最重要的事实来源。后续核对行为是否符合预期时,优先看这里。</p>
|
||
</div>
|
||
<div class="kicker">PERSISTENCE MAP</div>
|
||
</div>
|
||
<div class="table-wrap">
|
||
<table>
|
||
<thead>
|
||
<tr>
|
||
<th>数据</th>
|
||
<th>位置</th>
|
||
<th>写入者</th>
|
||
<th>用途</th>
|
||
</tr>
|
||
</thead>
|
||
<tbody>
|
||
<tr>
|
||
<td>Session / transcript event stream</td>
|
||
<td><code><workspace>/sessions/state.db</code></td>
|
||
<td><code>SessionManager</code> / <code>AgentLoop</code></td>
|
||
<td>保存可见对话、隐藏 system snapshots、tool calls/results、run lifecycle、usage、FTS 搜索。</td>
|
||
</tr>
|
||
<tr>
|
||
<td>Task records</td>
|
||
<td><code><workspace>/tasks/tasks.json</code></td>
|
||
<td><code>TaskService</code></td>
|
||
<td>保存 task goal/status/run_ids/skill_names/acceptance history。</td>
|
||
</tr>
|
||
<tr>
|
||
<td>Task events</td>
|
||
<td><code><workspace>/tasks/events.jsonl</code></td>
|
||
<td><code>TaskService</code></td>
|
||
<td>保存 created/run_started/run_completed/evidence_recorded/accepted/revised/closed/abandoned。</td>
|
||
</tr>
|
||
<tr>
|
||
<td>Curated memory</td>
|
||
<td><code><workspace>/memory/curated/MEMORY.md</code>, <code>USER.md</code></td>
|
||
<td><code>MemoryTool</code> / <code>MemoryStore</code></td>
|
||
<td>长期注入 prompt 的稳定事实;每个 run 冻结快照。</td>
|
||
</tr>
|
||
<tr>
|
||
<td>Run receipts / skill effects</td>
|
||
<td><code><workspace>/memory/runs/*.jsonl</code></td>
|
||
<td><code>AgentLoop</code> / <code>AgentService</code> 用户验收入口</td>
|
||
<td>skill learning 的原始执行证据、用户验收事件和 final accepted run 标记。</td>
|
||
</tr>
|
||
<tr>
|
||
<td>Skills lifecycle</td>
|
||
<td><code><workspace>/skills/<name>/...</code></td>
|
||
<td><code>SkillSpecStore</code> / draft/review/publisher services</td>
|
||
<td>published versions、drafts、reviews、current version、supporting files。</td>
|
||
</tr>
|
||
<tr>
|
||
<td>Skill learning state</td>
|
||
<td><code><workspace>/memory/skills/...</code></td>
|
||
<td><code>SkillLearningStore</code></td>
|
||
<td>候选、performance snapshot、safety report、eval report。</td>
|
||
</tr>
|
||
<tr>
|
||
<td>Cron jobs and runs</td>
|
||
<td><code><workspace>/cron/jobs.json</code></td>
|
||
<td><code>CronService</code></td>
|
||
<td>定时任务配置、next_run、history、notification/task linkage。</td>
|
||
</tr>
|
||
<tr>
|
||
<td>Agent registry / subagents</td>
|
||
<td><code><workspace>/agents/registry.json</code>, <code>*_agent/AGENTS.json</code></td>
|
||
<td><code>AgentRegistry</code> / <code>LocalSubagentStore</code></td>
|
||
<td>管理 builtin/workspace/learned agents 和本地 sub-agent workspace。</td>
|
||
</tr>
|
||
</tbody>
|
||
</table>
|
||
</div>
|
||
</section>
|
||
|
||
<section class="section">
|
||
<div class="section-head">
|
||
<div>
|
||
<h2>关键流程拆解</h2>
|
||
<p>这些流程是后续逐模块修改时最容易产生偏差的地方。</p>
|
||
</div>
|
||
<div class="kicker">CONTROL FLOWS</div>
|
||
</div>
|
||
<div class="cols-2">
|
||
<div class="panel">
|
||
<h2>Simple chat</h2>
|
||
<ul class="checks">
|
||
<li>入口调用 <code>AgentService._process_with_main_agent()</code>。</li>
|
||
<li><code>MainAgentRouter</code> 返回非 task。</li>
|
||
<li>关闭 skill assembly 和 tools:<code>include_skill_assembly=False</code>、<code>include_tools=False</code>。</li>
|
||
<li>仍通过 <code>AgentLoop</code> 写 session/run 事件,但不创建 Task。</li>
|
||
</ul>
|
||
</div>
|
||
<div class="panel">
|
||
<h2>Task mode single</h2>
|
||
<ul class="checks">
|
||
<li>创建或复用 open task。</li>
|
||
<li>planner 返回 single,主 agent 直接运行。</li>
|
||
<li>运行后构建 <code>TaskEvidencePacket</code>。</li>
|
||
<li>运行后状态变 <code>awaiting_acceptance</code>;用户 accept/revise/abandon 决定关闭、修订或放弃。</li>
|
||
</ul>
|
||
</div>
|
||
<div class="panel">
|
||
<h2>Task mode team</h2>
|
||
<ul class="checks">
|
||
<li>planner 生成 <code>ExecutionGraph</code>。</li>
|
||
<li><code>TaskSkillResolver</code> 给节点绑定 published skill 或 ephemeral guidance。</li>
|
||
<li><code>TeamService</code> 运行节点,节点仍调用 <code>AgentLoop</code>。</li>
|
||
<li>主 agent synthesis 使用 team evidence,通常关闭工具调用,避免重复执行子 agent 已做的事情。</li>
|
||
</ul>
|
||
</div>
|
||
<div class="panel">
|
||
<h2>Skill learning</h2>
|
||
<ul class="checks">
|
||
<li>每个 run 记录 activated skill receipt 和 effect。</li>
|
||
<li>用户 accept task 后才生成候选;证据包含整个 task 的所有 runs,并标记 final_accepted_run_id。</li>
|
||
<li>worker 只生成 draft、做 safety/eval,不自动 approve/publish。</li>
|
||
<li>publish 必须有 approved review、passing safety、没有失败 eval;high risk 还需要显式确认。</li>
|
||
</ul>
|
||
</div>
|
||
</div>
|
||
</section>
|
||
|
||
<section class="section">
|
||
<div class="section-head">
|
||
<div>
|
||
<h2>后续核对问题清单</h2>
|
||
<p>这些问题适合配合 brainstorming / grill-me 逐模块核对想法和现有项目是否一致。</p>
|
||
</div>
|
||
<div class="kicker">REVIEW PROMPTS</div>
|
||
</div>
|
||
<div class="matrix">
|
||
<div class="key">产品目标</div>
|
||
<div>这个后端当前更像“任务型 agent runtime”,不是普通聊天后端。你想保留 Task runtime 的 Plan -> Run -> Evidence -> User Acceptance 主体验,还是把它降级成可选高级模式?</div>
|
||
<div class="key">主入口边界</div>
|
||
<div><code>interfaces/web/app.py</code> 已经超过 3000 行,包含 auth、files、skills、cron、chat 等。后续是否要拆 route 模块,还是先保持单文件以降低迁移风险?</div>
|
||
<div class="key">Task 自动化程度</div>
|
||
<div>现在 Main Agent 会自动 Task 化复杂请求。你是否接受模型分类误差?是否需要用户显式确认创建 Task?</div>
|
||
<div class="key">Team 执行策略</div>
|
||
<div>当前真正实现 sequence / parallel / DAG,其它策略只是保留枚举。是否要支持更多 coordinator,还是坚持 v1 只做三种稳定策略?</div>
|
||
<div class="key">Agent registry 角色</div>
|
||
<div>registry/search/target resolver 已存在,但 Task 主线主要绑定技能而不是 specialist agent。你希望 team node 优先找 specialist agent,还是继续 generic skill worker?</div>
|
||
<div class="key">权限治理</div>
|
||
<div>permissions 目录目前是骨架。terminal、filesystem、web、MCP、Outlook 等能力是否需要统一 policy gate?</div>
|
||
<div class="key">Skill 学习闭环</div>
|
||
<div>候选生成应依赖 task accepted。你希望只从用户接受的 task evidence 学习,还是允许人工从 abandoned/revised 历史中手动创建候选?</div>
|
||
<div class="key">外部集成</div>
|
||
<div>Outlook/AuthZ/MCP 已经比较具体;A2A/WhatsApp 仍是占位。后续应该优先补协议,还是先收紧已有集成的权限和错误处理?</div>
|
||
</div>
|
||
</section>
|
||
|
||
<section class="section">
|
||
<div class="section-head">
|
||
<div>
|
||
<h2>代码观察与风险点</h2>
|
||
<p>这些不是修改建议的最终结论,只是阅读代码后值得后续逐项核对的偏差点。</p>
|
||
</div>
|
||
<div class="kicker">OPEN RISKS</div>
|
||
</div>
|
||
<div class="cols-2">
|
||
<div class="note risk">
|
||
<h3>定时 Task 路径存在明显变量错误</h3>
|
||
<p>
|
||
<code>AgentService.run_scheduled_task()</code> 末尾更新 assistant event payload 时引用了 <code>job.id</code>、<code>run.scheduled_run_id</code>、<code>job.name</code>,
|
||
但该函数参数只有 <code>cron_job_id</code>、<code>cron_job_name</code>、<code>scheduled_run_id</code>。这条路径如果执行到这里会触发 <code>NameError</code>。
|
||
</p>
|
||
</div>
|
||
<div class="note risk">
|
||
<h3>权限层还没有真正成为执行闸门</h3>
|
||
<p>
|
||
<code>permissions</code> 目录为空骨架,实际保护分散在工具实现和路径校验里。若后续开放 terminal、filesystem、MCP sensitive tools,需要统一执行前 policy。
|
||
</p>
|
||
</div>
|
||
<div class="note risk">
|
||
<h3>Web auth 是本地单用户风格</h3>
|
||
<p>
|
||
本地 auth 文件以 username/password 字段读写,使用 token/handoff code 做前端会话。若目标是多用户或公网后端,需要重新评估密码存储、token 生命周期和权限边界。
|
||
</p>
|
||
</div>
|
||
<div class="note risk">
|
||
<h3>Skill eval 目前偏轻量启发式</h3>
|
||
<p>
|
||
<code>SkillDraftEvaluator</code> 基于历史 accepted task evidence 和草稿长度/内容做 bounded report,不是真正 replay。它只属于 skill draft 治理,不属于 Task runtime。
|
||
</p>
|
||
</div>
|
||
<div class="note risk">
|
||
<h3>接口层过大</h3>
|
||
<p>
|
||
<code>interfaces/web/app.py</code> 同时承载 app factory、lifespan、auth、provider config、sessions、files、agents、MCP、Outlook、skills、cron、chat、helper functions。
|
||
后续修改时容易产生跨功能回归。
|
||
</p>
|
||
</div>
|
||
<div class="note">
|
||
<h3>已经落地的稳定点</h3>
|
||
<p>
|
||
<span class="ok">可依赖:</span>统一 <code>AgentLoop</code>、session event stream、Task evidence/acceptance 状态、team graph v1、skill lifecycle gates、MCP wrapper、workspace path containment。
|
||
</p>
|
||
</div>
|
||
</div>
|
||
</section>
|
||
|
||
<section class="section">
|
||
<div class="section-head">
|
||
<div>
|
||
<h2>测试覆盖信号</h2>
|
||
<p>单元测试覆盖了当前后端多数关键行为,可作为后续修改文档的回归索引。</p>
|
||
</div>
|
||
<div class="kicker">TEST INDEX</div>
|
||
</div>
|
||
<div class="cols-4">
|
||
<div class="panel">
|
||
<h2>Task / acceptance</h2>
|
||
<p><code>test_task_mode_feedback.py</code>, <code>test_task_evidence.py</code>, <code>test_task_execution_planner.py</code>, <code>test_task_skill_resolver.py</code></p>
|
||
</div>
|
||
<div class="panel">
|
||
<h2>Engine / providers</h2>
|
||
<p><code>test_websocket_chat.py</code>, <code>test_main_agent_router.py</code>, <code>test_litellm_thinking_mode.py</code>, <code>test_imports.py</code></p>
|
||
</div>
|
||
<div class="panel">
|
||
<h2>Team / process</h2>
|
||
<p><code>test_agent_team_v1.py</code>, <code>test_agent_registry_resolver.py</code>, <code>test_process_projection.py</code></p>
|
||
</div>
|
||
<div class="panel">
|
||
<h2>Skills / tools / web</h2>
|
||
<p><code>test_phase5_skills_runtime.py</code>, <code>test_skill_learning_*.py</code>, <code>test_tool_assembler.py</code>, <code>test_web_files_api.py</code></p>
|
||
</div>
|
||
</div>
|
||
</section>
|
||
</article>
|
||
</main>
|
||
</body>
|
||
</html>
|