Files

steven_li 16347caf5e Add task detail live execution design

2026-05-26 10:55:16 +08:00

9.9 KiB

Raw Blame History

Task Detail Live Execution Design

Purpose

Task detail should be a live execution surface for ordinary users. It should answer "what is Beaver doing now?", "what has already happened?", "what changed because of a tool or agent result?", and "what can I inspect or accept?" without forcing the user to wait for a final answer.

This page is not primarily a developer audit view. It should expose enough execution detail to create confidence, while keeping raw payloads, long tool output, and debug metadata behind progressive disclosure.

User Experience Principles

Show progress as a chronological card feed that grows while the task runs.
Prefer user-facing explanations over raw internal event names.
Show skill selection, tool usage, tool result, agent team activity, artifacts, and final result as first-class cards.
Do not expose hidden chain-of-thought. Use brief action summaries such as "Beaver found the relevant files and will now inspect the API response shape."
Keep the user oriented with a persistent task header and clear current status.
Stop live updates once the task reaches a terminal state, while still allowing manual refresh.

Page Layout

Persistent Header

The top header remains visible while scrolling and contains:

task title
task status: open, running, awaiting acceptance, needs revision, closed, abandoned, error, or cancelled
current stage label
elapsed time
compact progress summary
link back to task list
link to source conversation
acceptance entry point when a run is ready for review

Main Timeline

The main column is a chronological card feed. Cards append as execution events arrive.

Expected card sequence:

task created
planning started or completed
skill selected
tool call started
tool call finished
model next step
agent team created
sub-agent started
sub-agent progress
agent handoff
sub-agent finished
artifact created
result ready
acceptance recorded

Cards should visually appear in order and keep enough prior context visible so the page feels like a live work log rather than a static report.

Side Rail

The side rail contains compact, always-accessible context:

agent team map
currently active agent or tool
artifacts list
latest warning or blocked state
acceptance state

On small screens, the side rail collapses below the header or into tabs.

Card Types

Task Created Card

Shows that Beaver recognized the user message as a task.

Fields:

task goal
source session
created time
initial status

Plan Card

Shows the execution approach.

Fields:

mode: single agent or agent team
planned stages
attempt index
strategy summary

Skill Card

Shows which skill Beaver selected and why it matters.

Fields:

skill name
skill version if available
user-facing reason
capabilities or method guidance summary

If multiple skills are selected, render one grouped card with individual rows.

Tool Call Card

Shows that Beaver is using a tool.

Fields:

tool name
action summary
actor name
status: running, done, failed
started time
duration if completed

Raw tool arguments are hidden by default.

Tool Result Card

Shows what the tool found or produced.

Fields:

success or failure
result summary
error message if any
links to artifact or output
expandable raw result

Next Step Card

Shows Beaver's next user-visible action after interpreting a result.

Fields:

short action explanation
related prior card or run
expected next event type when known

This card must not contain private reasoning traces.

Agent Team Card

Shows that Beaver created a multi-agent team.

Fields:

team strategy
agent count
dependency shape
agent names and assigned tasks

Sub-Agent Card

Shows progress from an individual agent.

Fields:

agent name
assigned task
current status
progress text
latest output summary

Agent Handoff Card

Shows interaction between agents.

Fields:

source agent
target agent
handoff reason
summary of transferred result

Artifact Card

Shows an output created during execution.

Fields:

artifact title
artifact type
source agent or run
created time
open or download action
summary or preview where safe

Error or Blocked Card

Shows that execution hit a problem.

Fields:

problem summary
affected stage or tool
whether Beaver can continue automatically
action required from user if any

Final Result Card

Shows the result that the user can review.

Fields:

final answer or result summary
important artifacts
validation or evidence status when available
accept, revise, and abandon actions

Realtime Behavior

Live Updates

The page should subscribe to task-related process events while the task is active. The following updates should append or update cards in real time:

skill selected
tool call started
tool call finished
agent team created
sub-agent started
sub-agent progress
sub-agent finished
agent handoff
artifact created
task result ready
task error or blocked state
acceptance recorded

Initial Load

On page load, call GET /api/tasks/{task_id} and hydrate:

task metadata
lifecycle events
process runs
process events
process artifacts
readable run messages
existing feedback

The frontend should build the initial card feed from these persisted records so a refreshed page reconstructs the same execution timeline.

Fallback Polling

If WebSocket updates are unavailable, active tasks should poll GET /api/tasks/{task_id} every 3 to 5 seconds.

Polling stops when the task reaches a terminal state:

closed
abandoned
cancelled
error

Manual refresh remains available.

Large Content Loading

The following content should not be loaded or expanded by default:

raw tool arguments
full tool output
raw process event payloads
full transcript
memory retrieval trace
debug metadata

These belong behind "show details" controls or a later advanced view.

Backend Event Contract

The existing task detail API already exposes useful primitives:

process_runs
process_events
process_artifacts
runs
events
skill_names
task metadata and feedback

For a reliable user-facing timeline, backend events should become more explicit. Recommended event kinds:

task_created
task_planned
skill_selected
tool_call_started
tool_call_finished
agent_team_created
agent_started
agent_progress
agent_handoff
agent_finished
artifact_created
task_result_ready
task_acceptance_recorded
task_error

Each event should include:

event_id
task_id
run_id when applicable
parent_run_id when applicable
actor_type
actor_name
kind
status
text
created_at
compact metadata

Metadata should contain structured fields for rendering, not only raw provider or tool payloads.

Frontend Rendering Model

The frontend should normalize events into a TaskTimelineCard view model.

Recommended fields:

type TaskTimelineCard = {
  id: string;
  taskId: string;
  runId?: string | null;
  parentRunId?: string | null;
  type:
    | 'task_created'
    | 'plan'
    | 'skill'
    | 'tool_call'
    | 'tool_result'
    | 'next_step'
    | 'agent_team'
    | 'agent_progress'
    | 'agent_handoff'
    | 'artifact'
    | 'error'
    | 'result'
    | 'acceptance';
  title: string;
  summary?: string;
  actorName?: string;
  status?: string;
  createdAt: string;
  relatedArtifactIds?: string[];
  details?: Record<string, unknown>;
};

This keeps rendering stable even if backend event payloads evolve.

Empty, Loading, and Error States

No Events Yet

Show a task created card and a running placeholder:

"Beaver is preparing the first step."

Waiting on Tool

Show the active tool call card with a spinner and elapsed time.

Waiting on Agent

Show the active agent card with its assigned task and current status.

Failed Tool

Show an error card with a concise reason and whether Beaver is retrying or changing approach.

Lost Connection

Keep existing cards visible and show a small reconnecting indicator. If reconnect fails, fall back to polling.

Acceptance Flow

The final result card is the primary acceptance surface.

Actions:

Accept: closes the task and can trigger skill learning.
Needs revision: requires a comment, appends a new revision card, and starts another attempt in the same timeline.
Abandon: closes the task as abandoned and preserves the execution history.

After any acceptance action, the page should immediately update local UI state and refetch the task detail.

V1 Scope

V1 includes:

persistent task header
live chronological card feed
skill cards
tool call and result cards
agent team card
sub-agent progress cards
artifact cards
final result and acceptance card
WebSocket-first updates with polling fallback
collapsed raw details

V1 excludes:

full administrator audit mode
memory retrieval graph visualization
raw provider request/response viewer
advanced event payload debugger
editable task graph

Implementation Notes

The existing tasks/[taskId]/page.tsx already has useful pieces, but the main hierarchy should shift from phase groups and selected node detail to a timeline-first experience.

Likely frontend modules:

TaskLiveHeader
TaskTimeline
TaskTimelineCard
TaskSideRail
TaskAcceptanceCard
buildTaskTimelineCards

Likely backend work:

emit explicit process events for skill selection and tool calls
include user-facing text summaries in event metadata
ensure task detail reconstruction uses persisted events
expose enough run and actor metadata for agent team rendering

Self-Review

No placeholders remain.
The design is scoped to ordinary-user task detail, not admin audit.
Realtime requirements distinguish live updates from expandable heavy details.
Backend event requirements are explicit enough for frontend implementation.
V1 scope avoids memory graph and debug payload work.

9.9 KiB Raw Blame History

Task Detail Live Execution Design

Purpose

User Experience Principles

Page Layout

Persistent Header

Main Timeline

Side Rail

Card Types

Task Created Card

Plan Card

Skill Card

Tool Call Card

Tool Result Card

Next Step Card

Agent Team Card

Sub-Agent Card

Agent Handoff Card

Artifact Card

Error or Blocked Card

Final Result Card

Realtime Behavior

Live Updates

Initial Load

Fallback Polling

Large Content Loading

Backend Event Contract

Frontend Rendering Model

Empty, Loading, and Error States

No Events Yet

Waiting on Tool

Waiting on Agent

Failed Tool

Lost Connection

Acceptance Flow

V1 Scope

Implementation Notes

Self-Review

9.9 KiB

Raw Blame History