441 lines
9.9 KiB
Markdown
441 lines
9.9 KiB
Markdown
# Task Detail Live Execution Design
|
|
|
|
## Purpose
|
|
|
|
Task detail should be a live execution surface for ordinary users. It should answer "what is Beaver doing now?", "what has already happened?", "what changed because of a tool or agent result?", and "what can I inspect or accept?" without forcing the user to wait for a final answer.
|
|
|
|
This page is not primarily a developer audit view. It should expose enough execution detail to create confidence, while keeping raw payloads, long tool output, and debug metadata behind progressive disclosure.
|
|
|
|
## User Experience Principles
|
|
|
|
- Show progress as a chronological card feed that grows while the task runs.
|
|
- Prefer user-facing explanations over raw internal event names.
|
|
- Show skill selection, tool usage, tool result, agent team activity, artifacts, and final result as first-class cards.
|
|
- Do not expose hidden chain-of-thought. Use brief action summaries such as "Beaver found the relevant files and will now inspect the API response shape."
|
|
- Keep the user oriented with a persistent task header and clear current status.
|
|
- Stop live updates once the task reaches a terminal state, while still allowing manual refresh.
|
|
|
|
## Page Layout
|
|
|
|
### Persistent Header
|
|
|
|
The top header remains visible while scrolling and contains:
|
|
|
|
- task title
|
|
- task status: open, running, awaiting acceptance, needs revision, closed, abandoned, error, or cancelled
|
|
- current stage label
|
|
- elapsed time
|
|
- compact progress summary
|
|
- link back to task list
|
|
- link to source conversation
|
|
- acceptance entry point when a run is ready for review
|
|
|
|
### Main Timeline
|
|
|
|
The main column is a chronological card feed. Cards append as execution events arrive.
|
|
|
|
Expected card sequence:
|
|
|
|
1. task created
|
|
2. planning started or completed
|
|
3. skill selected
|
|
4. tool call started
|
|
5. tool call finished
|
|
6. model next step
|
|
7. agent team created
|
|
8. sub-agent started
|
|
9. sub-agent progress
|
|
10. agent handoff
|
|
11. sub-agent finished
|
|
12. artifact created
|
|
13. result ready
|
|
14. acceptance recorded
|
|
|
|
Cards should visually appear in order and keep enough prior context visible so the page feels like a live work log rather than a static report.
|
|
|
|
### Side Rail
|
|
|
|
The side rail contains compact, always-accessible context:
|
|
|
|
- agent team map
|
|
- currently active agent or tool
|
|
- artifacts list
|
|
- latest warning or blocked state
|
|
- acceptance state
|
|
|
|
On small screens, the side rail collapses below the header or into tabs.
|
|
|
|
## Card Types
|
|
|
|
### Task Created Card
|
|
|
|
Shows that Beaver recognized the user message as a task.
|
|
|
|
Fields:
|
|
|
|
- task goal
|
|
- source session
|
|
- created time
|
|
- initial status
|
|
|
|
### Plan Card
|
|
|
|
Shows the execution approach.
|
|
|
|
Fields:
|
|
|
|
- mode: single agent or agent team
|
|
- planned stages
|
|
- attempt index
|
|
- strategy summary
|
|
|
|
### Skill Card
|
|
|
|
Shows which skill Beaver selected and why it matters.
|
|
|
|
Fields:
|
|
|
|
- skill name
|
|
- skill version if available
|
|
- user-facing reason
|
|
- capabilities or method guidance summary
|
|
|
|
If multiple skills are selected, render one grouped card with individual rows.
|
|
|
|
### Tool Call Card
|
|
|
|
Shows that Beaver is using a tool.
|
|
|
|
Fields:
|
|
|
|
- tool name
|
|
- action summary
|
|
- actor name
|
|
- status: running, done, failed
|
|
- started time
|
|
- duration if completed
|
|
|
|
Raw tool arguments are hidden by default.
|
|
|
|
### Tool Result Card
|
|
|
|
Shows what the tool found or produced.
|
|
|
|
Fields:
|
|
|
|
- success or failure
|
|
- result summary
|
|
- error message if any
|
|
- links to artifact or output
|
|
- expandable raw result
|
|
|
|
### Next Step Card
|
|
|
|
Shows Beaver's next user-visible action after interpreting a result.
|
|
|
|
Fields:
|
|
|
|
- short action explanation
|
|
- related prior card or run
|
|
- expected next event type when known
|
|
|
|
This card must not contain private reasoning traces.
|
|
|
|
### Agent Team Card
|
|
|
|
Shows that Beaver created a multi-agent team.
|
|
|
|
Fields:
|
|
|
|
- team strategy
|
|
- agent count
|
|
- dependency shape
|
|
- agent names and assigned tasks
|
|
|
|
### Sub-Agent Card
|
|
|
|
Shows progress from an individual agent.
|
|
|
|
Fields:
|
|
|
|
- agent name
|
|
- assigned task
|
|
- current status
|
|
- progress text
|
|
- latest output summary
|
|
|
|
### Agent Handoff Card
|
|
|
|
Shows interaction between agents.
|
|
|
|
Fields:
|
|
|
|
- source agent
|
|
- target agent
|
|
- handoff reason
|
|
- summary of transferred result
|
|
|
|
### Artifact Card
|
|
|
|
Shows an output created during execution.
|
|
|
|
Fields:
|
|
|
|
- artifact title
|
|
- artifact type
|
|
- source agent or run
|
|
- created time
|
|
- open or download action
|
|
- summary or preview where safe
|
|
|
|
### Error or Blocked Card
|
|
|
|
Shows that execution hit a problem.
|
|
|
|
Fields:
|
|
|
|
- problem summary
|
|
- affected stage or tool
|
|
- whether Beaver can continue automatically
|
|
- action required from user if any
|
|
|
|
### Final Result Card
|
|
|
|
Shows the result that the user can review.
|
|
|
|
Fields:
|
|
|
|
- final answer or result summary
|
|
- important artifacts
|
|
- validation or evidence status when available
|
|
- accept, revise, and abandon actions
|
|
|
|
## Realtime Behavior
|
|
|
|
### Live Updates
|
|
|
|
The page should subscribe to task-related process events while the task is active. The following updates should append or update cards in real time:
|
|
|
|
- skill selected
|
|
- tool call started
|
|
- tool call finished
|
|
- agent team created
|
|
- sub-agent started
|
|
- sub-agent progress
|
|
- sub-agent finished
|
|
- agent handoff
|
|
- artifact created
|
|
- task result ready
|
|
- task error or blocked state
|
|
- acceptance recorded
|
|
|
|
### Initial Load
|
|
|
|
On page load, call `GET /api/tasks/{task_id}` and hydrate:
|
|
|
|
- task metadata
|
|
- lifecycle events
|
|
- process runs
|
|
- process events
|
|
- process artifacts
|
|
- readable run messages
|
|
- existing feedback
|
|
|
|
The frontend should build the initial card feed from these persisted records so a refreshed page reconstructs the same execution timeline.
|
|
|
|
### Fallback Polling
|
|
|
|
If WebSocket updates are unavailable, active tasks should poll `GET /api/tasks/{task_id}` every 3 to 5 seconds.
|
|
|
|
Polling stops when the task reaches a terminal state:
|
|
|
|
- closed
|
|
- abandoned
|
|
- cancelled
|
|
- error
|
|
|
|
Manual refresh remains available.
|
|
|
|
### Large Content Loading
|
|
|
|
The following content should not be loaded or expanded by default:
|
|
|
|
- raw tool arguments
|
|
- full tool output
|
|
- raw process event payloads
|
|
- full transcript
|
|
- memory retrieval trace
|
|
- debug metadata
|
|
|
|
These belong behind "show details" controls or a later advanced view.
|
|
|
|
## Backend Event Contract
|
|
|
|
The existing task detail API already exposes useful primitives:
|
|
|
|
- `process_runs`
|
|
- `process_events`
|
|
- `process_artifacts`
|
|
- `runs`
|
|
- `events`
|
|
- `skill_names`
|
|
- task metadata and feedback
|
|
|
|
For a reliable user-facing timeline, backend events should become more explicit. Recommended event kinds:
|
|
|
|
- `task_created`
|
|
- `task_planned`
|
|
- `skill_selected`
|
|
- `tool_call_started`
|
|
- `tool_call_finished`
|
|
- `agent_team_created`
|
|
- `agent_started`
|
|
- `agent_progress`
|
|
- `agent_handoff`
|
|
- `agent_finished`
|
|
- `artifact_created`
|
|
- `task_result_ready`
|
|
- `task_acceptance_recorded`
|
|
- `task_error`
|
|
|
|
Each event should include:
|
|
|
|
- `event_id`
|
|
- `task_id`
|
|
- `run_id` when applicable
|
|
- `parent_run_id` when applicable
|
|
- `actor_type`
|
|
- `actor_name`
|
|
- `kind`
|
|
- `status`
|
|
- `text`
|
|
- `created_at`
|
|
- compact `metadata`
|
|
|
|
Metadata should contain structured fields for rendering, not only raw provider or tool payloads.
|
|
|
|
## Frontend Rendering Model
|
|
|
|
The frontend should normalize events into a `TaskTimelineCard` view model.
|
|
|
|
Recommended fields:
|
|
|
|
```ts
|
|
type TaskTimelineCard = {
|
|
id: string;
|
|
taskId: string;
|
|
runId?: string | null;
|
|
parentRunId?: string | null;
|
|
type:
|
|
| 'task_created'
|
|
| 'plan'
|
|
| 'skill'
|
|
| 'tool_call'
|
|
| 'tool_result'
|
|
| 'next_step'
|
|
| 'agent_team'
|
|
| 'agent_progress'
|
|
| 'agent_handoff'
|
|
| 'artifact'
|
|
| 'error'
|
|
| 'result'
|
|
| 'acceptance';
|
|
title: string;
|
|
summary?: string;
|
|
actorName?: string;
|
|
status?: string;
|
|
createdAt: string;
|
|
relatedArtifactIds?: string[];
|
|
details?: Record<string, unknown>;
|
|
};
|
|
```
|
|
|
|
This keeps rendering stable even if backend event payloads evolve.
|
|
|
|
## Empty, Loading, and Error States
|
|
|
|
### No Events Yet
|
|
|
|
Show a task created card and a running placeholder:
|
|
|
|
"Beaver is preparing the first step."
|
|
|
|
### Waiting on Tool
|
|
|
|
Show the active tool call card with a spinner and elapsed time.
|
|
|
|
### Waiting on Agent
|
|
|
|
Show the active agent card with its assigned task and current status.
|
|
|
|
### Failed Tool
|
|
|
|
Show an error card with a concise reason and whether Beaver is retrying or changing approach.
|
|
|
|
### Lost Connection
|
|
|
|
Keep existing cards visible and show a small reconnecting indicator. If reconnect fails, fall back to polling.
|
|
|
|
## Acceptance Flow
|
|
|
|
The final result card is the primary acceptance surface.
|
|
|
|
Actions:
|
|
|
|
- Accept: closes the task and can trigger skill learning.
|
|
- Needs revision: requires a comment, appends a new revision card, and starts another attempt in the same timeline.
|
|
- Abandon: closes the task as abandoned and preserves the execution history.
|
|
|
|
After any acceptance action, the page should immediately update local UI state and refetch the task detail.
|
|
|
|
## V1 Scope
|
|
|
|
V1 includes:
|
|
|
|
- persistent task header
|
|
- live chronological card feed
|
|
- skill cards
|
|
- tool call and result cards
|
|
- agent team card
|
|
- sub-agent progress cards
|
|
- artifact cards
|
|
- final result and acceptance card
|
|
- WebSocket-first updates with polling fallback
|
|
- collapsed raw details
|
|
|
|
V1 excludes:
|
|
|
|
- full administrator audit mode
|
|
- memory retrieval graph visualization
|
|
- raw provider request/response viewer
|
|
- advanced event payload debugger
|
|
- editable task graph
|
|
|
|
## Implementation Notes
|
|
|
|
The existing `tasks/[taskId]/page.tsx` already has useful pieces, but the main hierarchy should shift from phase groups and selected node detail to a timeline-first experience.
|
|
|
|
Likely frontend modules:
|
|
|
|
- `TaskLiveHeader`
|
|
- `TaskTimeline`
|
|
- `TaskTimelineCard`
|
|
- `TaskSideRail`
|
|
- `TaskAcceptanceCard`
|
|
- `buildTaskTimelineCards`
|
|
|
|
Likely backend work:
|
|
|
|
- emit explicit process events for skill selection and tool calls
|
|
- include user-facing text summaries in event metadata
|
|
- ensure task detail reconstruction uses persisted events
|
|
- expose enough run and actor metadata for agent team rendering
|
|
|
|
## Self-Review
|
|
|
|
- No placeholders remain.
|
|
- The design is scoped to ordinary-user task detail, not admin audit.
|
|
- Realtime requirements distinguish live updates from expandable heavy details.
|
|
- Backend event requirements are explicit enough for frontend implementation.
|
|
- V1 scope avoids memory graph and debug payload work.
|