Add task detail live execution design
This commit is contained in:
@ -0,0 +1,440 @@
|
||||
# Task Detail Live Execution Design
|
||||
|
||||
## Purpose
|
||||
|
||||
Task detail should be a live execution surface for ordinary users. It should answer "what is Beaver doing now?", "what has already happened?", "what changed because of a tool or agent result?", and "what can I inspect or accept?" without forcing the user to wait for a final answer.
|
||||
|
||||
This page is not primarily a developer audit view. It should expose enough execution detail to create confidence, while keeping raw payloads, long tool output, and debug metadata behind progressive disclosure.
|
||||
|
||||
## User Experience Principles
|
||||
|
||||
- Show progress as a chronological card feed that grows while the task runs.
|
||||
- Prefer user-facing explanations over raw internal event names.
|
||||
- Show skill selection, tool usage, tool result, agent team activity, artifacts, and final result as first-class cards.
|
||||
- Do not expose hidden chain-of-thought. Use brief action summaries such as "Beaver found the relevant files and will now inspect the API response shape."
|
||||
- Keep the user oriented with a persistent task header and clear current status.
|
||||
- Stop live updates once the task reaches a terminal state, while still allowing manual refresh.
|
||||
|
||||
## Page Layout
|
||||
|
||||
### Persistent Header
|
||||
|
||||
The top header remains visible while scrolling and contains:
|
||||
|
||||
- task title
|
||||
- task status: open, running, awaiting acceptance, needs revision, closed, abandoned, error, or cancelled
|
||||
- current stage label
|
||||
- elapsed time
|
||||
- compact progress summary
|
||||
- link back to task list
|
||||
- link to source conversation
|
||||
- acceptance entry point when a run is ready for review
|
||||
|
||||
### Main Timeline
|
||||
|
||||
The main column is a chronological card feed. Cards append as execution events arrive.
|
||||
|
||||
Expected card sequence:
|
||||
|
||||
1. task created
|
||||
2. planning started or completed
|
||||
3. skill selected
|
||||
4. tool call started
|
||||
5. tool call finished
|
||||
6. model next step
|
||||
7. agent team created
|
||||
8. sub-agent started
|
||||
9. sub-agent progress
|
||||
10. agent handoff
|
||||
11. sub-agent finished
|
||||
12. artifact created
|
||||
13. result ready
|
||||
14. acceptance recorded
|
||||
|
||||
Cards should visually appear in order and keep enough prior context visible so the page feels like a live work log rather than a static report.
|
||||
|
||||
### Side Rail
|
||||
|
||||
The side rail contains compact, always-accessible context:
|
||||
|
||||
- agent team map
|
||||
- currently active agent or tool
|
||||
- artifacts list
|
||||
- latest warning or blocked state
|
||||
- acceptance state
|
||||
|
||||
On small screens, the side rail collapses below the header or into tabs.
|
||||
|
||||
## Card Types
|
||||
|
||||
### Task Created Card
|
||||
|
||||
Shows that Beaver recognized the user message as a task.
|
||||
|
||||
Fields:
|
||||
|
||||
- task goal
|
||||
- source session
|
||||
- created time
|
||||
- initial status
|
||||
|
||||
### Plan Card
|
||||
|
||||
Shows the execution approach.
|
||||
|
||||
Fields:
|
||||
|
||||
- mode: single agent or agent team
|
||||
- planned stages
|
||||
- attempt index
|
||||
- strategy summary
|
||||
|
||||
### Skill Card
|
||||
|
||||
Shows which skill Beaver selected and why it matters.
|
||||
|
||||
Fields:
|
||||
|
||||
- skill name
|
||||
- skill version if available
|
||||
- user-facing reason
|
||||
- capabilities or method guidance summary
|
||||
|
||||
If multiple skills are selected, render one grouped card with individual rows.
|
||||
|
||||
### Tool Call Card
|
||||
|
||||
Shows that Beaver is using a tool.
|
||||
|
||||
Fields:
|
||||
|
||||
- tool name
|
||||
- action summary
|
||||
- actor name
|
||||
- status: running, done, failed
|
||||
- started time
|
||||
- duration if completed
|
||||
|
||||
Raw tool arguments are hidden by default.
|
||||
|
||||
### Tool Result Card
|
||||
|
||||
Shows what the tool found or produced.
|
||||
|
||||
Fields:
|
||||
|
||||
- success or failure
|
||||
- result summary
|
||||
- error message if any
|
||||
- links to artifact or output
|
||||
- expandable raw result
|
||||
|
||||
### Next Step Card
|
||||
|
||||
Shows Beaver's next user-visible action after interpreting a result.
|
||||
|
||||
Fields:
|
||||
|
||||
- short action explanation
|
||||
- related prior card or run
|
||||
- expected next event type when known
|
||||
|
||||
This card must not contain private reasoning traces.
|
||||
|
||||
### Agent Team Card
|
||||
|
||||
Shows that Beaver created a multi-agent team.
|
||||
|
||||
Fields:
|
||||
|
||||
- team strategy
|
||||
- agent count
|
||||
- dependency shape
|
||||
- agent names and assigned tasks
|
||||
|
||||
### Sub-Agent Card
|
||||
|
||||
Shows progress from an individual agent.
|
||||
|
||||
Fields:
|
||||
|
||||
- agent name
|
||||
- assigned task
|
||||
- current status
|
||||
- progress text
|
||||
- latest output summary
|
||||
|
||||
### Agent Handoff Card
|
||||
|
||||
Shows interaction between agents.
|
||||
|
||||
Fields:
|
||||
|
||||
- source agent
|
||||
- target agent
|
||||
- handoff reason
|
||||
- summary of transferred result
|
||||
|
||||
### Artifact Card
|
||||
|
||||
Shows an output created during execution.
|
||||
|
||||
Fields:
|
||||
|
||||
- artifact title
|
||||
- artifact type
|
||||
- source agent or run
|
||||
- created time
|
||||
- open or download action
|
||||
- summary or preview where safe
|
||||
|
||||
### Error or Blocked Card
|
||||
|
||||
Shows that execution hit a problem.
|
||||
|
||||
Fields:
|
||||
|
||||
- problem summary
|
||||
- affected stage or tool
|
||||
- whether Beaver can continue automatically
|
||||
- action required from user if any
|
||||
|
||||
### Final Result Card
|
||||
|
||||
Shows the result that the user can review.
|
||||
|
||||
Fields:
|
||||
|
||||
- final answer or result summary
|
||||
- important artifacts
|
||||
- validation or evidence status when available
|
||||
- accept, revise, and abandon actions
|
||||
|
||||
## Realtime Behavior
|
||||
|
||||
### Live Updates
|
||||
|
||||
The page should subscribe to task-related process events while the task is active. The following updates should append or update cards in real time:
|
||||
|
||||
- skill selected
|
||||
- tool call started
|
||||
- tool call finished
|
||||
- agent team created
|
||||
- sub-agent started
|
||||
- sub-agent progress
|
||||
- sub-agent finished
|
||||
- agent handoff
|
||||
- artifact created
|
||||
- task result ready
|
||||
- task error or blocked state
|
||||
- acceptance recorded
|
||||
|
||||
### Initial Load
|
||||
|
||||
On page load, call `GET /api/tasks/{task_id}` and hydrate:
|
||||
|
||||
- task metadata
|
||||
- lifecycle events
|
||||
- process runs
|
||||
- process events
|
||||
- process artifacts
|
||||
- readable run messages
|
||||
- existing feedback
|
||||
|
||||
The frontend should build the initial card feed from these persisted records so a refreshed page reconstructs the same execution timeline.
|
||||
|
||||
### Fallback Polling
|
||||
|
||||
If WebSocket updates are unavailable, active tasks should poll `GET /api/tasks/{task_id}` every 3 to 5 seconds.
|
||||
|
||||
Polling stops when the task reaches a terminal state:
|
||||
|
||||
- closed
|
||||
- abandoned
|
||||
- cancelled
|
||||
- error
|
||||
|
||||
Manual refresh remains available.
|
||||
|
||||
### Large Content Loading
|
||||
|
||||
The following content should not be loaded or expanded by default:
|
||||
|
||||
- raw tool arguments
|
||||
- full tool output
|
||||
- raw process event payloads
|
||||
- full transcript
|
||||
- memory retrieval trace
|
||||
- debug metadata
|
||||
|
||||
These belong behind "show details" controls or a later advanced view.
|
||||
|
||||
## Backend Event Contract
|
||||
|
||||
The existing task detail API already exposes useful primitives:
|
||||
|
||||
- `process_runs`
|
||||
- `process_events`
|
||||
- `process_artifacts`
|
||||
- `runs`
|
||||
- `events`
|
||||
- `skill_names`
|
||||
- task metadata and feedback
|
||||
|
||||
For a reliable user-facing timeline, backend events should become more explicit. Recommended event kinds:
|
||||
|
||||
- `task_created`
|
||||
- `task_planned`
|
||||
- `skill_selected`
|
||||
- `tool_call_started`
|
||||
- `tool_call_finished`
|
||||
- `agent_team_created`
|
||||
- `agent_started`
|
||||
- `agent_progress`
|
||||
- `agent_handoff`
|
||||
- `agent_finished`
|
||||
- `artifact_created`
|
||||
- `task_result_ready`
|
||||
- `task_acceptance_recorded`
|
||||
- `task_error`
|
||||
|
||||
Each event should include:
|
||||
|
||||
- `event_id`
|
||||
- `task_id`
|
||||
- `run_id` when applicable
|
||||
- `parent_run_id` when applicable
|
||||
- `actor_type`
|
||||
- `actor_name`
|
||||
- `kind`
|
||||
- `status`
|
||||
- `text`
|
||||
- `created_at`
|
||||
- compact `metadata`
|
||||
|
||||
Metadata should contain structured fields for rendering, not only raw provider or tool payloads.
|
||||
|
||||
## Frontend Rendering Model
|
||||
|
||||
The frontend should normalize events into a `TaskTimelineCard` view model.
|
||||
|
||||
Recommended fields:
|
||||
|
||||
```ts
|
||||
type TaskTimelineCard = {
|
||||
id: string;
|
||||
taskId: string;
|
||||
runId?: string | null;
|
||||
parentRunId?: string | null;
|
||||
type:
|
||||
| 'task_created'
|
||||
| 'plan'
|
||||
| 'skill'
|
||||
| 'tool_call'
|
||||
| 'tool_result'
|
||||
| 'next_step'
|
||||
| 'agent_team'
|
||||
| 'agent_progress'
|
||||
| 'agent_handoff'
|
||||
| 'artifact'
|
||||
| 'error'
|
||||
| 'result'
|
||||
| 'acceptance';
|
||||
title: string;
|
||||
summary?: string;
|
||||
actorName?: string;
|
||||
status?: string;
|
||||
createdAt: string;
|
||||
relatedArtifactIds?: string[];
|
||||
details?: Record<string, unknown>;
|
||||
};
|
||||
```
|
||||
|
||||
This keeps rendering stable even if backend event payloads evolve.
|
||||
|
||||
## Empty, Loading, and Error States
|
||||
|
||||
### No Events Yet
|
||||
|
||||
Show a task created card and a running placeholder:
|
||||
|
||||
"Beaver is preparing the first step."
|
||||
|
||||
### Waiting on Tool
|
||||
|
||||
Show the active tool call card with a spinner and elapsed time.
|
||||
|
||||
### Waiting on Agent
|
||||
|
||||
Show the active agent card with its assigned task and current status.
|
||||
|
||||
### Failed Tool
|
||||
|
||||
Show an error card with a concise reason and whether Beaver is retrying or changing approach.
|
||||
|
||||
### Lost Connection
|
||||
|
||||
Keep existing cards visible and show a small reconnecting indicator. If reconnect fails, fall back to polling.
|
||||
|
||||
## Acceptance Flow
|
||||
|
||||
The final result card is the primary acceptance surface.
|
||||
|
||||
Actions:
|
||||
|
||||
- Accept: closes the task and can trigger skill learning.
|
||||
- Needs revision: requires a comment, appends a new revision card, and starts another attempt in the same timeline.
|
||||
- Abandon: closes the task as abandoned and preserves the execution history.
|
||||
|
||||
After any acceptance action, the page should immediately update local UI state and refetch the task detail.
|
||||
|
||||
## V1 Scope
|
||||
|
||||
V1 includes:
|
||||
|
||||
- persistent task header
|
||||
- live chronological card feed
|
||||
- skill cards
|
||||
- tool call and result cards
|
||||
- agent team card
|
||||
- sub-agent progress cards
|
||||
- artifact cards
|
||||
- final result and acceptance card
|
||||
- WebSocket-first updates with polling fallback
|
||||
- collapsed raw details
|
||||
|
||||
V1 excludes:
|
||||
|
||||
- full administrator audit mode
|
||||
- memory retrieval graph visualization
|
||||
- raw provider request/response viewer
|
||||
- advanced event payload debugger
|
||||
- editable task graph
|
||||
|
||||
## Implementation Notes
|
||||
|
||||
The existing `tasks/[taskId]/page.tsx` already has useful pieces, but the main hierarchy should shift from phase groups and selected node detail to a timeline-first experience.
|
||||
|
||||
Likely frontend modules:
|
||||
|
||||
- `TaskLiveHeader`
|
||||
- `TaskTimeline`
|
||||
- `TaskTimelineCard`
|
||||
- `TaskSideRail`
|
||||
- `TaskAcceptanceCard`
|
||||
- `buildTaskTimelineCards`
|
||||
|
||||
Likely backend work:
|
||||
|
||||
- emit explicit process events for skill selection and tool calls
|
||||
- include user-facing text summaries in event metadata
|
||||
- ensure task detail reconstruction uses persisted events
|
||||
- expose enough run and actor metadata for agent team rendering
|
||||
|
||||
## Self-Review
|
||||
|
||||
- No placeholders remain.
|
||||
- The design is scoped to ordinary-user task detail, not admin audit.
|
||||
- Realtime requirements distinguish live updates from expandable heavy details.
|
||||
- Backend event requirements are explicit enough for frontend implementation.
|
||||
- V1 scope avoids memory graph and debug payload work.
|
||||
Reference in New Issue
Block a user