diff --git a/docs/superpowers/specs/2026-06-08-skill-replay-eval-design.md b/docs/superpowers/specs/2026-06-08-skill-replay-eval-design.md index 4faa972..7426bab 100644 --- a/docs/superpowers/specs/2026-06-08-skill-replay-eval-design.md +++ b/docs/superpowers/specs/2026-06-08-skill-replay-eval-design.md @@ -23,7 +23,7 @@ This design also fixes revision draft generation dropping important content from ## Evaluation Model -Each draft eval selects 3 to 5 historical cases. +Each draft eval selects up to 10 historical cases. If fewer than 10 eligible cases exist, use as many as available. If more than 10 exist, select the 10 most relevant cases. For `revise_skill`, select accepted historical runs that activated the target skill/version. Prefer recent accepted runs, then diversify by task and session.