Title

Replay New Council Review Skill Gap-Fill Cases With Sub-Agents

Status

Execute the five newly added docs/tests/council-review-skill/ gap-fill cases with real sub-agents and bundled skill assets.
Capture concrete pass/fail evidence for each case and record the outcome in the workstream trace.

Run the five new council-review-skill case docs with sub-agents rather than direct CLI replay alone.
Use skills/council-review/ for leader roles and skills/inbox/ for reviewer roles where the case requires reviewer completion.
Validate outcomes from the main thread with bundled CLI commands and temp-path evidence.

docs/tests/council-review-skill/README.md
docs/tests/council-review-skill/council-report-show-all-includes-minority-through-bundled-cli.md
docs/tests/council-review-skill/council-report-rejects-invalid-show-through-bundled-cli.md
docs/tests/council-review-skill/council-tally-strict-keeps-distinct-proposals-through-bundled-cli.md
docs/tests/council-review-skill/council-reviewer-output-invalid-json-fails-tally-through-bundled-cli.md
docs/tests/council-review-skill/council-start-with-target-file-through-bundled-cli.md
docs/roadmaps/archive/council-review-skill-gap-fill-real-forward-test.md

Use sub-agents as the execution surface because the user explicitly asked for sub-agent-based testing.
Group the five cases into a few parallel runners to balance throughput against coordination overhead.
Prefer the documented forward-test model first; use main-thread validation commands to independently confirm the reported outcome.

initial double-case runners were too broad: leader sub-agents spent time on repository process discovery instead of immediately running the documented bundled-CLI steps
nested role-agent shell startup needed the narrower codex exec --dangerously-bypass-approvals-and-sandbox workaround before the local bundled CLI commands could start reliably

Commit or otherwise preserve the recorded real-forward evidence if the user wants the updated case docs saved in Git history.

All five newly added council-review-skill cases passed under real sub-agent execution with isolated temp DBs and bundled skill assets.
Main-thread validation independently confirmed the critical assertions for target-file, show all, invalid --show, strict tally semantics, and malformed-reviewer JSON failure at tally time.
Added Recorded Real Forward Run sections to the five case docs with concrete temp paths, run ids, thread ids, and validation summaries.
The final successful runs used narrower role prompts that explicitly forbade repo discovery or roadmap work before executing the bundled CLI workflow steps.