Files
ai-workflow-skill/docs/roadmaps/archive/council-review-skill-gap-fill-real-forward-test.md
T

3.1 KiB

Title

Replay New Council Review Skill Gap-Fill Cases With Sub-Agents

Status

  • completed

Owner

  • Codex main agent

Started At

  • 2026-03-19

Goal

  • Execute the five newly added docs/tests/council-review-skill/ gap-fill cases with real sub-agents and bundled skill assets.
  • Capture concrete pass/fail evidence for each case and record the outcome in the workstream trace.

Scope

  • Run the five new council-review-skill case docs with sub-agents rather than direct CLI replay alone.
  • Use skills/council-review/ for leader roles and skills/inbox/ for reviewer roles where the case requires reviewer completion.
  • Validate outcomes from the main thread with bundled CLI commands and temp-path evidence.

Checklist

  • Review the relevant roadmap and case docs before execution.
  • Launch sub-agent runners for the five new council-review skill cases.
  • Collect final evidence and determine pass/fail for each case.
  • Update docs or recorded evidence as needed and archive this execution roadmap.

Files

  • docs/tests/council-review-skill/README.md
  • docs/tests/council-review-skill/council-report-show-all-includes-minority-through-bundled-cli.md
  • docs/tests/council-review-skill/council-report-rejects-invalid-show-through-bundled-cli.md
  • docs/tests/council-review-skill/council-tally-strict-keeps-distinct-proposals-through-bundled-cli.md
  • docs/tests/council-review-skill/council-reviewer-output-invalid-json-fails-tally-through-bundled-cli.md
  • docs/tests/council-review-skill/council-start-with-target-file-through-bundled-cli.md
  • docs/roadmaps/archive/council-review-skill-gap-fill-real-forward-test.md

Decisions

  • Use sub-agents as the execution surface because the user explicitly asked for sub-agent-based testing.
  • Group the five cases into a few parallel runners to balance throughput against coordination overhead.
  • Prefer the documented forward-test model first; use main-thread validation commands to independently confirm the reported outcome.

Blockers

  • initial double-case runners were too broad: leader sub-agents spent time on repository process discovery instead of immediately running the documented bundled-CLI steps
  • nested role-agent shell startup needed the narrower codex exec --dangerously-bypass-approvals-and-sandbox workaround before the local bundled CLI commands could start reliably

Next Step

  • Commit or otherwise preserve the recorded real-forward evidence if the user wants the updated case docs saved in Git history.

Completion Summary

  • All five newly added council-review-skill cases passed under real sub-agent execution with isolated temp DBs and bundled skill assets.
  • Main-thread validation independently confirmed the critical assertions for target-file, show all, invalid --show, strict tally semantics, and malformed-reviewer JSON failure at tally time.
  • Added Recorded Real Forward Run sections to the five case docs with concrete temp paths, run ids, thread ids, and validation summaries.
  • The final successful runs used narrower role prompts that explicitly forbade repo discovery or roadmap work before executing the bundled CLI workflow steps.