Files
ai-workflow-skill/docs/roadmaps/archive/council-review-skill-direct-replay.md
T

3.0 KiB

Title

Direct Replay Of Council Review Skill Forward Tests

Status

  • completed

Owner

  • Codex main agent

Started At

  • 2026-03-19

Goal

  • Execute the documented docs/tests/council-review-skill/ forward-test scenarios with real subagents and bundled skill assets.
  • Collect pass/fail outcomes and concrete evidence for the current skill bundle behavior.

Scope

  • Run the current council-review skill test-plan cases against isolated temp DBs.
  • Use skills/council-review/ for the leader and skills/inbox/ for reviewers where the case requires reviewer completion.
  • Validate outcomes from the main thread with bundled CLI commands.

Checklist

  • Review the council-review skill test-plan directory and choose execution order.
  • Run council-report-rejects-before-tally-through-bundled-cli.
  • Run council-wait-timeout-through-bundled-cli.
  • Run council-brainstorm-end-to-end-through-bundled-cli.
  • Run council-unanimous-only-default-report-through-bundled-cli.
  • Summarize results and archive this execution roadmap.

Files

  • docs/tests/council-review-skill/README.md
  • docs/tests/council-review-skill/*.md
  • docs/roadmaps/archive/council-review-skill-direct-replay.md

Decisions

  • Start with the single-agent error/timeout cases to verify the leader skill behavior before spending time on four-agent end-to-end runs.
  • Keep each case in its own temp directory and DB for isolation.

Blockers

  • none

Next Step

  • If desired, append Recorded Example Run sections to the council-review skill case docs using the captured run ids and temp paths from this replay.

Completion Summary

  • council-report-rejects-before-tally-through-bundled-cli: passed on /tmp/council-skill-report-before-tally.AXZn2p/coord.db; main-thread replay returned exit code 30 with invalid_state and the expected “run council tally first” message.
  • council-wait-timeout-through-bundled-cli: passed on /tmp/council-skill-wait-timeout.csirvt/coord.db; main-thread replay returned woke == false, all_complete == false, and three visible reviewer statuses while orch status showed the run still running.
  • council-brainstorm-end-to-end-through-bundled-cli: passed on /tmp/council-skill-e2e.DLaTj6/coord.db; main-thread validation confirmed run.status == done, three reviewer tasks done, default report show == ["consensus","majority"], summary counts 1/1/1, and markdown artifact /tmp/council-skill-e2e.DLaTj6/.orch/reports/council_skill_001.md.
  • council-unanimous-only-default-report-through-bundled-cli: passed on /tmp/council-skill-unanimous.MzF1lp/coord.db; main-thread validation confirmed run.status == done, default report show == ["consensus"], preserved summary counts 1/1/1, and markdown artifact /tmp/council-skill-unanimous.MzF1lp/.orch/reports/council_skill_002.md.
  • One reviewer agent in the unanimous-only run had an initial thread-id parsing misstep, but it retried through the bundled inbox CLI and finished successfully; the case still passed under independent main-thread validation.