# Case: `council-brainstorm-end-to-end-through-bundled-cli` ## Test Type This is a `forward-test` and a high-level council workflow validation. The goal is to verify that a leader using the packaged `council-review` skill can drive `council start -> wait -> tally -> report` while three real reviewer agents return structured outputs through the packaged inbox skill. ## Purpose Validate that all of the following can be true at the same time: - the leader can use the bundled `./assets/orch` CLI through the council-review skill - three reviewer agents can claim and complete their fixed-role inbox tasks - the leader can wait, tally, and report after all reviewer outputs arrive - the final report defaults to `consensus,majority` - a markdown report artifact is written ## Preconditions - council-review skill path exists: `COUNCIL_SKILL_PATH=skills/council-review` - inbox skill path exists: `INBOX_SKILL_PATH=skills/inbox` - bundled CLI executables exist at `COUNCIL_SKILL_PATH/assets/orch` and `INBOX_SKILL_PATH/assets/inbox` - use an empty temporary directory `TMPDIR` - initialize `TMPDIR/coord.db` before launching role agents through `INBOX_SKILL_PATH/assets/inbox --db TMPDIR/coord.db --json init` ## Agent Topology - `leader` - `architecture-reviewer` - `implementation-reviewer` - `risk-reviewer` ## Inputs ### Leader Prompt ```text Use $council-review at COUNCIL_SKILL_PATH to act as leader on the already initialized SQLite DB TMPDIR/coord.db. Only coordinate through the bundled orch CLI from the skill. Workflow: 1) start council run council_skill_001 with a short architecture review prompt, 2) wait until all three reviewers complete, 3) tally with normal similarity, 4) report with default settings, 5) stop after reporting RUN_ID and REPORT_PATH. Do not use ordinary chat to coordinate with the reviewers. ``` ### Architecture Reviewer Prompt ```text Use $inbox at INBOX_SKILL_PATH to act as architecture-reviewer on SQLite DB TMPDIR/coord.db. Only coordinate through the bundled inbox CLI from the skill. Workflow: 1) fetch and claim your assigned council task, 2) complete it with done using this exact JSON body: {"reviewer_role":"architecture-reviewer","findings":[{"title":"Split contracts","summary":"Transport contracts are mixed into UI code.","proposal":"Move API contract definitions into a dedicated module.","rationale":"This lowers coupling.","confidence":"high","tags":["architecture"],"target_refs":{"repo_path":"."}},{"title":"Share helpers","summary":"Council report rendering paths are repeated.","proposal":"Introduce shared council coordinator helpers for report rendering.","rationale":"This keeps report assembly consistent.","confidence":"medium","tags":["reporting"],"target_refs":{"repo_path":"."}}]}, 3) stop after reporting THREAD_ID. Do not use ordinary chat to coordinate with the leader. ``` ### Implementation Reviewer Prompt ```text Use $inbox at INBOX_SKILL_PATH to act as implementation-reviewer on SQLite DB TMPDIR/coord.db. Only coordinate through the bundled inbox CLI from the skill. Workflow: 1) fetch and claim your assigned council task, 2) complete it with done using this exact JSON body: {"reviewer_role":"implementation-reviewer","findings":[{"title":"Extract contracts","summary":"Shared transport shapes are duplicated.","proposal":"Move API contract definitions into dedicated module","rationale":"This reduces duplication.","confidence":"high","tags":["maintainability"],"target_refs":{"repo_path":"."}},{"title":"Reuse report helpers","summary":"Formatting logic should stay shared.","proposal":"Introduce shared council coordinator helpers for report rendering","rationale":"This avoids formatter drift.","confidence":"medium","tags":["reporting"],"target_refs":{"repo_path":"."}}]}, 3) stop after reporting THREAD_ID. Do not use ordinary chat to coordinate with the leader. ``` ### Risk Reviewer Prompt ```text Use $inbox at INBOX_SKILL_PATH to act as risk-reviewer on SQLite DB TMPDIR/coord.db. Only coordinate through the bundled inbox CLI from the skill. Workflow: 1) fetch and claim your assigned council task, 2) complete it with done using this exact JSON body: {"reviewer_role":"risk-reviewer","findings":[{"title":"Lock contracts","summary":"Contract drift becomes risky over time.","proposal":"Move API contract definitions into a dedicated module.","rationale":"This reduces integration regressions.","confidence":"high","tags":["risk"],"target_refs":{"repo_path":"."}},{"title":"Cover JSON output","summary":"The council report response should stay stable.","proposal":"Add regression tests for council report JSON output.","rationale":"This catches contract regressions earlier.","confidence":"high","tags":["testing"],"target_refs":{"repo_path":"."}}]}, 3) stop after reporting THREAD_ID. Do not use ordinary chat to coordinate with the leader. ``` ## Execution Parameters - use the shared execution contract from [README.md](./README.md) - use the shared timeout defaults from [README.md](./README.md) - do not override the default cleanup policy ## Execution Steps 1. Initialize `TMPDIR/coord.db` once through the bundled inbox CLI before launching agents 2. Inject `skills/council-review/` into `leader` 3. Inject `skills/inbox/` into the three reviewer agents 4. Point all agents at the same database path `TMPDIR/coord.db` 5. Launch `leader`, `architecture-reviewer`, `implementation-reviewer`, and `risk-reviewer` in parallel 6. Wait for all agents to finish 7. Resolve `RUN_ID=council_skill_001`, reviewer `THREAD_ID`s, and `REPORT_PATH` from the agent outputs 8. Independently run the validation commands from the main thread ## Validation Commands ```bash COUNCIL_SKILL_PATH/assets/orch --db TMPDIR/coord.db --json status --run council_skill_001 COUNCIL_SKILL_PATH/assets/orch --db TMPDIR/coord.db --json council report --run council_skill_001 test -f REPORT_PATH ``` ## Expected Outcomes - the leader successfully starts `council_skill_001` - all three reviewers complete their fixed-role tasks - `council wait` returns `all_complete == true` - `council tally` returns one `consensus`, one `majority`, and one `minority` - `council report` defaults to showing `consensus,majority` - a markdown report artifact exists on disk ## Assertions - `status.data.run.status == "done"` - `status.data.tasks` contains exactly three reviewer tasks and all are `done` - `report.data.show == ["consensus","majority"]` - `report.data.summary.consensus == 1` - `report.data.summary.majority == 1` - `report.data.summary.minority == 1` - `report.data.grouped_recommendations` length is `2` - `REPORT_PATH` exists ## Cleanup - use the default cleanup policy from [README.md](./README.md) - if the run fails, retain `TMPDIR` and `coord.db` for replay and manual inspection