4.3 KiB
4.3 KiB
Case: council-report-rejects-invalid-show-through-bundled-cli
Test Type
This is a forward-test and an invalid-input report-filter validation.
The goal is to verify that a leader using the packaged council-review skill reaches the stable invalid_input error contract when it asks council report for an unsupported bucket list.
Purpose
Validate that all of the following can be true at the same time:
- the leader can drive a real council run through
start -> wait -> tally - three reviewer agents can complete their tasks through the packaged inbox skill
- the leader can attempt
council report --show consensus,invalid - the skill surfaces the stable
invalid_inputerror instead of silently dropping the bad bucket
Preconditions
- council-review skill path exists:
COUNCIL_SKILL_PATH=skills/council-review - inbox skill path exists:
INBOX_SKILL_PATH=skills/inbox - bundled CLI executables exist at
COUNCIL_SKILL_PATH/assets/orchandINBOX_SKILL_PATH/assets/inbox - use an empty temporary directory
TMPDIR - initialize
TMPDIR/coord.dbbefore launching role agents throughINBOX_SKILL_PATH/assets/inbox --db TMPDIR/coord.db --json init
Agent Topology
leaderarchitecture-reviewerimplementation-reviewerrisk-reviewer
Inputs
Leader Prompt
Use $council-review at COUNCIL_SKILL_PATH to act as leader on the already initialized SQLite DB TMPDIR/coord.db. Only coordinate through the bundled orch CLI from the skill. Workflow: 1) start council run council_skill_006 with a short architecture review prompt, 2) wait until all three reviewers complete, 3) tally with normal similarity, 4) attempt council report with --show consensus,invalid, 5) stop after reporting RUN_ID, exit code, and the error payload you observed. Do not use ordinary chat to coordinate with the reviewers.
Reviewer Prompts
- Reuse the same reviewer body JSON and inbox-only workflow as in council-brainstorm-end-to-end-through-bundled-cli.md, but target run
council_skill_006.
Execution Parameters
- use the shared execution contract from README.md
- use the shared timeout defaults from README.md
- do not override the default cleanup policy
Execution Steps
- Initialize
TMPDIR/coord.dbonce through the bundled inbox CLI before launching agents - Inject
skills/council-review/intoleader - Inject
skills/inbox/into the three reviewer agents - Point all agents at the same database path
TMPDIR/coord.db - Launch
leader,architecture-reviewer,implementation-reviewer, andrisk-reviewerin parallel - Wait for all agents to finish
- Independently run the validation commands from the main thread
Validation Commands
COUNCIL_SKILL_PATH/assets/orch --db TMPDIR/coord.db --json council report --run council_skill_006 --show consensus,invalid
Expected Outcomes
- the leader successfully starts
council_skill_006 - reviewer completion and tally both succeed before the invalid report attempt
- the report command exits with the stable invalid-input contract
- the error message names the accepted bucket values
Assertions
- command exit code is
30 - error code is
invalid_input - the error message mentions
consensus - the error message mentions
majority - the error message mentions
minority - the error message mentions
all
Cleanup
- use the default cleanup policy from README.md
- if the run fails, retain
TMPDIRandcoord.dbfor replay and manual inspection
Recorded Real Forward Run
- recorded on:
2026-03-19 - execution mode:
real_subagent_forward_test - result:
pass - evidence root:
/tmp/council-skill-invalid-show-narrow.Sw6so6 - observed run id:
council_skill_006 - observed thread ids:
architecture-reviewer:thr_7fad634dd9d245239d4fbd2287992d54implementation-reviewer:thr_fc76cff125f04fc491064b828a18ff69risk-reviewer:thr_f421bf49fa1240beb5c7a2d5f38aab6b- evidence summary:
- main-thread
status --run council_skill_006 --jsonreturnedrun.status == "done"andtask_counts.done == 3 - main-thread
council report --run council_skill_006 --show consensus,invalid --jsonexited with code30 - the returned error payload was
invalid_inputwith messageshow must contain consensus, majority, minority, or all