Record council-review skill test evidence
This commit is contained in:
+16
@@ -84,3 +84,19 @@ COUNCIL_SKILL_PATH/assets/orch --db TMPDIR/coord.db --json council report --run
|
||||
|
||||
- use the default cleanup policy from [README.md](./README.md)
|
||||
- if the run fails, retain `TMPDIR` and `coord.db` for replay and manual inspection
|
||||
|
||||
## Recorded Real Forward Run
|
||||
|
||||
- recorded on: `2026-03-19`
|
||||
- execution mode: `real_subagent_forward_test`
|
||||
- result: `pass`
|
||||
- evidence root: `/tmp/council-skill-invalid-show-narrow.Sw6so6`
|
||||
- observed run id: `council_skill_006`
|
||||
- observed thread ids:
|
||||
- `architecture-reviewer`: `thr_7fad634dd9d245239d4fbd2287992d54`
|
||||
- `implementation-reviewer`: `thr_fc76cff125f04fc491064b828a18ff69`
|
||||
- `risk-reviewer`: `thr_f421bf49fa1240beb5c7a2d5f38aab6b`
|
||||
- evidence summary:
|
||||
- main-thread `status --run council_skill_006 --json` returned `run.status == "done"` and `task_counts.done == 3`
|
||||
- main-thread `council report --run council_skill_006 --show consensus,invalid --json` exited with code `30`
|
||||
- the returned error payload was `invalid_input` with message `show must contain consensus, majority, minority, or all`
|
||||
|
||||
+17
@@ -88,3 +88,20 @@ test -f REPORT_PATH
|
||||
|
||||
- use the default cleanup policy from [README.md](./README.md)
|
||||
- if the run fails, retain `TMPDIR` and `coord.db` for replay and manual inspection
|
||||
|
||||
## Recorded Real Forward Run
|
||||
|
||||
- recorded on: `2026-03-19`
|
||||
- execution mode: `real_subagent_forward_test`
|
||||
- result: `pass`
|
||||
- evidence root: `/tmp/council-skill-show-all-narrow.Uk0ThB`
|
||||
- observed run id: `council_skill_005`
|
||||
- observed thread ids:
|
||||
- `architecture-reviewer`: `thr_c4cb0a9a5dd142619e854fc0f3864ea8`
|
||||
- `implementation-reviewer`: `thr_3a54f2e1bc6945f38627958f7f6b4728`
|
||||
- `risk-reviewer`: `thr_16765453dedf45b4a6ccf4ecfab710db`
|
||||
- observed report path: `/tmp/council-skill-show-all-narrow.Uk0ThB/.orch/reports/council_skill_005.md`
|
||||
- evidence summary:
|
||||
- main-thread `status --run council_skill_005 --json` returned `run.status == "done"` and `task_counts.done == 3`
|
||||
- main-thread `council report --run council_skill_005 --show all --json` returned `show == ["consensus","majority","minority"]`, summary counts `1/1/1`, and `grouped_recommendations` length `3`
|
||||
- the returned groups included a `minority` bucket and the markdown artifact existed on disk
|
||||
|
||||
+17
@@ -107,3 +107,20 @@ COUNCIL_SKILL_PATH/assets/orch --db TMPDIR/coord.db --json council tally --run c
|
||||
|
||||
- use the default cleanup policy from [README.md](./README.md)
|
||||
- if the run fails, retain `TMPDIR` and `coord.db` for replay and manual inspection
|
||||
|
||||
## Recorded Real Forward Run
|
||||
|
||||
- recorded on: `2026-03-19`
|
||||
- execution mode: `real_subagent_forward_test`
|
||||
- result: `pass`
|
||||
- evidence root: `/tmp/council-reviewer-output-invalid-json-fails-tally-through-bundled-cli.narrow1.i6ZP98`
|
||||
- observed run id: `council_skill_008`
|
||||
- observed thread ids:
|
||||
- `architecture-reviewer`: `thr_350c43fdf8a449228b8611ce5114326d`
|
||||
- `implementation-reviewer`: `thr_db858b530cb044a7bceeaa417f1cea75`
|
||||
- `risk-reviewer`: `thr_1c93381b070c47c49e312039b8343655`
|
||||
- evidence summary:
|
||||
- main-thread `council wait --run council_skill_008 --timeout-seconds 2 --json` returned `woke == true` and `all_complete == true`
|
||||
- main-thread `council tally --run council_skill_008 --similarity normal --json` exited with code `30`
|
||||
- the returned error payload was `invalid_input` with message `reviewer output must be valid JSON`
|
||||
- this run confirmed the negative path where reviewer tasks are all `done` but tally still fails on stored reviewer-output validation
|
||||
|
||||
@@ -95,3 +95,19 @@ sqlite3 TMPDIR/coord.db "SELECT acceptance_json FROM tasks WHERE run_id = 'counc
|
||||
|
||||
- use the default cleanup policy from [README.md](./README.md)
|
||||
- if the run fails, retain `TMPDIR`, `brief.md`, and `coord.db` for replay and manual inspection
|
||||
|
||||
## Recorded Real Forward Run
|
||||
|
||||
- recorded on: `2026-03-19`
|
||||
- execution mode: `real_subagent_forward_test`
|
||||
- result: `pass`
|
||||
- evidence root: `/tmp/council-skill-target-file.ikPOLP`
|
||||
- observed run id: `council_skill_009`
|
||||
- observed thread ids:
|
||||
- `CR1`: `thr_32df58f9b55945b899257f583708b7ef`
|
||||
- `CR2`: `thr_c5f8c552cb1240649546df8386be3668`
|
||||
- `CR3`: `thr_172eabff13eb48ed9af2deee928a9438`
|
||||
- evidence summary:
|
||||
- main-thread `status --run council_skill_009 --json` returned three `dispatched` council tasks and a non-terminal run
|
||||
- main-thread `sqlite3` validation showed `council_inputs.target_file == "/tmp/council-skill-target-file.ikPOLP/brief.md"` with empty `prompt`, `repo_path`, and `target_task_id`
|
||||
- main-thread `sqlite3` validation of `CR1` acceptance JSON showed the same `target_file` persisted into the council task payload
|
||||
|
||||
+16
@@ -102,3 +102,19 @@ COUNCIL_SKILL_PATH/assets/orch --db TMPDIR/coord.db --json council tally --run c
|
||||
|
||||
- use the default cleanup policy from [README.md](./README.md)
|
||||
- if the run fails, retain `TMPDIR` and `coord.db` for replay and manual inspection
|
||||
|
||||
## Recorded Real Forward Run
|
||||
|
||||
- recorded on: `2026-03-19`
|
||||
- execution mode: `real_subagent_forward_test`
|
||||
- result: `pass`
|
||||
- evidence root: `/tmp/council-tally-strict-keeps-distinct-proposals-through-bundled-cli.narrow4.UCbqOc`
|
||||
- observed run id: `council_skill_007`
|
||||
- observed thread ids:
|
||||
- `architecture-reviewer`: `thr_9e153f61692b4475a55f5c3068842ea5`
|
||||
- `implementation-reviewer`: `thr_abbd9a2961374b13b3d3e27720fe27ab`
|
||||
- `risk-reviewer`: `thr_3f2d64211f274f64b606bd8b8c6be5f7`
|
||||
- evidence summary:
|
||||
- main-thread `council wait --run council_skill_007 --timeout-seconds 2 --json` returned `woke == true` and `all_complete == true`
|
||||
- main-thread `council tally --run council_skill_007 --similarity strict --json` returned `similarity == "strict"` and `counts.minority == 3`
|
||||
- the returned proposal set preserved all three distinct values, including both `Move API contract definitions into a dedicated module.` and `Move API contract definitions into dedicated module`
|
||||
|
||||
Reference in New Issue
Block a user