From a20bec1cacd909d6dd701dc5c79b83c0ce7a3e71 Mon Sep 17 00:00:00 2001 From: kurihada Date: Thu, 19 Mar 2026 16:27:28 +0800 Subject: [PATCH] Author orch Markdown test plan --- docs/implementation-roadmap.md | 14 +- .../archive/orch-markdown-test-cases.md | 65 ++++++ .../archive/orch-markdown-test-gap-fill.md | 71 ++++++ .../archive/orch-markdown-test-review.md | 59 +++++ docs/tests/orch/ROADMAP.md | 209 ++++++++++++------ docs/tests/orch/_shared/README.md | 16 ++ docs/tests/orch/answer/README.md | 10 +- ...nswer-accepts-payload-json-without-body.md | 37 ++++ .../answer-appends-answer-to-active-thread.md | 36 +++ .../answer-rejects-empty-body-and-payload.md | 29 +++ docs/tests/orch/blocked/README.md | 8 +- ...-lists-latest-question-for-blocked-task.md | 38 ++++ docs/tests/orch/cancel/README.md | 9 +- .../orch/cancel/cancel-cancels-entire-run.md | 30 +++ .../orch/cancel/cancel-cancels-single-task.md | 32 +++ docs/tests/orch/cleanup/README.md | 10 +- .../cleanup-rejects-attempt-without-task.md | 29 +++ .../cleanup-removes-completed-worktree.md | 37 ++++ ...urns-no-matching-work-when-filters-miss.md | 33 +++ docs/tests/orch/council-report/README.md | 13 +- ...port-defaults-to-consensus-and-majority.md | 70 ++++++ ...to-consensus-when-run-is-only-unanimous.md | 73 ++++++ .../council-report-json-shape-is-stable.md | 41 ++++ .../council-report-rejects-before-tally.md | 32 +++ .../council-report-rejects-invalid-show.md | 29 +++ ...uncil-report-show-all-includes-minority.md | 29 +++ docs/tests/orch/council-start/README.md | 8 +- ...ouncil-start-dispatches-three-reviewers.md | 46 ++++ docs/tests/orch/council-tally/README.md | 9 +- ...groups-reviewer-findings-in-normal-mode.md | 67 ++++++ ...keeps-distinct-proposals-in-strict-mode.md | 61 +++++ docs/tests/orch/council-wait/README.md | 9 +- ...ait-times-out-when-reviewers-incomplete.md | 35 +++ ...-wait-wakes-when-all-reviewers-complete.md | 53 +++++ docs/tests/orch/dep-add/README.md | 8 +- ...ndent-task-until-prerequisite-completes.md | 38 ++++ docs/tests/orch/dispatch/README.md | 14 +- ...-allows-explicit-base-ref-on-dirty-repo.md | 31 +++ ...uto-enables-worktree-for-code-like-task.md | 34 +++ ...eates-attempt-and-thread-for-ready-task.md | 38 ++++ .../dispatch-creates-strict-worktree.md | 39 ++++ ...tch-rejects-dirty-repo-without-base-ref.md | 30 +++ .../dispatch-rejects-non-ready-task.md | 34 +++ ...h-skips-auto-worktree-for-non-code-task.md | 35 +++ docs/tests/orch/ready/README.md | 9 +- .../ready/ready-lists-only-eligible-tasks.md | 38 ++++ ...y-orders-by-priority-and-respects-limit.md | 39 ++++ docs/tests/orch/reassign/README.md | 8 +- ...s-old-thread-and-dispatches-new-attempt.md | 39 ++++ docs/tests/orch/reconcile/README.md | 9 +- ...laimed-or-in-progress-thread-to-running.md | 34 +++ ...or-failed-thread-to-terminal-task-state.md | 34 +++ docs/tests/orch/retry/README.md | 8 +- ...try-creates-new-attempt-for-failed-task.md | 40 ++++ docs/tests/orch/run-init/README.md | 8 +- .../orch/run-init/run-init-creates-new-run.md | 34 +++ docs/tests/orch/run-show/README.md | 8 +- ...how-returns-run-summary-and-task-counts.md | 35 +++ docs/tests/orch/status/README.md | 8 +- ...tatus-returns-run-summary-and-task-list.md | 37 ++++ docs/tests/orch/task-add/README.md | 10 +- .../task-add-creates-ready-root-task.md | 36 +++ ...ask-add-rejects-invalid-acceptance-json.md | 27 +++ .../task-add-rejects-invalid-priority.md | 27 +++ docs/tests/orch/wait/README.md | 9 +- .../wait-times-out-without-matching-event.md | 33 +++ .../wait/wait-wakes-on-matching-run-event.md | 40 ++++ docs/tests/orch/workflows/README.md | 167 +++++++++++++- 68 files changed, 2225 insertions(+), 160 deletions(-) create mode 100644 docs/roadmaps/archive/orch-markdown-test-cases.md create mode 100644 docs/roadmaps/archive/orch-markdown-test-gap-fill.md create mode 100644 docs/roadmaps/archive/orch-markdown-test-review.md create mode 100644 docs/tests/orch/answer/answer-accepts-payload-json-without-body.md create mode 100644 docs/tests/orch/answer/answer-appends-answer-to-active-thread.md create mode 100644 docs/tests/orch/answer/answer-rejects-empty-body-and-payload.md create mode 100644 docs/tests/orch/blocked/blocked-lists-latest-question-for-blocked-task.md create mode 100644 docs/tests/orch/cancel/cancel-cancels-entire-run.md create mode 100644 docs/tests/orch/cancel/cancel-cancels-single-task.md create mode 100644 docs/tests/orch/cleanup/cleanup-rejects-attempt-without-task.md create mode 100644 docs/tests/orch/cleanup/cleanup-removes-completed-worktree.md create mode 100644 docs/tests/orch/cleanup/cleanup-returns-no-matching-work-when-filters-miss.md create mode 100644 docs/tests/orch/council-report/council-report-defaults-to-consensus-and-majority.md create mode 100644 docs/tests/orch/council-report/council-report-defaults-to-consensus-when-run-is-only-unanimous.md create mode 100644 docs/tests/orch/council-report/council-report-json-shape-is-stable.md create mode 100644 docs/tests/orch/council-report/council-report-rejects-before-tally.md create mode 100644 docs/tests/orch/council-report/council-report-rejects-invalid-show.md create mode 100644 docs/tests/orch/council-report/council-report-show-all-includes-minority.md create mode 100644 docs/tests/orch/council-start/council-start-dispatches-three-reviewers.md create mode 100644 docs/tests/orch/council-tally/council-tally-groups-reviewer-findings-in-normal-mode.md create mode 100644 docs/tests/orch/council-tally/council-tally-keeps-distinct-proposals-in-strict-mode.md create mode 100644 docs/tests/orch/council-wait/council-wait-times-out-when-reviewers-incomplete.md create mode 100644 docs/tests/orch/council-wait/council-wait-wakes-when-all-reviewers-complete.md create mode 100644 docs/tests/orch/dep-add/dep-add-blocks-dependent-task-until-prerequisite-completes.md create mode 100644 docs/tests/orch/dispatch/dispatch-allows-explicit-base-ref-on-dirty-repo.md create mode 100644 docs/tests/orch/dispatch/dispatch-auto-enables-worktree-for-code-like-task.md create mode 100644 docs/tests/orch/dispatch/dispatch-creates-attempt-and-thread-for-ready-task.md create mode 100644 docs/tests/orch/dispatch/dispatch-creates-strict-worktree.md create mode 100644 docs/tests/orch/dispatch/dispatch-rejects-dirty-repo-without-base-ref.md create mode 100644 docs/tests/orch/dispatch/dispatch-rejects-non-ready-task.md create mode 100644 docs/tests/orch/dispatch/dispatch-skips-auto-worktree-for-non-code-task.md create mode 100644 docs/tests/orch/ready/ready-lists-only-eligible-tasks.md create mode 100644 docs/tests/orch/ready/ready-orders-by-priority-and-respects-limit.md create mode 100644 docs/tests/orch/reassign/reassign-cancels-old-thread-and-dispatches-new-attempt.md create mode 100644 docs/tests/orch/reconcile/reconcile-maps-claimed-or-in-progress-thread-to-running.md create mode 100644 docs/tests/orch/reconcile/reconcile-maps-done-or-failed-thread-to-terminal-task-state.md create mode 100644 docs/tests/orch/retry/retry-creates-new-attempt-for-failed-task.md create mode 100644 docs/tests/orch/run-init/run-init-creates-new-run.md create mode 100644 docs/tests/orch/run-show/run-show-returns-run-summary-and-task-counts.md create mode 100644 docs/tests/orch/status/status-returns-run-summary-and-task-list.md create mode 100644 docs/tests/orch/task-add/task-add-creates-ready-root-task.md create mode 100644 docs/tests/orch/task-add/task-add-rejects-invalid-acceptance-json.md create mode 100644 docs/tests/orch/task-add/task-add-rejects-invalid-priority.md create mode 100644 docs/tests/orch/wait/wait-times-out-without-matching-event.md create mode 100644 docs/tests/orch/wait/wait-wakes-on-matching-run-event.md diff --git a/docs/implementation-roadmap.md b/docs/implementation-roadmap.md index 53eab8c..5fbd8c2 100644 --- a/docs/implementation-roadmap.md +++ b/docs/implementation-roadmap.md @@ -21,7 +21,7 @@ As of now: - `inbox` supports blocking waits, lease renewal, unread fetches backed by per-agent read cursors, `--body-file`, artifact attachments, and structured JSON errors with stable exit codes - integration tests now cover each implemented inbox command, plus the main inbox workflows, wait/watch flows, artifact persistence, unread behavior, and JSON error contracts - a human-readable inbox command test-plan set has been authored under `docs/tests/inbox/` -- an inbox-style human-readable `orch` test-plan skeleton now exists under `docs/tests/orch/`, with a `ROADMAP.md`, shared conventions, workflow entrypoint, and one index `README.md` per implemented leaf command +- a human-readable `orch` test-plan set has now been authored under `docs/tests/orch/`, with a `ROADMAP.md`, shared conventions, workflow scenarios, per-command indexes, and concrete case documents aligned to the current CLI surface, including supplemental coverage for key flag validation, ordering/limit behavior, payload-only answers, cleanup errors, and council report default/error contracts - a reusable Codex skill package for `inbox` now exists under `skills/inbox/`, with a formal `SKILL.md`, `agents/openai.yaml`, and a bundled CLI binary asset - an inbox skill forward-test plan directory now exists under `docs/tests/inbox-skill/`, with a shared execution template and multiple scenario cases - an execution-roadmap workflow now exists under `docs/roadmaps/active/` and `docs/roadmaps/archive/` for agent-level work traces and completion archives @@ -480,9 +480,10 @@ Definition of done: Status: -- initial skeleton created under `docs/tests/orch/` -- global conventions, shared fixtures, workflow entrypoint, and per-command index documents now exist -- command-case Markdown files are still pending and should be authored against the backlog in `docs/tests/orch/ROADMAP.md` +- current planned `orch` Markdown test-plan set is authored under `docs/tests/orch/` +- global conventions, shared fixtures, workflow scenarios, per-command indexes, and concrete case documents now exist +- `docs/tests/orch/ROADMAP.md` now tracks authored counts, document progress, and future additions in the same style used for `docs/tests/inbox/ROADMAP.md` +- supplemental command-visible cases now cover high-value gaps in `task add`, `ready`, `answer`, `cleanup`, and `council report` Goal: @@ -495,17 +496,18 @@ Directory layout: - `docs/tests/orch/_shared/README.md` - `docs/tests/orch/workflows/README.md` - `docs/tests/orch//README.md` +- `docs/tests/orch//.md` Current document model: - one folder per implemented leaf command - each command folder uses `README.md` as an index only - workflow cases live in `docs/tests/orch/workflows/README.md` -- detailed case backlog is tracked centrally in `docs/tests/orch/ROADMAP.md` +- detailed case backlog and authored-case register are tracked centrally in `docs/tests/orch/ROADMAP.md` Next documentation step: -- start authoring the highest-value scheduler and workflow cases from `docs/tests/orch/ROADMAP.md`, beginning with the happy-path lifecycle, dependency-blocked-answer flow, and strict worktree dispatch cases +- keep `docs/tests/orch/ROADMAP.md` synchronized when new `orch` CLI behavior or workflow cases are added, removed, or materially revised ## Out Of Scope For First Pass diff --git a/docs/roadmaps/archive/orch-markdown-test-cases.md b/docs/roadmaps/archive/orch-markdown-test-cases.md new file mode 100644 index 0000000..d2d42fd --- /dev/null +++ b/docs/roadmaps/archive/orch-markdown-test-cases.md @@ -0,0 +1,65 @@ +# Orch Markdown Test Cases + +## Status + +- `completed` + +## Owner + +- codex + +## Started At + +- `2026-03-19` + +## Goal + +- author the first complete wave of human-readable Markdown test-plan cases for `orch`, using the existing `docs/tests/inbox/` structure and writing style as the reference model + +## Scope + +- add concrete command-case Markdown files under `docs/tests/orch//` +- add concrete workflow cases under `docs/tests/orch/workflows/README.md` +- update each affected command index `README.md` +- keep `docs/tests/orch/ROADMAP.md` synchronized with authored files, authored-case counts, and remaining backlog +- update `docs/implementation-roadmap.md` if the documentation progress materially changes + +## Checklist + +- [x] create or adopt an execution roadmap for the workstream +- [x] inspect representative `inbox` Markdown case files and map `orch` cases from current backlog +- [x] author core scheduler command cases and update their indexes +- [x] author interactive leader command cases and update their indexes +- [x] author council command cases and update their indexes +- [x] author workflow cases and synchronize `docs/tests/orch/ROADMAP.md` +- [x] update `docs/implementation-roadmap.md` +- [x] archive this roadmap with a completion summary + +## Files + +- `docs/roadmaps/archive/orch-markdown-test-cases.md` +- `docs/implementation-roadmap.md` +- `docs/tests/orch/ROADMAP.md` +- `docs/tests/orch/workflows/README.md` +- `docs/tests/orch//README.md` +- `docs/tests/orch//.md` + +## Decisions + +- use the existing `docs/tests/inbox/` case structure and tone as the immediate template for `orch` +- keep shared index/count integration in the main thread, while delegating disjoint command-directory write scopes to sub-agents +- keep `docs/tests/orch/ROADMAP.md` aligned with the richer `docs/tests/inbox/ROADMAP.md` format by tracking individual case files in `Document Progress` and `Authored Case Register` + +## Blockers + +- none + +## Next Step + +- no further step in this workstream; future changes should start from `docs/tests/orch/ROADMAP.md` if new cases are added + +## Completion Summary + +- authored the current planned `orch` Markdown test-plan set across workflow, core scheduler, interactive control, and council command surfaces +- synchronized `docs/tests/orch/ROADMAP.md` with real files on disk, authored-case counts, and a completed authored-case register +- updated `docs/implementation-roadmap.md` so future agents see `docs/tests/orch/` as an authored test-plan set rather than only a skeleton diff --git a/docs/roadmaps/archive/orch-markdown-test-gap-fill.md b/docs/roadmaps/archive/orch-markdown-test-gap-fill.md new file mode 100644 index 0000000..2082999 --- /dev/null +++ b/docs/roadmaps/archive/orch-markdown-test-gap-fill.md @@ -0,0 +1,71 @@ +# Orch Markdown Test Gap Fill + +## Status + +- `completed` + +## Owner + +- codex + +## Started At + +- `2026-03-19` + +## Goal + +- fill the five highest-priority Markdown test-plan gaps in `docs/tests/orch/` so the current `orch` CLI has better coverage for key flag validation, non-default output behavior, and error contracts + +## Scope + +- add or update `docs/tests/orch/task-add/` for invalid `acceptance-json` and invalid `priority` +- add or update `docs/tests/orch/ready/` for priority ordering and `--limit` +- add or update `docs/tests/orch/answer/` for payload-only or empty-input validation +- add or update `docs/tests/orch/cleanup/` for invalid-input or no-matching-work cleanup behavior +- add or update `docs/tests/orch/council-report/` for pre-tally/invalid-show/only-unanimous report behavior as needed +- keep `docs/tests/orch/ROADMAP.md` and `docs/implementation-roadmap.md` synchronized + +## Checklist + +- [x] create an execution roadmap for the workstream +- [x] add missing `task add` and `ready` cases and update indexes +- [x] add missing `answer` and `cleanup` cases and update indexes +- [x] add missing `council report` cases and update indexes +- [x] synchronize `docs/tests/orch/ROADMAP.md` +- [x] update `docs/implementation-roadmap.md` +- [x] archive this roadmap with a completion summary + +## Files + +- `docs/roadmaps/archive/orch-markdown-test-gap-fill.md` +- `docs/implementation-roadmap.md` +- `docs/tests/orch/ROADMAP.md` +- `docs/tests/orch/task-add/README.md` +- `docs/tests/orch/task-add/*.md` +- `docs/tests/orch/ready/README.md` +- `docs/tests/orch/ready/*.md` +- `docs/tests/orch/answer/README.md` +- `docs/tests/orch/answer/*.md` +- `docs/tests/orch/cleanup/README.md` +- `docs/tests/orch/cleanup/*.md` +- `docs/tests/orch/council-report/README.md` +- `docs/tests/orch/council-report/*.md` + +## Decisions + +- use focused follow-up case files instead of broad rewrites to keep the current authored set stable +- keep shared tracking files in the main thread and delegate only disjoint command directories + +## Blockers + +- none + +## Next Step + +- no further step in this workstream; future follow-up should start from the updated `docs/tests/orch/ROADMAP.md` + +## Completion Summary + +- added 10 supplemental `orch` Markdown test-plan cases covering high-priority gaps in `task add`, `ready`, `answer`, `cleanup`, and `council report` +- synchronized the relevant command index `README.md` files, the shared `docs/tests/orch/ROADMAP.md` ledger, and the project-level implementation roadmap +- used sub-agents for three disjoint directory slices while keeping shared tracking files in the main thread diff --git a/docs/roadmaps/archive/orch-markdown-test-review.md b/docs/roadmaps/archive/orch-markdown-test-review.md new file mode 100644 index 0000000..88eaa76 --- /dev/null +++ b/docs/roadmaps/archive/orch-markdown-test-review.md @@ -0,0 +1,59 @@ +# Orch Markdown Test Review + +## Status + +- `completed` + +## Owner + +- codex + +## Started At + +- `2026-03-19` + +## Goal + +- review the authored `docs/tests/orch/` command directories and case files for completeness against the current `orch` CLI and identify any missing or weakly specified test-plan coverage + +## Scope + +- inspect each `docs/tests/orch//` directory +- compare current Markdown cases against the implemented CLI and existing automated integration tests +- identify missing command-visible contracts, edge cases, or unclear assertions +- summarize findings without expanding scope into unrelated implementation changes + +## Checklist + +- [x] create an execution roadmap for the review +- [x] inspect current `docs/tests/orch/` files and split the review by command area +- [x] collect findings for core scheduler command directories +- [x] collect findings for interactive leader/control command directories +- [x] collect findings for council command directories +- [x] summarize whether any command directories need supplemental cases or edits +- [x] archive this roadmap with a completion summary + +## Files + +- `docs/roadmaps/archive/orch-markdown-test-review.md` +- `docs/tests/orch/README.md` +- `docs/tests/orch/ROADMAP.md` +- `docs/tests/orch//README.md` +- `docs/tests/orch//.md` + +## Decisions + +- treat this as a review task first: surface gaps before proposing edits + +## Blockers + +- none + +## Next Step + +- none; future follow-up should start from the review findings and add only the missing case files or constraints that were called out + +## Completion Summary + +- completed a read-only coverage review of the authored `docs/tests/orch/` command directories against the current `orch` CLI and automated integration tests +- identified the highest-value missing or weakly specified command-visible cases so follow-up authoring can target concrete gaps instead of re-auditing the whole tree diff --git a/docs/tests/orch/ROADMAP.md b/docs/tests/orch/ROADMAP.md index fa06470..4aa2cae 100644 --- a/docs/tests/orch/ROADMAP.md +++ b/docs/tests/orch/ROADMAP.md @@ -25,15 +25,16 @@ Current state: - `orch` CLI is implemented for the current scheduler, strict worktree, wait, and council review surfaces - automated Go integration tests already cover the main scheduler lifecycle, dependency gating, blocked-answer flow, worktree dispatch behavior, waits, retries, reassignments, cleanup, and council start/wait/tally/report flows - this roadmap now exists under `docs/tests/orch/ROADMAP.md` -- the initial `docs/tests/orch/` skeleton now exists with shared, workflow, and per-command index documents -- command-case Markdown files have not been authored yet; the next work is filling the backlog in a stable order +- all planned global, shared, workflow, command-index, and command-case Markdown documents in the current `orch` test-plan set have been authored +- every implemented `orch` leaf-command folder now uses `README.md` as an index plus one Markdown file per planned case +- workflow cases now exist in `docs/tests/orch/workflows/README.md`, and command-case coverage is aligned to the current automated integration suite Progress summary for planned test-plan documents, excluding `ROADMAP.md`: -- planned document files: `22` -- authored document files: `22` -- planned case slugs in this roadmap: `35` -- authored case slugs in this roadmap: `0` +- planned document files: `64` +- authored document files: `64` +- planned case slugs in this roadmap: `46` +- authored case slugs in this roadmap: `46` ## Scope @@ -203,26 +204,68 @@ docs/tests/orch/ | --- | --- | ---: | ---: | --- | | `docs/tests/orch/README.md` | Global testing conventions and glossary | 0 | 0 | done | | `docs/tests/orch/_shared/README.md` | Shared fixtures, JSON assertions, exit-code rules, and worktree conventions | 0 | 0 | done | -| `docs/tests/orch/workflows/README.md` | Cross-command scenarios | 4 | 0 | done | -| `docs/tests/orch/run-init/README.md` | `run init` command case index | 1 | 0 | done | -| `docs/tests/orch/run-show/README.md` | `run show` command case index | 1 | 0 | done | -| `docs/tests/orch/task-add/README.md` | `task add` command case index | 1 | 0 | done | -| `docs/tests/orch/dep-add/README.md` | `dep add` command case index | 1 | 0 | done | -| `docs/tests/orch/ready/README.md` | `ready` command case index | 1 | 0 | done | -| `docs/tests/orch/dispatch/README.md` | `dispatch` command case index | 6 | 0 | done | -| `docs/tests/orch/reconcile/README.md` | `reconcile` command case index | 2 | 0 | done | -| `docs/tests/orch/wait/README.md` | `wait` command case index | 2 | 0 | done | -| `docs/tests/orch/blocked/README.md` | `blocked` command case index | 1 | 0 | done | -| `docs/tests/orch/answer/README.md` | `answer` command case index | 1 | 0 | done | -| `docs/tests/orch/retry/README.md` | `retry` command case index | 1 | 0 | done | -| `docs/tests/orch/reassign/README.md` | `reassign` command case index | 1 | 0 | done | -| `docs/tests/orch/cancel/README.md` | `cancel` command case index | 2 | 0 | done | -| `docs/tests/orch/cleanup/README.md` | `cleanup` command case index | 1 | 0 | done | -| `docs/tests/orch/status/README.md` | `status` command case index | 1 | 0 | done | -| `docs/tests/orch/council-start/README.md` | `council start` command case index | 1 | 0 | done | -| `docs/tests/orch/council-wait/README.md` | `council wait` command case index | 2 | 0 | done | -| `docs/tests/orch/council-tally/README.md` | `council tally` command case index | 2 | 0 | done | -| `docs/tests/orch/council-report/README.md` | `council report` command case index | 3 | 0 | done | +| `docs/tests/orch/workflows/README.md` | Cross-command scenarios | 4 | 4 | done | +| `docs/tests/orch/run-init/README.md` | `run init` command case index | 0 | 0 | done | +| `docs/tests/orch/run-init/run-init-creates-new-run.md` | `run init` command case | 1 | 1 | done | +| `docs/tests/orch/run-show/README.md` | `run show` command case index | 0 | 0 | done | +| `docs/tests/orch/run-show/run-show-returns-run-summary-and-task-counts.md` | `run show` command case | 1 | 1 | done | +| `docs/tests/orch/task-add/README.md` | `task add` command case index | 0 | 0 | done | +| `docs/tests/orch/task-add/task-add-creates-ready-root-task.md` | `task add` command case | 1 | 1 | done | +| `docs/tests/orch/task-add/task-add-rejects-invalid-acceptance-json.md` | `task add` command case | 1 | 1 | done | +| `docs/tests/orch/task-add/task-add-rejects-invalid-priority.md` | `task add` command case | 1 | 1 | done | +| `docs/tests/orch/dep-add/README.md` | `dep add` command case index | 0 | 0 | done | +| `docs/tests/orch/dep-add/dep-add-blocks-dependent-task-until-prerequisite-completes.md` | `dep add` command case | 1 | 1 | done | +| `docs/tests/orch/ready/README.md` | `ready` command case index | 0 | 0 | done | +| `docs/tests/orch/ready/ready-lists-only-eligible-tasks.md` | `ready` command case | 1 | 1 | done | +| `docs/tests/orch/ready/ready-orders-by-priority-and-respects-limit.md` | `ready` command case | 1 | 1 | done | +| `docs/tests/orch/dispatch/README.md` | `dispatch` command case index | 0 | 0 | done | +| `docs/tests/orch/dispatch/dispatch-creates-attempt-and-thread-for-ready-task.md` | `dispatch` command case | 1 | 1 | done | +| `docs/tests/orch/dispatch/dispatch-rejects-non-ready-task.md` | `dispatch` command case | 1 | 1 | done | +| `docs/tests/orch/dispatch/dispatch-creates-strict-worktree.md` | `dispatch` command case | 1 | 1 | done | +| `docs/tests/orch/dispatch/dispatch-rejects-dirty-repo-without-base-ref.md` | `dispatch` command case | 1 | 1 | done | +| `docs/tests/orch/dispatch/dispatch-allows-explicit-base-ref-on-dirty-repo.md` | `dispatch` command case | 1 | 1 | done | +| `docs/tests/orch/dispatch/dispatch-auto-enables-worktree-for-code-like-task.md` | `dispatch` command case | 1 | 1 | done | +| `docs/tests/orch/dispatch/dispatch-skips-auto-worktree-for-non-code-task.md` | `dispatch` command case | 1 | 1 | done | +| `docs/tests/orch/reconcile/README.md` | `reconcile` command case index | 0 | 0 | done | +| `docs/tests/orch/reconcile/reconcile-maps-claimed-or-in-progress-thread-to-running.md` | `reconcile` command case | 1 | 1 | done | +| `docs/tests/orch/reconcile/reconcile-maps-done-or-failed-thread-to-terminal-task-state.md` | `reconcile` command case | 1 | 1 | done | +| `docs/tests/orch/wait/README.md` | `wait` command case index | 0 | 0 | done | +| `docs/tests/orch/wait/wait-wakes-on-matching-run-event.md` | `wait` command case | 1 | 1 | done | +| `docs/tests/orch/wait/wait-times-out-without-matching-event.md` | `wait` command case | 1 | 1 | done | +| `docs/tests/orch/blocked/README.md` | `blocked` command case index | 0 | 0 | done | +| `docs/tests/orch/blocked/blocked-lists-latest-question-for-blocked-task.md` | `blocked` command case | 1 | 1 | done | +| `docs/tests/orch/answer/README.md` | `answer` command case index | 0 | 0 | done | +| `docs/tests/orch/answer/answer-appends-answer-to-active-thread.md` | `answer` command case | 1 | 1 | done | +| `docs/tests/orch/answer/answer-accepts-payload-json-without-body.md` | `answer` command case | 1 | 1 | done | +| `docs/tests/orch/answer/answer-rejects-empty-body-and-payload.md` | `answer` command case | 1 | 1 | done | +| `docs/tests/orch/retry/README.md` | `retry` command case index | 0 | 0 | done | +| `docs/tests/orch/retry/retry-creates-new-attempt-for-failed-task.md` | `retry` command case | 1 | 1 | done | +| `docs/tests/orch/reassign/README.md` | `reassign` command case index | 0 | 0 | done | +| `docs/tests/orch/reassign/reassign-cancels-old-thread-and-dispatches-new-attempt.md` | `reassign` command case | 1 | 1 | done | +| `docs/tests/orch/cancel/README.md` | `cancel` command case index | 0 | 0 | done | +| `docs/tests/orch/cancel/cancel-cancels-single-task.md` | `cancel` command case | 1 | 1 | done | +| `docs/tests/orch/cancel/cancel-cancels-entire-run.md` | `cancel` command case | 1 | 1 | done | +| `docs/tests/orch/cleanup/README.md` | `cleanup` command case index | 0 | 0 | done | +| `docs/tests/orch/cleanup/cleanup-removes-completed-worktree.md` | `cleanup` command case | 1 | 1 | done | +| `docs/tests/orch/cleanup/cleanup-rejects-attempt-without-task.md` | `cleanup` command case | 1 | 1 | done | +| `docs/tests/orch/cleanup/cleanup-returns-no-matching-work-when-filters-miss.md` | `cleanup` command case | 1 | 1 | done | +| `docs/tests/orch/status/README.md` | `status` command case index | 0 | 0 | done | +| `docs/tests/orch/status/status-returns-run-summary-and-task-list.md` | `status` command case | 1 | 1 | done | +| `docs/tests/orch/council-start/README.md` | `council start` command case index | 0 | 0 | done | +| `docs/tests/orch/council-start/council-start-dispatches-three-reviewers.md` | `council start` command case | 1 | 1 | done | +| `docs/tests/orch/council-wait/README.md` | `council wait` command case index | 0 | 0 | done | +| `docs/tests/orch/council-wait/council-wait-wakes-when-all-reviewers-complete.md` | `council wait` command case | 1 | 1 | done | +| `docs/tests/orch/council-wait/council-wait-times-out-when-reviewers-incomplete.md` | `council wait` command case | 1 | 1 | done | +| `docs/tests/orch/council-tally/README.md` | `council tally` command case index | 0 | 0 | done | +| `docs/tests/orch/council-tally/council-tally-groups-reviewer-findings-in-normal-mode.md` | `council tally` command case | 1 | 1 | done | +| `docs/tests/orch/council-tally/council-tally-keeps-distinct-proposals-in-strict-mode.md` | `council tally` command case | 1 | 1 | done | +| `docs/tests/orch/council-report/README.md` | `council report` command case index | 0 | 0 | done | +| `docs/tests/orch/council-report/council-report-defaults-to-consensus-and-majority.md` | `council report` command case | 1 | 1 | done | +| `docs/tests/orch/council-report/council-report-show-all-includes-minority.md` | `council report` command case | 1 | 1 | done | +| `docs/tests/orch/council-report/council-report-json-shape-is-stable.md` | `council report` command case | 1 | 1 | done | +| `docs/tests/orch/council-report/council-report-rejects-before-tally.md` | `council report` command case | 1 | 1 | done | +| `docs/tests/orch/council-report/council-report-rejects-invalid-show.md` | `council report` command case | 1 | 1 | done | +| `docs/tests/orch/council-report/council-report-defaults-to-consensus-when-run-is-only-unanimous.md` | `council report` command case | 1 | 1 | done | ## Authoring Order @@ -233,50 +276,74 @@ docs/tests/orch/ 5. interactive leader command docs: `wait`, `blocked`, `answer`, `retry`, `reassign`, `cancel`, `cleanup` 6. council workflow docs: `council-start`, `council-wait`, `council-tally`, `council-report` -## Pending Case Backlog - -### Workflows - -- `docs/tests/orch/workflows/README.md :: run-dispatch-reconcile-status-happy-path` - end-to-end happy path from run creation through final `status` -- `docs/tests/orch/workflows/README.md :: dependency-blocked-answer-resume-flow` - dependency gating plus blocked question and answer flow -- `docs/tests/orch/workflows/README.md :: strict-worktree-dispatch-to-cleanup` - worktree creation through cleanup of a completed attempt -- `docs/tests/orch/workflows/README.md :: council-review-end-to-end` - council start, wait, tally, and report sequence - -### Command Cases - -- `docs/tests/orch/run-init/run-init-creates-new-run.md` - create a run with goal and optional summary -- `docs/tests/orch/run-show/run-show-returns-run-summary-and-task-counts.md` - show aggregate run metadata after activity -- `docs/tests/orch/task-add/task-add-creates-ready-root-task.md` - adding a dependency-free task becomes `ready` -- `docs/tests/orch/dep-add/dep-add-blocks-dependent-task-until-prerequisite-completes.md` - dependency edge prevents immediate readiness -- `docs/tests/orch/ready/ready-lists-only-eligible-tasks.md` - ready list excludes blocked by dependency tasks -- `docs/tests/orch/dispatch/dispatch-creates-attempt-and-thread-for-ready-task.md` - happy-path dispatch with new attempt metadata -- `docs/tests/orch/dispatch/dispatch-rejects-non-ready-task.md` - invalid-state rejection when dispatching a gated task -- `docs/tests/orch/dispatch/dispatch-creates-strict-worktree.md` - explicit strict worktree dispatch persists workspace metadata -- `docs/tests/orch/dispatch/dispatch-rejects-dirty-repo-without-base-ref.md` - strict worktree guard on dirty repository state -- `docs/tests/orch/dispatch/dispatch-allows-explicit-base-ref-on-dirty-repo.md` - explicit base ref bypasses dirty working tree rejection -- `docs/tests/orch/dispatch/dispatch-auto-enables-worktree-for-code-like-task.md` - inferred worktree mode for code-like tasks -- `docs/tests/orch/dispatch/dispatch-skips-auto-worktree-for-non-code-task.md` - non-code tasks stay on the normal dispatch path -- `docs/tests/orch/reconcile/reconcile-maps-claimed-or-in-progress-thread-to-running.md` - reconcile turns claimed/in-progress inbox state into running task state -- `docs/tests/orch/reconcile/reconcile-maps-done-or-failed-thread-to-terminal-task-state.md` - reconcile turns terminal inbox state into task terminal state -- `docs/tests/orch/wait/wait-wakes-on-matching-run-event.md` - wait wakes on a later matching task event -- `docs/tests/orch/wait/wait-times-out-without-matching-event.md` - wait returns a non-woken timeout result -- `docs/tests/orch/blocked/blocked-lists-latest-question-for-blocked-task.md` - blocked list includes the active question payload -- `docs/tests/orch/answer/answer-appends-answer-to-active-thread.md` - answer writes an inbox answer message for the blocked attempt -- `docs/tests/orch/retry/retry-creates-new-attempt-for-failed-task.md` - retry creates a successor attempt after failure -- `docs/tests/orch/reassign/reassign-cancels-old-thread-and-dispatches-new-attempt.md` - reassign moves work to a new worker and attempt -- `docs/tests/orch/cancel/cancel-cancels-single-task.md` - cancel transitions one task to `cancelled` -- `docs/tests/orch/cancel/cancel-cancels-entire-run.md` - cancel cascades terminal state across the run -- `docs/tests/orch/cleanup/cleanup-removes-completed-worktree.md` - cleanup removes a completed attempt worktree -- `docs/tests/orch/status/status-returns-run-summary-and-task-list.md` - status returns run aggregate state and per-task statuses -- `docs/tests/orch/council-start/council-start-dispatches-three-reviewers.md` - council start creates reviewer tasks and metadata -- `docs/tests/orch/council-wait/council-wait-wakes-when-all-reviewers-complete.md` - council wait completes successfully when all reviewers finish -- `docs/tests/orch/council-wait/council-wait-times-out-when-reviewers-incomplete.md` - council wait timeout behavior -- `docs/tests/orch/council-tally/council-tally-groups-reviewer-findings-in-normal-mode.md` - normal similarity grouping behavior -- `docs/tests/orch/council-tally/council-tally-keeps-distinct-proposals-in-strict-mode.md` - strict similarity grouping behavior -- `docs/tests/orch/council-report/council-report-defaults-to-consensus-and-majority.md` - default report buckets and markdown artifact behavior -- `docs/tests/orch/council-report/council-report-show-all-includes-minority.md` - `--show all` includes minority output -- `docs/tests/orch/council-report/council-report-json-shape-is-stable.md` - JSON output shape and artifact metadata contract - ## Authored Case Register -- none yet +| Path | Case Slug | Coverage Note | Status | +| --- | --- | --- | --- | +| `docs/tests/orch/workflows/README.md` | `run-dispatch-reconcile-status-happy-path` | end-to-end happy path from run creation through final status | done | +| `docs/tests/orch/workflows/README.md` | `dependency-blocked-answer-resume-flow` | dependency gating plus blocked question and answer recovery | done | +| `docs/tests/orch/workflows/README.md` | `strict-worktree-dispatch-to-cleanup` | worktree-backed code task flows from dispatch through cleanup | done | +| `docs/tests/orch/workflows/README.md` | `council-review-end-to-end` | council workflow runs from reviewer dispatch through final report | done | +| `docs/tests/orch/run-init/run-init-creates-new-run.md` | `run-init-creates-new-run` | creates a run with goal and optional summary | done | +| `docs/tests/orch/run-show/run-show-returns-run-summary-and-task-counts.md` | `run-show-returns-run-summary-and-task-counts` | shows aggregate run metadata after activity | done | +| `docs/tests/orch/task-add/task-add-creates-ready-root-task.md` | `task-add-creates-ready-root-task` | dependency-free task becomes ready immediately | done | +| `docs/tests/orch/task-add/task-add-rejects-invalid-acceptance-json.md` | `task-add-rejects-invalid-acceptance-json` | malformed `--acceptance-json` returns stable invalid_input | done | +| `docs/tests/orch/task-add/task-add-rejects-invalid-priority.md` | `task-add-rejects-invalid-priority` | unsupported priorities are rejected with invalid_input | done | +| `docs/tests/orch/dep-add/dep-add-blocks-dependent-task-until-prerequisite-completes.md` | `dep-add-blocks-dependent-task-until-prerequisite-completes` | dependency edge prevents immediate readiness | done | +| `docs/tests/orch/ready/ready-lists-only-eligible-tasks.md` | `ready-lists-only-eligible-tasks` | ready list excludes dependency-gated tasks | done | +| `docs/tests/orch/ready/ready-orders-by-priority-and-respects-limit.md` | `ready-orders-by-priority-and-respects-limit` | ready output orders by priority and applies explicit limit truncation | done | +| `docs/tests/orch/dispatch/dispatch-creates-attempt-and-thread-for-ready-task.md` | `dispatch-creates-attempt-and-thread-for-ready-task` | ready task dispatch creates attempt, thread, and task message | done | +| `docs/tests/orch/dispatch/dispatch-rejects-non-ready-task.md` | `dispatch-rejects-non-ready-task` | dispatch on gated task returns invalid_state | done | +| `docs/tests/orch/dispatch/dispatch-creates-strict-worktree.md` | `dispatch-creates-strict-worktree` | explicit strict worktree dispatch provisions isolated workspace metadata | done | +| `docs/tests/orch/dispatch/dispatch-rejects-dirty-repo-without-base-ref.md` | `dispatch-rejects-dirty-repo-without-base-ref` | dirty repository without explicit base ref is rejected in strict mode | done | +| `docs/tests/orch/dispatch/dispatch-allows-explicit-base-ref-on-dirty-repo.md` | `dispatch-allows-explicit-base-ref-on-dirty-repo` | explicit base ref allows dispatch from a dirty repository | done | +| `docs/tests/orch/dispatch/dispatch-auto-enables-worktree-for-code-like-task.md` | `dispatch-auto-enables-worktree-for-code-like-task` | code-like task metadata auto-enables worktree mode | done | +| `docs/tests/orch/dispatch/dispatch-skips-auto-worktree-for-non-code-task.md` | `dispatch-skips-auto-worktree-for-non-code-task` | clearly non-code tasks stay on normal dispatch path | done | +| `docs/tests/orch/reconcile/reconcile-maps-claimed-or-in-progress-thread-to-running.md` | `reconcile-maps-claimed-or-in-progress-thread-to-running` | reconcile maps active inbox execution to running task state | done | +| `docs/tests/orch/reconcile/reconcile-maps-done-or-failed-thread-to-terminal-task-state.md` | `reconcile-maps-done-or-failed-thread-to-terminal-task-state` | reconcile maps terminal inbox states to terminal task states | done | +| `docs/tests/orch/wait/wait-wakes-on-matching-run-event.md` | `wait-wakes-on-matching-run-event` | wait wakes on a later matching run-scoped event | done | +| `docs/tests/orch/wait/wait-times-out-without-matching-event.md` | `wait-times-out-without-matching-event` | wait timeout returns a normal non-woken result | done | +| `docs/tests/orch/blocked/blocked-lists-latest-question-for-blocked-task.md` | `blocked-lists-latest-question-for-blocked-task` | blocked view includes latest question payload for the task | done | +| `docs/tests/orch/answer/answer-appends-answer-to-active-thread.md` | `answer-appends-answer-to-active-thread` | answer appends an inbox answer message to the blocked attempt thread | done | +| `docs/tests/orch/answer/answer-accepts-payload-json-without-body.md` | `answer-accepts-payload-json-without-body` | payload-only answers stay valid and machine-readable | done | +| `docs/tests/orch/answer/answer-rejects-empty-body-and-payload.md` | `answer-rejects-empty-body-and-payload` | empty answer requests fail with invalid_input | done | +| `docs/tests/orch/retry/retry-creates-new-attempt-for-failed-task.md` | `retry-creates-new-attempt-for-failed-task` | retry dispatches a successor attempt after failure | done | +| `docs/tests/orch/reassign/reassign-cancels-old-thread-and-dispatches-new-attempt.md` | `reassign-cancels-old-thread-and-dispatches-new-attempt` | reassign cancels old execution and opens a new attempt | done | +| `docs/tests/orch/cancel/cancel-cancels-single-task.md` | `cancel-cancels-single-task` | single-task cancel moves only the targeted task to cancelled | done | +| `docs/tests/orch/cancel/cancel-cancels-entire-run.md` | `cancel-cancels-entire-run` | run cancel cascades terminal state across the run | done | +| `docs/tests/orch/cleanup/cleanup-removes-completed-worktree.md` | `cleanup-removes-completed-worktree` | cleanup removes completed attempt worktree artifacts | done | +| `docs/tests/orch/cleanup/cleanup-rejects-attempt-without-task.md` | `cleanup-rejects-attempt-without-task` | cleanup enforces `--task` when `--attempt` is specified | done | +| `docs/tests/orch/cleanup/cleanup-returns-no-matching-work-when-filters-miss.md` | `cleanup-returns-no-matching-work-when-filters-miss` | cleanup returns no_matching_work when selectors find no candidates | done | +| `docs/tests/orch/status/status-returns-run-summary-and-task-list.md` | `status-returns-run-summary-and-task-list` | status reports aggregate run state and per-task statuses | done | +| `docs/tests/orch/council-start/council-start-dispatches-three-reviewers.md` | `council-start-dispatches-three-reviewers` | council start creates and dispatches three fixed reviewer tasks | done | +| `docs/tests/orch/council-wait/council-wait-wakes-when-all-reviewers-complete.md` | `council-wait-wakes-when-all-reviewers-complete` | council wait wakes when all reviewer tasks complete | done | +| `docs/tests/orch/council-wait/council-wait-times-out-when-reviewers-incomplete.md` | `council-wait-times-out-when-reviewers-incomplete` | council wait timeout stays machine-readable | done | +| `docs/tests/orch/council-tally/council-tally-groups-reviewer-findings-in-normal-mode.md` | `council-tally-groups-reviewer-findings-in-normal-mode` | normal similarity groups semantically aligned reviewer findings | done | +| `docs/tests/orch/council-tally/council-tally-keeps-distinct-proposals-in-strict-mode.md` | `council-tally-keeps-distinct-proposals-in-strict-mode` | strict similarity preserves wording-level proposal separation | done | +| `docs/tests/orch/council-report/council-report-defaults-to-consensus-and-majority.md` | `council-report-defaults-to-consensus-and-majority` | default report keeps main output on consensus and majority buckets | done | +| `docs/tests/orch/council-report/council-report-show-all-includes-minority.md` | `council-report-show-all-includes-minority` | `--show all` includes minority recommendations in final report | done | +| `docs/tests/orch/council-report/council-report-json-shape-is-stable.md` | `council-report-json-shape-is-stable` | JSON response shape and report artifact metadata remain stable | done | +| `docs/tests/orch/council-report/council-report-rejects-before-tally.md` | `council-report-rejects-before-tally` | report generation before tally fails with invalid_state | done | +| `docs/tests/orch/council-report/council-report-rejects-invalid-show.md` | `council-report-rejects-invalid-show` | unsupported `--show` values return invalid_input | done | +| `docs/tests/orch/council-report/council-report-defaults-to-consensus-when-run-is-only-unanimous.md` | `council-report-defaults-to-consensus-when-run-is-only-unanimous` | omitted `--show` collapses to consensus for only-unanimous runs | done | + +## Pending Case Backlog + +No pending case slugs remain in the current plan. + +When a new `orch` CLI contract or workflow needs coverage: + +1. if it is a command case, create a new `.md` file under the relevant leaf-command folder and add it to that folder `README.md` index +2. if it is a workflow case, add it to `docs/tests/orch/workflows/README.md` +3. add the new slug to `Authored Case Register` +4. update `Current Snapshot` and `Document Progress` + +## Definition Of Done + +This roadmap is complete only when all of the following are true: + +- every implemented `orch` leaf command has a corresponding document folder +- each planned command index and case document exists +- each pending case slug has been either authored or explicitly deferred +- the authored-case register matches the actual Markdown files on disk +- a new agent can pick any future case and know exactly where it should be written diff --git a/docs/tests/orch/_shared/README.md b/docs/tests/orch/_shared/README.md index d8170b1..074d770 100644 --- a/docs/tests/orch/_shared/README.md +++ b/docs/tests/orch/_shared/README.md @@ -116,6 +116,22 @@ When worktree behavior is under test, assert at least: - attempt `worktree_path` - attempt `workspace_status` +## Direct DB Inspection + +Most `orch` cases should stay at the CLI contract level, but a few manual reproduction flows need direct SQL reads to recover attempt-to-thread mappings that the current `orch` CLI does not print in a standalone query command. + +When a case truly needs that mapping: + +- use a read-only `sqlite3` query against `TMPDIR/coord.db` +- prefer querying `task_attempts` by stable keys such as `run_id`, `task_id`, and `attempt_no` +- treat the SQL read as fixture setup for the next CLI command, not as the main assertion target + +Typical example: + +```bash +sqlite3 TMPDIR/coord.db "SELECT thread_id FROM task_attempts WHERE run_id = 'council_blog_001' AND task_id = 'CR1' AND attempt_no = 1;" +``` + ## Workflow Authoring Rule If a case spans multiple `orch` commands, place the end-to-end narrative in `workflows/README.md` first, then add narrower command-level cases only when they are easier to reason about in isolation. diff --git a/docs/tests/orch/answer/README.md b/docs/tests/orch/answer/README.md index 39d3f06..ac48dbe 100644 --- a/docs/tests/orch/answer/README.md +++ b/docs/tests/orch/answer/README.md @@ -1,7 +1,9 @@ # Orch `answer` Test Plan Index -## Status +## Case Files -No command case files are authored yet. - -Use [../ROADMAP.md](../ROADMAP.md) for planned case slugs and document progress. +| Case Slug | File | Coverage Note | +| --- | --- | --- | +| `answer-appends-answer-to-active-thread` | [answer-appends-answer-to-active-thread.md](./answer-appends-answer-to-active-thread.md) | appends an inbox answer message onto the active blocked attempt thread | +| `answer-accepts-payload-json-without-body` | [answer-accepts-payload-json-without-body.md](./answer-accepts-payload-json-without-body.md) | accepts structured `--payload-json` input even when no body text is provided | +| `answer-rejects-empty-body-and-payload` | [answer-rejects-empty-body-and-payload.md](./answer-rejects-empty-body-and-payload.md) | rejects an answer request that provides neither body text nor payload JSON | diff --git a/docs/tests/orch/answer/answer-accepts-payload-json-without-body.md b/docs/tests/orch/answer/answer-accepts-payload-json-without-body.md new file mode 100644 index 0000000..b64f3d3 --- /dev/null +++ b/docs/tests/orch/answer/answer-accepts-payload-json-without-body.md @@ -0,0 +1,37 @@ +# Case: `answer-accepts-payload-json-without-body` + +## 用例意义 + +验证 `answer` 在未提供 `--body` 的情况下,仍可通过纯 `--payload-json` 向当前阻塞尝试写回结构化决策。 + +## 前置条件 + +- 运行 `run_blog_002` 中的任务 `T2` 已处于 `blocked` +- `blocked` 列表中可见 `T2` +- 已知该阻塞尝试对应线程为 `THREAD_ID` + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json answer --run run_blog_002 --task T2 --payload-json '{"decision":"stdout","source":"leader"}' +inbox --db TMPDIR/coord.db --json show --thread THREAD_ID +``` + +## 预期输出 + +- `answer` 退出码为 `0` +- `answer.data.message.kind == "answer"` +- `answer.data.message.payload_json.decision == "stdout"` +- `answer.data.message.payload_json.source == "leader"` +- `show.data.messages` 末尾新增一条 `kind=answer` 的消息 +- 末尾消息的 `payload_json.decision == "stdout"` + +## 断言结论 + +- `answer` 不要求 leader 必须提供纯文本正文;结构化 payload 本身就可以构成有效答复 +- worker 可以从同一条 `answer` 消息里读取结构化决策,而不必依赖约定俗成的正文格式 + +## 补充约束 + +- `--payload-json` 必须是合法 JSON;非法值应返回 `invalid_input` +- `--body` 与 `--body-file` 仍然互斥,即使本用例不使用它们 diff --git a/docs/tests/orch/answer/answer-appends-answer-to-active-thread.md b/docs/tests/orch/answer/answer-appends-answer-to-active-thread.md new file mode 100644 index 0000000..ee7b615 --- /dev/null +++ b/docs/tests/orch/answer/answer-appends-answer-to-active-thread.md @@ -0,0 +1,36 @@ +# Case: `answer-appends-answer-to-active-thread` + +## 用例意义 + +验证 `answer` 会把 leader 的答复写回当前阻塞尝试的 inbox 线程,并以 `answer` 消息形式供 worker 继续消费。 + +## 前置条件 + +- 运行 `run_blog_002` 中的任务 `T2` 已处于 `blocked` +- `blocked` 列表中可见 `T2` +- 已知该阻塞尝试对应线程为 `THREAD_ID` + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json answer --run run_blog_002 --task T2 --body "Use stdout for MVP." +inbox --db TMPDIR/coord.db --json show --thread THREAD_ID +``` + +## 预期输出 + +- `answer` 退出码为 `0` +- `answer.data.message.kind == "answer"` +- `answer.data.task.task_id == "T2"` +- `show.data.messages` 末尾新增一条 `kind=answer` 的消息 +- 末尾消息 `body == "Use stdout for MVP."` + +## 断言结论 + +- `answer` 的本质是向活动线程追加 leader 决策消息,而不是直接修改任务状态 +- worker 仍需继续通过 `inbox` 或后续 `reconcile` 推进任务状态 + +## 补充约束 + +- `answer` 支持 `--body-file` 与 `--payload-json` +- `--body` 与 `--body-file` 互斥;若两者都为空,则至少需要提供 `--payload-json` diff --git a/docs/tests/orch/answer/answer-rejects-empty-body-and-payload.md b/docs/tests/orch/answer/answer-rejects-empty-body-and-payload.md new file mode 100644 index 0000000..aebe648 --- /dev/null +++ b/docs/tests/orch/answer/answer-rejects-empty-body-and-payload.md @@ -0,0 +1,29 @@ +# Case: `answer-rejects-empty-body-and-payload` + +## 用例意义 + +验证 `answer` 在既没有正文也没有结构化 payload 时返回稳定输入错误,而不是写入空答复消息。 + +## 前置条件 + +- 运行 `run_blog_002` 中的任务 `T2` 已处于 `blocked` + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json answer --run run_blog_002 --task T2 +``` + +## 预期输出 + +- 退出码为 `30` +- JSON 错误码为 `invalid_input` + +## 断言结论 + +- `answer` 至少需要一种有效输入载荷:正文或 `payload-json` +- 空答复会在写入线程前被拒绝,而不是生成一条语义不明的 `answer` 消息 + +## 补充约束 + +- 若同时传入 `--body` 和 `--body-file`,也应返回 `invalid_input` diff --git a/docs/tests/orch/blocked/README.md b/docs/tests/orch/blocked/README.md index a698ed7..e092694 100644 --- a/docs/tests/orch/blocked/README.md +++ b/docs/tests/orch/blocked/README.md @@ -1,7 +1,7 @@ # Orch `blocked` Test Plan Index -## Status +## Case Files -No command case files are authored yet. - -Use [../ROADMAP.md](../ROADMAP.md) for planned case slugs and document progress. +| Case Slug | File | Coverage Note | +| --- | --- | --- | +| `blocked-lists-latest-question-for-blocked-task` | [blocked-lists-latest-question-for-blocked-task.md](./blocked-lists-latest-question-for-blocked-task.md) | lists blocked tasks together with the latest worker question payload | diff --git a/docs/tests/orch/blocked/blocked-lists-latest-question-for-blocked-task.md b/docs/tests/orch/blocked/blocked-lists-latest-question-for-blocked-task.md new file mode 100644 index 0000000..2a79071 --- /dev/null +++ b/docs/tests/orch/blocked/blocked-lists-latest-question-for-blocked-task.md @@ -0,0 +1,38 @@ +# Case: `blocked-lists-latest-question-for-blocked-task` + +## 用例意义 + +验证 `blocked` 会列出当前阻塞任务,并附带最新问题消息,便于 leader 直接做决策。 + +## 前置条件 + +- 已创建运行 `run_blog_002` +- 已创建任务 `T1`、`T2`,且 `T2` 依赖 `T1` +- `T1` 已完成并经 `reconcile` 推进,使 `T2` 变为 `ready` +- `T2` 已完成 `dispatch` +- `worker-b` 已 `claim` `T2` 对应线程,并通过 `inbox update --status blocked` 写入问题 +- 最近一次 `reconcile` 已执行 + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json blocked --run run_blog_002 +``` + +## 预期输出 + +- 退出码为 `0` +- `blocked.data.blocked` 长度为 `1` +- 唯一条目的 `task.task_id == "T2"` +- `question.kind == "question"` +- `question.summary == "Need logging decision"` +- `question.payload_json.question == "stdout or stderr?"` + +## 断言结论 + +- `blocked` 返回的不只是任务状态,还会附带 leader 真正需要回答的问题消息 +- 该命令适合作为 leader 的“待答复队列”入口,而不是只做状态列表展示 + +## 补充约束 + +- 若没有阻塞任务,非 JSON 输出会打印 `no blocked tasks` diff --git a/docs/tests/orch/cancel/README.md b/docs/tests/orch/cancel/README.md index 853020c..5dbd0cd 100644 --- a/docs/tests/orch/cancel/README.md +++ b/docs/tests/orch/cancel/README.md @@ -1,7 +1,8 @@ # Orch `cancel` Test Plan Index -## Status +## Case Files -No command case files are authored yet. - -Use [../ROADMAP.md](../ROADMAP.md) for planned case slugs and document progress. +| Case Slug | File | Coverage Note | +| --- | --- | --- | +| `cancel-cancels-single-task` | [cancel-cancels-single-task.md](./cancel-cancels-single-task.md) | cancels one task without implicitly cancelling unrelated tasks in the same run | +| `cancel-cancels-entire-run` | [cancel-cancels-entire-run.md](./cancel-cancels-entire-run.md) | cancels the run and forces every task into the cancelled terminal state | diff --git a/docs/tests/orch/cancel/cancel-cancels-entire-run.md b/docs/tests/orch/cancel/cancel-cancels-entire-run.md new file mode 100644 index 0000000..d97b166 --- /dev/null +++ b/docs/tests/orch/cancel/cancel-cancels-entire-run.md @@ -0,0 +1,30 @@ +# Case: `cancel-cancels-entire-run` + +## 用例意义 + +验证不带 `--task` 的 `cancel` 会取消整个运行,并把所有任务推进到 `cancelled`。 + +## 前置条件 + +- 运行 `run_blog_cancel_001` 已存在 +- 该运行下至少有 `T1`、`T2` 两个任务 +- 在执行本用例前,可能已有单任务取消发生 + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json cancel --run run_blog_cancel_001 --reason "Stop the run." +orch --db TMPDIR/coord.db --json status --run run_blog_cancel_001 +``` + +## 预期输出 + +- `cancel` 退出码为 `0` +- `cancel.data.run.status == "cancelled"` +- `status.data.run.status == "cancelled"` +- `status.data.tasks` 中所有任务的 `status` 都为 `cancelled` + +## 断言结论 + +- 运行级取消会级联终止运行下的全部任务 +- 该命令是 leader 主动停止整个调度的主入口,而不是只做标记 diff --git a/docs/tests/orch/cancel/cancel-cancels-single-task.md b/docs/tests/orch/cancel/cancel-cancels-single-task.md new file mode 100644 index 0000000..9bb045c --- /dev/null +++ b/docs/tests/orch/cancel/cancel-cancels-single-task.md @@ -0,0 +1,32 @@ +# Case: `cancel-cancels-single-task` + +## 用例意义 + +验证 `cancel --task` 只取消指定任务,不会隐式取消同一运行中的其他任务。 + +## 前置条件 + +- 已创建运行 `run_blog_cancel_001` +- 已创建任务 `T1`、`T2` +- `T1` 已完成 `dispatch` +- 已知 `T1` 对应线程为 `THREAD_ID` + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json cancel --run run_blog_cancel_001 --task T1 --reason "Task is no longer needed." +orch --db TMPDIR/coord.db --json status --run run_blog_cancel_001 +inbox --db TMPDIR/coord.db --json show --thread THREAD_ID +``` + +## 预期输出 + +- `cancel` 退出码为 `0` +- `status` 中 `T1.status == "cancelled"` +- `status` 中 `T2` 仍保持非 `cancelled` 状态 +- `show.data.thread.status == "cancelled"`,指向 `T1` 的原线程 + +## 断言结论 + +- 单任务取消是局部控制动作,不会把运行整体终止 +- 对已分派任务,取消也会同步终止对应 inbox 线程,避免 worker 继续执行 diff --git a/docs/tests/orch/cleanup/README.md b/docs/tests/orch/cleanup/README.md index 8e13b99..465cd59 100644 --- a/docs/tests/orch/cleanup/README.md +++ b/docs/tests/orch/cleanup/README.md @@ -1,7 +1,9 @@ # Orch `cleanup` Test Plan Index -## Status +## Case Files -No command case files are authored yet. - -Use [../ROADMAP.md](../ROADMAP.md) for planned case slugs and document progress. +| Case Slug | File | Coverage Note | +| --- | --- | --- | +| `cleanup-removes-completed-worktree` | [cleanup-removes-completed-worktree.md](./cleanup-removes-completed-worktree.md) | removes a completed attempt worktree and records the cleanup result | +| `cleanup-rejects-attempt-without-task` | [cleanup-rejects-attempt-without-task.md](./cleanup-rejects-attempt-without-task.md) | rejects `--attempt` when no matching `--task` selector is provided | +| `cleanup-returns-no-matching-work-when-filters-miss` | [cleanup-returns-no-matching-work-when-filters-miss.md](./cleanup-returns-no-matching-work-when-filters-miss.md) | returns the stable no-matching-work contract when cleanup filters yield no candidates | diff --git a/docs/tests/orch/cleanup/cleanup-rejects-attempt-without-task.md b/docs/tests/orch/cleanup/cleanup-rejects-attempt-without-task.md new file mode 100644 index 0000000..6b3aaeb --- /dev/null +++ b/docs/tests/orch/cleanup/cleanup-rejects-attempt-without-task.md @@ -0,0 +1,29 @@ +# Case: `cleanup-rejects-attempt-without-task` + +## 用例意义 + +验证 `cleanup` 在使用 `--attempt` 精确选择尝试时,要求同时提供 `--task`,避免对 run 级别尝试号产生歧义。 + +## 前置条件 + +- 已创建运行 `run_blog_cleanup_002` + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json cleanup --run run_blog_cleanup_002 --attempt 1 +``` + +## 预期输出 + +- 退出码为 `30` +- JSON 错误码为 `invalid_input` + +## 断言结论 + +- `cleanup` 的选择器组合在查询前就会进行基本输入校验 +- `--attempt` 不是独立的 run 级过滤器,必须依附具体 `task` + +## 补充约束 + +- 若既未提供 `--task`,也未提供 `--attempt` 或 `--all-completed`,同样应返回 `invalid_input` diff --git a/docs/tests/orch/cleanup/cleanup-removes-completed-worktree.md b/docs/tests/orch/cleanup/cleanup-removes-completed-worktree.md new file mode 100644 index 0000000..e2c2c94 --- /dev/null +++ b/docs/tests/orch/cleanup/cleanup-removes-completed-worktree.md @@ -0,0 +1,37 @@ +# Case: `cleanup-removes-completed-worktree` + +## 用例意义 + +验证 `cleanup` 会移除已完成尝试的 worktree,并把清理结果返回给 leader。 + +## 前置条件 + +- 已创建运行 `run_blog_cleanup_001` +- 已创建任务 `T1` +- `T1` 已通过严格 worktree 模式完成 `dispatch` +- `worker-a` 已完成 `claim` 并通过 `inbox done` 把线程推进到 `done` +- 最近一次 `reconcile` 已执行,使任务状态同步为 `done` +- 已知当前尝试的 worktree 路径为 `WORKTREE_PATH` + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json cleanup --run run_blog_cleanup_001 --task T1 +``` + +## 预期输出 + +- 退出码为 `0` +- `cleanup.data.cleaned` 长度为 `1` +- 唯一记录对应 `T1` 的已完成尝试 +- `WORKTREE_PATH` 在文件系统上已不存在 + +## 断言结论 + +- `cleanup` 针对的是尝试工作区资源,不会改变任务的完成结果 +- 成功清理后,leader 可以安全回收已终态尝试占用的 worktree + +## 补充约束 + +- `cleanup` 支持按 `--task`、`--attempt` 或 `--all-completed` 选择范围 +- `--force` 用于非常规清理;本用例验证的是常规完成态清理路径 diff --git a/docs/tests/orch/cleanup/cleanup-returns-no-matching-work-when-filters-miss.md b/docs/tests/orch/cleanup/cleanup-returns-no-matching-work-when-filters-miss.md new file mode 100644 index 0000000..9142beb --- /dev/null +++ b/docs/tests/orch/cleanup/cleanup-returns-no-matching-work-when-filters-miss.md @@ -0,0 +1,33 @@ +# Case: `cleanup-returns-no-matching-work-when-filters-miss` + +## 用例意义 + +验证 `cleanup` 在筛选条件没有命中任何可清理 worktree 时,返回稳定的“无匹配工作”契约,而不是成功空列表。 + +## 前置条件 + +- 已创建运行 `run_blog_cleanup_003` +- 已创建任务 `T1` +- 当前 run 中不存在 `workspace_status` 为 `completed` 或 `abandoned` 的 worktree 尝试 + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json run init --run run_blog_cleanup_003 --goal "Validate cleanup empty result" +orch --db TMPDIR/coord.db --json task add --run run_blog_cleanup_003 --task T1 --title "Prepare cleanup target" +orch --db TMPDIR/coord.db --json cleanup --run run_blog_cleanup_003 --task T1 +``` + +## 预期输出 + +- `cleanup` 退出码为 `10` +- JSON 错误码为 `no_matching_work` + +## 断言结论 + +- `cleanup` 对空筛选结果使用显式 no-matching-work 信号,而不是返回成功空数组 +- leader 或脚本可以据此区分“没有候选 worktree”与“清理已成功完成” + +## 补充约束 + +- 该契约同样适用于使用 `--all-completed` 或 `--attempt` 时筛选不到候选的场景 diff --git a/docs/tests/orch/council-report/README.md b/docs/tests/orch/council-report/README.md index cb0b11c..0b40252 100644 --- a/docs/tests/orch/council-report/README.md +++ b/docs/tests/orch/council-report/README.md @@ -1,7 +1,12 @@ # Orch `council report` Test Plan Index -## Status +## Case Files -No command case files are authored yet. - -Use [../ROADMAP.md](../ROADMAP.md) for planned case slugs and document progress. +| Case Slug | File | Coverage Note | +| --- | --- | --- | +| `council-report-defaults-to-consensus-and-majority` | [council-report-defaults-to-consensus-and-majority.md](./council-report-defaults-to-consensus-and-majority.md) | renders a markdown report that omits minority recommendations by default and writes a markdown artifact | +| `council-report-show-all-includes-minority` | [council-report-show-all-includes-minority.md](./council-report-show-all-includes-minority.md) | includes minority recommendations when `--show all` is requested | +| `council-report-json-shape-is-stable` | [council-report-json-shape-is-stable.md](./council-report-json-shape-is-stable.md) | returns the stable JSON report contract with summary, filtered groups, and artifact metadata | +| `council-report-rejects-before-tally` | [council-report-rejects-before-tally.md](./council-report-rejects-before-tally.md) | rejects report generation with `invalid_state` when grouped recommendations have not been tallied yet | +| `council-report-rejects-invalid-show` | [council-report-rejects-invalid-show.md](./council-report-rejects-invalid-show.md) | rejects unsupported `--show` bucket values with `invalid_input` | +| `council-report-defaults-to-consensus-when-run-is-only-unanimous` | [council-report-defaults-to-consensus-when-run-is-only-unanimous.md](./council-report-defaults-to-consensus-when-run-is-only-unanimous.md) | defaults omitted `--show` to `consensus` when the run was started with `--only-unanimous` | diff --git a/docs/tests/orch/council-report/council-report-defaults-to-consensus-and-majority.md b/docs/tests/orch/council-report/council-report-defaults-to-consensus-and-majority.md new file mode 100644 index 0000000..5746d02 --- /dev/null +++ b/docs/tests/orch/council-report/council-report-defaults-to-consensus-and-majority.md @@ -0,0 +1,70 @@ +# Case: `council-report-defaults-to-consensus-and-majority` + +## 用例意义 + +验证 `council report` 默认只展示 `consensus` 与 `majority` bucket,同时生成 markdown artifact。 + +## 前置条件 + +- 使用隔离的临时目录 `TMPDIR` +- 已准备好能产出 `consensus`、`majority`、`minority` 三类 recommendation 的 reviewer 输出 JSON +- 本地可使用 `sqlite3` 从 `task_attempts` 中读取 reviewer thread ID + +## 输入 + +```bash +cat <<'EOF' > TMPDIR/architecture-review.json +{"reviewer_role":"architecture-reviewer","findings":[{"title":"Split contracts","summary":"Transport contracts are mixed into UI code.","proposal":"Move API contract definitions into a dedicated module.","rationale":"This lowers coupling.","confidence":"high","tags":["architecture"],"target_refs":{"repo_path":"."}},{"title":"Share helpers","summary":"Council report rendering paths are repeated.","proposal":"Introduce shared council coordinator helpers for report rendering.","rationale":"This keeps report assembly consistent.","confidence":"medium","tags":["reporting"],"target_refs":{"repo_path":"."}}]} +EOF + +cat <<'EOF' > TMPDIR/implementation-review.json +{"reviewer_role":"implementation-reviewer","findings":[{"title":"Extract contracts","summary":"Shared transport shapes are duplicated.","proposal":"Move API contract definitions into dedicated module","rationale":"This reduces duplication.","confidence":"high","tags":["maintainability"],"target_refs":{"repo_path":"."}},{"title":"Reuse report helpers","summary":"Formatting logic should stay shared.","proposal":"Introduce shared council coordinator helpers for report rendering","rationale":"This avoids formatter drift.","confidence":"medium","tags":["reporting"],"target_refs":{"repo_path":"."}}]} +EOF + +cat <<'EOF' > TMPDIR/risk-review.json +{"reviewer_role":"risk-reviewer","findings":[{"title":"Lock contracts","summary":"Contract drift becomes risky over time.","proposal":"Move API contract definitions into a dedicated module.","rationale":"This reduces integration regressions.","confidence":"high","tags":["risk"],"target_refs":{"repo_path":"."}},{"title":"Cover JSON output","summary":"The council report response should stay stable.","proposal":"Add regression tests for council report JSON output.","rationale":"This catches contract regressions earlier.","confidence":"high","tags":["testing"],"target_refs":{"repo_path":"."}}]} +EOF + +orch --db TMPDIR/coord.db --json council start \ + --run council_blog_report_001 \ + --target "Review the council reporting flow." + +THREAD_ID_CR1=$(sqlite3 TMPDIR/coord.db "SELECT thread_id FROM task_attempts WHERE run_id = 'council_blog_report_001' AND task_id = 'CR1' AND attempt_no = 1;") +THREAD_ID_CR2=$(sqlite3 TMPDIR/coord.db "SELECT thread_id FROM task_attempts WHERE run_id = 'council_blog_report_001' AND task_id = 'CR2' AND attempt_no = 1;") +THREAD_ID_CR3=$(sqlite3 TMPDIR/coord.db "SELECT thread_id FROM task_attempts WHERE run_id = 'council_blog_report_001' AND task_id = 'CR3' AND attempt_no = 1;") + +inbox --db TMPDIR/coord.db --json claim --agent architecture-reviewer --thread "$THREAD_ID_CR1" +inbox --db TMPDIR/coord.db --json done --agent architecture-reviewer --thread "$THREAD_ID_CR1" --summary "Review complete" --body-file TMPDIR/architecture-review.json + +inbox --db TMPDIR/coord.db --json claim --agent implementation-reviewer --thread "$THREAD_ID_CR2" +inbox --db TMPDIR/coord.db --json done --agent implementation-reviewer --thread "$THREAD_ID_CR2" --summary "Review complete" --body-file TMPDIR/implementation-review.json + +inbox --db TMPDIR/coord.db --json claim --agent risk-reviewer --thread "$THREAD_ID_CR3" +inbox --db TMPDIR/coord.db --json done --agent risk-reviewer --thread "$THREAD_ID_CR3" --summary "Review complete" --body-file TMPDIR/risk-review.json + +orch --db TMPDIR/coord.db --json council tally \ + --run council_blog_report_001 \ + --similarity normal + +orch --db TMPDIR/coord.db council report \ + --run council_blog_report_001 +``` + +## 预期输出 + +- `council report` 退出码为 `0` +- stdout 是 markdown,而不是 JSON +- 报告正文包含 `# Council Review Report` +- 报告正文包含 `## Consensus` +- 报告正文包含 `## Majority` +- 报告正文不包含 `## Minority` +- `TMPDIR/.orch/reports/council_blog_report_001.md` 被创建,且内容与 stdout 一致 + +## 断言结论 + +- `council report` 的默认呈现策略是“主报告展示 consensus + majority,隐藏 minority” +- 该命令既是渲染命令,也是 artifact 产出命令 + +## 补充约束 + +- 默认 bucket 行为受 `council run` 的 `only_unanimous` 配置影响;当前常规路径默认仍是 `consensus,majority` diff --git a/docs/tests/orch/council-report/council-report-defaults-to-consensus-when-run-is-only-unanimous.md b/docs/tests/orch/council-report/council-report-defaults-to-consensus-when-run-is-only-unanimous.md new file mode 100644 index 0000000..698d4c4 --- /dev/null +++ b/docs/tests/orch/council-report/council-report-defaults-to-consensus-when-run-is-only-unanimous.md @@ -0,0 +1,73 @@ +# Case: `council-report-defaults-to-consensus-when-run-is-only-unanimous` + +## 用例意义 + +验证当 council run 以 `--only-unanimous` 启动时,省略 `--show` 的 `council report --json` 默认只返回 `consensus` bucket。 + +## 前置条件 + +- 使用隔离的临时目录 `TMPDIR` +- 已准备好与 `council-report-defaults-to-consensus-and-majority` 相同的 3 份 reviewer 输出 JSON +- 本地可使用 `sqlite3` 从 `task_attempts` 中读取 reviewer thread ID + +## 输入 + +```bash +cat <<'EOF' > TMPDIR/architecture-review.json +{"reviewer_role":"architecture-reviewer","findings":[{"title":"Split contracts","summary":"Transport contracts are mixed into UI code.","proposal":"Move API contract definitions into a dedicated module.","rationale":"This lowers coupling.","confidence":"high","tags":["architecture"],"target_refs":{"repo_path":"."}},{"title":"Share helpers","summary":"Council report rendering paths are repeated.","proposal":"Introduce shared council coordinator helpers for report rendering.","rationale":"This keeps report assembly consistent.","confidence":"medium","tags":["reporting"],"target_refs":{"repo_path":"."}}]} +EOF + +cat <<'EOF' > TMPDIR/implementation-review.json +{"reviewer_role":"implementation-reviewer","findings":[{"title":"Extract contracts","summary":"Shared transport shapes are duplicated.","proposal":"Move API contract definitions into dedicated module","rationale":"This reduces duplication.","confidence":"high","tags":["maintainability"],"target_refs":{"repo_path":"."}},{"title":"Reuse report helpers","summary":"Formatting logic should stay shared.","proposal":"Introduce shared council coordinator helpers for report rendering","rationale":"This avoids formatter drift.","confidence":"medium","tags":["reporting"],"target_refs":{"repo_path":"."}}]} +EOF + +cat <<'EOF' > TMPDIR/risk-review.json +{"reviewer_role":"risk-reviewer","findings":[{"title":"Lock contracts","summary":"Contract drift becomes risky over time.","proposal":"Move API contract definitions into a dedicated module.","rationale":"This reduces integration regressions.","confidence":"high","tags":["risk"],"target_refs":{"repo_path":"."}},{"title":"Cover JSON output","summary":"The council report response should stay stable.","proposal":"Add regression tests for council report JSON output.","rationale":"This catches contract regressions earlier.","confidence":"high","tags":["testing"],"target_refs":{"repo_path":"."}}]} +EOF + +orch --db TMPDIR/coord.db --json council start \ + --run council_blog_report_011 \ + --target "Review the council reporting flow." \ + --only-unanimous + +THREAD_ID_CR1=$(sqlite3 TMPDIR/coord.db "SELECT thread_id FROM task_attempts WHERE run_id = 'council_blog_report_011' AND task_id = 'CR1' AND attempt_no = 1;") +THREAD_ID_CR2=$(sqlite3 TMPDIR/coord.db "SELECT thread_id FROM task_attempts WHERE run_id = 'council_blog_report_011' AND task_id = 'CR2' AND attempt_no = 1;") +THREAD_ID_CR3=$(sqlite3 TMPDIR/coord.db "SELECT thread_id FROM task_attempts WHERE run_id = 'council_blog_report_011' AND task_id = 'CR3' AND attempt_no = 1;") + +inbox --db TMPDIR/coord.db --json claim --agent architecture-reviewer --thread "$THREAD_ID_CR1" +inbox --db TMPDIR/coord.db --json done --agent architecture-reviewer --thread "$THREAD_ID_CR1" --summary "Review complete" --body-file TMPDIR/architecture-review.json + +inbox --db TMPDIR/coord.db --json claim --agent implementation-reviewer --thread "$THREAD_ID_CR2" +inbox --db TMPDIR/coord.db --json done --agent implementation-reviewer --thread "$THREAD_ID_CR2" --summary "Review complete" --body-file TMPDIR/implementation-review.json + +inbox --db TMPDIR/coord.db --json claim --agent risk-reviewer --thread "$THREAD_ID_CR3" +inbox --db TMPDIR/coord.db --json done --agent risk-reviewer --thread "$THREAD_ID_CR3" --summary "Review complete" --body-file TMPDIR/risk-review.json + +orch --db TMPDIR/coord.db --json council tally \ + --run council_blog_report_011 \ + --similarity normal + +orch --db TMPDIR/coord.db --json council report \ + --run council_blog_report_011 +``` + +## 预期输出 + +- 最后一条 `council report` 命令退出码为 `0` +- `ok == true` +- `data.run_id == "council_blog_report_011"` +- `data.show == ["consensus"]` +- `data.summary.consensus == 1` +- `data.summary.majority == 1` +- `data.summary.minority == 1` +- `data.grouped_recommendations` 长度为 `1` +- 唯一返回的 recommendation 的 `bucket == "consensus"` + +## 断言结论 + +- `--only-unanimous` 不会删除持久化的 `majority` 或 `minority` 数据,但会改变省略 `--show` 时的默认输出策略 +- leader 若希望在 unanimous-only run 中仍查看 `majority`,必须显式传入 `--show` + +## 补充约束 + +- 即使这里使用 `--json` 断言 `show` 默认值,命令仍会写出 markdown artifact diff --git a/docs/tests/orch/council-report/council-report-json-shape-is-stable.md b/docs/tests/orch/council-report/council-report-json-shape-is-stable.md new file mode 100644 index 0000000..699c80a --- /dev/null +++ b/docs/tests/orch/council-report/council-report-json-shape-is-stable.md @@ -0,0 +1,41 @@ +# Case: `council-report-json-shape-is-stable` + +## 用例意义 + +验证 `council report --json` 返回稳定 JSON 契约,包含 `show`、`summary`、过滤后的 grouped recommendations,以及 report artifact 元数据。 + +## 前置条件 + +- 已按 `council-report-defaults-to-consensus-and-majority` 的前置流程完成 reviewer 输出与 `council tally` +- 运行 ID 为 `council_blog_report_003` + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json council report \ + --run council_blog_report_003 +``` + +## 预期输出 + +- 退出码为 `0` +- `ok == true` +- `command == "council report"` +- `data.run_id == "council_blog_report_003"` +- `data.show == ["consensus","majority"]` +- `data.summary.consensus == 1` +- `data.summary.majority == 1` +- `data.summary.minority == 1` +- `data.report_artifacts` 长度为 `1` +- 首个 artifact 的 `kind == "markdown"` +- `data.grouped_recommendations` 长度为 `2` +- 第一组 recommendation 的 `bucket == "consensus"` + +## 断言结论 + +- `--json` 模式返回的是 leader 可继续消费的稳定 machine-readable contract +- 默认 JSON 输出只返回被当前 `show` 过滤后的 recommendation,而 summary 仍保留全量 bucket 统计 + +## 补充约束 + +- 即使 `--json` 模式返回 artifact path,markdown artifact 仍应实际落盘 diff --git a/docs/tests/orch/council-report/council-report-rejects-before-tally.md b/docs/tests/orch/council-report/council-report-rejects-before-tally.md new file mode 100644 index 0000000..5f3d881 --- /dev/null +++ b/docs/tests/orch/council-report/council-report-rejects-before-tally.md @@ -0,0 +1,32 @@ +# Case: `council-report-rejects-before-tally` + +## 用例意义 + +验证 `council report` 在还没有持久化 grouped recommendations 时会返回稳定的 `invalid_state` 契约,而不是生成空报告。 + +## 前置条件 + +- 使用隔离的临时目录 `TMPDIR` +- 当前数据库中尚未对该 council run 执行 `council tally` + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json council start \ + --run council_blog_report_010 \ + --target "Review the council reporting flow." + +orch --db TMPDIR/coord.db --json council report \ + --run council_blog_report_010 +``` + +## 预期输出 + +- 第二条 `council report` 命令退出码为 `30` +- JSON 错误码为 `invalid_state` +- 错误消息指出 grouped recommendations 尚不可用,需先执行 `council tally` + +## 断言结论 + +- `council report` 不是“边读 reviewer 输出边临时汇总”的命令 +- report 阶段依赖已持久化的 `council_groups`,因此 `tally -> report` 的顺序是稳定 CLI 契约 diff --git a/docs/tests/orch/council-report/council-report-rejects-invalid-show.md b/docs/tests/orch/council-report/council-report-rejects-invalid-show.md new file mode 100644 index 0000000..f2f084e --- /dev/null +++ b/docs/tests/orch/council-report/council-report-rejects-invalid-show.md @@ -0,0 +1,29 @@ +# Case: `council-report-rejects-invalid-show` + +## 用例意义 + +验证 `council report --show` 对非法 bucket 值返回稳定的 `invalid_input`,避免 leader 误以为未知 bucket 会被静默忽略。 + +## 前置条件 + +- 已按 `council-report-defaults-to-consensus-and-majority` 的前置流程完成 reviewer 输出与 `council tally` +- 运行 ID 为 `council_blog_report_001` + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json council report \ + --run council_blog_report_001 \ + --show consensus,invalid +``` + +## 预期输出 + +- 退出码为 `30` +- JSON 错误码为 `invalid_input` +- 错误消息说明 `--show` 只接受 `consensus`、`majority`、`minority` 或 `all` + +## 断言结论 + +- `--show` 不是宽松过滤参数;未知 bucket 会触发显式输入错误 +- leader 侧脚本可以依赖这一点来尽早发现错误配置,而不是事后对空报告排障 diff --git a/docs/tests/orch/council-report/council-report-show-all-includes-minority.md b/docs/tests/orch/council-report/council-report-show-all-includes-minority.md new file mode 100644 index 0000000..334e3de --- /dev/null +++ b/docs/tests/orch/council-report/council-report-show-all-includes-minority.md @@ -0,0 +1,29 @@ +# Case: `council-report-show-all-includes-minority` + +## 用例意义 + +验证 `council report --show all` 会把默认被省略的 `minority` recommendation 一并展示出来。 + +## 前置条件 + +- 已按 `council-report-defaults-to-consensus-and-majority` 的前置流程完成 reviewer 输出与 `council tally` +- 运行 ID 为 `council_blog_report_002` + +## 输入 + +```bash +orch --db TMPDIR/coord.db council report \ + --run council_blog_report_002 \ + --show all +``` + +## 预期输出 + +- `council report` 退出码为 `0` +- stdout markdown 同时包含 `## Consensus`、`## Majority`、`## Minority` +- markdown 中出现 minority proposal,例如 `Add regression tests for council report JSON output.` + +## 断言结论 + +- `--show all` 会覆盖默认的 bucket 过滤策略 +- `minority` recommendation 会保留在持久化数据里,只是默认不进入主报告 diff --git a/docs/tests/orch/council-start/README.md b/docs/tests/orch/council-start/README.md index c6aa1a7..a960dc3 100644 --- a/docs/tests/orch/council-start/README.md +++ b/docs/tests/orch/council-start/README.md @@ -1,7 +1,7 @@ # Orch `council start` Test Plan Index -## Status +## Case Files -No command case files are authored yet. - -Use [../ROADMAP.md](../ROADMAP.md) for planned case slugs and document progress. +| Case Slug | File | Coverage Note | +| --- | --- | --- | +| `council-start-dispatches-three-reviewers` | [council-start-dispatches-three-reviewers.md](./council-start-dispatches-three-reviewers.md) | creates the council run, dispatches the fixed three reviewer roles, and exposes the expected default metadata | diff --git a/docs/tests/orch/council-start/council-start-dispatches-three-reviewers.md b/docs/tests/orch/council-start/council-start-dispatches-three-reviewers.md new file mode 100644 index 0000000..618afa4 --- /dev/null +++ b/docs/tests/orch/council-start/council-start-dispatches-three-reviewers.md @@ -0,0 +1,46 @@ +# Case: `council-start-dispatches-three-reviewers` + +## 用例意义 + +验证 `council start` 会创建一个新的 council run,并立即分派固定的三位 reviewer:`architecture-reviewer`、`implementation-reviewer`、`risk-reviewer`。 + +## 前置条件 + +- 使用隔离的临时目录 `TMPDIR` +- 目标数据库 `TMPDIR/coord.db` 尚不存在 + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json council start \ + --run council_blog_001 \ + --target "Review the current blog architecture and propose optimizations." \ + --target-type mixed \ + --output both + +orch --db TMPDIR/coord.db --json status --run council_blog_001 +``` + +## 预期输出 + +- `council start` 退出码为 `0` +- `start.data.run_id == "council_blog_001"` +- `start.data.mode == "brainstorm"` +- `start.data.target_type == "mixed"` +- `start.data.output == "both"` +- `start.data.only_unanimous == false` +- `start.data.reviewers` 长度为 `3` +- 三个 reviewer 的 `reviewer_role` 分别为 `architecture-reviewer`、`implementation-reviewer`、`risk-reviewer` +- 三个 reviewer 的 `status` 都是 `dispatched` +- 后续 `status` 返回 `3` 个 task,run 处于活动中而不是终态 + +## 断言结论 + +- `council start` 不只是创建 run 元数据,还会直接完成 reviewer task 的创建与分派 +- v1 reviewer 集合是固定的三角色集合,而不是由用户动态指定 + +## 补充约束 + +- 未显式传入 `--mode` 时,默认回退到 `brainstorm` +- 未显式传入 `--only-unanimous` 时,默认值是 `false` +- council reviewer task 在当前实现里不应自动申请 code worktree diff --git a/docs/tests/orch/council-tally/README.md b/docs/tests/orch/council-tally/README.md index 74a8ddd..b104715 100644 --- a/docs/tests/orch/council-tally/README.md +++ b/docs/tests/orch/council-tally/README.md @@ -1,7 +1,8 @@ # Orch `council tally` Test Plan Index -## Status +## Case Files -No command case files are authored yet. - -Use [../ROADMAP.md](../ROADMAP.md) for planned case slugs and document progress. +| Case Slug | File | Coverage Note | +| --- | --- | --- | +| `council-tally-groups-reviewer-findings-in-normal-mode` | [council-tally-groups-reviewer-findings-in-normal-mode.md](./council-tally-groups-reviewer-findings-in-normal-mode.md) | groups semantically similar reviewer outputs into majority and minority buckets in `normal` mode | +| `council-tally-keeps-distinct-proposals-in-strict-mode` | [council-tally-keeps-distinct-proposals-in-strict-mode.md](./council-tally-keeps-distinct-proposals-in-strict-mode.md) | preserves wording differences as separate minority groups in `strict` mode | diff --git a/docs/tests/orch/council-tally/council-tally-groups-reviewer-findings-in-normal-mode.md b/docs/tests/orch/council-tally/council-tally-groups-reviewer-findings-in-normal-mode.md new file mode 100644 index 0000000..200bf2d --- /dev/null +++ b/docs/tests/orch/council-tally/council-tally-groups-reviewer-findings-in-normal-mode.md @@ -0,0 +1,67 @@ +# Case: `council-tally-groups-reviewer-findings-in-normal-mode` + +## 用例意义 + +验证 `council tally --similarity normal` 会把语义相近的 reviewer proposal 合并到同一组,并产出 `majority` / `minority` bucket。 + +## 前置条件 + +- 使用隔离的临时目录 `TMPDIR` +- 本地可使用 `sqlite3` 从 `task_attempts` 中读取 reviewer thread ID +- 已准备好三份 reviewer 输出 JSON;其中 architecture 与 implementation proposal 语义相近,risk proposal 独立 + +## 输入 + +```bash +cat <<'EOF' > TMPDIR/architecture-review.json +{"reviewer_role":"architecture-reviewer","findings":[{"title":"Split contracts","summary":"Transport contracts are mixed into UI code.","proposal":"Move API contract definitions into a dedicated module.","rationale":"This lowers coupling.","confidence":"high","tags":["architecture","coupling"],"target_refs":{"repo_path":"."}}]} +EOF + +cat <<'EOF' > TMPDIR/implementation-review.json +{"reviewer_role":"implementation-reviewer","findings":[{"title":"Extract API contracts","summary":"Shared transport shapes are duplicated.","proposal":"Move API contract definitions into dedicated module","rationale":"This reduces duplication.","confidence":"medium","tags":["maintainability"],"target_refs":{"repo_path":"."}}]} +EOF + +cat <<'EOF' > TMPDIR/risk-review.json +{"reviewer_role":"risk-reviewer","findings":[{"title":"Add auth integration tests","summary":"Login regressions are hard to catch.","proposal":"Add integration tests for auth flows.","rationale":"This catches regressions earlier.","confidence":"high","tags":["risk","testing"],"target_refs":{"repo_path":"."}}]} +EOF + +orch --db TMPDIR/coord.db --json council start \ + --run council_blog_tally_001 \ + --target "Review the current blog architecture." + +THREAD_ID_CR1=$(sqlite3 TMPDIR/coord.db "SELECT thread_id FROM task_attempts WHERE run_id = 'council_blog_tally_001' AND task_id = 'CR1' AND attempt_no = 1;") +THREAD_ID_CR2=$(sqlite3 TMPDIR/coord.db "SELECT thread_id FROM task_attempts WHERE run_id = 'council_blog_tally_001' AND task_id = 'CR2' AND attempt_no = 1;") +THREAD_ID_CR3=$(sqlite3 TMPDIR/coord.db "SELECT thread_id FROM task_attempts WHERE run_id = 'council_blog_tally_001' AND task_id = 'CR3' AND attempt_no = 1;") + +inbox --db TMPDIR/coord.db --json claim --agent architecture-reviewer --thread "$THREAD_ID_CR1" +inbox --db TMPDIR/coord.db --json done --agent architecture-reviewer --thread "$THREAD_ID_CR1" --summary "Review complete" --body-file TMPDIR/architecture-review.json + +inbox --db TMPDIR/coord.db --json claim --agent implementation-reviewer --thread "$THREAD_ID_CR2" +inbox --db TMPDIR/coord.db --json done --agent implementation-reviewer --thread "$THREAD_ID_CR2" --summary "Review complete" --body-file TMPDIR/implementation-review.json + +inbox --db TMPDIR/coord.db --json claim --agent risk-reviewer --thread "$THREAD_ID_CR3" +inbox --db TMPDIR/coord.db --json done --agent risk-reviewer --thread "$THREAD_ID_CR3" --summary "Review complete" --body-file TMPDIR/risk-review.json + +orch --db TMPDIR/coord.db --json council tally \ + --run council_blog_tally_001 \ + --similarity normal +``` + +## 预期输出 + +- `council tally` 退出码为 `0` +- `tally.data.similarity == "normal"` +- `tally.data.counts.majority == 1` +- `tally.data.counts.minority == 1` +- `tally.data.grouped_recommendations` 长度为 `2` +- 第一组 recommendation 的 `bucket == "majority"` +- 第一组 recommendation 的 `support_count == 2` + +## 断言结论 + +- `normal` 模式会优先按归一化意图合并 proposal,而不是逐字面比较 +- tally 输出不仅返回统计摘要,还返回分组后的 recommendation 明细 + +## 补充约束 + +- reviewer `done` 消息体必须是结构化 JSON;无效 JSON 或缺失 `reviewer_role`/`proposal` 会让 tally 返回 `invalid_input` diff --git a/docs/tests/orch/council-tally/council-tally-keeps-distinct-proposals-in-strict-mode.md b/docs/tests/orch/council-tally/council-tally-keeps-distinct-proposals-in-strict-mode.md new file mode 100644 index 0000000..aa63933 --- /dev/null +++ b/docs/tests/orch/council-tally/council-tally-keeps-distinct-proposals-in-strict-mode.md @@ -0,0 +1,61 @@ +# Case: `council-tally-keeps-distinct-proposals-in-strict-mode` + +## 用例意义 + +验证 `council tally --similarity strict` 不会合并 wording 不同的 proposal,即使它们语义接近,也会保留为独立 recommendation。 + +## 前置条件 + +- 使用隔离的临时目录 `TMPDIR` +- 本地可使用 `sqlite3` 从 `task_attempts` 中读取 reviewer thread ID +- 已准备好三份 reviewer 输出 JSON;其中 architecture 与 implementation proposal 语义相近但措辞不同 + +## 输入 + +```bash +cat <<'EOF' > TMPDIR/architecture-review.json +{"reviewer_role":"architecture-reviewer","findings":[{"title":"Split contracts","summary":"Transport contracts are mixed into UI code.","proposal":"Move API contract definitions into a dedicated module.","rationale":"This lowers coupling.","confidence":"high","tags":["architecture"],"target_refs":{"repo_path":"."}}]} +EOF + +cat <<'EOF' > TMPDIR/implementation-review.json +{"reviewer_role":"implementation-reviewer","findings":[{"title":"Extract API contracts","summary":"Shared transport shapes are duplicated.","proposal":"Move API contract definitions into dedicated module","rationale":"This reduces duplication.","confidence":"medium","tags":["maintainability"],"target_refs":{"repo_path":"."}}]} +EOF + +cat <<'EOF' > TMPDIR/risk-review.json +{"reviewer_role":"risk-reviewer","findings":[{"title":"Add auth integration tests","summary":"Login regressions are hard to catch.","proposal":"Add integration tests for auth flows.","rationale":"This catches regressions earlier.","confidence":"high","tags":["risk"],"target_refs":{"repo_path":"."}}]} +EOF + +orch --db TMPDIR/coord.db --json council start \ + --run council_blog_tally_002 \ + --target "Review the current blog architecture." + +THREAD_ID_CR1=$(sqlite3 TMPDIR/coord.db "SELECT thread_id FROM task_attempts WHERE run_id = 'council_blog_tally_002' AND task_id = 'CR1' AND attempt_no = 1;") +THREAD_ID_CR2=$(sqlite3 TMPDIR/coord.db "SELECT thread_id FROM task_attempts WHERE run_id = 'council_blog_tally_002' AND task_id = 'CR2' AND attempt_no = 1;") +THREAD_ID_CR3=$(sqlite3 TMPDIR/coord.db "SELECT thread_id FROM task_attempts WHERE run_id = 'council_blog_tally_002' AND task_id = 'CR3' AND attempt_no = 1;") + +inbox --db TMPDIR/coord.db --json claim --agent architecture-reviewer --thread "$THREAD_ID_CR1" +inbox --db TMPDIR/coord.db --json done --agent architecture-reviewer --thread "$THREAD_ID_CR1" --summary "Review complete" --body-file TMPDIR/architecture-review.json + +inbox --db TMPDIR/coord.db --json claim --agent implementation-reviewer --thread "$THREAD_ID_CR2" +inbox --db TMPDIR/coord.db --json done --agent implementation-reviewer --thread "$THREAD_ID_CR2" --summary "Review complete" --body-file TMPDIR/implementation-review.json + +inbox --db TMPDIR/coord.db --json claim --agent risk-reviewer --thread "$THREAD_ID_CR3" +inbox --db TMPDIR/coord.db --json done --agent risk-reviewer --thread "$THREAD_ID_CR3" --summary "Review complete" --body-file TMPDIR/risk-review.json + +orch --db TMPDIR/coord.db --json council tally \ + --run council_blog_tally_002 \ + --similarity strict +``` + +## 预期输出 + +- `council tally` 退出码为 `0` +- `tally.data.similarity == "strict"` +- `tally.data.counts.minority == 3` +- `tally.data.grouped_recommendations` 长度为 `3` +- 三组 recommendation 都应落入 `minority` + +## 断言结论 + +- `strict` 模式的目标是保留 proposal 的字面差异,而不是宽松合并 +- 当没有 proposal 被合并时,support count 会退化成单 reviewer 支持 diff --git a/docs/tests/orch/council-wait/README.md b/docs/tests/orch/council-wait/README.md index e76363e..11c8037 100644 --- a/docs/tests/orch/council-wait/README.md +++ b/docs/tests/orch/council-wait/README.md @@ -1,7 +1,8 @@ # Orch `council wait` Test Plan Index -## Status +## Case Files -No command case files are authored yet. - -Use [../ROADMAP.md](../ROADMAP.md) for planned case slugs and document progress. +| Case Slug | File | Coverage Note | +| --- | --- | --- | +| `council-wait-wakes-when-all-reviewers-complete` | [council-wait-wakes-when-all-reviewers-complete.md](./council-wait-wakes-when-all-reviewers-complete.md) | wakes successfully once all three reviewer threads reach terminal success | +| `council-wait-times-out-when-reviewers-incomplete` | [council-wait-times-out-when-reviewers-incomplete.md](./council-wait-times-out-when-reviewers-incomplete.md) | returns a stable timeout result while reviewer work remains incomplete | diff --git a/docs/tests/orch/council-wait/council-wait-times-out-when-reviewers-incomplete.md b/docs/tests/orch/council-wait/council-wait-times-out-when-reviewers-incomplete.md new file mode 100644 index 0000000..91489d6 --- /dev/null +++ b/docs/tests/orch/council-wait/council-wait-times-out-when-reviewers-incomplete.md @@ -0,0 +1,35 @@ +# Case: `council-wait-times-out-when-reviewers-incomplete` + +## 用例意义 + +验证 `council wait` 在 reviewer 尚未全部完成时返回稳定的超时结果,而不是误判为成功唤醒。 + +## 前置条件 + +- 使用隔离的临时目录 `TMPDIR` +- 目标数据库 `TMPDIR/coord.db` 尚不存在 + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json council start \ + --run council_blog_wait_002 \ + --target "Review the current blog architecture." + +orch --db TMPDIR/coord.db --json council wait \ + --run council_blog_wait_002 \ + --timeout-seconds 1 +``` + +## 预期输出 + +- `council wait` 退出码为 `0` +- `wait.data.woke == false` +- `wait.data.all_complete == false` +- `wait.data.reviewers` 长度为 `3` +- 返回 reviewer 状态集合时,不要求每个 reviewer 已完成 + +## 断言结论 + +- `council wait` 的超时结果是显式的“未唤醒”状态,而不是错误退出 +- leader 可以基于同一个返回结构同时处理唤醒与超时两种路径 diff --git a/docs/tests/orch/council-wait/council-wait-wakes-when-all-reviewers-complete.md b/docs/tests/orch/council-wait/council-wait-wakes-when-all-reviewers-complete.md new file mode 100644 index 0000000..0b3ac9c --- /dev/null +++ b/docs/tests/orch/council-wait/council-wait-wakes-when-all-reviewers-complete.md @@ -0,0 +1,53 @@ +# Case: `council-wait-wakes-when-all-reviewers-complete` + +## 用例意义 + +验证 `council wait` 在三位 reviewer 都完成后会被唤醒,并返回完整 reviewer 状态集合。 + +## 前置条件 + +- 使用隔离的临时目录 `TMPDIR` +- 已通过 `council start` 创建 run `council_blog_wait_001` +- 本地可使用 `sqlite3` 从 `task_attempts` 中读取 reviewer thread ID,用于构造 `inbox` 完成态 + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json council start \ + --run council_blog_wait_001 \ + --target "Review the current blog architecture." + +THREAD_ID_CR1=$(sqlite3 TMPDIR/coord.db "SELECT thread_id FROM task_attempts WHERE run_id = 'council_blog_wait_001' AND task_id = 'CR1' AND attempt_no = 1;") +THREAD_ID_CR2=$(sqlite3 TMPDIR/coord.db "SELECT thread_id FROM task_attempts WHERE run_id = 'council_blog_wait_001' AND task_id = 'CR2' AND attempt_no = 1;") +THREAD_ID_CR3=$(sqlite3 TMPDIR/coord.db "SELECT thread_id FROM task_attempts WHERE run_id = 'council_blog_wait_001' AND task_id = 'CR3' AND attempt_no = 1;") + +inbox --db TMPDIR/coord.db --json claim --agent architecture-reviewer --thread "$THREAD_ID_CR1" +inbox --db TMPDIR/coord.db --json done --agent architecture-reviewer --thread "$THREAD_ID_CR1" --summary "Review complete" + +inbox --db TMPDIR/coord.db --json claim --agent implementation-reviewer --thread "$THREAD_ID_CR2" +inbox --db TMPDIR/coord.db --json done --agent implementation-reviewer --thread "$THREAD_ID_CR2" --summary "Review complete" + +inbox --db TMPDIR/coord.db --json claim --agent risk-reviewer --thread "$THREAD_ID_CR3" +inbox --db TMPDIR/coord.db --json done --agent risk-reviewer --thread "$THREAD_ID_CR3" --summary "Review complete" + +orch --db TMPDIR/coord.db --json council wait \ + --run council_blog_wait_001 \ + --timeout-seconds 2 +``` + +## 预期输出 + +- `council wait` 退出码为 `0` +- `wait.data.woke == true` +- `wait.data.all_complete == true` +- `wait.data.reviewers` 长度为 `3` +- 三个 reviewer 的 `status` 都是 `done` + +## 断言结论 + +- `council wait` 的唤醒条件是“三位 reviewer 全部达到终态成功” +- 返回结果不仅告知已唤醒,还会携带完整 reviewer 状态快照,便于 leader 继续执行 tally/report + +## 补充约束 + +- 当前手工复现实例需要通过 `task_attempts` 提取 reviewer `thread_id`,因为 `orch` CLI 还不直接暴露 attempt-thread mapping diff --git a/docs/tests/orch/dep-add/README.md b/docs/tests/orch/dep-add/README.md index 849602e..a690d56 100644 --- a/docs/tests/orch/dep-add/README.md +++ b/docs/tests/orch/dep-add/README.md @@ -1,7 +1,7 @@ # Orch `dep add` Test Plan Index -## Status +## Case Files -No command case files are authored yet. - -Use [../ROADMAP.md](../ROADMAP.md) for planned case slugs and document progress. +| Case Slug | File | Coverage Note | +| --- | --- | --- | +| `dep-add-blocks-dependent-task-until-prerequisite-completes` | [dep-add-blocks-dependent-task-until-prerequisite-completes.md](./dep-add-blocks-dependent-task-until-prerequisite-completes.md) | adds a dependency edge that keeps the dependent task out of the ready set | diff --git a/docs/tests/orch/dep-add/dep-add-blocks-dependent-task-until-prerequisite-completes.md b/docs/tests/orch/dep-add/dep-add-blocks-dependent-task-until-prerequisite-completes.md new file mode 100644 index 0000000..0e7d94b --- /dev/null +++ b/docs/tests/orch/dep-add/dep-add-blocks-dependent-task-until-prerequisite-completes.md @@ -0,0 +1,38 @@ +# Case: `dep-add-blocks-dependent-task-until-prerequisite-completes` + +## 用例意义 + +验证 `dep add` 会建立依赖边,并让被依赖任务在前置任务完成前保持不可调度。 + +## 前置条件 + +- 已存在 run `run_blog_002` +- run 下已存在任务 `T1` 与 `T2` + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json run init --run run_blog_002 --goal "Build dependency-aware workflow" +orch --db TMPDIR/coord.db --json task add --run run_blog_002 --task T1 --title "Build backend" --default-to worker-a +orch --db TMPDIR/coord.db --json task add --run run_blog_002 --task T2 --title "Build frontend" --default-to worker-b +orch --db TMPDIR/coord.db --json dep add --run run_blog_002 --task T2 --depends-on T1 +orch --db TMPDIR/coord.db --json ready --run run_blog_002 +``` + +## 预期输出 + +- `dep add` 退出码为 `0` +- `data.dependency.task_id == "T2"` +- `data.dependency.depends_on_task_id == "T1"` +- 后续 `ready` 只返回 `T1` +- `T2` 不出现在 `ready.data.tasks` 中 + +## 断言结论 + +- `dep add` 会立刻影响 ready 计算结果 +- 依赖关系属于调度门控,而不是仅供展示的元数据 + +## 补充约束 + +- `--task` 不能依赖自己;自依赖应返回 `invalid_input` +- 重复添加同一条依赖边应返回 `invalid_state` diff --git a/docs/tests/orch/dispatch/README.md b/docs/tests/orch/dispatch/README.md index c28697c..88ef75d 100644 --- a/docs/tests/orch/dispatch/README.md +++ b/docs/tests/orch/dispatch/README.md @@ -1,7 +1,13 @@ # Orch `dispatch` Test Plan Index -## Status +## Case Files -No command case files are authored yet. - -Use [../ROADMAP.md](../ROADMAP.md) for planned case slugs and document progress. +| Case Slug | File | Coverage Note | +| --- | --- | --- | +| `dispatch-creates-attempt-and-thread-for-ready-task` | [dispatch-creates-attempt-and-thread-for-ready-task.md](./dispatch-creates-attempt-and-thread-for-ready-task.md) | dispatches a ready task into a new attempt, inbox thread, and initial task message | +| `dispatch-rejects-non-ready-task` | [dispatch-rejects-non-ready-task.md](./dispatch-rejects-non-ready-task.md) | rejects dispatch when the task is still gated by dependencies | +| `dispatch-creates-strict-worktree` | [dispatch-creates-strict-worktree.md](./dispatch-creates-strict-worktree.md) | provisions a strict worktree and writes workspace metadata into the attempt and payload | +| `dispatch-rejects-dirty-repo-without-base-ref` | [dispatch-rejects-dirty-repo-without-base-ref.md](./dispatch-rejects-dirty-repo-without-base-ref.md) | blocks strict worktree dispatch from a dirty repository without an explicit base ref | +| `dispatch-allows-explicit-base-ref-on-dirty-repo` | [dispatch-allows-explicit-base-ref-on-dirty-repo.md](./dispatch-allows-explicit-base-ref-on-dirty-repo.md) | accepts dirty repository state when `--base-ref` resolves to a concrete commit | +| `dispatch-auto-enables-worktree-for-code-like-task` | [dispatch-auto-enables-worktree-for-code-like-task.md](./dispatch-auto-enables-worktree-for-code-like-task.md) | auto-enables worktree mode for code-like tasks when no explicit worktree flags are supplied | +| `dispatch-skips-auto-worktree-for-non-code-task` | [dispatch-skips-auto-worktree-for-non-code-task.md](./dispatch-skips-auto-worktree-for-non-code-task.md) | keeps clearly non-code tasks on the normal non-worktree dispatch path | diff --git a/docs/tests/orch/dispatch/dispatch-allows-explicit-base-ref-on-dirty-repo.md b/docs/tests/orch/dispatch/dispatch-allows-explicit-base-ref-on-dirty-repo.md new file mode 100644 index 0000000..59cff56 --- /dev/null +++ b/docs/tests/orch/dispatch/dispatch-allows-explicit-base-ref-on-dirty-repo.md @@ -0,0 +1,31 @@ +# Case: `dispatch-allows-explicit-base-ref-on-dirty-repo` + +## 用例意义 + +验证 strict worktree dispatch 在仓库已变脏时,只要显式给出可解析的 `--base-ref`,仍可继续创建 attempt。 + +## 前置条件 + +- `TMPDIR/repo` 是一个 Git 仓库 +- 仓库工作区存在未提交变更 +- `HEAD` 仍指向合法 commit +- 已存在 run `run_blog_worktree_003` 与任务 `T1` + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json run init --run run_blog_worktree_003 --goal "Validate explicit base ref on dirty repo" +orch --db TMPDIR/coord.db --json task add --run run_blog_worktree_003 --task T1 --title "Implement backend" --default-to worker-a +orch --db TMPDIR/coord.db --json dispatch --run run_blog_worktree_003 --task T1 --repo-path TMPDIR/repo --workspace-root .orch/worktrees --strict-worktree --base-ref HEAD +``` + +## 预期输出 + +- `dispatch` 退出码为 `0` +- `data.attempt.base_ref == "HEAD"` +- `data.attempt.base_commit` 等于 dirty 之前当前可解析的 `HEAD` commit + +## 断言结论 + +- `--base-ref` 是 dirty repo strict dispatch 的显式解锁条件 +- worktree 基线来自 commit,而不是当前未提交工作区内容 diff --git a/docs/tests/orch/dispatch/dispatch-auto-enables-worktree-for-code-like-task.md b/docs/tests/orch/dispatch/dispatch-auto-enables-worktree-for-code-like-task.md new file mode 100644 index 0000000..4e92e6c --- /dev/null +++ b/docs/tests/orch/dispatch/dispatch-auto-enables-worktree-for-code-like-task.md @@ -0,0 +1,34 @@ +# Case: `dispatch-auto-enables-worktree-for-code-like-task` + +## 用例意义 + +验证 `dispatch` 在未显式传 worktree flags 时,会对 code-like 任务自动启用 worktree 流程。 + +## 前置条件 + +- `TMPDIR/repo` 是一个干净的 Git 仓库 +- 已存在 code-like 任务 `T1` + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json run init --run run_blog_auto_worktree_001 --goal "Validate auto worktree detection" +orch --db TMPDIR/coord.db --json task add --run run_blog_auto_worktree_001 --task T1 --title "Implement backend API" --default-to backend-worker +orch --db TMPDIR/coord.db --json dispatch --run run_blog_auto_worktree_001 --task T1 --repo-path TMPDIR/repo +``` + +## 预期输出 + +- `dispatch` 退出码为 `0` +- `data.attempt.worktree_path` 为非空 +- `data.attempt.workspace_status == "created"` +- 返回的 worktree 路径在磁盘上存在 + +## 断言结论 + +- `dispatch` 存在自动 worktree 推断逻辑,不要求 leader 每次显式写 `--strict-worktree` + +## 补充约束 + +- 当前推断主要依赖任务角色与 acceptance JSON 的 code-like 标记 +- 未指定 `--workspace-root` 时,自动 worktree 模式默认写到仓库下的 `.orch/worktrees` diff --git a/docs/tests/orch/dispatch/dispatch-creates-attempt-and-thread-for-ready-task.md b/docs/tests/orch/dispatch/dispatch-creates-attempt-and-thread-for-ready-task.md new file mode 100644 index 0000000..8e1f8e4 --- /dev/null +++ b/docs/tests/orch/dispatch/dispatch-creates-attempt-and-thread-for-ready-task.md @@ -0,0 +1,38 @@ +# Case: `dispatch-creates-attempt-and-thread-for-ready-task` + +## 用例意义 + +验证 `dispatch` 在任务已 `ready` 时,会创建 attempt、映射 inbox thread,并写入首条任务消息。 + +## 前置条件 + +- 已存在 run `run_blog_001` +- 已存在无依赖任务 `T1` + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json run init --run run_blog_001 --goal "Build blog MVP" --summary "Public blog plus admin CRUD" +orch --db TMPDIR/coord.db --json task add --run run_blog_001 --task T1 --title "Implement retry policy" --summary "Add retry policy to HTTP client" --default-to worker-a +orch --db TMPDIR/coord.db --json dispatch --run run_blog_001 --task T1 --body "Implement retry handling for the HTTP client." +``` + +## 预期输出 + +- `dispatch` 退出码为 `0` +- `data.task.status == "dispatched"` +- `data.attempt.attempt_no == 1` +- 返回 `data.attempt.thread_id` +- `data.attempt.assigned_to == "worker-a"` +- `data.thread.thread_id` 与 `data.attempt.thread_id` 一致 +- `data.message.kind == "task"` + +## 断言结论 + +- `dispatch` 是把调度意图物化为一次 attempt 和 inbox thread 的命令 +- 任务进入 `dispatched` 后,leader 可以用 thread 映射等待 worker 侧进展 + +## 补充约束 + +- 未显式传 `--to` 时,会回退使用任务的 `default_to` +- `--body` 与 `--body-file` 互斥;不可读的 `--body-file` 应返回 `invalid_input` diff --git a/docs/tests/orch/dispatch/dispatch-creates-strict-worktree.md b/docs/tests/orch/dispatch/dispatch-creates-strict-worktree.md new file mode 100644 index 0000000..37777cd --- /dev/null +++ b/docs/tests/orch/dispatch/dispatch-creates-strict-worktree.md @@ -0,0 +1,39 @@ +# Case: `dispatch-creates-strict-worktree` + +## 用例意义 + +验证显式 `--strict-worktree` dispatch 会创建隔离 worktree,并把 workspace 元数据持久化到 attempt 与任务 payload 中。 + +## 前置条件 + +- `TMPDIR/repo` 是一个干净的 Git 仓库 +- 仓库内已存在至少一个已提交文件 +- 已存在 run `run_blog_worktree_001` 与任务 `T1` + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json run init --run run_blog_worktree_001 --goal "Validate strict worktree dispatch" +orch --db TMPDIR/coord.db --json task add --run run_blog_worktree_001 --task T1 --title "Implement backend" --default-to worker-a +orch --db TMPDIR/coord.db --json dispatch --run run_blog_worktree_001 --task T1 --repo-path TMPDIR/repo --workspace-root .orch/worktrees --strict-worktree --body "Implement inside isolated worktree." +``` + +## 预期输出 + +- `dispatch` 退出码为 `0` +- `data.attempt.base_ref == "HEAD"` +- `data.attempt.base_commit` 等于仓库当前 `HEAD` commit +- `data.attempt.branch_name == "orch/run-blog-worktree-001/T1/attempt-1"` +- 返回非空 `data.attempt.worktree_path` +- `data.attempt.workspace_status == "created"` +- `data.message.payload_json.worktree_path` 与 attempt 中的路径一致 + +## 断言结论 + +- strict worktree dispatch 会创建真正的隔离工作目录,而不是只记录一组字符串元数据 +- worker 读取任务 payload 时可以拿到同一份 worktree 路径 + +## 补充约束 + +- 未显式传 `--base-ref` 且仓库干净时,会默认回退到 `HEAD` +- `--workspace-root` 为相对路径时,会相对于仓库根目录解析 diff --git a/docs/tests/orch/dispatch/dispatch-rejects-dirty-repo-without-base-ref.md b/docs/tests/orch/dispatch/dispatch-rejects-dirty-repo-without-base-ref.md new file mode 100644 index 0000000..672f5d1 --- /dev/null +++ b/docs/tests/orch/dispatch/dispatch-rejects-dirty-repo-without-base-ref.md @@ -0,0 +1,30 @@ +# Case: `dispatch-rejects-dirty-repo-without-base-ref` + +## 用例意义 + +验证 strict worktree dispatch 在仓库存在未提交修改且未显式指定 `--base-ref` 时,会拒绝继续执行。 + +## 前置条件 + +- `TMPDIR/repo` 是一个 Git 仓库 +- 仓库工作区存在未提交变更 +- 已存在 run `run_blog_worktree_002` 与任务 `T1` + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json run init --run run_blog_worktree_002 --goal "Validate dirty repo rejection" +orch --db TMPDIR/coord.db --json task add --run run_blog_worktree_002 --task T1 --title "Implement backend" --default-to worker-a +orch --db TMPDIR/coord.db --json dispatch --run run_blog_worktree_002 --task T1 --repo-path TMPDIR/repo --workspace-root .orch/worktrees --strict-worktree +``` + +## 预期输出 + +- `dispatch` 退出码为 `30` +- JSON 错误码为 `invalid_state` +- `.orch/worktrees/run_blog_worktree_002/T1/attempt-1` 不应被创建 + +## 断言结论 + +- strict 模式不会隐式吞掉未提交工作区状态 +- 当 leader 依赖脏工作区时,必须显式给出 `--base-ref` diff --git a/docs/tests/orch/dispatch/dispatch-rejects-non-ready-task.md b/docs/tests/orch/dispatch/dispatch-rejects-non-ready-task.md new file mode 100644 index 0000000..7f424d5 --- /dev/null +++ b/docs/tests/orch/dispatch/dispatch-rejects-non-ready-task.md @@ -0,0 +1,34 @@ +# Case: `dispatch-rejects-non-ready-task` + +## 用例意义 + +验证 `dispatch` 在任务仍被依赖阻塞时会返回稳定的 `invalid_state` 契约,而不是偷偷创建 attempt。 + +## 前置条件 + +- 已存在 run `run_blog_003` +- 任务 `T2` 依赖 `T1` +- `T1` 尚未完成 + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json run init --run run_blog_003 --goal "Validate ready gating" +orch --db TMPDIR/coord.db --json task add --run run_blog_003 --task T1 --title "Backend" +orch --db TMPDIR/coord.db --json task add --run run_blog_003 --task T2 --title "Frontend" +orch --db TMPDIR/coord.db --json dep add --run run_blog_003 --task T2 --depends-on T1 +orch --db TMPDIR/coord.db --json dispatch --run run_blog_003 --task T2 +``` + +## 预期输出 + +- `dispatch` 退出码为 `30` +- JSON 错误码为 `invalid_state` + +## 断言结论 + +- 依赖未满足时,`dispatch` 失败得很早,并且不会越过 ready gate + +## 补充约束 + +- 该错误是调度状态错误,不是 `not_found` diff --git a/docs/tests/orch/dispatch/dispatch-skips-auto-worktree-for-non-code-task.md b/docs/tests/orch/dispatch/dispatch-skips-auto-worktree-for-non-code-task.md new file mode 100644 index 0000000..8f2644d --- /dev/null +++ b/docs/tests/orch/dispatch/dispatch-skips-auto-worktree-for-non-code-task.md @@ -0,0 +1,35 @@ +# Case: `dispatch-skips-auto-worktree-for-non-code-task` + +## 用例意义 + +验证 `dispatch` 在未显式传 worktree flags 时,不会把明显非代码任务错误地推进到 worktree 执行路径。 + +## 前置条件 + +- `TMPDIR/repo` 是一个干净的 Git 仓库 +- 已存在 run `run_blog_auto_worktree_002` +- 已存在非 code-like 任务 `T1` + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json run init --run run_blog_auto_worktree_002 --goal "Validate non-code dispatch fallback" +orch --db TMPDIR/coord.db --json task add --run run_blog_auto_worktree_002 --task T1 --title "Review QA findings" --summary "Summarize test failures and next steps" --default-to qa-worker +orch --db TMPDIR/coord.db --json dispatch --run run_blog_auto_worktree_002 --task T1 --repo-path TMPDIR/repo +``` + +## 预期输出 + +- `dispatch` 退出码为 `0` +- `data.attempt.worktree_path == ""` +- `data.attempt.workspace_status == ""` +- 仍会正常返回 `thread_id` 与首条任务消息 + +## 断言结论 + +- 自动 worktree 推断不是“见仓库就建 worktree” +- 非代码任务仍走标准 dispatch 路径,不会平白引入分支和工作目录 + +## 补充约束 + +- 当前非代码判断通常来自任务标题、摘要、角色和 acceptance 信息都缺少 code-like 信号 diff --git a/docs/tests/orch/ready/README.md b/docs/tests/orch/ready/README.md index 2069960..2ba734e 100644 --- a/docs/tests/orch/ready/README.md +++ b/docs/tests/orch/ready/README.md @@ -1,7 +1,8 @@ # Orch `ready` Test Plan Index -## Status +## Case Files -No command case files are authored yet. - -Use [../ROADMAP.md](../ROADMAP.md) for planned case slugs and document progress. +| Case Slug | File | Coverage Note | +| --- | --- | --- | +| `ready-lists-only-eligible-tasks` | [ready-lists-only-eligible-tasks.md](./ready-lists-only-eligible-tasks.md) | returns only dependency-cleared tasks in priority-aware ready order | +| `ready-orders-by-priority-and-respects-limit` | [ready-orders-by-priority-and-respects-limit.md](./ready-orders-by-priority-and-respects-limit.md) | sorts ready tasks by priority and applies `--limit` after ordering | diff --git a/docs/tests/orch/ready/ready-lists-only-eligible-tasks.md b/docs/tests/orch/ready/ready-lists-only-eligible-tasks.md new file mode 100644 index 0000000..b19fc20 --- /dev/null +++ b/docs/tests/orch/ready/ready-lists-only-eligible-tasks.md @@ -0,0 +1,38 @@ +# Case: `ready-lists-only-eligible-tasks` + +## 用例意义 + +验证 `ready` 只返回当前真正可调度的任务,而不会把仍受依赖阻塞的任务混入结果。 + +## 前置条件 + +- 已存在 run `run_blog_002` +- `T1` 与 `T2` 已创建 +- `T2` 已通过 `dep add` 依赖 `T1` + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json run init --run run_blog_002 --goal "Build dependency-aware workflow" +orch --db TMPDIR/coord.db --json task add --run run_blog_002 --task T1 --title "Build backend" --default-to worker-a +orch --db TMPDIR/coord.db --json task add --run run_blog_002 --task T2 --title "Build frontend" --default-to worker-b +orch --db TMPDIR/coord.db --json dep add --run run_blog_002 --task T2 --depends-on T1 +orch --db TMPDIR/coord.db --json ready --run run_blog_002 +``` + +## 预期输出 + +- `ready` 退出码为 `0` +- `data.tasks` 长度为 `1` +- 唯一返回项是 `T1` +- 返回任务状态为 `ready` + +## 断言结论 + +- `ready` 是经过依赖和状态过滤后的结果,不是“所有未完成任务”的简单列表 +- 新 agent 可以依赖该命令决定可立即 dispatch 的工作 + +## 补充约束 + +- 未显式传 `--limit` 时,默认上限是 `20` +- `--run` 指向不存在的 run 时,应返回 `not_found` diff --git a/docs/tests/orch/ready/ready-orders-by-priority-and-respects-limit.md b/docs/tests/orch/ready/ready-orders-by-priority-and-respects-limit.md new file mode 100644 index 0000000..6643701 --- /dev/null +++ b/docs/tests/orch/ready/ready-orders-by-priority-and-respects-limit.md @@ -0,0 +1,39 @@ +# Case: `ready-orders-by-priority-and-respects-limit` + +## 用例意义 + +验证 `ready` 会先按优先级排序可调度任务,再应用 `--limit` 截断结果,而不是按创建顺序直接裁剪。 + +## 前置条件 + +- 已存在 run `run_blog_005` +- 该 run 下有至少三个无依赖且处于 `ready` 的任务 +- 三个任务优先级分别为 `high`、`normal`、`low` + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json run init --run run_blog_005 --goal "Validate ready ordering" +orch --db TMPDIR/coord.db --json task add --run run_blog_005 --task T1 --title "Low priority task" --priority low +orch --db TMPDIR/coord.db --json task add --run run_blog_005 --task T2 --title "Normal priority task" --priority normal +orch --db TMPDIR/coord.db --json task add --run run_blog_005 --task T3 --title "High priority task" --priority high +orch --db TMPDIR/coord.db --json ready --run run_blog_005 --limit 2 +``` + +## 预期输出 + +- `ready` 退出码为 `0` +- `data.tasks` 长度为 `2` +- 第一个返回项是高优先级任务 `T3` +- 第二个返回项是普通优先级任务 `T2` +- 低优先级任务 `T1` 不出现在本次结果中 + +## 断言结论 + +- `ready` 的用户可见顺序是 `high -> normal -> low` +- `--limit` 的截断发生在优先级排序之后,因此 leader 可以依赖该命令优先看到更重要的可调度任务 + +## 补充约束 + +- 当多个 ready 任务优先级相同时,当前实现会按创建时间升序稳定返回 +- 未显式传 `--limit` 时,默认上限仍是 `20` diff --git a/docs/tests/orch/reassign/README.md b/docs/tests/orch/reassign/README.md index 79c52ff..5b34e2c 100644 --- a/docs/tests/orch/reassign/README.md +++ b/docs/tests/orch/reassign/README.md @@ -1,7 +1,7 @@ # Orch `reassign` Test Plan Index -## Status +## Case Files -No command case files are authored yet. - -Use [../ROADMAP.md](../ROADMAP.md) for planned case slugs and document progress. +| Case Slug | File | Coverage Note | +| --- | --- | --- | +| `reassign-cancels-old-thread-and-dispatches-new-attempt` | [reassign-cancels-old-thread-and-dispatches-new-attempt.md](./reassign-cancels-old-thread-and-dispatches-new-attempt.md) | cancels the old blocked thread and creates a new attempt for another worker | diff --git a/docs/tests/orch/reassign/reassign-cancels-old-thread-and-dispatches-new-attempt.md b/docs/tests/orch/reassign/reassign-cancels-old-thread-and-dispatches-new-attempt.md new file mode 100644 index 0000000..673da7d --- /dev/null +++ b/docs/tests/orch/reassign/reassign-cancels-old-thread-and-dispatches-new-attempt.md @@ -0,0 +1,39 @@ +# Case: `reassign-cancels-old-thread-and-dispatches-new-attempt` + +## 用例意义 + +验证 `reassign` 会取消旧的阻塞线程,并为新 worker 创建新的尝试与线程。 + +## 前置条件 + +- 已创建运行 `run_blog_reassign_001` +- 已创建任务 `T1` +- `T1` 已通过严格 worktree 模式完成首次 `dispatch` +- `worker-a` 已 `claim` 首次尝试线程,并通过 `inbox update --status blocked` 写入问题 +- 最近一次 `reconcile` 已执行,使任务进入 `blocked` +- 已知旧线程为 `OLD_THREAD_ID` + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json reassign --run run_blog_reassign_001 --task T1 --to worker-b --reason "Try another worker with clearer ownership." +inbox --db TMPDIR/coord.db --json show --thread OLD_THREAD_ID +``` + +## 预期输出 + +- `reassign` 退出码为 `0` +- `reassign.data.attempt.assigned_to == "worker-b"` +- `reassign.data.attempt.attempt_no == 2` +- `reassign.data.attempt.thread_id != OLD_THREAD_ID` +- `show.data.thread.status == "cancelled"`,指向旧线程 + +## 断言结论 + +- `reassign` 不是简单修改 `assigned_to` 字段,而是显式终止旧尝试并派生新尝试 +- 旧线程被取消后,worker 侧不会继续在过期上下文上执行 + +## 补充约束 + +- `reassign` 只接受 `blocked` 或 `failed` 任务 +- `--to` 是必填参数;`--reason` 建议始终填写,便于审计和人工排障 diff --git a/docs/tests/orch/reconcile/README.md b/docs/tests/orch/reconcile/README.md index 417a0a0..f0dd235 100644 --- a/docs/tests/orch/reconcile/README.md +++ b/docs/tests/orch/reconcile/README.md @@ -1,7 +1,8 @@ # Orch `reconcile` Test Plan Index -## Status +## Case Files -No command case files are authored yet. - -Use [../ROADMAP.md](../ROADMAP.md) for planned case slugs and document progress. +| Case Slug | File | Coverage Note | +| --- | --- | --- | +| `reconcile-maps-claimed-or-in-progress-thread-to-running` | [reconcile-maps-claimed-or-in-progress-thread-to-running.md](./reconcile-maps-claimed-or-in-progress-thread-to-running.md) | maps worker claim or in-progress inbox state back into a running orch task | +| `reconcile-maps-done-or-failed-thread-to-terminal-task-state` | [reconcile-maps-done-or-failed-thread-to-terminal-task-state.md](./reconcile-maps-done-or-failed-thread-to-terminal-task-state.md) | maps terminal inbox states into terminal task states and updates run aggregates | diff --git a/docs/tests/orch/reconcile/reconcile-maps-claimed-or-in-progress-thread-to-running.md b/docs/tests/orch/reconcile/reconcile-maps-claimed-or-in-progress-thread-to-running.md new file mode 100644 index 0000000..a7fda85 --- /dev/null +++ b/docs/tests/orch/reconcile/reconcile-maps-claimed-or-in-progress-thread-to-running.md @@ -0,0 +1,34 @@ +# Case: `reconcile-maps-claimed-or-in-progress-thread-to-running` + +## 用例意义 + +验证 `reconcile` 会把 worker 侧的 `claim` / `in_progress` 进展同步回 `orch`,将任务推进到 `running`。 + +## 前置条件 + +- 已存在 run `run_blog_001` +- 任务 `T1` 已通过 `dispatch` 创建 attempt 和 thread +- worker 已对该 thread 完成 `claim`,并可选地追加 `in_progress` 更新 + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json run init --run run_blog_001 --goal "Build blog MVP" +orch --db TMPDIR/coord.db --json task add --run run_blog_001 --task T1 --title "Implement retry policy" --default-to worker-a +orch --db TMPDIR/coord.db --json dispatch --run run_blog_001 --task T1 --body "Implement retry handling for the HTTP client." +inbox --db TMPDIR/coord.db --json claim --agent worker-a --thread THREAD_ID +inbox --db TMPDIR/coord.db --json update --agent worker-a --thread THREAD_ID --status in_progress --summary "Implementation started" +orch --db TMPDIR/coord.db --json reconcile --run run_blog_001 +``` + +## 预期输出 + +- `reconcile` 退出码为 `0` +- `data.updated_tasks` 长度为 `1` +- 唯一更新任务的 `status == "running"` +- `data.run.run_id == "run_blog_001"` + +## 断言结论 + +- `reconcile` 是 leader 侧把 inbox 执行状态投影回 scheduler 状态机的关键同步点 +- claim 与 in-progress 的 worker 信号不会停留在 inbox 层 diff --git a/docs/tests/orch/reconcile/reconcile-maps-done-or-failed-thread-to-terminal-task-state.md b/docs/tests/orch/reconcile/reconcile-maps-done-or-failed-thread-to-terminal-task-state.md new file mode 100644 index 0000000..4130322 --- /dev/null +++ b/docs/tests/orch/reconcile/reconcile-maps-done-or-failed-thread-to-terminal-task-state.md @@ -0,0 +1,34 @@ +# Case: `reconcile-maps-done-or-failed-thread-to-terminal-task-state` + +## 用例意义 + +验证 `reconcile` 会把 worker 侧 thread 的终态同步到 `orch` 任务,并刷新 run 聚合状态。 + +## 前置条件 + +- 已存在 run 和已 dispatch 的任务 +- worker 已对该 thread 完成 `done` 或 `fail` + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json run init --run run_blog_001 --goal "Build blog MVP" +orch --db TMPDIR/coord.db --json task add --run run_blog_001 --task T1 --title "Implement retry policy" --default-to worker-a +orch --db TMPDIR/coord.db --json dispatch --run run_blog_001 --task T1 --body "Implement retry handling for the HTTP client." +inbox --db TMPDIR/coord.db --json claim --agent worker-a --thread THREAD_ID +inbox --db TMPDIR/coord.db --json done --agent worker-a --thread THREAD_ID --summary "Retry policy implemented" --body "The HTTP client now retries transient failures." +orch --db TMPDIR/coord.db --json reconcile --run run_blog_001 +orch --db TMPDIR/coord.db --json status --run run_blog_001 +``` + +## 预期输出 + +- `reconcile` 退出码为 `0` +- `data.updated_tasks` 包含 `T1` +- `T1.status == "done"`;若输入是 `fail`,则应为 `failed` +- 后续 `status.data.run.status` 与终态任务聚合结果一致 + +## 断言结论 + +- 任务终态依赖 `reconcile` 落回 `orch`,而不是由 worker 直接改写 task 表 +- run 级聚合状态会随终态任务一并刷新 diff --git a/docs/tests/orch/retry/README.md b/docs/tests/orch/retry/README.md index de4a2c0..6b46360 100644 --- a/docs/tests/orch/retry/README.md +++ b/docs/tests/orch/retry/README.md @@ -1,7 +1,7 @@ # Orch `retry` Test Plan Index -## Status +## Case Files -No command case files are authored yet. - -Use [../ROADMAP.md](../ROADMAP.md) for planned case slugs and document progress. +| Case Slug | File | Coverage Note | +| --- | --- | --- | +| `retry-creates-new-attempt-for-failed-task` | [retry-creates-new-attempt-for-failed-task.md](./retry-creates-new-attempt-for-failed-task.md) | creates a successor attempt, thread, and worktree after a failed attempt | diff --git a/docs/tests/orch/retry/retry-creates-new-attempt-for-failed-task.md b/docs/tests/orch/retry/retry-creates-new-attempt-for-failed-task.md new file mode 100644 index 0000000..d7fab47 --- /dev/null +++ b/docs/tests/orch/retry/retry-creates-new-attempt-for-failed-task.md @@ -0,0 +1,40 @@ +# Case: `retry-creates-new-attempt-for-failed-task` + +## 用例意义 + +验证 `retry` 会在失败任务上创建新的尝试记录,而不是复用旧线程或旧 worktree。 + +## 前置条件 + +- 已创建运行 `run_blog_retry_001` +- 已创建任务 `T1` +- `T1` 已通过严格 worktree 模式完成首次 `dispatch` +- `worker-a` 已 `claim` 首次尝试线程并通过 `inbox fail` 把线程推进到 `failed` +- 最近一次 `reconcile` 已执行,使任务状态同步为 `failed` +- 已知首次尝试的线程为 `OLD_THREAD_ID`,worktree 为 `OLD_WORKTREE_PATH` + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json retry --run run_blog_retry_001 --task T1 --body "Retry after fixing the failure." +``` + +## 预期输出 + +- 退出码为 `0` +- `retry.data.task.status == "dispatched"` +- `retry.data.attempt.attempt_no == 2` +- `retry.data.attempt.thread_id != OLD_THREAD_ID` +- `retry.data.attempt.worktree_path != OLD_WORKTREE_PATH` +- 新 worktree 路径在文件系统上存在 +- `retry.data.previous_attempt.attempt_no == 1` + +## 断言结论 + +- `retry` 会为失败任务生成新的执行尝试,而不是把旧尝试重新打开 +- 对代码任务而言,重试会分配新的 worktree,避免旧失败环境污染下一次执行 + +## 补充约束 + +- `--to` 可选;未显式传入时,默认沿用当前任务/尝试的既有分配信息 +- `retry` 支持 `--body-file`,并遵守与 `--body` 的互斥规则 diff --git a/docs/tests/orch/run-init/README.md b/docs/tests/orch/run-init/README.md index a87ba3e..c16eaed 100644 --- a/docs/tests/orch/run-init/README.md +++ b/docs/tests/orch/run-init/README.md @@ -1,7 +1,7 @@ # Orch `run init` Test Plan Index -## Status +## Case Files -No command case files are authored yet. - -Use [../ROADMAP.md](../ROADMAP.md) for planned case slugs and document progress. +| Case Slug | File | Coverage Note | +| --- | --- | --- | +| `run-init-creates-new-run` | [run-init-creates-new-run.md](./run-init-creates-new-run.md) | creates a new active run and persists goal and summary | diff --git a/docs/tests/orch/run-init/run-init-creates-new-run.md b/docs/tests/orch/run-init/run-init-creates-new-run.md new file mode 100644 index 0000000..f63a64a --- /dev/null +++ b/docs/tests/orch/run-init/run-init-creates-new-run.md @@ -0,0 +1,34 @@ +# Case: `run-init-creates-new-run` + +## 用例意义 + +验证 `run init` 会创建新的 orchestration run,并返回稳定的 run 元数据。 + +## 前置条件 + +- `TMPDIR/coord.db` 尚不存在或为空路径 + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json run init --run run_blog_001 --goal "Build blog MVP" --summary "Public blog plus admin CRUD" +``` + +## 预期输出 + +- 命令退出码为 `0` +- `data.run.run_id == "run_blog_001"` +- `data.run.goal == "Build blog MVP"` +- `data.run.summary == "Public blog plus admin CRUD"` +- `data.run.status == "active"` +- 返回 `created_at` 与 `updated_at` + +## 断言结论 + +- `run init` 会创建 run 记录,而不是只做内存态初始化 +- 新建 run 的默认状态是 `active` + +## 补充约束 + +- `--run` 与 `--goal` 是必填;缺失任一项都应返回 `invalid_input` +- 当同一 `run_id` 已存在时,应返回 `invalid_state`,而不是覆盖旧 run diff --git a/docs/tests/orch/run-show/README.md b/docs/tests/orch/run-show/README.md index dbb785f..ebb01c0 100644 --- a/docs/tests/orch/run-show/README.md +++ b/docs/tests/orch/run-show/README.md @@ -1,7 +1,7 @@ # Orch `run show` Test Plan Index -## Status +## Case Files -No command case files are authored yet. - -Use [../ROADMAP.md](../ROADMAP.md) for planned case slugs and document progress. +| Case Slug | File | Coverage Note | +| --- | --- | --- | +| `run-show-returns-run-summary-and-task-counts` | [run-show-returns-run-summary-and-task-counts.md](./run-show-returns-run-summary-and-task-counts.md) | returns aggregate run metadata and task-count summary without listing task rows | diff --git a/docs/tests/orch/run-show/run-show-returns-run-summary-and-task-counts.md b/docs/tests/orch/run-show/run-show-returns-run-summary-and-task-counts.md new file mode 100644 index 0000000..a7fa78b --- /dev/null +++ b/docs/tests/orch/run-show/run-show-returns-run-summary-and-task-counts.md @@ -0,0 +1,35 @@ +# Case: `run-show-returns-run-summary-and-task-counts` + +## 用例意义 + +验证 `run show` 会返回 run 元数据与聚合任务统计,适合作为 leader 端的轻量总览命令。 + +## 前置条件 + +- 已存在 `run_blog_001` +- 该 run 下至少已有一个任务,以便产生非空 `task_counts` + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json run init --run run_blog_001 --goal "Build blog MVP" --summary "Public blog plus admin CRUD" +orch --db TMPDIR/coord.db --json task add --run run_blog_001 --task T1 --title "Implement retry policy" --summary "Add retry policy to HTTP client" --default-to worker-a +orch --db TMPDIR/coord.db --json run show --run run_blog_001 +``` + +## 预期输出 + +- `run show` 退出码为 `0` +- `data.run.run_id == "run_blog_001"` +- `data.run.status == "active"` +- `data.task_counts.ready >= 1` +- 返回值不包含 `tasks` 数组 + +## 断言结论 + +- `run show` 提供的是聚合视图,而不是完整任务明细 +- run 级状态和任务计数可以在不调用 `status` 的情况下被读取 + +## 补充约束 + +- 当 `--run` 指向不存在的 run 时,应返回 `not_found` diff --git a/docs/tests/orch/status/README.md b/docs/tests/orch/status/README.md index e961fcc..4965a2a 100644 --- a/docs/tests/orch/status/README.md +++ b/docs/tests/orch/status/README.md @@ -1,7 +1,7 @@ # Orch `status` Test Plan Index -## Status +## Case Files -No command case files are authored yet. - -Use [../ROADMAP.md](../ROADMAP.md) for planned case slugs and document progress. +| Case Slug | File | Coverage Note | +| --- | --- | --- | +| `status-returns-run-summary-and-task-list` | [status-returns-run-summary-and-task-list.md](./status-returns-run-summary-and-task-list.md) | returns aggregate run status plus the per-task status list | diff --git a/docs/tests/orch/status/status-returns-run-summary-and-task-list.md b/docs/tests/orch/status/status-returns-run-summary-and-task-list.md new file mode 100644 index 0000000..72e8586 --- /dev/null +++ b/docs/tests/orch/status/status-returns-run-summary-and-task-list.md @@ -0,0 +1,37 @@ +# Case: `status-returns-run-summary-and-task-list` + +## 用例意义 + +验证 `status` 会返回 run 聚合视图以及任务明细列表,是 leader 端的完整状态检查入口。 + +## 前置条件 + +- 已存在 run `run_blog_001` +- 任务 `T1` 已经过完整的 dispatch -> worker done -> reconcile 流程 + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json run init --run run_blog_001 --goal "Build blog MVP" +orch --db TMPDIR/coord.db --json task add --run run_blog_001 --task T1 --title "Implement retry policy" --default-to worker-a +orch --db TMPDIR/coord.db --json dispatch --run run_blog_001 --task T1 --body "Implement retry handling for the HTTP client." +inbox --db TMPDIR/coord.db --json claim --agent worker-a --thread THREAD_ID +inbox --db TMPDIR/coord.db --json done --agent worker-a --thread THREAD_ID --summary "Retry policy implemented" --body "The HTTP client now retries transient failures." +orch --db TMPDIR/coord.db --json reconcile --run run_blog_001 +orch --db TMPDIR/coord.db --json status --run run_blog_001 +``` + +## 预期输出 + +- `status` 退出码为 `0` +- `data.run.run_id == "run_blog_001"` +- `data.run.status == "done"` +- 返回 `data.task_counts` +- 返回 `data.tasks` 数组 +- `data.tasks[0].task_id == "T1"` +- `data.tasks[0].status == "done"` + +## 断言结论 + +- `status` 比 `run show` 更完整,适合做 run 级收口检查 +- 任务清单与 run 聚合状态应保持一致,不应出现 run 已完成而任务仍显示旧状态的结果 diff --git a/docs/tests/orch/task-add/README.md b/docs/tests/orch/task-add/README.md index 30a9eba..cf89533 100644 --- a/docs/tests/orch/task-add/README.md +++ b/docs/tests/orch/task-add/README.md @@ -1,7 +1,9 @@ # Orch `task add` Test Plan Index -## Status +## Case Files -No command case files are authored yet. - -Use [../ROADMAP.md](../ROADMAP.md) for planned case slugs and document progress. +| Case Slug | File | Coverage Note | +| --- | --- | --- | +| `task-add-creates-ready-root-task` | [task-add-creates-ready-root-task.md](./task-add-creates-ready-root-task.md) | creates a dependency-free task that becomes ready immediately | +| `task-add-rejects-invalid-acceptance-json` | [task-add-rejects-invalid-acceptance-json.md](./task-add-rejects-invalid-acceptance-json.md) | rejects malformed `--acceptance-json` with `invalid_input` | +| `task-add-rejects-invalid-priority` | [task-add-rejects-invalid-priority.md](./task-add-rejects-invalid-priority.md) | rejects priorities outside `low|normal|high` | diff --git a/docs/tests/orch/task-add/task-add-creates-ready-root-task.md b/docs/tests/orch/task-add/task-add-creates-ready-root-task.md new file mode 100644 index 0000000..c6d26a7 --- /dev/null +++ b/docs/tests/orch/task-add/task-add-creates-ready-root-task.md @@ -0,0 +1,36 @@ +# Case: `task-add-creates-ready-root-task` + +## 用例意义 + +验证 `task add` 为无依赖任务创建记录时,会在同一事务里把任务推进为 `ready`。 + +## 前置条件 + +- 已存在 run `run_blog_001` + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json run init --run run_blog_001 --goal "Build blog MVP" +orch --db TMPDIR/coord.db --json task add --run run_blog_001 --task T1 --title "Implement retry policy" --summary "Add retry policy to HTTP client" --default-to worker-a +``` + +## 预期输出 + +- `task add` 退出码为 `0` +- `data.task.task_id == "T1"` +- `data.task.title == "Implement retry policy"` +- `data.task.status == "ready"` +- `data.task.default_to == "worker-a"` +- `data.task.priority == "normal"` + +## 断言结论 + +- `task add` 不只是插入 `planned` 任务;对无依赖任务会立即刷新为 `ready` +- 默认优先级会稳定回退到 `normal` + +## 补充约束 + +- `--run`、`--task`、`--title` 是必填 +- 未显式传 `--acceptance-json` 时,会回退为合法 JSON 默认值,而不是空字符串 +- 同一 run 下重复的 `task_id` 应返回 `invalid_state` diff --git a/docs/tests/orch/task-add/task-add-rejects-invalid-acceptance-json.md b/docs/tests/orch/task-add/task-add-rejects-invalid-acceptance-json.md new file mode 100644 index 0000000..22a4758 --- /dev/null +++ b/docs/tests/orch/task-add/task-add-rejects-invalid-acceptance-json.md @@ -0,0 +1,27 @@ +# Case: `task-add-rejects-invalid-acceptance-json` + +## 用例意义 + +验证 `task add` 会拒绝格式非法的 `--acceptance-json`,并返回稳定的 `invalid_input` 错误契约。 + +## 前置条件 + +- 已存在 run `run_blog_003` + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json run init --run run_blog_003 --goal "Validate task add input guards" +orch --db TMPDIR/coord.db --json task add --run run_blog_003 --task T1 --title "Implement retry policy" --acceptance-json '{"done":true' +``` + +## 预期输出 + +- `task add` 退出码为 `30` +- JSON 错误码为 `invalid_input` +- 错误消息指出 `acceptance-json` 必须是合法 JSON + +## 断言结论 + +- `task add` 不会把格式错误的 acceptance 条件静默写入数据库 +- `--acceptance-json` 的校验属于稳定的 CLI 输入契约,而不是存储层偶然失败 diff --git a/docs/tests/orch/task-add/task-add-rejects-invalid-priority.md b/docs/tests/orch/task-add/task-add-rejects-invalid-priority.md new file mode 100644 index 0000000..f817d88 --- /dev/null +++ b/docs/tests/orch/task-add/task-add-rejects-invalid-priority.md @@ -0,0 +1,27 @@ +# Case: `task-add-rejects-invalid-priority` + +## 用例意义 + +验证 `task add` 只接受 `low|normal|high` 三种优先级值,并在其他输入下返回稳定错误。 + +## 前置条件 + +- 已存在 run `run_blog_004` + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json run init --run run_blog_004 --goal "Validate task priority input" +orch --db TMPDIR/coord.db --json task add --run run_blog_004 --task T1 --title "Implement retry policy" --priority urgent +``` + +## 预期输出 + +- `task add` 退出码为 `30` +- JSON 错误码为 `invalid_input` +- 错误消息指出 `priority` 必须是 `low`、`normal` 或 `high` + +## 断言结论 + +- `task add` 的优先级枚举是明确而稳定的 CLI 契约 +- 非法优先级会在任务写入前被拒绝,而不是退回到默认值或被静默接受 diff --git a/docs/tests/orch/wait/README.md b/docs/tests/orch/wait/README.md index 65b44f2..435bb4f 100644 --- a/docs/tests/orch/wait/README.md +++ b/docs/tests/orch/wait/README.md @@ -1,7 +1,8 @@ # Orch `wait` Test Plan Index -## Status +## Case Files -No command case files are authored yet. - -Use [../ROADMAP.md](../ROADMAP.md) for planned case slugs and document progress. +| Case Slug | File | Coverage Note | +| --- | --- | --- | +| `wait-wakes-on-matching-run-event` | [wait-wakes-on-matching-run-event.md](./wait-wakes-on-matching-run-event.md) | wakes on a later matching task event and returns that event payload | +| `wait-times-out-without-matching-event` | [wait-times-out-without-matching-event.md](./wait-times-out-without-matching-event.md) | returns a stable timeout payload when no later matching event appears | diff --git a/docs/tests/orch/wait/wait-times-out-without-matching-event.md b/docs/tests/orch/wait/wait-times-out-without-matching-event.md new file mode 100644 index 0000000..b02ddcb --- /dev/null +++ b/docs/tests/orch/wait/wait-times-out-without-matching-event.md @@ -0,0 +1,33 @@ +# Case: `wait-times-out-without-matching-event` + +## 用例意义 + +验证 `wait` 在没有后续匹配事件时返回稳定的超时结果,而不是把超时视为命令失败。 + +## 前置条件 + +- 空数据库已初始化 +- 已创建运行 `run_blog_wait_002` +- 当前没有会产生 `task_done` 的后续事件 + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json wait --run run_blog_wait_002 --for task_done --after-event 0 --timeout-seconds 1 +``` + +## 预期输出 + +- 退出码为 `0` +- `wait.data.woke == false` +- `wait.data.next_event_id == 0` +- `wait.data.events` 为空或缺省 + +## 断言结论 + +- `wait` 的超时是可消费的正常结果,不是错误态 +- leader 可以基于 `woke=false` 决定继续轮询、切换过滤条件,或退出当前控制循环 + +## 补充约束 + +- 该用例强调超时契约,不要求系统中存在任何任务 diff --git a/docs/tests/orch/wait/wait-wakes-on-matching-run-event.md b/docs/tests/orch/wait/wait-wakes-on-matching-run-event.md new file mode 100644 index 0000000..f188696 --- /dev/null +++ b/docs/tests/orch/wait/wait-wakes-on-matching-run-event.md @@ -0,0 +1,40 @@ +# Case: `wait-wakes-on-matching-run-event` + +## 用例意义 + +验证 `wait` 能在后续匹配事件出现时被唤醒,并返回稳定的事件载荷。 + +## 前置条件 + +- 空数据库已初始化 +- 已创建运行 `run_blog_wait_001` +- 已添加任务 `T1` 并完成一次 `dispatch` +- 已知当前尝试线程为 `THREAD_ID` +- `wait` 在工作线程写入阻塞状态前启动 + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json wait --run run_blog_wait_001 --for task_blocked --after-event 0 --timeout-seconds 2 +inbox --db TMPDIR/coord.db --json claim --agent worker-a --thread THREAD_ID +inbox --db TMPDIR/coord.db --json update --agent worker-a --thread THREAD_ID --status blocked --summary "Need logging decision" --payload-json '{"question":"stdout or stderr?"}' +``` + +## 预期输出 + +- `wait` 退出码为 `0` +- `wait.data.woke == true` +- `wait.data.events` 长度为 `1` +- 唯一事件的 `type == "task_blocked"` +- 事件 `summary == "Need logging decision"` +- 事件 `payload.question == "stdout or stderr?"` + +## 断言结论 + +- `wait` 不是简单睡眠,而是面向 run 事件流的阻塞读取接口 +- `task_blocked` 事件会把 worker 提问摘要和结构化 payload 暴露给 leader + +## 补充约束 + +- `--for` 支持逗号分隔的事件类型列表;该用例验证的是单事件过滤 +- `wait` 返回成功时也会给出 `next_event_id`,便于后续增量等待 diff --git a/docs/tests/orch/workflows/README.md b/docs/tests/orch/workflows/README.md index d8b59c7..c829a74 100644 --- a/docs/tests/orch/workflows/README.md +++ b/docs/tests/orch/workflows/README.md @@ -10,13 +10,166 @@ All examples assume: - `orch --db TMPDIR/coord.db --json` is used consistently - assertions follow the shared rules in [../_shared/README.md](../_shared/README.md) -## Current Status +## case: run-dispatch-reconcile-status-happy-path -No workflow case documents are authored yet. +### 用例意义 -Planned first workflow cases live in [../ROADMAP.md](../ROADMAP.md), starting with: +验证 `orch` 的主干领导者流程可用:创建 run、加入 task、查看 ready、dispatch、通过 `inbox` 推进 worker 状态、reconcile,再用 `status` 看到最终完成态。 -- `run-dispatch-reconcile-status-happy-path` -- `dependency-blocked-answer-resume-flow` -- `strict-worktree-dispatch-to-cleanup` -- `council-review-end-to-end` +### 前置条件 + +- 空数据库路径 `TMPDIR/coord.db` +- 执行者为 `worker-a` + +### 输入 + +```bash +orch --db TMPDIR/coord.db --json run init --run run_blog_001 --goal "Build blog MVP" --summary "Public blog plus admin CRUD" +orch --db TMPDIR/coord.db --json task add --run run_blog_001 --task T1 --title "Implement retry policy" --summary "Add retry policy to HTTP client" --default-to worker-a +orch --db TMPDIR/coord.db --json ready --run run_blog_001 +orch --db TMPDIR/coord.db --json dispatch --run run_blog_001 --task T1 --body "Implement retry handling for the HTTP client." +inbox --db TMPDIR/coord.db --json claim --agent worker-a --thread THREAD_ID +inbox --db TMPDIR/coord.db --json update --agent worker-a --thread THREAD_ID --status in_progress --summary "Implementation started" +orch --db TMPDIR/coord.db --json reconcile --run run_blog_001 +inbox --db TMPDIR/coord.db --json done --agent worker-a --thread THREAD_ID --summary "Retry policy implemented" --body "The HTTP client now retries transient failures." +orch --db TMPDIR/coord.db --json reconcile --run run_blog_001 +orch --db TMPDIR/coord.db --json status --run run_blog_001 +``` + +### 预期输出 + +- `run init` 成功创建 `run_blog_001` +- `task add` 返回的新 task 初始状态为 `ready` +- `ready` 只返回 `T1` +- `dispatch` 创建 attempt 与 inbox thread,并将 task 推进到 `dispatched` +- 第一次 `reconcile` 后 task 状态变为 `running` +- 第二次 `reconcile` 后 task 状态变为 `done` +- `status` 返回 `run.status == "done"` + +### 断言结论 + +- `orch` 的主干 happy path 不是单命令行为,而是 `orch` 与 `inbox` 共同完成的闭环 +- `reconcile` 是把 worker-side 线程状态折叠回 leader-side task 状态的关键步骤 + +## case: dependency-blocked-answer-resume-flow + +### 用例意义 + +验证依赖门控、blocked 列表、`answer` 反馈以及最终恢复到完成态的完整交互链路。 + +### 前置条件 + +- 空数据库路径 `TMPDIR/coord.db` +- `worker-a` 负责 `T1` +- `worker-b` 负责 `T2` + +### 输入 + +```bash +orch --db TMPDIR/coord.db --json run init --run run_blog_002 --goal "Build dependency-aware workflow" +orch --db TMPDIR/coord.db --json task add --run run_blog_002 --task T1 --title "Build backend" --summary "Implement backend APIs" --default-to worker-a +orch --db TMPDIR/coord.db --json task add --run run_blog_002 --task T2 --title "Build frontend" --summary "Implement frontend flows" --default-to worker-b +orch --db TMPDIR/coord.db --json dep add --run run_blog_002 --task T2 --depends-on T1 +orch --db TMPDIR/coord.db --json ready --run run_blog_002 +orch --db TMPDIR/coord.db --json dispatch --run run_blog_002 --task T1 +inbox --db TMPDIR/coord.db --json claim --agent worker-a --thread THREAD_BACKEND +inbox --db TMPDIR/coord.db --json done --agent worker-a --thread THREAD_BACKEND --summary "Backend complete" +orch --db TMPDIR/coord.db --json reconcile --run run_blog_002 +orch --db TMPDIR/coord.db --json ready --run run_blog_002 +orch --db TMPDIR/coord.db --json dispatch --run run_blog_002 --task T2 +inbox --db TMPDIR/coord.db --json claim --agent worker-b --thread THREAD_FRONTEND +inbox --db TMPDIR/coord.db --json update --agent worker-b --thread THREAD_FRONTEND --status blocked --summary "Need logging decision" --payload-json '{"question":"stdout or stderr?"}' +orch --db TMPDIR/coord.db --json reconcile --run run_blog_002 +orch --db TMPDIR/coord.db --json blocked --run run_blog_002 +orch --db TMPDIR/coord.db --json answer --run run_blog_002 --task T2 --body "Use stdout for MVP." +inbox --db TMPDIR/coord.db --json update --agent worker-b --thread THREAD_FRONTEND --status in_progress --summary "Decision applied" +inbox --db TMPDIR/coord.db --json done --agent worker-b --thread THREAD_FRONTEND --summary "Frontend complete" +orch --db TMPDIR/coord.db --json reconcile --run run_blog_002 +orch --db TMPDIR/coord.db --json status --run run_blog_002 +``` + +### 预期输出 + +- 初始 `ready` 仅包含 `T1` +- `T1` 完成并 `reconcile` 后,`T2` 才出现在 `ready` +- `blocked` 返回 `T2` 与最新 question +- `answer` 向活跃 thread 追加一条 `kind=answer` 消息 +- 最终 `status` 中 run 进入 `done` + +### 断言结论 + +- 依赖门控和 blocked-answer 机制在同一个 run 中可以顺序衔接 +- `answer` 不直接改 task 状态;真正的状态恢复仍依赖 worker 继续推进线程并由 `reconcile` 采集 + +## case: strict-worktree-dispatch-to-cleanup + +### 用例意义 + +验证代码任务的 strict worktree 路径能从 dispatch 一直走到 cleanup,确保隔离工作区既会被创建,也能在完成后被移除。 + +### 前置条件 + +- `TMPDIR/repo` 是一个已提交初始内容的 Git 仓库 +- `worker-a` 负责代码任务 + +### 输入 + +```bash +orch --db TMPDIR/coord.db --json run init --run run_blog_worktree_001 --goal "Validate strict worktree dispatch" +orch --db TMPDIR/coord.db --json task add --run run_blog_worktree_001 --task T1 --title "Implement backend" --default-to worker-a +orch --db TMPDIR/coord.db --json dispatch --run run_blog_worktree_001 --task T1 --repo-path TMPDIR/repo --workspace-root .orch/worktrees --strict-worktree --body "Implement inside isolated worktree." +inbox --db TMPDIR/coord.db --json claim --agent worker-a --thread THREAD_ID +inbox --db TMPDIR/coord.db --json done --agent worker-a --thread THREAD_ID --summary "Backend complete" +orch --db TMPDIR/coord.db --json reconcile --run run_blog_worktree_001 +orch --db TMPDIR/coord.db --json cleanup --run run_blog_worktree_001 --task T1 --attempt 1 +``` + +### 预期输出 + +- `dispatch` 返回非空的 `attempt.base_ref`、`attempt.base_commit`、`attempt.branch_name`、`attempt.worktree_path` +- `attempt.workspace_status == "created"` +- `cleanup` 返回被清理的 attempt 记录 +- 清理后 `worktree_path` 不再存在于文件系统 + +### 断言结论 + +- strict worktree 不是单次 dispatch 细节,而是完整 attempt 生命周期的一部分 +- `cleanup` 的目标是已完成或废弃的工作区,不应误删仍在活动中的执行目录 + +## case: council-review-end-to-end + +### 用例意义 + +验证 `orch council` 高层工作流可从 reviewer dispatch 一直走到 final report,且 grouped recommendations 与最终输出衔接一致。 + +### 前置条件 + +- 空数据库路径 `TMPDIR/coord.db` +- 三个固定 reviewer 分别为 `architecture-reviewer`、`implementation-reviewer`、`risk-reviewer` + +### 输入 + +```bash +orch --db TMPDIR/coord.db --json council start --run council_blog_001 --target "Review the current blog architecture." +inbox --db TMPDIR/coord.db --json claim --agent architecture-reviewer --thread THREAD_CR1 +inbox --db TMPDIR/coord.db --json done --agent architecture-reviewer --thread THREAD_CR1 --summary "Review complete" --body '{"reviewer_role":"architecture-reviewer","findings":[...]}' +inbox --db TMPDIR/coord.db --json claim --agent implementation-reviewer --thread THREAD_CR2 +inbox --db TMPDIR/coord.db --json done --agent implementation-reviewer --thread THREAD_CR2 --summary "Review complete" --body '{"reviewer_role":"implementation-reviewer","findings":[...]}' +inbox --db TMPDIR/coord.db --json claim --agent risk-reviewer --thread THREAD_CR3 +inbox --db TMPDIR/coord.db --json done --agent risk-reviewer --thread THREAD_CR3 --summary "Review complete" --body '{"reviewer_role":"risk-reviewer","findings":[...]}' +orch --db TMPDIR/coord.db --json council wait --run council_blog_001 --timeout-seconds 2 +orch --db TMPDIR/coord.db --json council tally --run council_blog_001 --similarity normal +orch --db TMPDIR/coord.db --json council report --run council_blog_001 +``` + +### 预期输出 + +- `council start` 创建 3 个 reviewer task 并完成 dispatch +- `council wait` 在 3 个 reviewer 全部完成后返回 `all_complete == true` +- `council tally` 返回 grouped recommendations,并按 `consensus|majority|minority` 分桶 +- `council report` 返回默认 `show == ["consensus","majority"]`,并产出 markdown artifact + +### 断言结论 + +- council workflow 是建立在 `orch` 调度面之上的高层流程,而不是独立基础设施 +- final report 依赖已持久化的 grouped recommendations,因此 `tally` 与 `report` 必须在契约上连续