diff --git a/api/events.md b/api/events.md index 7e7a516..84345cd 100644 --- a/api/events.md +++ b/api/events.md @@ -47,7 +47,9 @@ Each SSE message should carry one JSON object with this shape: - `task_dispatched` - `task_running` - `task_blocked` +- `task_verifying` - `task_answered` +- `task_verification_recorded` - `task_done` - `task_failed` - `thread_claim` diff --git a/docs/architecture.md b/docs/architecture.md index 2e827da..5b96843 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -8,6 +8,7 @@ The design target is a local, file-portable agent coordination stack: - `inbox`: durable communication bus - `orch`: task graph and scheduling control plane +- spec-aware tasks and verification gates owned by `orch` - worktree-backed task execution for code-writing workers - optional user-facing council review workflow on top of `orch` - shared SQLite database file @@ -25,7 +26,7 @@ If `inbox` is reduced to pure chat storage, the scheduler must reconstruct state ## Role Model - `user`: talks only to the leader -- `leader`: owns the overall goal, task graph, acceptance criteria, and final integration +- `leader`: owns the overall goal, task graph, task specs, acceptance criteria, verification policy, and final integration - `worker`: executes one assigned task at a time and reports through `inbox` - `inbox`: durable thread/message/lease/artifact store - `orch`: run/task/dependency/dispatch state machine built on top of `inbox` @@ -45,7 +46,7 @@ If `inbox` is reduced to pure chat storage, the scheduler must reconstruct state Both CLIs should point at the same SQLite file. - `inbox` owns communication tables such as threads, messages, leases, and artifacts. -- `orch` owns scheduling tables such as runs, tasks, dependencies, and attempts. +- `orch` owns scheduling tables such as runs, tasks, dependencies, attempts, task specs, and verification check results. - both layers append to a shared event stream for blocking waits - `orch dispatch` creates or updates `inbox` threads. - `orch reconcile` reads `inbox` state and updates task state. @@ -135,9 +136,11 @@ If packaging later favors a single binary, the same model can be exposed as comm - runs - task graph and dependencies +- task spec snapshots and task-level policy metadata - ready queue calculation - dispatch decisions - task-attempt worktree allocation +- verification gates and check-result aggregation - blocked queue review for the leader - retries, reassignment, and cancellation - mapping task attempts to inbox threads diff --git a/docs/orch-cli.md b/docs/orch-cli.md index d96547c..9e82279 100644 --- a/docs/orch-cli.md +++ b/docs/orch-cli.md @@ -2,7 +2,7 @@ ## Purpose -`orch` is the leader-facing scheduler and control plane. It owns the run, task graph, dependencies, ready queue, dispatch decisions, retries, and reassignment logic. +`orch` is the leader-facing scheduler and control plane. It owns the run, task graph, task specs, dependencies, verification gates, ready queue, dispatch decisions, retries, and reassignment logic. `orch` does not replace `inbox`. It uses `inbox` as the durable transport and execution record. @@ -20,10 +20,12 @@ In normal operation: - creating a run for one user request or project - defining tasks and dependencies +- snapshotting task specs and per-task verification policy - calculating which tasks are ready - dispatching ready tasks to workers - tracking attempts and mapping them to inbox threads - allocating attempt worktrees for code tasks +- aggregating post-implementation check results into a task verification gate - surfacing blocked tasks to the leader - sending answers back into the active inbox thread - reconciling thread state into task state @@ -71,6 +73,7 @@ See [worktree-execution.md](/home/kurihada/project/ai-workflow-skill/docs/worktr - `dispatched`: an inbox thread exists but the worker has not started yet - `running`: the task has been claimed and is actively executing - `blocked`: the active attempt needs clarification or an external dependency +- `verifying`: the worker reported completion, but required verification checks have not all passed yet - `done`: task completed and passed its current acceptance gate - `failed`: task completed unsuccessfully - `cancelled`: task was cancelled and should not continue @@ -82,7 +85,10 @@ Suggested transitions: - `dispatched -> running` - `running -> blocked` - `blocked -> running` -- `running -> done` +- `running -> verifying` when the worker reports `done` and the task has required checks +- `running -> done` when the worker reports `done` and the task has no required checks +- `verifying -> done` +- `verifying -> failed` - `running -> failed` - `failed -> ready` through explicit retry - `* -> cancelled` by leader action @@ -97,12 +103,13 @@ The normal leader loop is: 4. inspect `ready` 5. `dispatch` tasks 6. arrange or launch a separate worker runtime that consumes the assigned inbox threads -7. use `status` for the current operational view; it reconciles first and includes latest attempt and message context -8. inspect `blocked` -9. answer blocked questions -10. if nothing is actionable, call `wait` -11. retry or reassign failures when needed -12. finish when all required tasks are `done` +7. use `status` for the current operational view; it reconciles first and includes latest attempt, message, and gate context +8. if a task enters `verifying`, record check results with `verify record` and inspect gate state with `verify status` +9. inspect `blocked` +10. answer blocked questions +11. if nothing is actionable, call `wait` +12. retry or reassign failures when needed +13. finish when all required tasks are `done` The leader should block on `orch wait`, not on ad hoc `sleep`. @@ -162,6 +169,13 @@ Suggested flags: - `--default-to AGENT` - `--acceptance-json STRING` - `--priority low|normal|high` +- `--spec-file PATH` +- `--spec-sha SHA256` +- `--check-profile NAME` +- `--required-check NAME` repeatable +- `--allowed-path PATH` repeatable +- `--blocked-path PATH` repeatable +- `--metadata-json STRING` ### `orch dep add` @@ -207,7 +221,7 @@ Behavior: - in `code` mode, resolves a committed base revision - in `code` mode, creates a branch and worktree for the attempt - creates or links an `inbox` thread -- writes `execution_mode` into the inbox task payload and writes workspace metadata for code tasks into attempt storage and task payload +- writes `execution_mode` into the inbox task payload, includes the task spec snapshot and verification policy in the dispatch payload, and writes workspace metadata for code tasks into attempt storage and task payload - moves the task to `dispatched` - does not start a worker runtime on its own @@ -235,7 +249,8 @@ Behavior: - maps inbox `claimed` or `in_progress` to `running` - maps inbox `blocked` to `blocked` -- maps inbox `done` to `done` +- maps inbox `done` to `verifying` when the task has required checks +- maps inbox `done` to `done` when the task has no required checks - maps inbox `failed` to `failed` ### `orch blocked` @@ -246,6 +261,47 @@ Suggested flags: - `--run RUN_ID` +### `orch verify record` + +Record or update one verification check result for the latest task attempt. + +Suggested flags: + +- `--run RUN_ID` +- `--task TASK_ID` +- `--attempt N` optional; defaults to latest +- `--check NAME` +- `--status passed|failed|skipped` +- `--summary TEXT` +- `--body TEXT` +- `--body-file PATH` +- `--metadata-json STRING` +- `--recorded-by NAME` + +Behavior: + +- upserts one named check result for the selected attempt +- emits a verification-recorded event +- recomputes the gate for the task +- keeps the task in `verifying` while required checks are still pending +- moves the task to `done` when all required checks pass +- moves the task to `failed` when one or more required checks fail + +### `orch verify status` + +Show the current verification state for one task. + +Suggested flags: + +- `--run RUN_ID` +- `--task TASK_ID` +- `--attempt N` optional; defaults to latest + +Behavior: + +- returns the task, selected attempt, task spec snapshot, and current gate state +- helps the leader inspect which required checks are still pending or failed + ### `orch wait` Block until one or more run-scoped events become available. diff --git a/docs/tests/orch/README.md b/docs/tests/orch/README.md index 67c538f..c073a4c 100644 --- a/docs/tests/orch/README.md +++ b/docs/tests/orch/README.md @@ -51,6 +51,7 @@ Unless a case says otherwise: - `_shared/README.md`: reusable fixtures, JSON assertions, exit-code rules, and worktree conventions - `workflows/README.md`: cross-command end-to-end scenarios - per-command folders: one leaf-command directory per implemented `orch` command surface +- `verify/`: verification-gate command cases ## Glossary @@ -60,6 +61,8 @@ Unless a case says otherwise: - `attempt`: one execution try for a task - `dispatch`: the act of materializing a task into an inbox thread - `workspace`: the branch and worktree assigned to a code-writing attempt +- `verification gate`: the check aggregation state between worker `done` and final task completion +- `verifying`: the task state used while required checks are still pending or being recorded - `blocked task`: a task whose active attempt requires clarification or another external decision - `council review`: a higher-level workflow built on top of `orch` that dispatches fixed reviewer roles and tallies recommendations diff --git a/docs/tests/orch/ROADMAP.md b/docs/tests/orch/ROADMAP.md index d16258d..6907ce5 100644 --- a/docs/tests/orch/ROADMAP.md +++ b/docs/tests/orch/ROADMAP.md @@ -18,13 +18,13 @@ It is not a replacement for automated Go tests. Snapshot date: -- `2026-03-20` +- `2026-03-23` Current state: -- `orch` CLI is implemented for the current scheduler, explicit execution-mode dispatch, wait, and council review surfaces -- automated Go tests now cover every currently documented `orch` command case and workflow case, combining the original integration suite with focused contract tests for run/task/ready/dispatch/blocked/answer/cleanup/status/reconcile/workflow/council-report edges -- `status` coverage now also documents the richer leader view: auto-reconcile plus latest attempt, latest message, and blocked-question context +- `orch` CLI now covers scheduler control, explicit execution-mode dispatch, verification gates, wait, and council review surfaces +- automated Go tests now cover every currently documented `orch` command case and workflow case, combining the original integration suite with focused contract tests for run/task/ready/dispatch/verify/blocked/answer/cleanup/status/reconcile/workflow/council-report edges +- `status` coverage now also documents the richer leader view: auto-reconcile plus latest attempt, latest message, blocked-question context, and task gate context - this roadmap now exists under `docs/tests/orch/ROADMAP.md` - all planned global, shared, workflow, command-index, and command-case Markdown documents in the current `orch` test-plan set have been authored - every implemented `orch` leaf-command folder now uses `README.md` as an index plus one Markdown file per planned case @@ -32,10 +32,10 @@ Current state: Progress summary for planned test-plan documents, excluding `ROADMAP.md`: -- planned document files: `65` -- authored document files: `65` -- planned case slugs in this roadmap: `47` -- authored case slugs in this roadmap: `47` +- planned document files: `71` +- authored document files: `71` +- planned case slugs in this roadmap: `52` +- authored case slugs in this roadmap: `52` ## Scope @@ -48,6 +48,7 @@ In scope: - `orch ready` - `orch dispatch` - `orch reconcile` +- `orch verify` - `orch wait` - `orch blocked` - `orch answer` @@ -150,7 +151,10 @@ The Markdown test-plan set starts at zero, but these automated tests already exi - [command_contracts_core_test.go](../../../packages/orch-runtime/internal/cli/orch/command_contracts_core_test.go) `TestOrchRunShowRejectsMissingRun` - [command_contracts_core_test.go](../../../packages/orch-runtime/internal/cli/orch/command_contracts_core_test.go) `TestOrchTaskAddRejectsInvalidAcceptanceJSON` - [command_contracts_core_test.go](../../../packages/orch-runtime/internal/cli/orch/command_contracts_core_test.go) `TestOrchTaskAddRejectsInvalidPriority` +- [command_contracts_core_test.go](../../../packages/orch-runtime/internal/cli/orch/command_contracts_core_test.go#L147) `TestOrchTaskAddSnapshotsSpecAndVerificationPolicy` +- [command_contracts_core_test.go](../../../packages/orch-runtime/internal/cli/orch/command_contracts_core_test.go#L203) `TestOrchTaskAddRejectsSpecSHAMismatch` - [command_contracts_core_test.go](../../../packages/orch-runtime/internal/cli/orch/command_contracts_core_test.go) `TestOrchReadyOrdersByPriorityAndRespectsLimit` +- [integration_test.go](../../../packages/orch-runtime/internal/cli/orch/integration_test.go#L185) `TestOrchVerificationGateLifecycle` - [command_contracts_edges_test.go](../../../packages/orch-runtime/internal/cli/orch/command_contracts_edges_test.go) `TestOrchAnswerAcceptsPayloadJSONWithoutBody` - [command_contracts_edges_test.go](../../../packages/orch-runtime/internal/cli/orch/command_contracts_edges_test.go) `TestOrchAnswerRejectsEmptyBodyAndPayload` - [command_contracts_edges_test.go](../../../packages/orch-runtime/internal/cli/orch/command_contracts_edges_test.go) `TestOrchCleanupRejectsAttemptWithoutTask` @@ -192,6 +196,8 @@ docs/tests/orch/ README.md reconcile/ README.md + verify/ + README.md wait/ README.md blocked/ @@ -233,6 +239,8 @@ docs/tests/orch/ | `docs/tests/orch/task-add/task-add-creates-ready-root-task.md` | `task add` command case | 1 | 1 | done | | `docs/tests/orch/task-add/task-add-rejects-invalid-acceptance-json.md` | `task add` command case | 1 | 1 | done | | `docs/tests/orch/task-add/task-add-rejects-invalid-priority.md` | `task add` command case | 1 | 1 | done | +| `docs/tests/orch/task-add/task-add-snapshots-spec-and-verification-policy.md` | `task add` command case | 1 | 1 | done | +| `docs/tests/orch/task-add/task-add-rejects-spec-sha-mismatch.md` | `task add` command case | 1 | 1 | done | | `docs/tests/orch/dep-add/README.md` | `dep add` command case index | 0 | 0 | done | | `docs/tests/orch/dep-add/dep-add-blocks-dependent-task-until-prerequisite-completes.md` | `dep add` command case | 1 | 1 | done | | `docs/tests/orch/ready/README.md` | `ready` command case index | 0 | 0 | done | @@ -249,6 +257,10 @@ docs/tests/orch/ | `docs/tests/orch/reconcile/README.md` | `reconcile` command case index | 0 | 0 | done | | `docs/tests/orch/reconcile/reconcile-maps-claimed-or-in-progress-thread-to-running.md` | `reconcile` command case | 1 | 1 | done | | `docs/tests/orch/reconcile/reconcile-maps-done-or-failed-thread-to-terminal-task-state.md` | `reconcile` command case | 1 | 1 | done | +| `docs/tests/orch/reconcile/reconcile-maps-done-thread-to-verifying-when-task-has-required-checks.md` | `reconcile` command case | 1 | 1 | done | +| `docs/tests/orch/verify/README.md` | `verify` command case index | 0 | 0 | done | +| `docs/tests/orch/verify/verify-status-returns-spec-and-gate-for-task.md` | `verify` command case | 1 | 1 | done | +| `docs/tests/orch/verify/verify-record-updates-gate-and-marks-task-done-when-required-checks-pass.md` | `verify` command case | 1 | 1 | done | | `docs/tests/orch/wait/README.md` | `wait` command case index | 0 | 0 | done | | `docs/tests/orch/wait/wait-wakes-on-matching-run-event.md` | `wait` command case | 1 | 1 | done | | `docs/tests/orch/wait/wait-times-out-without-matching-event.md` | `wait` command case | 1 | 1 | done | @@ -294,8 +306,9 @@ docs/tests/orch/ 2. shared fixtures and assertion helpers in `docs/tests/orch/_shared/README.md` 3. workflow cases in `docs/tests/orch/workflows/README.md` 4. core scheduler command docs: `run-init`, `task-add`, `dep-add`, `ready`, `dispatch`, `reconcile`, `status` -5. interactive leader command docs: `wait`, `blocked`, `answer`, `retry`, `reassign`, `cancel`, `cleanup` -6. council workflow docs: `council-start`, `council-wait`, `council-tally`, `council-report` +5. verification command docs: `verify` +6. interactive leader command docs: `wait`, `blocked`, `answer`, `retry`, `reassign`, `cancel`, `cleanup` +7. council workflow docs: `council-start`, `council-wait`, `council-tally`, `council-report` ## Authored Case Register @@ -310,6 +323,8 @@ docs/tests/orch/ | `docs/tests/orch/task-add/task-add-creates-ready-root-task.md` | `task-add-creates-ready-root-task` | dependency-free task becomes ready immediately | done | | `docs/tests/orch/task-add/task-add-rejects-invalid-acceptance-json.md` | `task-add-rejects-invalid-acceptance-json` | malformed `--acceptance-json` returns stable invalid_input | done | | `docs/tests/orch/task-add/task-add-rejects-invalid-priority.md` | `task-add-rejects-invalid-priority` | unsupported priorities are rejected with invalid_input | done | +| `docs/tests/orch/task-add/task-add-snapshots-spec-and-verification-policy.md` | `task-add-snapshots-spec-and-verification-policy` | task add snapshots spec content, verification profile, and scope policy onto the task | done | +| `docs/tests/orch/task-add/task-add-rejects-spec-sha-mismatch.md` | `task-add-rejects-spec-sha-mismatch` | explicit spec hash mismatch returns invalid_input | done | | `docs/tests/orch/dep-add/dep-add-blocks-dependent-task-until-prerequisite-completes.md` | `dep-add-blocks-dependent-task-until-prerequisite-completes` | dependency edge prevents immediate readiness | done | | `docs/tests/orch/ready/ready-lists-only-eligible-tasks.md` | `ready-lists-only-eligible-tasks` | ready list excludes dependency-gated tasks | done | | `docs/tests/orch/ready/ready-orders-by-priority-and-respects-limit.md` | `ready-orders-by-priority-and-respects-limit` | ready output orders by priority and applies explicit limit truncation | done | @@ -322,6 +337,9 @@ docs/tests/orch/ | `docs/tests/orch/dispatch/dispatch-analysis-mode-skips-worktree.md` | `dispatch-analysis-mode-skips-worktree` | analysis mode stays on the normal non-worktree path | done | | `docs/tests/orch/reconcile/reconcile-maps-claimed-or-in-progress-thread-to-running.md` | `reconcile-maps-claimed-or-in-progress-thread-to-running` | reconcile maps active inbox execution to running task state | done | | `docs/tests/orch/reconcile/reconcile-maps-done-or-failed-thread-to-terminal-task-state.md` | `reconcile-maps-done-or-failed-thread-to-terminal-task-state` | reconcile maps terminal inbox states to terminal task states | done | +| `docs/tests/orch/reconcile/reconcile-maps-done-thread-to-verifying-when-task-has-required-checks.md` | `reconcile-maps-done-thread-to-verifying-when-task-has-required-checks` | reconcile routes worker done into verifying when the task has required checks | done | +| `docs/tests/orch/verify/verify-status-returns-spec-and-gate-for-task.md` | `verify-status-returns-spec-and-gate-for-task` | verify status returns the task spec snapshot, selected attempt, and current gate state | done | +| `docs/tests/orch/verify/verify-record-updates-gate-and-marks-task-done-when-required-checks-pass.md` | `verify-record-updates-gate-and-marks-task-done-when-required-checks-pass` | verify record recomputes the gate and promotes the task to done when all required checks pass | done | | `docs/tests/orch/wait/wait-wakes-on-matching-run-event.md` | `wait-wakes-on-matching-run-event` | wait wakes on a later matching run-scoped event | done | | `docs/tests/orch/wait/wait-times-out-without-matching-event.md` | `wait-times-out-without-matching-event` | wait timeout returns a normal non-woken result | done | | `docs/tests/orch/blocked/blocked-lists-latest-question-for-blocked-task.md` | `blocked-lists-latest-question-for-blocked-task` | blocked view includes latest question payload for the task | done | diff --git a/docs/tests/orch/reconcile/README.md b/docs/tests/orch/reconcile/README.md index f0dd235..a378f82 100644 --- a/docs/tests/orch/reconcile/README.md +++ b/docs/tests/orch/reconcile/README.md @@ -5,4 +5,5 @@ | Case Slug | File | Coverage Note | | --- | --- | --- | | `reconcile-maps-claimed-or-in-progress-thread-to-running` | [reconcile-maps-claimed-or-in-progress-thread-to-running.md](./reconcile-maps-claimed-or-in-progress-thread-to-running.md) | maps worker claim or in-progress inbox state back into a running orch task | -| `reconcile-maps-done-or-failed-thread-to-terminal-task-state` | [reconcile-maps-done-or-failed-thread-to-terminal-task-state.md](./reconcile-maps-done-or-failed-thread-to-terminal-task-state.md) | maps terminal inbox states into terminal task states and updates run aggregates | +| `reconcile-maps-done-or-failed-thread-to-terminal-task-state` | [reconcile-maps-done-or-failed-thread-to-terminal-task-state.md](./reconcile-maps-done-or-failed-thread-to-terminal-task-state.md) | maps `done` without a gate or `failed` inbox states into terminal task states and updates run aggregates | +| `reconcile-maps-done-thread-to-verifying-when-task-has-required-checks` | [reconcile-maps-done-thread-to-verifying-when-task-has-required-checks.md](./reconcile-maps-done-thread-to-verifying-when-task-has-required-checks.md) | routes worker `done` into `verifying` when the task has required checks | diff --git a/docs/tests/orch/reconcile/reconcile-maps-done-or-failed-thread-to-terminal-task-state.md b/docs/tests/orch/reconcile/reconcile-maps-done-or-failed-thread-to-terminal-task-state.md index aec49bf..1a9c350 100644 --- a/docs/tests/orch/reconcile/reconcile-maps-done-or-failed-thread-to-terminal-task-state.md +++ b/docs/tests/orch/reconcile/reconcile-maps-done-or-failed-thread-to-terminal-task-state.md @@ -3,10 +3,15 @@ ## 用例意义 验证 `reconcile` 会把 worker 侧 thread 的终态同步到 `orch` 任务,并刷新 run 聚合状态。 +这个 case 只覆盖两类终态: + +- worker `done` 且 task 没有 required checks +- worker `fail` ## 前置条件 - 已存在 run 和已 dispatch 的任务 +- 该任务没有 configured verification gate,或者输入使用的是 `fail` - worker 已对该 thread 完成 `done` 或 `fail` ## 输入 @@ -32,3 +37,7 @@ orch --db TMPDIR/coord.db --json status --run run_blog_001 - 任务终态依赖 `reconcile` 落回 `orch`,而不是由 worker 直接改写 task 表 - run 级聚合状态会随终态任务一并刷新 + +## 补充约束 + +- 如果 task 声明了 required checks,worker `done` 不应再直接进入 `done`;那条分支由 `reconcile-maps-done-thread-to-verifying-when-task-has-required-checks.md` 覆盖 diff --git a/docs/tests/orch/reconcile/reconcile-maps-done-thread-to-verifying-when-task-has-required-checks.md b/docs/tests/orch/reconcile/reconcile-maps-done-thread-to-verifying-when-task-has-required-checks.md new file mode 100644 index 0000000..c4d96bf --- /dev/null +++ b/docs/tests/orch/reconcile/reconcile-maps-done-thread-to-verifying-when-task-has-required-checks.md @@ -0,0 +1,42 @@ +# Case: `reconcile-maps-done-thread-to-verifying-when-task-has-required-checks` + +## 用例意义 + +验证 `reconcile` 在 worker 报 `done` 之后,如果任务声明了 required checks,不会直接把 task 置为 `done`,而是先推进到 `verifying`。 + +## 前置条件 + +- 已存在带 required checks 的任务 +- 该任务已经 dispatch 并被 worker claim +- worker 已对该 thread 执行 `done` + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json run init --run run_verify_001 --goal "Exercise verification gates" +orch --db TMPDIR/coord.db --json task add \ + --run run_verify_001 \ + --task T1 \ + --title "Implement verifier-backed task" \ + --default-to worker-a \ + --spec-file TMPDIR/task.md \ + --check-profile cadence_component \ + --required-check lint \ + --required-check test +orch --db TMPDIR/coord.db --json dispatch --run run_verify_001 --task T1 --execution-mode analysis --body "Implement the gated task." +inbox --db TMPDIR/coord.db --json claim --agent worker-a --thread THREAD_ID +inbox --db TMPDIR/coord.db --json done --agent worker-a --thread THREAD_ID --summary "Implementation finished" --body "Ready for verification." +orch --db TMPDIR/coord.db --json reconcile --run run_verify_001 +``` + +## 预期输出 + +- `reconcile` 退出码为 `0` +- `data.updated_tasks` 包含 `T1` +- `T1.status == "verifying"` +- 后续 `orch verify status --run run_verify_001 --task T1` 返回 `data.gate.status == "pending"` + +## 断言结论 + +- worker 的 `done` 不再自动等同于 task `done` +- 一旦 task 定义了 required checks,`reconcile` 的职责是把它送入验证门,而不是直接宣布完成 diff --git a/docs/tests/orch/task-add/README.md b/docs/tests/orch/task-add/README.md index cf89533..c7b4285 100644 --- a/docs/tests/orch/task-add/README.md +++ b/docs/tests/orch/task-add/README.md @@ -7,3 +7,5 @@ | `task-add-creates-ready-root-task` | [task-add-creates-ready-root-task.md](./task-add-creates-ready-root-task.md) | creates a dependency-free task that becomes ready immediately | | `task-add-rejects-invalid-acceptance-json` | [task-add-rejects-invalid-acceptance-json.md](./task-add-rejects-invalid-acceptance-json.md) | rejects malformed `--acceptance-json` with `invalid_input` | | `task-add-rejects-invalid-priority` | [task-add-rejects-invalid-priority.md](./task-add-rejects-invalid-priority.md) | rejects priorities outside `low|normal|high` | +| `task-add-snapshots-spec-and-verification-policy` | [task-add-snapshots-spec-and-verification-policy.md](./task-add-snapshots-spec-and-verification-policy.md) | snapshots spec file content, verification profile, and scope policy onto the task | +| `task-add-rejects-spec-sha-mismatch` | [task-add-rejects-spec-sha-mismatch.md](./task-add-rejects-spec-sha-mismatch.md) | rejects explicit spec hashes that do not match the task spec file content | diff --git a/docs/tests/orch/task-add/task-add-rejects-spec-sha-mismatch.md b/docs/tests/orch/task-add/task-add-rejects-spec-sha-mismatch.md new file mode 100644 index 0000000..4cfcc83 --- /dev/null +++ b/docs/tests/orch/task-add/task-add-rejects-spec-sha-mismatch.md @@ -0,0 +1,34 @@ +# Case: `task-add-rejects-spec-sha-mismatch` + +## 用例意义 + +验证 `task add` 在接收 `--spec-file` 与 `--spec-sha` 时,会拒绝内容摘要不匹配的任务定义,避免 task spec 漂移。 + +## 前置条件 + +- 已存在 run `run_blog_007` +- 临时目录内存在可读取的 spec 文件 `TMPDIR/task.md` +- 调用时传入的 `--spec-sha` 与文件实际 SHA256 不一致 + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json run init --run run_blog_007 --goal "Validate spec sha mismatch" +orch --db TMPDIR/coord.db --json task add \ + --run run_blog_007 \ + --task T1 \ + --title "Implement verifier" \ + --spec-file TMPDIR/task.md \ + --spec-sha deadbeef +``` + +## 预期输出 + +- `task add` 退出码为 `30` +- JSON error payload 的 `error.code == "invalid_input"` +- `error.message` 包含 `spec-sha does not match spec-file contents` + +## 断言结论 + +- task spec 快照不是“尽力而为”的附带字段;当显式声明 SHA 时,CLI 会把它当成契约校验 +- leader 不能在 spec 内容与预期摘要不一致时继续创建 task diff --git a/docs/tests/orch/task-add/task-add-snapshots-spec-and-verification-policy.md b/docs/tests/orch/task-add/task-add-snapshots-spec-and-verification-policy.md new file mode 100644 index 0000000..0640376 --- /dev/null +++ b/docs/tests/orch/task-add/task-add-snapshots-spec-and-verification-policy.md @@ -0,0 +1,49 @@ +# Case: `task-add-snapshots-spec-and-verification-policy` + +## 用例意义 + +验证 `task add` 在创建任务时,不只是写入基础调度字段,还会快照 task spec 与验证策略。 + +## 前置条件 + +- 已存在 run `run_blog_006` +- 临时目录内存在可读取的 spec 文件 `TMPDIR/task.md` + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json run init --run run_blog_006 --goal "Validate spec-aware task add" +orch --db TMPDIR/coord.db --json task add \ + --run run_blog_006 \ + --task T1 \ + --title "Implement verifier" \ + --spec-file TMPDIR/task.md \ + --check-profile cadence_component \ + --required-check lint \ + --required-check test \ + --allowed-path packages/ui \ + --blocked-path scripts/release-metadata.mjs \ + --metadata-json '{"repo":"cadence-ui"}' +``` + +## 预期输出 + +- `task add` 退出码为 `0` +- `data.task.status == "ready"` +- `data.task.spec.spec_file == "TMPDIR/task.md"` +- `data.task.spec.check_profile == "cadence_component"` +- `data.task.spec.required_checks` 包含 `lint` 与 `test` +- `data.task.spec.allowed_paths` 包含 `packages/ui` +- `data.task.spec.blocked_paths` 包含 `scripts/release-metadata.mjs` +- `data.task.gate.status == "pending"` +- `data.task.gate.required_checks` 与 spec 中的 required checks 一致 + +## 断言结论 + +- `task add` 现在会把任务说明和验证策略一起固化到 task spec,而不是只保存 `title`/`summary` +- required checks 一旦存在,task 会立即带上 `pending` gate,而不是等到 worker 完成后才临时推断 + +## 补充约束 + +- `spec_file` 对应内容应作为快照随 task 保存,而不是只保存路径引用 +- `check_profile` 目前只是任务策略名,后续 profile/adapter 机制会负责把它解释成真正的执行计划 diff --git a/docs/tests/orch/verify/README.md b/docs/tests/orch/verify/README.md new file mode 100644 index 0000000..2e10ae9 --- /dev/null +++ b/docs/tests/orch/verify/README.md @@ -0,0 +1,8 @@ +# Orch `verify` Test Plan Index + +## Case Files + +| Case Slug | File | Coverage Note | +| --- | --- | --- | +| `verify-status-returns-spec-and-gate-for-task` | [verify-status-returns-spec-and-gate-for-task.md](./verify-status-returns-spec-and-gate-for-task.md) | returns the selected task, latest attempt, spec snapshot, and current gate state | +| `verify-record-updates-gate-and-marks-task-done-when-required-checks-pass` | [verify-record-updates-gate-and-marks-task-done-when-required-checks-pass.md](./verify-record-updates-gate-and-marks-task-done-when-required-checks-pass.md) | records named checks, recomputes the gate, and promotes the task to `done` when all required checks pass | diff --git a/docs/tests/orch/verify/verify-record-updates-gate-and-marks-task-done-when-required-checks-pass.md b/docs/tests/orch/verify/verify-record-updates-gate-and-marks-task-done-when-required-checks-pass.md new file mode 100644 index 0000000..e71a7a6 --- /dev/null +++ b/docs/tests/orch/verify/verify-record-updates-gate-and-marks-task-done-when-required-checks-pass.md @@ -0,0 +1,35 @@ +# Case: `verify-record-updates-gate-and-marks-task-done-when-required-checks-pass` + +## 用例意义 + +验证 `verify record` 在逐个记录 required checks 后,会重新计算 gate,并在所有必过项通过时把 task 从 `verifying` 推进到 `done`。 + +## 前置条件 + +- 已存在处于 `verifying` 的任务 `T1` +- 该任务的 required checks 为 `lint` 与 `test` + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json verify record --run run_verify_001 --task T1 --check lint --status passed --summary "lint clean" +orch --db TMPDIR/coord.db --json verify record --run run_verify_001 --task T1 --check test --status passed --summary "tests clean" +orch --db TMPDIR/coord.db --json status --run run_verify_001 +``` + +## 预期输出 + +- 第一次 `verify record` 后: + - `data.task.status == "verifying"` + - `data.gate.status == "pending"` + - `data.gate.pending_checks` 仍包含 `test` +- 第二次 `verify record` 后: + - `data.task.status == "done"` + - `data.gate.status == "passed"` + - `data.gate.pending_checks` 为空 +- 后续 `status.data.run.status == "done"` + +## 断言结论 + +- `verify record` 不是单纯写一条 check 日志;它会驱动 gate 和 task 状态机前进 +- 只有所有 required checks 通过,task 才会真正完成 diff --git a/docs/tests/orch/verify/verify-status-returns-spec-and-gate-for-task.md b/docs/tests/orch/verify/verify-status-returns-spec-and-gate-for-task.md new file mode 100644 index 0000000..cbc1e39 --- /dev/null +++ b/docs/tests/orch/verify/verify-status-returns-spec-and-gate-for-task.md @@ -0,0 +1,32 @@ +# Case: `verify-status-returns-spec-and-gate-for-task` + +## 用例意义 + +验证 `verify status` 能把 task 的验证上下文一次性展示出来,而不是要求 leader 手工拼装 task、attempt、spec 与 check 结果。 + +## 前置条件 + +- 已存在带 required checks 的任务 +- 该任务已经经过 `reconcile` 进入 `verifying` + +## 输入 + +```bash +orch --db TMPDIR/coord.db --json verify status --run run_verify_001 --task T1 +``` + +## 预期输出 + +- `verify status` 退出码为 `0` +- `data.task.task_id == "T1"` +- `data.attempt.attempt_no == 1` +- `data.spec.spec_file` 非空 +- `data.spec.check_profile == "cadence_component"` +- `data.gate.status == "pending"` +- `data.gate.required_checks` 包含 `lint` 与 `test` +- `data.gate.pending_checks` 在首次查询时仍包含所有未通过检查 + +## 断言结论 + +- `verify status` 是 leader 查看 gate 的主入口,而不是只返回 task 表里的裸状态 +- gate 是否仍在等待检查、已经失败、还是已经通过,都应在一个响应里可见 diff --git a/packages/coord-core/db/schema/008_orch_harness.sql b/packages/coord-core/db/schema/008_orch_harness.sql new file mode 100644 index 0000000..c60e34f --- /dev/null +++ b/packages/coord-core/db/schema/008_orch_harness.sql @@ -0,0 +1,35 @@ +CREATE TABLE IF NOT EXISTS task_specs ( + run_id TEXT NOT NULL, + task_id TEXT NOT NULL, + spec_file TEXT NOT NULL DEFAULT '', + spec_sha TEXT NOT NULL DEFAULT '', + spec_body TEXT NOT NULL DEFAULT '', + check_profile TEXT NOT NULL DEFAULT '', + required_checks_json TEXT NOT NULL DEFAULT '[]', + allowed_paths_json TEXT NOT NULL DEFAULT '[]', + blocked_paths_json TEXT NOT NULL DEFAULT '[]', + metadata_json TEXT NOT NULL DEFAULT '{}', + created_at TEXT NOT NULL, + updated_at TEXT NOT NULL, + PRIMARY KEY (run_id, task_id), + FOREIGN KEY (run_id, task_id) REFERENCES tasks(run_id, task_id) +); + +CREATE TABLE IF NOT EXISTS check_runs ( + run_id TEXT NOT NULL, + task_id TEXT NOT NULL, + attempt_no INTEGER NOT NULL, + check_name TEXT NOT NULL, + status TEXT NOT NULL, + summary TEXT NOT NULL DEFAULT '', + body TEXT NOT NULL DEFAULT '', + metadata_json TEXT NOT NULL DEFAULT '{}', + recorded_by TEXT NOT NULL DEFAULT '', + created_at TEXT NOT NULL, + updated_at TEXT NOT NULL, + PRIMARY KEY (run_id, task_id, attempt_no, check_name), + FOREIGN KEY (run_id, task_id, attempt_no) REFERENCES task_attempts(run_id, task_id, attempt_no) +); + +CREATE INDEX IF NOT EXISTS idx_check_runs_task_attempt + ON check_runs(run_id, task_id, attempt_no, status, check_name); diff --git a/packages/coord-core/store/orch.go b/packages/coord-core/store/orch.go index 2a399f4..9497949 100644 --- a/packages/coord-core/store/orch.go +++ b/packages/coord-core/store/orch.go @@ -30,17 +30,58 @@ type Run struct { } type Task struct { - RunID string `json:"run_id"` - TaskID string `json:"task_id"` - Title string `json:"title"` - Summary string `json:"summary"` - Status string `json:"status"` - DefaultTo string `json:"default_to,omitempty"` - Priority string `json:"priority"` - AcceptanceJSON json.RawMessage `json:"acceptance_json"` - LatestAttemptNo int `json:"latest_attempt_no,omitempty"` - CreatedAt time.Time `json:"created_at"` - UpdatedAt time.Time `json:"updated_at"` + RunID string `json:"run_id"` + TaskID string `json:"task_id"` + Title string `json:"title"` + Summary string `json:"summary"` + Status string `json:"status"` + DefaultTo string `json:"default_to,omitempty"` + Priority string `json:"priority"` + AcceptanceJSON json.RawMessage `json:"acceptance_json"` + LatestAttemptNo int `json:"latest_attempt_no,omitempty"` + Spec *TaskSpec `json:"spec,omitempty"` + Gate *VerificationGate `json:"gate,omitempty"` + CreatedAt time.Time `json:"created_at"` + UpdatedAt time.Time `json:"updated_at"` +} + +type TaskSpec struct { + RunID string `json:"run_id"` + TaskID string `json:"task_id"` + SpecFile string `json:"spec_file,omitempty"` + SpecSHA string `json:"spec_sha,omitempty"` + SpecBody string `json:"spec_body,omitempty"` + CheckProfile string `json:"check_profile,omitempty"` + RequiredChecks []string `json:"required_checks,omitempty"` + AllowedPaths []string `json:"allowed_paths,omitempty"` + BlockedPaths []string `json:"blocked_paths,omitempty"` + MetadataJSON json.RawMessage `json:"metadata_json,omitempty"` + CreatedAt time.Time `json:"created_at"` + UpdatedAt time.Time `json:"updated_at"` +} + +type TaskCheckRun struct { + RunID string `json:"run_id"` + TaskID string `json:"task_id"` + AttemptNo int `json:"attempt_no"` + CheckName string `json:"check_name"` + Status string `json:"status"` + Summary string `json:"summary"` + Body string `json:"body,omitempty"` + MetadataJSON json.RawMessage `json:"metadata_json,omitempty"` + RecordedBy string `json:"recorded_by,omitempty"` + CreatedAt time.Time `json:"created_at"` + UpdatedAt time.Time `json:"updated_at"` +} + +type VerificationGate struct { + Status string `json:"status"` + AttemptNo int `json:"attempt_no,omitempty"` + CheckProfile string `json:"check_profile,omitempty"` + RequiredChecks []string `json:"required_checks,omitempty"` + PendingChecks []string `json:"pending_checks,omitempty"` + FailedChecks []string `json:"failed_checks,omitempty"` + Checks []TaskCheckRun `json:"checks,omitempty"` } type TaskDependency struct { @@ -99,6 +140,14 @@ type AddTaskInput struct { DefaultTo string AcceptanceJSON string Priority string + SpecFile string + SpecSHA string + SpecBody string + CheckProfile string + RequiredChecks []string + AllowedPaths []string + BlockedPaths []string + MetadataJSON string } type AddDependencyInput struct { @@ -189,6 +238,38 @@ type AnswerResult struct { Message Message `json:"message"` } +type VerifyRecordInput struct { + RunID string + TaskID string + AttemptNo int + CheckName string + Status string + Summary string + Body string + MetadataJSON string + RecordedBy string +} + +type VerifyRecordResult struct { + Task Task `json:"task"` + Attempt TaskAttempt `json:"attempt"` + Check TaskCheckRun `json:"check"` + Gate *VerificationGate `json:"gate,omitempty"` +} + +type VerificationStatusInput struct { + RunID string + TaskID string + AttemptNo int +} + +type VerificationStatusResult struct { + Task Task `json:"task"` + Attempt *TaskAttempt `json:"attempt,omitempty"` + Spec *TaskSpec `json:"spec,omitempty"` + Gate *VerificationGate `json:"gate,omitempty"` +} + type RetryInput struct { RunID string TaskID string @@ -338,6 +419,25 @@ func (s *OrchStore) AddTask(ctx context.Context, input AddTaskInput) (Task, erro if err != nil { return Task{}, err } + specMetadataJSON, err := validateAndNormalizeJSON("metadata-json", input.MetadataJSON) + if err != nil { + return Task{}, err + } + requiredChecks := normalizeStringList(input.RequiredChecks) + allowedPaths := normalizeStringList(input.AllowedPaths) + blockedPaths := normalizeStringList(input.BlockedPaths) + requiredChecksJSON, err := marshalStringList("required-check", requiredChecks) + if err != nil { + return Task{}, err + } + allowedPathsJSON, err := marshalStringList("allowed-path", allowedPaths) + if err != nil { + return Task{}, err + } + blockedPathsJSON, err := marshalStringList("blocked-path", blockedPaths) + if err != nil { + return Task{}, err + } now := nowUTC() tx, err := s.db.BeginTx(ctx, nil) @@ -375,17 +475,48 @@ func (s *OrchStore) AddTask(ctx context.Context, input AddTaskInput) (Task, erro } if err := insertEvent(ctx, tx, eventInput{ - RunID: input.RunID, - TaskID: input.TaskID, - Source: "orch", - EventType: "task_added", - Summary: input.Title, - PayloadJSON: marshalJSON(map[string]any{"title": input.Title, "priority": priority}), - CreatedAt: now, + RunID: input.RunID, + TaskID: input.TaskID, + Source: "orch", + EventType: "task_added", + Summary: input.Title, + PayloadJSON: marshalJSON(map[string]any{ + "title": input.Title, + "priority": priority, + "spec_file": strings.TrimSpace(input.SpecFile), + "check_profile": strings.TrimSpace(input.CheckProfile), + }), + CreatedAt: now, }); err != nil { return Task{}, err } + if shouldPersistTaskSpec(input, specMetadataJSON, requiredChecks, allowedPaths, blockedPaths) { + _, err = tx.ExecContext( + ctx, + `INSERT INTO task_specs ( + run_id, task_id, spec_file, spec_sha, spec_body, check_profile, + required_checks_json, allowed_paths_json, blocked_paths_json, + metadata_json, created_at, updated_at + ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)`, + input.RunID, + input.TaskID, + strings.TrimSpace(input.SpecFile), + strings.TrimSpace(input.SpecSHA), + input.SpecBody, + strings.TrimSpace(input.CheckProfile), + requiredChecksJSON, + allowedPathsJSON, + blockedPathsJSON, + specMetadataJSON, + formatTime(now), + formatTime(now), + ) + if err != nil { + return Task{}, fmt.Errorf("insert task spec: %w", err) + } + } + if err := refreshReadyStates(ctx, tx, input.RunID, now); err != nil { return Task{}, err } @@ -397,6 +528,9 @@ func (s *OrchStore) AddTask(ctx context.Context, input AddTaskInput) (Task, erro if err != nil { return Task{}, err } + if err := attachTaskHarnessData(ctx, tx, &task, false); err != nil { + return Task{}, err + } if err := tx.Commit(); err != nil { return Task{}, fmt.Errorf("commit add task transaction: %w", err) @@ -534,6 +668,9 @@ func (s *OrchStore) ListReadyTasks(ctx context.Context, input ListReadyInput) ([ if err != nil { return nil, err } + if err := attachTaskHarnessData(ctx, tx, &task, false); err != nil { + return nil, err + } tasks = append(tasks, task) } if err := rows.Err(); err != nil { @@ -602,6 +739,9 @@ func (s *OrchStore) GetTaskWithLatestAttempt(ctx context.Context, runID, taskID if err != nil { return Task{}, nil, err } + if err := attachTaskHarnessData(ctx, s.db, &task, false); err != nil { + return Task{}, nil, err + } if task.LatestAttemptNo == 0 { return task, nil, nil } @@ -1118,6 +1258,11 @@ func (s *OrchStore) dispatchTaskTx( threadID := newID("thr") messageID := newID("msg") + spec, err := selectTaskSpec(ctx, tx, task.RunID, task.TaskID, true) + if err != nil { + return DispatchResult{}, finalizeWorkspace, err + } + task.Spec = spec payloadJSON := buildDispatchPayload(task, attemptNo, workspace) thread := Thread{ ThreadID: threadID, @@ -1133,7 +1278,7 @@ func (s *OrchStore) dispatchTaskTx( UpdatedAt: now, } - _, err := tx.ExecContext( + _, err = tx.ExecContext( ctx, `INSERT INTO threads ( thread_id, run_id, task_id, subject, created_by, assigned_to, status, @@ -1441,7 +1586,10 @@ func (s *OrchStore) ReconcileRun(ctx context.Context, runID string) (ReconcileRe return ReconcileResult{}, fmt.Errorf("scan reconcile candidate: %w", err) } - nextStatus := reconcileTaskStatus(threadStatus) + nextStatus, err := reconcileTaskStatus(ctx, tx, runID, taskID, attemptNo, taskStatus, threadStatus) + if err != nil { + return ReconcileResult{}, err + } if nextStatus == "" { continue } @@ -1535,6 +1683,9 @@ func (s *OrchStore) ReconcileRun(ctx context.Context, runID string) (ReconcileRe if err != nil { return ReconcileResult{}, err } + if err := attachTaskHarnessData(ctx, tx, &task, false); err != nil { + return ReconcileResult{}, err + } updatedTasks = append(updatedTasks, task) } @@ -1593,6 +1744,9 @@ func (s *OrchStore) ListBlockedTasks(ctx context.Context, runID string) ([]Block if err != nil { return nil, err } + if err := attachTaskHarnessData(ctx, tx, &task, false); err != nil { + return nil, err + } question, err := selectLatestQuestionMessage(ctx, tx, attempt.ThreadID) if err != nil { return nil, err @@ -1750,6 +1904,268 @@ func (s *OrchStore) AnswerTask(ctx context.Context, input AnswerInput) (AnswerRe }, nil } +func (s *OrchStore) RecordCheck(ctx context.Context, input VerifyRecordInput) (VerifyRecordResult, error) { + if strings.TrimSpace(input.RunID) == "" { + return VerifyRecordResult{}, fmt.Errorf("%w: run id is required", ErrInvalidInput) + } + if strings.TrimSpace(input.TaskID) == "" { + return VerifyRecordResult{}, fmt.Errorf("%w: task id is required", ErrInvalidInput) + } + checkName := strings.TrimSpace(input.CheckName) + if checkName == "" { + return VerifyRecordResult{}, fmt.Errorf("%w: check name is required", ErrInvalidInput) + } + + checkStatus, err := normalizeCheckStatus(input.Status) + if err != nil { + return VerifyRecordResult{}, err + } + metadataJSON, err := validateAndNormalizeJSON("metadata-json", input.MetadataJSON) + if err != nil { + return VerifyRecordResult{}, err + } + + now := nowUTC() + tx, err := s.db.BeginTx(ctx, nil) + if err != nil { + return VerifyRecordResult{}, fmt.Errorf("begin record check transaction: %w", err) + } + defer tx.Rollback() + + task, err := selectTask(ctx, tx, input.RunID, input.TaskID) + if err != nil { + return VerifyRecordResult{}, err + } + if task.LatestAttemptNo == 0 { + return VerifyRecordResult{}, fmt.Errorf("%w: task %s has no attempt to verify", ErrInvalidState, task.TaskID) + } + if task.Status != "verifying" && task.Status != "failed" && task.Status != "done" { + return VerifyRecordResult{}, fmt.Errorf("%w: task %s is not ready for verification recording", ErrInvalidState, task.TaskID) + } + + attemptNo := input.AttemptNo + if attemptNo == 0 { + attemptNo = task.LatestAttemptNo + } + if attemptNo != task.LatestAttemptNo { + return VerifyRecordResult{}, fmt.Errorf("%w: can only record verification for the latest attempt", ErrInvalidState) + } + attempt, err := selectAttempt(ctx, tx, input.RunID, input.TaskID, attemptNo) + if err != nil { + return VerifyRecordResult{}, err + } + + checkRun := TaskCheckRun{ + RunID: input.RunID, + TaskID: input.TaskID, + AttemptNo: attempt.AttemptNo, + CheckName: checkName, + Status: checkStatus, + Summary: strings.TrimSpace(input.Summary), + Body: input.Body, + MetadataJSON: json.RawMessage(metadataJSON), + RecordedBy: defaultString(strings.TrimSpace(input.RecordedBy), "orch"), + CreatedAt: now, + UpdatedAt: now, + } + + _, err = tx.ExecContext( + ctx, + `INSERT INTO check_runs ( + run_id, task_id, attempt_no, check_name, status, summary, body, + metadata_json, recorded_by, created_at, updated_at + ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) + ON CONFLICT(run_id, task_id, attempt_no, check_name) DO UPDATE SET + status = excluded.status, + summary = excluded.summary, + body = excluded.body, + metadata_json = excluded.metadata_json, + recorded_by = excluded.recorded_by, + updated_at = excluded.updated_at`, + checkRun.RunID, + checkRun.TaskID, + checkRun.AttemptNo, + checkRun.CheckName, + checkRun.Status, + checkRun.Summary, + checkRun.Body, + string(checkRun.MetadataJSON), + checkRun.RecordedBy, + formatTime(checkRun.CreatedAt), + formatTime(checkRun.UpdatedAt), + ) + if err != nil { + return VerifyRecordResult{}, fmt.Errorf("upsert check run: %w", err) + } + + checkRun, err = selectCheckRun(ctx, tx, checkRun.RunID, checkRun.TaskID, checkRun.AttemptNo, checkRun.CheckName) + if err != nil { + return VerifyRecordResult{}, err + } + + if err := insertEvent(ctx, tx, eventInput{ + RunID: task.RunID, + TaskID: task.TaskID, + ThreadID: attempt.ThreadID, + Source: "orch", + EventType: "task_verification_recorded", + Summary: defaultString(checkRun.Summary, fmt.Sprintf("%s: %s", checkRun.CheckName, checkRun.Status)), + PayloadJSON: marshalJSON(map[string]any{ + "attempt_no": attempt.AttemptNo, + "check_name": checkRun.CheckName, + "status": checkRun.Status, + }), + CreatedAt: now, + }); err != nil { + return VerifyRecordResult{}, err + } + + gate, err := buildVerificationGate(ctx, tx, task.RunID, task.TaskID, attempt.AttemptNo) + if err != nil { + return VerifyRecordResult{}, err + } + + nextStatus := task.Status + switch { + case gate == nil: + nextStatus = task.Status + case gate.Status == "failed": + nextStatus = "failed" + case gate.Status == "passed": + nextStatus = "done" + default: + nextStatus = "verifying" + } + + if nextStatus != task.Status { + _, err = tx.ExecContext( + ctx, + `UPDATE tasks + SET status = ?, updated_at = ? + WHERE run_id = ? AND task_id = ?`, + nextStatus, + formatTime(now), + task.RunID, + task.TaskID, + ) + if err != nil { + return VerifyRecordResult{}, fmt.Errorf("update verified task status: %w", err) + } + if err := insertEvent(ctx, tx, eventInput{ + RunID: task.RunID, + TaskID: task.TaskID, + ThreadID: attempt.ThreadID, + Source: "orch", + EventType: "task_" + nextStatus, + Summary: verificationSummary(nextStatus, gate, checkRun), + PayloadJSON: marshalJSON(map[string]any{ + "attempt_no": attempt.AttemptNo, + "check_name": checkRun.CheckName, + "status": checkRun.Status, + }), + CreatedAt: now, + }); err != nil { + return VerifyRecordResult{}, err + } + task.Status = nextStatus + task.UpdatedAt = now + } + + if nextStatus != attempt.Status { + _, err = tx.ExecContext( + ctx, + `UPDATE task_attempts + SET status = ?, updated_at = ? + WHERE run_id = ? AND task_id = ? AND attempt_no = ?`, + nextStatus, + formatTime(now), + attempt.RunID, + attempt.TaskID, + attempt.AttemptNo, + ) + if err != nil { + return VerifyRecordResult{}, fmt.Errorf("update verified attempt status: %w", err) + } + attempt.Status = nextStatus + attempt.UpdatedAt = now + } + + if err := updateRunAggregateStatus(ctx, tx, task.RunID, now); err != nil { + return VerifyRecordResult{}, err + } + + if err := attachTaskHarnessData(ctx, tx, &task, false); err != nil { + return VerifyRecordResult{}, err + } + if task.Gate != nil { + gate = task.Gate + } + + if err := tx.Commit(); err != nil { + return VerifyRecordResult{}, fmt.Errorf("commit record check transaction: %w", err) + } + + return VerifyRecordResult{ + Task: task, + Attempt: attempt, + Check: checkRun, + Gate: gate, + }, nil +} + +func (s *OrchStore) GetVerificationStatus(ctx context.Context, input VerificationStatusInput) (VerificationStatusResult, error) { + if strings.TrimSpace(input.RunID) == "" { + return VerificationStatusResult{}, fmt.Errorf("%w: run id is required", ErrInvalidInput) + } + if strings.TrimSpace(input.TaskID) == "" { + return VerificationStatusResult{}, fmt.Errorf("%w: task id is required", ErrInvalidInput) + } + + task, err := selectTask(ctx, s.db, input.RunID, input.TaskID) + if err != nil { + return VerificationStatusResult{}, err + } + spec, err := selectTaskSpec(ctx, s.db, input.RunID, input.TaskID, false) + if err != nil { + return VerificationStatusResult{}, err + } + task.Spec = spec + + var attempt *TaskAttempt + var gate *VerificationGate + if task.LatestAttemptNo > 0 { + attemptNo := input.AttemptNo + if attemptNo == 0 { + attemptNo = task.LatestAttemptNo + } + record, err := selectAttempt(ctx, s.db, input.RunID, input.TaskID, attemptNo) + if err != nil { + return VerificationStatusResult{}, err + } + attempt = &record + gate, err = buildVerificationGate(ctx, s.db, input.RunID, input.TaskID, attemptNo) + if err != nil { + return VerificationStatusResult{}, err + } + } + if gate == nil && spec != nil && len(spec.RequiredChecks) > 0 { + gate = &VerificationGate{ + Status: "pending", + CheckProfile: spec.CheckProfile, + RequiredChecks: append([]string(nil), spec.RequiredChecks...), + PendingChecks: append([]string(nil), spec.RequiredChecks...), + } + } + task.Gate = gate + + return VerificationStatusResult{ + Task: task, + Attempt: attempt, + Spec: spec, + Gate: gate, + }, nil +} + func (s *OrchStore) GetRunOverview(ctx context.Context, runID string) (RunOverview, error) { if strings.TrimSpace(runID) == "" { return RunOverview{}, fmt.Errorf("%w: run id is required", ErrInvalidInput) @@ -1924,7 +2340,7 @@ func (s *OrchStore) WaitForEvents(ctx context.Context, input WaitInput) (WaitRes } } -func listTasksForRun(ctx context.Context, db queryRowsContexter, runID string) ([]Task, error) { +func listTasksForRun(ctx context.Context, db queryRowsAndRower, runID string) ([]Task, error) { rows, err := db.QueryContext( ctx, `SELECT @@ -1946,6 +2362,9 @@ func listTasksForRun(ctx context.Context, db queryRowsContexter, runID string) ( if err != nil { return nil, err } + if err := attachTaskHarnessData(ctx, db, &task, false); err != nil { + return nil, err + } tasks = append(tasks, task) } if err := rows.Err(); err != nil { @@ -2147,6 +2566,82 @@ func scanAttempt(scanner threadScanner) (TaskAttempt, error) { return attempt, nil } +func scanTaskSpec(scanner threadScanner) (TaskSpec, error) { + var ( + spec TaskSpec + requiredChecksJSON, allowedPathsJSON string + blockedPathsJSON, metadataJSON string + createdAt, updatedAt string + ) + + if err := scanner.Scan( + &spec.RunID, + &spec.TaskID, + &spec.SpecFile, + &spec.SpecSHA, + &spec.SpecBody, + &spec.CheckProfile, + &requiredChecksJSON, + &allowedPathsJSON, + &blockedPathsJSON, + &metadataJSON, + &createdAt, + &updatedAt, + ); err != nil { + return TaskSpec{}, fmt.Errorf("scan task spec: %w", err) + } + + requiredChecks, err := unmarshalStringList(requiredChecksJSON) + if err != nil { + return TaskSpec{}, err + } + allowedPaths, err := unmarshalStringList(allowedPathsJSON) + if err != nil { + return TaskSpec{}, err + } + blockedPaths, err := unmarshalStringList(blockedPathsJSON) + if err != nil { + return TaskSpec{}, err + } + + spec.RequiredChecks = requiredChecks + spec.AllowedPaths = allowedPaths + spec.BlockedPaths = blockedPaths + spec.MetadataJSON = json.RawMessage(metadataJSON) + spec.CreatedAt = parseTime(createdAt) + spec.UpdatedAt = parseTime(updatedAt) + return spec, nil +} + +func scanTaskCheckRun(scanner threadScanner) (TaskCheckRun, error) { + var ( + check TaskCheckRun + metadataJSON string + createdAt, updatedAt string + ) + + if err := scanner.Scan( + &check.RunID, + &check.TaskID, + &check.AttemptNo, + &check.CheckName, + &check.Status, + &check.Summary, + &check.Body, + &metadataJSON, + &check.RecordedBy, + &createdAt, + &updatedAt, + ); err != nil { + return TaskCheckRun{}, fmt.Errorf("scan task check run: %w", err) + } + + check.MetadataJSON = json.RawMessage(metadataJSON) + check.CreatedAt = parseTime(createdAt) + check.UpdatedAt = parseTime(updatedAt) + return check, nil +} + func scanTaskAndAttempt(scanner threadScanner) (Task, TaskAttempt, error) { var ( task Task @@ -2269,6 +2764,85 @@ func selectAttempt(ctx context.Context, db queryRower, runID, taskID string, att return attempt, err } +func selectTaskSpec(ctx context.Context, db queryRower, runID, taskID string, includeBody bool) (*TaskSpec, error) { + columns := `run_id, task_id, spec_file, spec_sha, spec_body, check_profile, + required_checks_json, allowed_paths_json, blocked_paths_json, metadata_json, + created_at, updated_at` + if !includeBody { + columns = `run_id, task_id, spec_file, spec_sha, '' AS spec_body, check_profile, + required_checks_json, allowed_paths_json, blocked_paths_json, metadata_json, + created_at, updated_at` + } + row := db.QueryRowContext( + ctx, + `SELECT `+columns+` + FROM task_specs + WHERE run_id = ? AND task_id = ?`, + runID, + taskID, + ) + spec, err := scanTaskSpec(row) + if errors.Is(err, sql.ErrNoRows) { + return nil, nil + } + if err != nil { + return nil, err + } + return &spec, nil +} + +func selectCheckRuns(ctx context.Context, db queryRowsContexter, runID, taskID string, attemptNo int) ([]TaskCheckRun, error) { + rows, err := db.QueryContext( + ctx, + `SELECT + run_id, task_id, attempt_no, check_name, status, summary, body, + metadata_json, recorded_by, created_at, updated_at + FROM check_runs + WHERE run_id = ? AND task_id = ? AND attempt_no = ? + ORDER BY check_name ASC`, + runID, + taskID, + attemptNo, + ) + if err != nil { + return nil, fmt.Errorf("query check runs: %w", err) + } + defer rows.Close() + + var checks []TaskCheckRun + for rows.Next() { + check, err := scanTaskCheckRun(rows) + if err != nil { + return nil, err + } + checks = append(checks, check) + } + if err := rows.Err(); err != nil { + return nil, fmt.Errorf("iterate check runs: %w", err) + } + return checks, nil +} + +func selectCheckRun(ctx context.Context, db queryRower, runID, taskID string, attemptNo int, checkName string) (TaskCheckRun, error) { + row := db.QueryRowContext( + ctx, + `SELECT + run_id, task_id, attempt_no, check_name, status, summary, body, + metadata_json, recorded_by, created_at, updated_at + FROM check_runs + WHERE run_id = ? AND task_id = ? AND attempt_no = ? AND check_name = ?`, + runID, + taskID, + attemptNo, + checkName, + ) + check, err := scanTaskCheckRun(row) + if errors.Is(err, sql.ErrNoRows) { + return TaskCheckRun{}, fmt.Errorf("%w: check %s for %s/%s attempt %d not found", ErrInvalidState, checkName, runID, taskID, attemptNo) + } + return check, err +} + func selectLatestQuestionMessage(ctx context.Context, db queryRowsAndRower, threadID string) (Message, error) { row := db.QueryRowContext( ctx, @@ -2368,6 +2942,93 @@ func loadArtifactsForMessageIDsFromQueryer(ctx context.Context, db queryRowsCont return result, nil } +func attachTaskHarnessData(ctx context.Context, db queryRowsAndRower, task *Task, includeSpecBody bool) error { + if task == nil { + return nil + } + + spec, err := selectTaskSpec(ctx, db, task.RunID, task.TaskID, includeSpecBody) + if err != nil { + return err + } + task.Spec = spec + + gate, err := buildVerificationGate(ctx, db, task.RunID, task.TaskID, task.LatestAttemptNo) + if err != nil { + return err + } + if gate == nil && spec != nil && len(spec.RequiredChecks) > 0 { + gate = &VerificationGate{ + Status: "pending", + CheckProfile: spec.CheckProfile, + RequiredChecks: append([]string(nil), spec.RequiredChecks...), + PendingChecks: append([]string(nil), spec.RequiredChecks...), + } + } + task.Gate = gate + return nil +} + +func buildVerificationGate(ctx context.Context, db queryRowsAndRower, runID, taskID string, attemptNo int) (*VerificationGate, error) { + spec, err := selectTaskSpec(ctx, db, runID, taskID, false) + if err != nil { + return nil, err + } + if spec == nil || len(spec.RequiredChecks) == 0 { + return nil, nil + } + + gate := &VerificationGate{ + Status: "pending", + AttemptNo: attemptNo, + CheckProfile: spec.CheckProfile, + RequiredChecks: append([]string(nil), spec.RequiredChecks...), + PendingChecks: append([]string(nil), spec.RequiredChecks...), + } + if attemptNo == 0 { + return gate, nil + } + + checks, err := selectCheckRuns(ctx, db, runID, taskID, attemptNo) + if err != nil { + return nil, err + } + gate.Checks = checks + + checkByName := make(map[string]TaskCheckRun, len(checks)) + for _, check := range checks { + checkByName[check.CheckName] = check + } + + pending := make([]string, 0, len(spec.RequiredChecks)) + failed := make([]string, 0) + for _, checkName := range spec.RequiredChecks { + check, ok := checkByName[checkName] + if !ok || check.Status == "skipped" { + pending = append(pending, checkName) + continue + } + if check.Status == "failed" { + failed = append(failed, checkName) + } + if check.Status != "passed" && check.Status != "failed" { + pending = append(pending, checkName) + } + } + + gate.PendingChecks = pending + gate.FailedChecks = failed + switch { + case len(failed) > 0: + gate.Status = "failed" + case len(pending) == 0: + gate.Status = "passed" + default: + gate.Status = "pending" + } + return gate, nil +} + func refreshReadyStates(ctx context.Context, tx *sql.Tx, runID string, now time.Time) error { rows, err := tx.QueryContext( ctx, @@ -2538,6 +3199,9 @@ func deriveRunStatus(counts map[string]int) string { if counts["running"] > 0 || counts["dispatched"] > 0 { return "running" } + if counts["verifying"] > 0 { + return "verifying" + } if counts["ready"] > 0 { return "ready" } @@ -2553,22 +3217,34 @@ func deriveRunStatus(counts map[string]int) string { return "active" } -func reconcileTaskStatus(threadStatus string) string { +func reconcileTaskStatus(ctx context.Context, db queryRowsAndRower, runID, taskID string, attemptNo int, currentTaskStatus, threadStatus string) (string, error) { switch threadStatus { case "pending": - return "dispatched" + return "dispatched", nil case "claimed", "in_progress": - return "running" + return "running", nil case "blocked": - return "blocked" + return "blocked", nil case "done": - return "done" + gate, err := buildVerificationGate(ctx, db, runID, taskID, attemptNo) + if err != nil { + return "", err + } + if gate != nil { + switch currentTaskStatus { + case "done", "failed": + return currentTaskStatus, nil + default: + return "verifying", nil + } + } + return "done", nil case "failed": - return "failed" + return "failed", nil case "cancelled": - return "cancelled" + return "cancelled", nil default: - return "" + return "", nil } } @@ -2622,6 +3298,83 @@ func validateAndNormalizeJSONDefault(fieldName, value, defaultValue string) (str return compact.String(), nil } +func normalizeStringList(values []string) []string { + normalized := make([]string, 0, len(values)) + seen := make(map[string]struct{}, len(values)) + for _, value := range values { + value = strings.TrimSpace(value) + if value == "" { + continue + } + if _, ok := seen[value]; ok { + continue + } + seen[value] = struct{}{} + normalized = append(normalized, value) + } + return normalized +} + +func marshalStringList(fieldName string, values []string) (string, error) { + encoded, err := json.Marshal(values) + if err != nil { + return "", fmt.Errorf("%w: %s must be serializable", ErrInvalidInput, fieldName) + } + return string(encoded), nil +} + +func unmarshalStringList(raw string) ([]string, error) { + raw = strings.TrimSpace(raw) + if raw == "" { + raw = "[]" + } + var values []string + if err := json.Unmarshal([]byte(raw), &values); err != nil { + return nil, fmt.Errorf("%w: invalid string list JSON", ErrInvalidInput) + } + return normalizeStringList(values), nil +} + +func shouldPersistTaskSpec(input AddTaskInput, metadataJSON string, requiredChecks, allowedPaths, blockedPaths []string) bool { + return strings.TrimSpace(input.SpecFile) != "" || + strings.TrimSpace(input.SpecSHA) != "" || + strings.TrimSpace(input.SpecBody) != "" || + strings.TrimSpace(input.CheckProfile) != "" || + len(requiredChecks) > 0 || + len(allowedPaths) > 0 || + len(blockedPaths) > 0 || + strings.TrimSpace(metadataJSON) != "" && strings.TrimSpace(metadataJSON) != "{}" +} + +func normalizeCheckStatus(status string) (string, error) { + status = strings.TrimSpace(status) + switch status { + case "passed", "failed", "skipped": + return status, nil + default: + return "", fmt.Errorf("%w: check status must be one of passed, failed, skipped", ErrInvalidInput) + } +} + +func verificationSummary(nextStatus string, gate *VerificationGate, check TaskCheckRun) string { + switch nextStatus { + case "done": + return fmt.Sprintf("verification passed after %s", check.CheckName) + case "failed": + if gate != nil && len(gate.FailedChecks) > 0 { + return fmt.Sprintf("verification failed: %s", strings.Join(gate.FailedChecks, ", ")) + } + return fmt.Sprintf("verification failed after %s", check.CheckName) + case "verifying": + if gate != nil && len(gate.PendingChecks) > 0 { + return fmt.Sprintf("waiting on checks: %s", strings.Join(gate.PendingChecks, ", ")) + } + return fmt.Sprintf("recorded verification for %s", check.CheckName) + default: + return defaultString(check.Summary, fmt.Sprintf("%s: %s", check.CheckName, check.Status)) + } +} + func buildDispatchPayload(task Task, attemptNo int, workspace DispatchWorkspace) string { payload := map[string]any{ "run_id": task.RunID, @@ -2638,6 +3391,26 @@ func buildDispatchPayload(task Task, attemptNo int, workspace DispatchWorkspace) payload["acceptance"] = acceptance } } + if task.Spec != nil { + specPayload := map[string]any{ + "file": task.Spec.SpecFile, + "sha": task.Spec.SpecSHA, + "check_profile": task.Spec.CheckProfile, + "required_checks": task.Spec.RequiredChecks, + "allowed_paths": task.Spec.AllowedPaths, + "blocked_paths": task.Spec.BlockedPaths, + } + if strings.TrimSpace(task.Spec.SpecBody) != "" { + specPayload["body"] = task.Spec.SpecBody + } + if len(task.Spec.MetadataJSON) > 0 { + var metadata any + if err := json.Unmarshal(task.Spec.MetadataJSON, &metadata); err == nil { + specPayload["metadata"] = metadata + } + } + payload["spec"] = specPayload + } if strings.TrimSpace(workspace.ExecutionMode) != "" { payload["execution_mode"] = strings.TrimSpace(workspace.ExecutionMode) } diff --git a/packages/orch-runtime/internal/cli/orch/command_contracts_core_test.go b/packages/orch-runtime/internal/cli/orch/command_contracts_core_test.go index f32523e..d3099ce 100644 --- a/packages/orch-runtime/internal/cli/orch/command_contracts_core_test.go +++ b/packages/orch-runtime/internal/cli/orch/command_contracts_core_test.go @@ -1,6 +1,7 @@ package orch import ( + "os" "path/filepath" "strings" "testing" @@ -143,6 +144,98 @@ func TestOrchTaskAddRejectsInvalidPriority(t *testing.T) { assertErrorMessageContains(t, stdout, "priority must be one of low, normal, high") } +func TestOrchTaskAddSnapshotsSpecAndVerificationPolicy(t *testing.T) { + t.Parallel() + + tempDir := t.TempDir() + dbPath := filepath.Join(tempDir, "coord.db") + specFile := filepath.Join(tempDir, "task.md") + if err := os.WriteFile(specFile, []byte("# Task\n\nImplement the first verifier slice.\n"), 0o644); err != nil { + t.Fatalf("write spec file: %v", err) + } + + runOrchCommand( + t, + "--db", dbPath, + "--json", + "run", "init", + "--run", "run_blog_006", + "--goal", "Validate spec-aware task add", + ) + + taskOut := runOrchCommand( + t, + "--db", dbPath, + "--json", + "task", "add", + "--run", "run_blog_006", + "--task", "T1", + "--title", "Implement verifier", + "--spec-file", specFile, + "--check-profile", "cadence_component", + "--required-check", "lint", + "--required-check", "test", + "--allowed-path", "packages/ui", + "--blocked-path", "scripts/release-metadata.mjs", + "--metadata-json", `{"repo":"cadence-ui"}`, + ) + + var taskResp map[string]any + mustDecodeJSON(t, taskOut, &taskResp) + task := nestedValue(t, taskResp, "data", "task").(map[string]any) + spec := nestedValue(t, task, "spec").(map[string]any) + gate := nestedValue(t, task, "gate").(map[string]any) + if got, _ := spec["spec_file"].(string); got != specFile { + t.Fatalf("expected spec_file %q, got %#v", specFile, spec["spec_file"]) + } + if got, _ := spec["check_profile"].(string); got != "cadence_component" { + t.Fatalf("expected check_profile cadence_component, got %#v", spec["check_profile"]) + } + requiredChecks, ok := spec["required_checks"].([]any) + if !ok || len(requiredChecks) != 2 { + t.Fatalf("expected required_checks array, got %#v", spec["required_checks"]) + } + if got, _ := gate["status"].(string); got != "pending" { + t.Fatalf("expected pending gate, got %#v", gate["status"]) + } +} + +func TestOrchTaskAddRejectsSpecSHAMismatch(t *testing.T) { + t.Parallel() + + tempDir := t.TempDir() + dbPath := filepath.Join(tempDir, "coord.db") + specFile := filepath.Join(tempDir, "task.md") + if err := os.WriteFile(specFile, []byte("hello spec\n"), 0o644); err != nil { + t.Fatalf("write spec file: %v", err) + } + + runOrchCommand( + t, + "--db", dbPath, + "--json", + "run", "init", + "--run", "run_blog_007", + "--goal", "Validate spec sha mismatch", + ) + + stdout, _, exitCode := executeOrchCommand( + "--db", dbPath, + "--json", + "task", "add", + "--run", "run_blog_007", + "--task", "T1", + "--title", "Implement verifier", + "--spec-file", specFile, + "--spec-sha", "deadbeef", + ) + if exitCode != 30 { + t.Fatalf("expected invalid_input exit code 30, got %d\nstdout:\n%s", exitCode, stdout) + } + assertErrorJSON(t, stdout, "invalid_input") + assertErrorMessageContains(t, stdout, "spec-sha does not match spec-file contents") +} + func TestOrchReadyOrdersByPriorityAndRespectsLimit(t *testing.T) { t.Parallel() diff --git a/packages/orch-runtime/internal/cli/orch/help_contracts_test.go b/packages/orch-runtime/internal/cli/orch/help_contracts_test.go index 4c966de..c6d0d2b 100644 --- a/packages/orch-runtime/internal/cli/orch/help_contracts_test.go +++ b/packages/orch-runtime/internal/cli/orch/help_contracts_test.go @@ -98,3 +98,20 @@ func TestOrchCleanupHelpExplainsScopeFlags(t *testing.T) { t.Fatalf("expected cleanup help to include exact-attempt example, got:\n%s", combined) } } + +func TestOrchVerifyHelpExplainsGateWorkflow(t *testing.T) { + t.Parallel() + + stdout, stderr, exitCode := executeOrchCommand("verify", "--help") + if exitCode != 0 { + t.Fatalf("expected help exit 0, got %d\nstderr:\n%s\nstdout:\n%s", exitCode, stderr, stdout) + } + + combined := stdout + stderr + if !strings.Contains(combined, "Verification gate commands") { + t.Fatalf("expected verify help to explain purpose, got:\n%s", combined) + } + if !strings.Contains(combined, "before the required checks pass") { + t.Fatalf("expected verify help to mention required checks, got:\n%s", combined) + } +} diff --git a/packages/orch-runtime/internal/cli/orch/integration_test.go b/packages/orch-runtime/internal/cli/orch/integration_test.go index 4ac9b97..9982355 100644 --- a/packages/orch-runtime/internal/cli/orch/integration_test.go +++ b/packages/orch-runtime/internal/cli/orch/integration_test.go @@ -182,6 +182,183 @@ func TestOrchRunDispatchReconcileLifecycle(t *testing.T) { } } +func TestOrchVerificationGateLifecycle(t *testing.T) { + t.Parallel() + + tempDir := t.TempDir() + dbPath := filepath.Join(tempDir, "coord.db") + specFile := filepath.Join(tempDir, "task.md") + if err := os.WriteFile(specFile, []byte("# Task\n\nShip the verifier-backed component change.\n"), 0o644); err != nil { + t.Fatalf("write spec file: %v", err) + } + + runOrchCommand( + t, + "--db", dbPath, + "--json", + "run", "init", + "--run", "run_verify_001", + "--goal", "Exercise verification gates", + ) + + taskOut := runOrchCommand( + t, + "--db", dbPath, + "--json", + "task", "add", + "--run", "run_verify_001", + "--task", "T1", + "--title", "Implement verifier-backed task", + "--default-to", "worker-a", + "--spec-file", specFile, + "--check-profile", "cadence_component", + "--required-check", "lint", + "--required-check", "test", + ) + + var taskResp map[string]any + mustDecodeJSON(t, taskOut, &taskResp) + if got := nestedString(t, taskResp, "data", "task", "status"); got != "ready" { + t.Fatalf("expected new task ready, got %q", got) + } + + dispatchOut := runOrchCommand( + t, + "--db", dbPath, + "--json", + "dispatch", + "--run", "run_verify_001", + "--task", "T1", + "--execution-mode", "analysis", + "--body", "Implement the gated task.", + ) + + var dispatchResp map[string]any + mustDecodeJSON(t, dispatchOut, &dispatchResp) + threadID := nestedString(t, dispatchResp, "data", "attempt", "thread_id") + + runInboxCommand( + t, + "--db", dbPath, + "--json", + "claim", + "--agent", "worker-a", + "--thread", threadID, + ) + runInboxCommand( + t, + "--db", dbPath, + "--json", + "update", + "--agent", "worker-a", + "--thread", threadID, + "--status", "in_progress", + "--summary", "Implementation started", + ) + runOrchCommand( + t, + "--db", dbPath, + "--json", + "reconcile", + "--run", "run_verify_001", + ) + + runInboxCommand( + t, + "--db", dbPath, + "--json", + "done", + "--agent", "worker-a", + "--thread", threadID, + "--summary", "Implementation finished", + "--body", "Ready for verification.", + ) + + reconcileOut := runOrchCommand( + t, + "--db", dbPath, + "--json", + "reconcile", + "--run", "run_verify_001", + ) + + var reconcileResp map[string]any + mustDecodeJSON(t, reconcileOut, &reconcileResp) + updatedTasks := nestedArray(t, reconcileResp, "data", "updated_tasks") + task := updatedTasks[0].(map[string]any) + if got, _ := task["status"].(string); got != "verifying" { + t.Fatalf("expected task verifying after done reconcile, got %#v", task["status"]) + } + + verifyStatusOut := runOrchCommand( + t, + "--db", dbPath, + "--json", + "verify", "status", + "--run", "run_verify_001", + "--task", "T1", + ) + + var verifyStatusResp map[string]any + mustDecodeJSON(t, verifyStatusOut, &verifyStatusResp) + if got := nestedString(t, verifyStatusResp, "data", "gate", "status"); got != "pending" { + t.Fatalf("expected pending gate, got %q", got) + } + + verifyLintOut := runOrchCommand( + t, + "--db", dbPath, + "--json", + "verify", "record", + "--run", "run_verify_001", + "--task", "T1", + "--check", "lint", + "--status", "passed", + "--summary", "lint clean", + ) + + var verifyLintResp map[string]any + mustDecodeJSON(t, verifyLintOut, &verifyLintResp) + if got := nestedString(t, verifyLintResp, "data", "task", "status"); got != "verifying" { + t.Fatalf("expected task to stay verifying after first passed check, got %q", got) + } + + verifyTestOut := runOrchCommand( + t, + "--db", dbPath, + "--json", + "verify", "record", + "--run", "run_verify_001", + "--task", "T1", + "--check", "test", + "--status", "passed", + "--summary", "tests clean", + ) + + var verifyTestResp map[string]any + mustDecodeJSON(t, verifyTestOut, &verifyTestResp) + if got := nestedString(t, verifyTestResp, "data", "task", "status"); got != "done" { + t.Fatalf("expected task done after all required checks pass, got %q", got) + } + if got := nestedString(t, verifyTestResp, "data", "gate", "status"); got != "passed" { + t.Fatalf("expected passed gate after all checks pass, got %q", got) + } + + statusOut := runOrchCommand( + t, + "--db", dbPath, + "--json", + "status", + "--run", "run_verify_001", + ) + + var statusResp map[string]any + mustDecodeJSON(t, statusOut, &statusResp) + if got := nestedString(t, statusResp, "data", "run", "status"); got != "done" { + t.Fatalf("expected run status done after gate passes, got %q", got) + } +} + func TestOrchDependencyBlockedAndAnswerFlow(t *testing.T) { t.Parallel() diff --git a/packages/orch-runtime/internal/cli/orch/root.go b/packages/orch-runtime/internal/cli/orch/root.go index 59c866f..8dfaee2 100644 --- a/packages/orch-runtime/internal/cli/orch/root.go +++ b/packages/orch-runtime/internal/cli/orch/root.go @@ -16,7 +16,7 @@ func NewRootCmd() *cobra.Command { Use: "orch", Short: "Leader-facing scheduler and control plane", Long: helpLong( - "Use orch to manage leader-side scheduling for runs, tasks, dependencies, dispatch, retries, reassignment, blocked-task answers, and worktree-backed code attempts.", + "Use orch to manage leader-side scheduling for runs, tasks, dependencies, dispatch, retries, reassignment, verification gates, blocked-task answers, and worktree-backed code attempts.", "orch is the control plane; it creates durable handoff state in inbox but does not launch workers by itself.", "After dispatch, a separate worker runtime or worker agent should claim the assigned inbox thread.", "Use execution-mode analysis for thread-only work and execution-mode code for worktree-backed repository changes.", @@ -48,6 +48,7 @@ func NewRootCmd() *cobra.Command { cmd.AddCommand(newBlockedCmd(opts)) cmd.AddCommand(newAnswerCmd(opts)) cmd.AddCommand(newStatusCmd(opts)) + cmd.AddCommand(newVerifyCmd(opts)) return cmd } diff --git a/packages/orch-runtime/internal/cli/orch/task.go b/packages/orch-runtime/internal/cli/orch/task.go index a15a7d9..7e13cce 100644 --- a/packages/orch-runtime/internal/cli/orch/task.go +++ b/packages/orch-runtime/internal/cli/orch/task.go @@ -1,7 +1,11 @@ package orch import ( + "crypto/sha256" + "encoding/hex" "fmt" + "os" + "strings" "ai-workflow-skill/packages/coord-core/protocol" "ai-workflow-skill/packages/coord-core/store" @@ -17,6 +21,13 @@ type taskAddOptions struct { defaultTo string acceptanceJSON string priority string + specFile string + specSHA string + checkProfile string + requiredChecks []string + allowedPaths []string + blockedPaths []string + metadataJSON string } func newTaskCmd(root *rootOptions) *cobra.Command { @@ -42,10 +53,11 @@ func newTaskAddCmd(root *rootOptions) *cobra.Command { Short: "Add a task to a run", Long: helpLong( "Use task add to register one schedulable task inside a run.", - "Tasks may include a default worker target, priority, and optional acceptance JSON that downstream tooling can inspect.", + "Tasks may include a default worker target, priority, optional acceptance JSON, and a task-spec snapshot with verification policy.", "A task must belong to an existing run before it can become ready or be dispatched.", ), - Example: ` orch --db .agents/coord.db task add --run blog_mvp_001 --task T1 --title "Implement backend" --summary "Ship the first API slice" --default-to backend-worker --priority high`, + Example: ` orch --db .agents/coord.db task add --run blog_mvp_001 --task T1 --title "Implement backend" --summary "Ship the first API slice" --default-to backend-worker --priority high + orch --db .agents/coord.db task add --run blog_mvp_001 --task T2 --title "Polish release flow" --spec-file ./tasks/t2.md --check-profile cadence_component --required-check lint --required-check test:e2e`, RunE: func(cmd *cobra.Command, args []string) error { ctx := cmd.Context() @@ -55,6 +67,11 @@ func newTaskAddCmd(root *rootOptions) *cobra.Command { } defer sqlDB.Close() + specBody, computedSpecSHA, err := loadTaskSpecSnapshot(opts.specFile, opts.specSHA) + if err != nil { + return err + } + task, err := store.NewOrchStore(sqlDB).AddTask(ctx, store.AddTaskInput{ RunID: opts.runID, TaskID: opts.taskID, @@ -63,6 +80,14 @@ func newTaskAddCmd(root *rootOptions) *cobra.Command { DefaultTo: opts.defaultTo, AcceptanceJSON: opts.acceptanceJSON, Priority: opts.priority, + SpecFile: strings.TrimSpace(opts.specFile), + SpecSHA: computedSpecSHA, + SpecBody: specBody, + CheckProfile: strings.TrimSpace(opts.checkProfile), + RequiredChecks: opts.requiredChecks, + AllowedPaths: opts.allowedPaths, + BlockedPaths: opts.blockedPaths, + MetadataJSON: opts.metadataJSON, }) if err != nil { return err @@ -91,9 +116,40 @@ func newTaskAddCmd(root *rootOptions) *cobra.Command { cmd.Flags().StringVar(&opts.defaultTo, "default-to", "", "Default worker agent") cmd.Flags().StringVar(&opts.acceptanceJSON, "acceptance-json", "", "Acceptance criteria JSON") cmd.Flags().StringVar(&opts.priority, "priority", "normal", "Task priority") + cmd.Flags().StringVar(&opts.specFile, "spec-file", "", "Path to the task spec file to snapshot") + cmd.Flags().StringVar(&opts.specSHA, "spec-sha", "", "Optional expected SHA256 for --spec-file") + cmd.Flags().StringVar(&opts.checkProfile, "check-profile", "", "Verification check profile name") + cmd.Flags().StringSliceVar(&opts.requiredChecks, "required-check", nil, "Required verification check name; repeat for multiple checks") + cmd.Flags().StringSliceVar(&opts.allowedPaths, "allowed-path", nil, "Allowed path prefix for task scope; repeat for multiple paths") + cmd.Flags().StringSliceVar(&opts.blockedPaths, "blocked-path", nil, "Blocked path prefix for task scope; repeat for multiple paths") + cmd.Flags().StringVar(&opts.metadataJSON, "metadata-json", "", "Structured metadata JSON for task policy") _ = cmd.MarkFlagRequired("run") _ = cmd.MarkFlagRequired("task") _ = cmd.MarkFlagRequired("title") return cmd } + +func loadTaskSpecSnapshot(specFile, expectedSHA string) (string, string, error) { + specFile = strings.TrimSpace(specFile) + expectedSHA = strings.TrimSpace(expectedSHA) + if specFile == "" { + if expectedSHA != "" { + return "", "", fmt.Errorf("%w: spec-sha requires spec-file", store.ErrInvalidInput) + } + return "", "", nil + } + + body, err := os.ReadFile(specFile) + if err != nil { + return "", "", protocol.InvalidInput("failed to read spec-file", err) + } + + sum := sha256.Sum256(body) + computed := hex.EncodeToString(sum[:]) + if expectedSHA != "" && !strings.EqualFold(expectedSHA, computed) { + return "", "", fmt.Errorf("%w: spec-sha does not match spec-file contents", store.ErrInvalidInput) + } + + return string(body), computed, nil +} diff --git a/packages/orch-runtime/internal/cli/orch/verify.go b/packages/orch-runtime/internal/cli/orch/verify.go new file mode 100644 index 0000000..a4da25b --- /dev/null +++ b/packages/orch-runtime/internal/cli/orch/verify.go @@ -0,0 +1,195 @@ +package orch + +import ( + "fmt" + + "ai-workflow-skill/packages/coord-core/protocol" + "ai-workflow-skill/packages/coord-core/store" + + "github.com/spf13/cobra" +) + +type verifyRecordOptions struct { + runID string + taskID string + attemptNo int + checkName string + status string + summary string + body string + bodyFile string + metadataJSON string + recordedBy string +} + +type verifyStatusOptions struct { + runID string + taskID string + attemptNo int +} + +func newVerifyCmd(root *rootOptions) *cobra.Command { + cmd := &cobra.Command{ + Use: "verify", + Short: "Verification gate commands", + Long: helpLong( + "Verification gate commands record post-implementation checks and inspect task verification state.", + "Verification gates keep orch from treating a worker's done signal as final completion before the required checks pass.", + ), + Example: ` orch --db .agents/coord.db verify record --run blog_mvp_001 --task T1 --check lint --status passed --summary "lint clean" + orch --db .agents/coord.db verify status --run blog_mvp_001 --task T1`, + } + + cmd.AddCommand(newVerifyRecordCmd(root)) + cmd.AddCommand(newVerifyStatusCmd(root)) + return cmd +} + +func newVerifyRecordCmd(root *rootOptions) *cobra.Command { + opts := &verifyRecordOptions{} + + cmd := &cobra.Command{ + Use: "record", + Short: "Record one verification check result for a task attempt", + Long: helpLong( + "Use verify record after reconcile has moved a task into verifying, or when a prior verification failure needs an updated check result.", + "record upserts one named check result for the selected attempt, then recomputes the task gate and task status.", + ), + Example: ` orch --db .agents/coord.db verify record --run blog_mvp_001 --task T1 --check lint --status passed --summary "lint clean" + orch --db .agents/coord.db verify record --run blog_mvp_001 --task T1 --check package-consumer --status failed --summary "consumer smoke failed" --body-file ./artifacts/package-consumer.txt`, + RunE: func(cmd *cobra.Command, args []string) error { + ctx := cmd.Context() + + sqlDB, err := openOrchDB(ctx, root.dbPath) + if err != nil { + return err + } + defer sqlDB.Close() + + body, err := resolveBodyValue(opts.body, opts.bodyFile) + if err != nil { + return err + } + + result, err := store.NewOrchStore(sqlDB).RecordCheck(ctx, store.VerifyRecordInput{ + RunID: opts.runID, + TaskID: opts.taskID, + AttemptNo: opts.attemptNo, + CheckName: opts.checkName, + Status: opts.status, + Summary: opts.summary, + Body: body, + MetadataJSON: opts.metadataJSON, + RecordedBy: opts.recordedBy, + }) + if err != nil { + return err + } + + resp := protocol.Success{ + OK: true, + Command: "verify record", + Data: map[string]any{ + "task": result.Task, + "attempt": result.Attempt, + "check": result.Check, + "gate": result.Gate, + }, + } + if root.json { + return protocol.WriteJSON(cmd.OutOrStdout(), resp) + } + + _, err = fmt.Fprintf( + cmd.OutOrStdout(), + "recorded check %s=%s for %s/%s attempt %d\n", + result.Check.CheckName, + result.Check.Status, + result.Task.RunID, + result.Task.TaskID, + result.Attempt.AttemptNo, + ) + return err + }, + } + + cmd.Flags().StringVar(&opts.runID, "run", "", "Run ID") + cmd.Flags().StringVar(&opts.taskID, "task", "", "Task ID") + cmd.Flags().IntVar(&opts.attemptNo, "attempt", 0, "Attempt number; defaults to the latest attempt") + cmd.Flags().StringVar(&opts.checkName, "check", "", "Check name") + cmd.Flags().StringVar(&opts.status, "status", "", "Check status: passed, failed, or skipped") + cmd.Flags().StringVar(&opts.summary, "summary", "", "Short verification summary") + cmd.Flags().StringVar(&opts.body, "body", "", "Optional verification details") + cmd.Flags().StringVar(&opts.bodyFile, "body-file", "", "Optional path to a verification details file") + cmd.Flags().StringVar(&opts.metadataJSON, "metadata-json", "", "Structured metadata JSON for the check result") + cmd.Flags().StringVar(&opts.recordedBy, "recorded-by", "orch", "Recorder identity for the check result") + _ = cmd.MarkFlagRequired("run") + _ = cmd.MarkFlagRequired("task") + _ = cmd.MarkFlagRequired("check") + _ = cmd.MarkFlagRequired("status") + + return cmd +} + +func newVerifyStatusCmd(root *rootOptions) *cobra.Command { + opts := &verifyStatusOptions{} + + cmd := &cobra.Command{ + Use: "status", + Short: "Show the current verification state for one task", + Long: helpLong( + "Use verify status to inspect the task spec snapshot, the selected attempt, and the current gate state in one response.", + "Prefer this when a task is stuck in verifying or failed because of gate results.", + ), + Example: ` orch --db .agents/coord.db verify status --run blog_mvp_001 --task T1 + orch --db .agents/coord.db --json verify status --run blog_mvp_001 --task T1 --attempt 2`, + RunE: func(cmd *cobra.Command, args []string) error { + ctx := cmd.Context() + + sqlDB, err := openOrchDB(ctx, root.dbPath) + if err != nil { + return err + } + defer sqlDB.Close() + + result, err := store.NewOrchStore(sqlDB).GetVerificationStatus(ctx, store.VerificationStatusInput{ + RunID: opts.runID, + TaskID: opts.taskID, + AttemptNo: opts.attemptNo, + }) + if err != nil { + return err + } + + resp := protocol.Success{ + OK: true, + Command: "verify status", + Data: map[string]any{ + "task": result.Task, + "attempt": result.Attempt, + "spec": result.Spec, + "gate": result.Gate, + }, + } + if root.json { + return protocol.WriteJSON(cmd.OutOrStdout(), resp) + } + + if result.Gate == nil { + _, err = fmt.Fprintf(cmd.OutOrStdout(), "%s/%s has no verification gate\n", result.Task.RunID, result.Task.TaskID) + return err + } + + _, err = fmt.Fprintf(cmd.OutOrStdout(), "%s/%s verification status %s\n", result.Task.RunID, result.Task.TaskID, result.Gate.Status) + return err + }, + } + + cmd.Flags().StringVar(&opts.runID, "run", "", "Run ID") + cmd.Flags().StringVar(&opts.taskID, "task", "", "Task ID") + cmd.Flags().IntVar(&opts.attemptNo, "attempt", 0, "Attempt number; defaults to the latest attempt") + _ = cmd.MarkFlagRequired("run") + _ = cmd.MarkFlagRequired("task") + + return cmd +}