Files
ai-workflow-skill/docs/orch-cli.md
T

692 lines
19 KiB
Markdown

# Orch CLI
## Purpose
`orch` is the leader-facing scheduler and control plane. It owns the run, task graph, task specs, dependencies, verification gates, ready queue, dispatch decisions, retries, and reassignment logic.
`orch` does not replace `inbox`. It uses `inbox` as the durable transport and execution record.
In normal operation:
- leaders use `orch`
- `orch` creates and monitors `inbox` threads
- workers continue using `inbox`
- a separate worker runtime or worker agent must still consume the assigned inbox thread after `dispatch`
- Codex-specific launch bridges may sit above `orch`, but they should consume dispatch output rather than change the CLI contract
## Responsibilities
`orch` is responsible for:
- creating a run for one user request or project
- defining tasks and dependencies
- snapshotting task specs and per-task verification policy
- calculating which tasks are ready
- dispatching ready tasks to workers
- tracking attempts and mapping them to inbox threads
- allocating attempt worktrees for code tasks
- aggregating post-implementation check results into a task verification gate
- surfacing blocked tasks to the leader
- sending answers back into the active inbox thread
- reconciling thread state into task state
- blocking until actionable events arrive for the leader
- retrying, reassigning, cancelling, or adding follow-up tasks
## Non-Responsibilities
`orch` should not implement:
- worker claiming
- direct worker polling
- automatic worker-runtime launch
- raw message append storage
- low-level thread history management
Those belong to `inbox`.
## Core Objects
- `run`: one coordinated execution for a user request
- `task`: one schedulable unit of work
- `dependency`: an edge between tasks
- `attempt`: one execution try for a task
- `dispatch`: the act of materializing a task into an inbox thread
- `workspace`: the branch and worktree assigned to one code-writing attempt
## Workspace Model
For `--execution-mode code`, `orch` should allocate one Git worktree per attempt.
Strict policy:
- dispatch from a concrete committed `base_ref`
- fail dispatch if the leader is implicitly relying on uncommitted state
- create a fresh worktree for every retry
- do not let workers edit the user's primary checkout
See [worktree-execution.md](/home/kurihada/project/ai-workflow-skill/docs/worktree-execution.md) for the full execution model.
## Task State Model
- `planned`: task exists but is not yet eligible for dispatch
- `ready`: dependencies are satisfied and it can be dispatched
- `dispatched`: an inbox thread exists but the worker has not started yet
- `running`: the task has been claimed and is actively executing
- `blocked`: the active attempt needs clarification or an external dependency
- `verifying`: the worker reported completion, but required verification checks have not all passed yet
- `done`: task completed and passed its current acceptance gate
- `failed`: task completed unsuccessfully
- `cancelled`: task was cancelled and should not continue
Suggested transitions:
- `planned -> ready`
- `ready -> dispatched`
- `dispatched -> running`
- `running -> blocked`
- `blocked -> running`
- `running -> verifying` when the worker reports `done` and the task has required checks
- `running -> done` when the worker reports `done` and the task has no required checks
- `verifying -> done`
- `verifying -> failed`
- `running -> failed`
- `failed -> ready` through explicit retry
- `* -> cancelled` by leader action
## Leader Workflow
The normal leader loop is:
1. create a run
2. add tasks
3. add dependencies
4. inspect `ready`
5. `dispatch` tasks
6. arrange or launch a separate worker runtime that consumes the assigned inbox threads
7. use `status` for the current operational view; it reconciles first and includes latest attempt, message, and gate context
8. if a task enters `verifying`, record check results with `verify record` and inspect gate state with `verify status`
9. inspect `blocked`
10. answer blocked questions
11. if nothing is actionable, call `wait`
12. retry or reassign failures when needed
13. finish when all required tasks are `done`
The leader should block on `orch wait`, not on ad hoc `sleep`.
## CLI Surface
The binary name is `orch`.
Built-in help should be sufficient for first use:
- root help should explain the leader role and the normal run -> task -> dep -> ready -> dispatch -> wait/status loop
- command help should explain the scheduling contract, not just list flags
- `dispatch` help should explicitly explain `--execution-mode analysis|code` and which flags only apply to code mode
- high-frequency commands should include concrete examples that can be copied directly
- `status` help should explain that it is the main operational dashboard command
- `blocked` help should explain that it is the compact queue to inspect before `answer`
- `cleanup` help should explain how `--task`, `--attempt`, and `--all-completed` change cleanup scope
### Global Flags
- `--db PATH`
- `--json`
### `orch run init`
Create a new run.
Suggested flags:
- `--run RUN_ID`
- `--goal TEXT`
- `--summary TEXT`
Example:
```bash
orch run init --db .agents/coord.db --run blog_mvp_001 --goal "Build blog MVP" --summary "Public blog plus admin CRUD"
```
### `orch run show`
Show run metadata and current aggregate status.
Suggested flags:
- `--run RUN_ID`
### `orch task add`
Add a task to a run.
Suggested flags:
- `--run RUN_ID`
- `--task TASK_ID`
- `--title TEXT`
- `--summary TEXT`
- `--default-to AGENT`
- `--acceptance-json STRING`
- `--priority low|normal|high`
- `--spec-file PATH`
- `--spec-sha SHA256`
- `--check-profile NAME`
- `--required-check NAME` repeatable
- `--allowed-path PATH` repeatable
- `--blocked-path PATH` repeatable
- `--metadata-json STRING`
### `orch dep add`
Add a dependency edge.
Suggested flags:
- `--run RUN_ID`
- `--task TASK_ID`
- `--depends-on TASK_ID`
### `orch ready`
List tasks ready for dispatch.
Suggested flags:
- `--run RUN_ID`
- `--limit N`
### `orch dispatch`
Dispatch a ready task to a worker by creating an inbox thread and the first task message.
Suggested flags:
- `--run RUN_ID`
- `--task TASK_ID`
- `--execution-mode analysis|code`
- `--to AGENT`
- `--repo-path PATH`
- `--base-ref REF`
- `--workspace-root PATH`
- `--body TEXT`
- `--body-file PATH`
Behavior:
- creates a new attempt
- requires the caller to choose `--execution-mode analysis|code`
- in `analysis` mode, stays thread-only and does not allocate a worktree
- in `code` mode, resolves the source repository from `--repo-path` or the current working directory
- in `code` mode, resolves a committed base revision
- in `code` mode, creates a branch and worktree for the attempt
- creates or links an `inbox` thread
- writes `execution_mode` into the inbox task payload, includes the task spec snapshot and verification policy in the dispatch payload, and writes workspace metadata for code tasks into attempt storage and task payload
- moves the task to `dispatched`
- does not start a worker runtime on its own
Integration note:
- a higher-level Codex bridge may save this JSON output, render a worker brief, and then spawn a worker sub-agent
- that bridge should remain outside the core `orch` runtime so the scheduling contract stays portable
Code-mode recommendation:
- if `--base-ref` is omitted and the repository is clean, default to `HEAD`
- if `--base-ref` is omitted and the repository is dirty, fail dispatch
- if `--base-ref` is provided, resolve it to a commit and use it exactly
- if `--workspace-root` is omitted in worktree mode, default to `.orch/worktrees` under the source repository
### `orch reconcile`
Read inbox state and update run/task state.
Suggested flags:
- `--run RUN_ID`
Behavior:
- maps inbox `claimed` or `in_progress` to `running`
- maps inbox `blocked` to `blocked`
- maps inbox `done` to `verifying` when the task has required checks
- maps inbox `done` to `done` when the task has no required checks
- maps inbox `failed` to `failed`
### `orch blocked`
List blocked tasks and their latest question.
Suggested flags:
- `--run RUN_ID`
### `orch verify record`
Record or update one verification check result for the latest task attempt.
Suggested flags:
- `--run RUN_ID`
- `--task TASK_ID`
- `--attempt N` optional; defaults to latest
- `--check NAME`
- `--status passed|failed|skipped`
- `--summary TEXT`
- `--body TEXT`
- `--body-file PATH`
- `--metadata-json STRING`
- `--recorded-by NAME`
Behavior:
- upserts one named check result for the selected attempt
- emits a verification-recorded event
- recomputes the gate for the task
- keeps the task in `verifying` while required checks are still pending
- moves the task to `done` when all required checks pass
- refreshes dependent readiness immediately when the task enters or leaves `done`, so newly unblocked work emits `task_ready` in the same flow
- moves the task to `failed` when one or more required checks fail
### `orch verify status`
Show the current verification state for one task.
Suggested flags:
- `--run RUN_ID`
- `--task TASK_ID`
- `--attempt N` optional; defaults to latest
Behavior:
- returns the task, selected attempt, task spec snapshot, and current gate state
- helps the leader inspect which required checks are still pending or failed
### `orch wait`
Block until one or more run-scoped events become available.
This is the normal wait primitive for the interactive leader.
Suggested flags:
- `--run RUN_ID`
- `--for task_ready,task_blocked,task_verifying,task_done,task_failed`
- `--after-event EVENT_ID`
- `--timeout-seconds N`
Behavior:
- blocks until a later matching event exists
- reconciles inbox state while polling so worker thread transitions can surface as `task_*` events
- returns a cursor for the next wait
- lets the leader wait for worker activity without manual sleep loops
### `orch answer`
Answer the active blocked question for a task by writing into the mapped inbox thread.
Suggested flags:
- `--run RUN_ID`
- `--task TASK_ID`
- `--body TEXT`
- `--body-file PATH`
- `--payload-json STRING`
### `orch retry`
Explicitly retry a failed task.
Suggested flags:
- `--run RUN_ID`
- `--task TASK_ID`
- `--to AGENT`
- `--body TEXT`
- `--body-file PATH`
Behavior:
- creates a new attempt
- links the retry to the prior failed attempt
- dispatches a new inbox thread or fresh task message
### `orch reassign`
Move a blocked or failed task to another worker.
Suggested flags:
- `--run RUN_ID`
- `--task TASK_ID`
- `--to AGENT`
- `--reason TEXT`
### `orch cancel`
Cancel a task or an entire run.
Suggested flags:
- `--run RUN_ID`
- `--task TASK_ID`
- `--reason TEXT`
### `orch cleanup`
Remove completed or abandoned attempt worktrees that are no longer needed.
Suggested flags:
- `--run RUN_ID`
- `--task TASK_ID`
- `--attempt N`
- `--all-completed`
- `--force`
### `orch status`
Show task state summary for the run.
Suggested flags:
- `--run RUN_ID`
Behavior:
- reconciles inbox thread state before returning the view
- returns run aggregate counts plus per-task detail
- includes the latest attempt for each task when one exists
- includes the latest thread message for each task when one exists
- includes the latest blocked question for blocked tasks so the leader can inspect the current issue without a separate `blocked` call in the common case
### `orch council start`
Start a three-reviewer council workflow for one target.
Suggested flags:
- `--run RUN_ID`
- `--target TEXT`
- `--target-file PATH`
- `--repo-path PATH`
- `--task-id TASK_ID`
- `--target-type text|repo|mixed`
- `--mode brainstorm|review`
- `--output markdown|json|both`
- `--only-unanimous`
Default behavior:
- fixed reviewer roles: `architecture-reviewer`, `implementation-reviewer`, `risk-reviewer`
- analysis only
- `--target-type mixed`
- `--output both`
- unanimous-only disabled unless requested
- reviewer count fixed at `3` in v1
### `orch council wait`
Block until the council has enough reviewer responses to continue.
Suggested flags:
- `--run RUN_ID`
- `--timeout-seconds N`
### `orch council tally`
Group similar reviewer suggestions and compute support counts.
Suggested flags:
- `--run RUN_ID`
- `--similarity strict|normal`
Behavior:
- groups semantically similar reviewer proposals
- assigns `consensus`, `majority`, or `minority`
- persists grouped recommendations in `orch` storage
Default behavior:
- `--similarity normal`
### `orch council report`
Render the final grouped council output.
Suggested flags:
- `--run RUN_ID`
- `--show consensus|majority|minority|all|consensus,majority`
Default behavior:
- show `consensus,majority`
- preserve `minority` in persisted storage even if omitted from the main report
- support both markdown artifacts and JSON output
## Relationship To Inbox
`orch` should be implemented as a control plane on top of `inbox`.
- `orch dispatch` writes the first `task` message into `inbox`
- `orch dispatch` also writes `execution_mode` into the inbox payload and writes worktree metadata for code tasks into the attempt record and inbox payload
- workers claim and update status through `inbox`
- `orch reconcile` reads thread state and converts it into task state
- `orch answer` writes an inbox `answer` message to the active thread
The leader should not need to hand-write `inbox send` during normal dispatch.
Higher-level workflows such as council review should also run on top of `orch`, not as a separate infrastructure layer. See [council-review.md](/home/kurihada/project/ai-workflow-skill/docs/council-review.md).
## Waiting Model
The leader does not receive worker output as an in-memory push. Instead:
- workers write updates into `inbox`
- `inbox` appends events
- `orch reconcile` converts thread state into task state
- `orch wait` blocks on the run-scoped event stream
This is still a single leader model. `orch wait` is just the leader's blocking read primitive.
## JSON Contract
Every command should support `--json`.
Suggested success shape:
```json
{
"ok": true,
"command": "dispatch",
"run_id": "blog_mvp_001",
"task": {
"task_id": "T4",
"status": "dispatched",
"assigned_to": "backend-worker"
},
"attempt": {
"attempt_no": 1,
"thread_id": "thr_987",
"base_ref": "main",
"base_commit": "abc1234",
"branch_name": "orch/blog_mvp_001/T4/attempt-1",
"worktree_path": ".orch/worktrees/blog_mvp_001/T4/attempt-1"
},
"message": {
"kind": "task",
"payload_json": {
"execution_mode": "code"
}
}
}
```
Suggested `wait` wake shape:
```json
{
"ok": true,
"command": "wait",
"woke": true,
"next_event_id": 127,
"events": [
{
"event_id": 127,
"type": "task_blocked",
"run_id": "blog_mvp_001",
"task_id": "T5",
"thread_id": "thr_t5_attempt1",
"summary": "Editor choice undecided",
"payload": {
"question": "Should the admin editor use a rich text editor or plain textarea in MVP?"
}
}
]
}
```
## Exit Codes
- `0`: success
- `10`: no ready or matching tasks
- `20`: conflict
- `30`: invalid input or invalid state transition
- `40`: not found
- `50`: storage or internal error
## SQLite Schema Draft
These tables should live in the same SQLite file as the inbox tables.
```sql
CREATE TABLE IF NOT EXISTS runs (
run_id TEXT PRIMARY KEY,
goal TEXT NOT NULL,
summary TEXT NOT NULL DEFAULT '',
status TEXT NOT NULL DEFAULT 'active',
created_at TEXT NOT NULL,
updated_at TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS tasks (
run_id TEXT NOT NULL,
task_id TEXT NOT NULL,
title TEXT NOT NULL,
summary TEXT NOT NULL DEFAULT '',
status TEXT NOT NULL,
default_to TEXT,
priority TEXT NOT NULL DEFAULT 'normal',
acceptance_json TEXT NOT NULL DEFAULT '[]',
latest_attempt_no INTEGER,
created_at TEXT NOT NULL,
updated_at TEXT NOT NULL,
PRIMARY KEY (run_id, task_id),
FOREIGN KEY(run_id) REFERENCES runs(run_id)
);
CREATE TABLE IF NOT EXISTS task_dependencies (
run_id TEXT NOT NULL,
task_id TEXT NOT NULL,
depends_on_task_id TEXT NOT NULL,
PRIMARY KEY (run_id, task_id, depends_on_task_id)
);
CREATE TABLE IF NOT EXISTS task_attempts (
run_id TEXT NOT NULL,
task_id TEXT NOT NULL,
attempt_no INTEGER NOT NULL,
assigned_to TEXT NOT NULL,
thread_id TEXT NOT NULL,
base_ref TEXT,
base_commit TEXT,
branch_name TEXT,
worktree_path TEXT,
workspace_status TEXT,
result_commit TEXT,
status TEXT NOT NULL,
created_at TEXT NOT NULL,
updated_at TEXT NOT NULL,
PRIMARY KEY (run_id, task_id, attempt_no)
);
CREATE INDEX IF NOT EXISTS idx_tasks_run_status
ON tasks(run_id, status, priority, updated_at);
CREATE TABLE IF NOT EXISTS events (
event_id INTEGER PRIMARY KEY AUTOINCREMENT,
run_id TEXT NOT NULL,
task_id TEXT NOT NULL,
thread_id TEXT,
source TEXT NOT NULL,
event_type TEXT NOT NULL,
message_id TEXT,
summary TEXT NOT NULL DEFAULT '',
payload_json TEXT NOT NULL DEFAULT '{}',
created_at TEXT NOT NULL
);
CREATE INDEX IF NOT EXISTS idx_events_run_event
ON events(run_id, event_id);
```
## Embedded Skill Draft
The following block is a draft `SKILL.md` for the leader-facing `orch` skill.
````markdown
```markdown
---
name: orch
description: Use this skill when the leader needs to plan and schedule work through the orch CLI. It is for creating runs, adding tasks and dependencies, finding ready work, dispatching tasks to workers, allocating task worktrees, reconciling inbox state, waiting for worker events, reviewing blocked tasks, answering them, retrying failures, reassigning work, and cleaning up attempt worktrees. Do not use this skill for worker-side claim or progress updates; use inbox for that.
---
# Orch
Use this skill when you are the leader and need to control the task graph through the `orch` CLI.
## When To Use
- you need to decompose a goal into tasks
- you need to record dependencies
- you need to know which tasks are ready
- you need to dispatch work to workers
- you need to allocate isolated worktrees for code-writing tasks
- you need to inspect blocked tasks and answer them
- you need to retry or reassign a failed task
## Rules
- Prefer `orch` over hand-written `inbox send` for normal leader operations.
- Reconcile inbox state before making new dispatch decisions.
- If nothing is actionable, use `orch wait` instead of manual sleep loops.
- For code tasks, dispatch from a committed base and allocate a fresh worktree per attempt.
- Choose `--execution-mode analysis` for read-only or review work and `--execution-mode code` for repository-writing work.
- Keep tasks small enough to be checkable and to minimize clarification loops.
- Use `inbox` directly only for inspection or manual repair.
- Keep user-facing discussion in the leader.
## Typical Commands
```bash
orch run init --run blog_mvp_001 --goal "Build blog MVP" --summary "Public blog plus admin CRUD" --json
orch task add --run blog_mvp_001 --task T1 --title "Project skeleton" --summary "Initialize app structure and database wiring" --default-to foundation-worker --json
orch dep add --run blog_mvp_001 --task T2 --depends-on T1 --json
orch ready --run blog_mvp_001 --json
orch dispatch --run blog_mvp_001 --task T1 --execution-mode code --to foundation-worker --base-ref main --workspace-root .orch/worktrees --body-file tasks/t1.md --json
orch reconcile --run blog_mvp_001 --json
orch wait --run blog_mvp_001 --for task_blocked,task_verifying,task_done,task_failed --after-event 0 --timeout-seconds 900 --json
orch blocked --run blog_mvp_001 --json
orch answer --run blog_mvp_001 --task T2 --body "MVP supports draft and published only." --json
orch retry --run blog_mvp_001 --task T7a --to backend-worker --body "Retry after fixing the contract mismatch." --json
orch cleanup --run blog_mvp_001 --all-completed --json
```
```
````