# Case: `leader-run-dispatch-reconcile-through-bundled-cli` ## Test Type This is a `forward-test` and a leader-side happy-path skill validation. The goal is to verify that a leader using the packaged `orch` skill can drive a complete run lifecycle while a worker uses the packaged `inbox` skill for thread progress. ## Purpose Validate that all of the following can be true at the same time: - the leader can use the bundled `./assets/orch` CLI through the skill - the leader can create a run, add a task, dispatch it, reconcile worker progress, and inspect final status - a worker using the bundled inbox skill can claim the dispatched thread and finish it - the final orch run state and inbox thread state both reach `done` ## Preconditions - orch skill path exists: `ORCH_SKILL_PATH=skills/orch` - inbox skill path exists: `INBOX_SKILL_PATH=skills/inbox` - bundled CLI executables exist at `ORCH_SKILL_PATH/assets/orch` and `INBOX_SKILL_PATH/assets/inbox` - use an empty temporary directory `TMPDIR` - initialize `TMPDIR/coord.db` before launching role agents through `INBOX_SKILL_PATH/assets/inbox --db TMPDIR/coord.db --json init` ## Agent Topology - `leader` - `worker-a` ## Inputs ### Leader Prompt ```text Use $orch at ORCH_SKILL_PATH to act as leader on the already initialized SQLite DB TMPDIR/coord.db. Only coordinate through the bundled orch CLI from the skill. Workflow: 1) create run run_blog_skill_001, 2) add exactly one task T1 assigned to worker-a, 3) dispatch it with --execution-mode analysis, 4) wait or poll until the worker reports completion, 5) reconcile the run, 6) inspect final status, 7) stop after reporting RUN_ID and THREAD_ID. Do not use ordinary chat to coordinate with the worker. ``` ### Worker Prompt ```text Use $inbox at INBOX_SKILL_PATH to act as worker-a on SQLite DB TMPDIR/coord.db. Only coordinate through the bundled inbox CLI from the skill. Workflow: 1) fetch pending work for worker-a, 2) claim it, 3) send one in_progress update, 4) finish it with done, 5) stop after reporting the THREAD_ID you handled. Do not use ordinary chat to coordinate with the leader. ``` ## Execution Parameters - use the shared execution contract from [README.md](./README.md) - use the shared timeout defaults from [README.md](./README.md) - do not override the default cleanup policy ## Execution Steps 1. Initialize `TMPDIR/coord.db` once through the bundled inbox CLI before launching agents 2. Inject `skills/orch/` into `leader` 3. Inject `skills/inbox/` into `worker-a` 4. Point both agents at the same database path `TMPDIR/coord.db` 5. Launch `leader` and `worker-a` in parallel 6. Wait for both agents to finish 7. Resolve `RUN_ID=run_blog_skill_001` and `THREAD_ID` from the agent outputs 8. Independently run the validation commands from the main thread ## Validation Commands ```bash ORCH_SKILL_PATH/assets/orch --db TMPDIR/coord.db --json status --run run_blog_skill_001 INBOX_SKILL_PATH/assets/inbox --db TMPDIR/coord.db --json show --thread THREAD_ID ``` ## Expected Outcomes - `leader` successfully creates `run_blog_skill_001` - `leader` successfully adds and dispatches `T1` - `worker-a` successfully claims the dispatched thread - `worker-a` emits at least one `in_progress` update - `worker-a` completes the thread with `done` - `leader` successfully reconciles and sees `run.status == "done"` ## Assertions - `status.data.run.run_id == "run_blog_skill_001"` - `status.data.run.status == "done"` - `status.data.tasks` contains exactly one task `T1` - `status.data.tasks[0].status == "done"` - `show.data.thread.status == "done"` - `show.data.messages[*].kind` includes `task`, `progress`, and `result` ## Cleanup - use the default cleanup policy from [README.md](./README.md) - if the run fails, retain `TMPDIR` and `coord.db` for replay and manual inspection ## Recorded Example Run - recorded on: `2026-03-19` - execution mode: `direct_cli_replay` via `scripts/run_orch_skill_forward_tests.sh` - result: `pass` - observed run id: `run_blog_skill_001` - observed thread id: `thr_eced1b8cb1254065a7cd3aaff6dc0bcb` - evidence summary: - final `orch status --run run_blog_skill_001 --json` returned `run.status == "done"` with a single task `T1` in state `done` - final `inbox show --thread thr_eced1b8cb1254065a7cd3aaff6dc0bcb --json` returned thread state `done` and message kinds `task`, `progress`, and `result` - the replay also observed `orch wait --for task_done` wake successfully before the final reconcile - note: this recorded run exercised the packaged binaries directly in a temporary DB and did not spawn separate Codex role agents ## Recorded Real Forward Run - recorded on: `2026-03-19` - execution mode: `real_subagent_forward_test` - result: `pass` - evidence root: `/tmp/orch-skill-subagents.J1XWgs/leader-run-dispatch-reconcile-through-bundled-cli` - observed run id: `run_blog_skill_001` - observed thread id: `thr_7c64e75bbcce4143a7fc425242f7e7d3` - evidence summary: - a real leader agent using `skills/orch/` completed `run init`, `task add`, `dispatch`, `wait`, `reconcile`, and `status` - a real worker agent using `skills/inbox/` completed `fetch`, `claim`, `update --status in_progress`, and `done` - main-thread validation confirmed `status.data.run.status == "done"`, `status.data.tasks[0].status == "done"`, and thread history kinds `task`, `progress`, and `result`