Files
ai-workflow-skill/docs/tests/orch-skill/leader-run-dispatch-reconcile-through-bundled-cli.md
T

5.2 KiB

Case: leader-run-dispatch-reconcile-through-bundled-cli

Test Type

This is a forward-test and a leader-side happy-path skill validation.

The goal is to verify that a leader using the packaged orch skill can drive a complete run lifecycle while a worker uses the packaged inbox skill for thread progress.

Purpose

Validate that all of the following can be true at the same time:

  • the leader can use the bundled ./assets/orch CLI through the skill
  • the leader can create a run, add a task, dispatch it, reconcile worker progress, and inspect final status
  • a worker using the bundled inbox skill can claim the dispatched thread and finish it
  • the final orch run state and inbox thread state both reach done

Preconditions

  • orch skill path exists: ORCH_SKILL_PATH=skills/orch
  • inbox skill path exists: INBOX_SKILL_PATH=skills/inbox
  • bundled CLI executables exist at ORCH_SKILL_PATH/assets/orch and INBOX_SKILL_PATH/assets/inbox
  • use an empty temporary directory TMPDIR
  • initialize TMPDIR/coord.db before launching role agents through INBOX_SKILL_PATH/assets/inbox --db TMPDIR/coord.db --json init

Agent Topology

  • leader
  • worker-a

Inputs

Leader Prompt

Use $orch at ORCH_SKILL_PATH to act as leader on the already initialized SQLite DB TMPDIR/coord.db. Only coordinate through the bundled orch CLI from the skill. Workflow: 1) create run run_blog_skill_001, 2) add exactly one task T1 assigned to worker-a, 3) dispatch it with --execution-mode analysis, 4) wait or poll until the worker reports completion, 5) reconcile the run, 6) inspect final status, 7) stop after reporting RUN_ID and THREAD_ID. Do not use ordinary chat to coordinate with the worker.

Worker Prompt

Use $inbox at INBOX_SKILL_PATH to act as worker-a on SQLite DB TMPDIR/coord.db. Only coordinate through the bundled inbox CLI from the skill. Workflow: 1) fetch pending work for worker-a, 2) claim it, 3) send one in_progress update, 4) finish it with done, 5) stop after reporting the THREAD_ID you handled. Do not use ordinary chat to coordinate with the leader.

Execution Parameters

  • use the shared execution contract from README.md
  • use the shared timeout defaults from README.md
  • do not override the default cleanup policy

Execution Steps

  1. Initialize TMPDIR/coord.db once through the bundled inbox CLI before launching agents
  2. Inject skills/orch/ into leader
  3. Inject skills/inbox/ into worker-a
  4. Point both agents at the same database path TMPDIR/coord.db
  5. Launch leader and worker-a in parallel
  6. Wait for both agents to finish
  7. Resolve RUN_ID=run_blog_skill_001 and THREAD_ID from the agent outputs
  8. Independently run the validation commands from the main thread

Validation Commands

ORCH_SKILL_PATH/assets/orch --db TMPDIR/coord.db --json status --run run_blog_skill_001
INBOX_SKILL_PATH/assets/inbox --db TMPDIR/coord.db --json show --thread THREAD_ID

Expected Outcomes

  • leader successfully creates run_blog_skill_001
  • leader successfully adds and dispatches T1
  • worker-a successfully claims the dispatched thread
  • worker-a emits at least one in_progress update
  • worker-a completes the thread with done
  • leader successfully reconciles and sees run.status == "done"

Assertions

  • status.data.run.run_id == "run_blog_skill_001"
  • status.data.run.status == "done"
  • status.data.tasks contains exactly one task T1
  • status.data.tasks[0].status == "done"
  • show.data.thread.status == "done"
  • show.data.messages[*].kind includes task, progress, and result

Cleanup

  • use the default cleanup policy from README.md
  • if the run fails, retain TMPDIR and coord.db for replay and manual inspection

Recorded Example Run

  • recorded on: 2026-03-19
  • execution mode: direct_cli_replay via scripts/run_orch_skill_forward_tests.sh
  • result: pass
  • observed run id: run_blog_skill_001
  • observed thread id: thr_eced1b8cb1254065a7cd3aaff6dc0bcb
  • evidence summary:
  • final orch status --run run_blog_skill_001 --json returned run.status == "done" with a single task T1 in state done
  • final inbox show --thread thr_eced1b8cb1254065a7cd3aaff6dc0bcb --json returned thread state done and message kinds task, progress, and result
  • the replay also observed orch wait --for task_done wake successfully before the final reconcile
  • note: this recorded run exercised the packaged binaries directly in a temporary DB and did not spawn separate Codex role agents

Recorded Real Forward Run

  • recorded on: 2026-03-19
  • execution mode: real_subagent_forward_test
  • result: pass
  • evidence root: /tmp/orch-skill-subagents.J1XWgs/leader-run-dispatch-reconcile-through-bundled-cli
  • observed run id: run_blog_skill_001
  • observed thread id: thr_7c64e75bbcce4143a7fc425242f7e7d3
  • evidence summary:
  • a real leader agent using skills/orch/ completed run init, task add, dispatch, wait, reconcile, and status
  • a real worker agent using skills/inbox/ completed fetch, claim, update --status in_progress, and done
  • main-thread validation confirmed status.data.run.status == "done", status.data.tasks[0].status == "done", and thread history kinds task, progress, and result