6.0 KiB
6.0 KiB
Case: strict-worktree-dispatch-to-cleanup-through-bundled-cli
Test Type
This is a forward-test and a worktree-lifecycle skill validation.
The goal is to verify that a leader using the packaged orch skill can allocate a code-mode worktree, reconcile completion, and clean that worktree up through the bundled CLI while a worker completes the task through inbox.
Purpose
Validate that all of the following can be true at the same time:
- the leader can dispatch a code task with
--execution-mode codethrough the bundled orch skill - the worker can complete the resulting attempt thread through inbox
- the leader can reconcile the finished task and clean the attempt worktree
- the final filesystem state matches the cleanup contract
Preconditions
- orch skill path exists:
ORCH_SKILL_PATH=skills/orch - inbox skill path exists:
INBOX_SKILL_PATH=skills/inbox - bundled CLI executables exist at
ORCH_SKILL_PATH/assets/orchandINBOX_SKILL_PATH/assets/inbox - use an empty temporary directory
TMPDIR - initialize
TMPDIR/coord.dbbefore launching role agents throughINBOX_SKILL_PATH/assets/inbox --db TMPDIR/coord.db --json init - create
TMPDIR/repoas a Git repository with one committed file before launching role agents
Agent Topology
leaderworker-a
Inputs
Leader Prompt
Use $orch at ORCH_SKILL_PATH to act as leader on the already initialized SQLite DB TMPDIR/coord.db. Only coordinate through the bundled orch CLI from the skill. Workflow: 1) create run run_blog_skill_worktree_001, 2) add one code task T1 for worker-a, 3) dispatch it with --execution-mode code --repo-path TMPDIR/repo --workspace-root .orch/worktrees, 4) record the returned THREAD_ID and WORKTREE_PATH, 5) wait until the worker completes, 6) reconcile, 7) clean up attempt 1, 8) stop after reporting RUN_ID, THREAD_ID, and WORKTREE_PATH. Do not use ordinary chat to coordinate with the worker.
Worker Prompt
Use $inbox at INBOX_SKILL_PATH to act as worker-a on SQLite DB TMPDIR/coord.db. Only coordinate through the bundled inbox CLI from the skill. Workflow: 1) fetch and claim the assigned task, 2) inspect the task payload enough to confirm a worktree path was provided, 3) finish the task with done, 4) stop after reporting the THREAD_ID you handled and whether you observed a worktree path. Do not use ordinary chat to coordinate with the leader.
Execution Parameters
- use the shared execution contract from README.md
- use the shared timeout defaults from README.md
- do not override the default cleanup policy
Execution Steps
- Initialize
TMPDIR/coord.dbonce through the bundled inbox CLI before launching agents - Create
TMPDIR/repowith an initial commit before launching agents - Inject
skills/orch/intoleader - Inject
skills/inbox/intoworker-a - Point both agents at the same database path
TMPDIR/coord.db - Launch
leaderandworker-ain parallel - Wait for both agents to finish
- Resolve
THREAD_IDandWORKTREE_PATHfrom the agent outputs - Independently run the validation commands from the main thread
Validation Commands
ORCH_SKILL_PATH/assets/orch --db TMPDIR/coord.db --json status --run run_blog_skill_worktree_001
INBOX_SKILL_PATH/assets/inbox --db TMPDIR/coord.db --json show --thread THREAD_ID
test ! -d WORKTREE_PATH
Expected Outcomes
- the leader reports a non-empty
WORKTREE_PATHfrom dispatch - the worker reports that the task payload exposed a worktree path
- the final run status is
done - the cleanup step removes the worktree directory
Assertions
status.data.run.status == "done"status.data.tasks[0].status == "done"show.data.thread.status == "done"- the task-side thread history includes a payload field or body content referencing the worktree path
WORKTREE_PATHdoes not exist after cleanup
Cleanup
- use the default cleanup policy from README.md
- if the run fails, retain
TMPDIR,coord.db, and the Git repo fixture for replay and manual inspection
Recorded Example Run
- recorded on:
2026-03-19 - execution mode:
direct_cli_replayviascripts/run_orch_skill_forward_tests.sh - result:
pass - observed run id:
run_blog_skill_worktree_001 - observed thread id:
thr_5743259fdccb41f9bb33dce0040b27a5 - observed worktree suffix:
.orch/worktrees/run-blog-skill-worktree-001/T1/attempt-1 - evidence summary:
orch dispatch --execution-mode codereturnedbase_ref == "HEAD", a concretebase_commit, branchorch/run-blog-skill-worktree-001/T1/attempt-1, and a non-emptyworktree_path- the task payload stored on the worker thread exposed the same
worktree_path - final
orch status --run run_blog_skill_worktree_001 --jsonreturnedrun.status == "done"andtasks[0].status == "done" - final
orch cleanup --run run_blog_skill_worktree_001 --task T1 --jsonreturned one cleaned attempt and the worktree directory no longer existed afterward - note: this recorded run exercised the packaged binaries directly in a temporary DB and Git fixture and did not spawn separate Codex role agents
Recorded Real Forward Run
- recorded on:
2026-03-19 - execution mode:
real_subagent_forward_test - result:
pass - evidence root:
/tmp/orch-skill-subagents.J1XWgs/strict-worktree-dispatch-to-cleanup-through-bundled-cli - observed run id:
run_blog_skill_worktree_001 - observed thread id:
thr_089527cd07f74b52a524ba07ed74c2e4 - observed worktree path:
/private/tmp/orch-skill-subagents.J1XWgs/strict-worktree-dispatch-to-cleanup-through-bundled-cli/repo/.orch/worktrees/run-blog-skill-worktree-001/T1/attempt-1 - evidence summary:
- a real leader agent using
skills/orch/completed code-modedispatch,wait,reconcile,cleanup, andstatus - a real worker agent using
skills/inbox/claimed the thread and finished it withdone - main-thread validation confirmed that the task payload did include the same
worktree_patheven though the worker agent summary failed to notice it, and also confirmed the worktree directory no longer existed after cleanup