6.5 KiB
6.5 KiB
Case: leader-reassigns-blocked-task-through-bundled-cli
Test Type
This is a forward-test and a reassignment-path skill validation.
The goal is to verify that a leader using the packaged orch skill can observe a blocked task, reassign it from one worker to another, and drive the run to completion through the new attempt.
Purpose
Validate that all of the following can be true at the same time:
- the leader can use
blocked,reassign,reconcile, andstatusthrough the bundled orch skill worker-acan claim the original attempt and block on a questionworker-bcan receive the reassigned attempt as a new thread- the original thread is cancelled and the new thread reaches
done - the final run reaches
done
Preconditions
- orch skill path exists:
ORCH_SKILL_PATH=skills/orch - inbox skill path exists:
INBOX_SKILL_PATH=skills/inbox - bundled CLI executables exist at
ORCH_SKILL_PATH/assets/orchandINBOX_SKILL_PATH/assets/inbox - use an empty temporary directory
TMPDIR - initialize
TMPDIR/coord.dbbefore launching role agents throughINBOX_SKILL_PATH/assets/inbox --db TMPDIR/coord.db --json init
Agent Topology
leaderworker-aworker-b
Inputs
Leader Prompt
Use $orch at ORCH_SKILL_PATH to act as leader on the already initialized SQLite DB TMPDIR/coord.db. Only coordinate through the bundled orch CLI from the skill. Workflow: 1) create run run_blog_skill_reassign_001, 2) add and dispatch one task T1 to worker-a with --execution-mode analysis, 3) wait until worker-a blocks, 4) inspect blocked tasks, 5) reassign T1 to worker-b with a short reason, 6) wait until worker-b completes the new attempt, 7) reconcile and inspect final status, 8) stop after reporting THREAD_ID_1 and THREAD_ID_2. Do not use ordinary chat to coordinate with the workers.
Worker A Prompt
Use $inbox at INBOX_SKILL_PATH to act as worker-a on SQLite DB TMPDIR/coord.db. Only coordinate through the bundled inbox CLI from the skill. Workflow: 1) fetch and claim the initial assigned thread, 2) send one blocked update with a precise question, 3) stop after reporting THREAD_ID_1 and the blocked summary you sent. Do not use ordinary chat to coordinate with the leader or worker-b.
Worker B Prompt
Use $inbox at INBOX_SKILL_PATH to act as worker-b on SQLite DB TMPDIR/coord.db. Only coordinate through the bundled inbox CLI from the skill. Workflow: 1) wait until reassigned work for worker-b appears, 2) fetch and claim it, 3) complete it with done, 4) stop after reporting THREAD_ID_2. Do not use ordinary chat to coordinate with the leader or worker-a.
Execution Parameters
- use the shared execution contract from README.md
- use the shared timeout defaults from README.md
- do not override the default cleanup policy
Execution Steps
- Initialize
TMPDIR/coord.dbonce through the bundled inbox CLI before launching agents - Inject
skills/orch/intoleader - Inject
skills/inbox/intoworker-aandworker-b - Point all agents at the same database path
TMPDIR/coord.db - Launch
leader,worker-a, andworker-bin parallel - Wait for all agents to finish
- Resolve
THREAD_ID_1andTHREAD_ID_2from the agent outputs - Independently run the validation commands from the main thread
Validation Commands
ORCH_SKILL_PATH/assets/orch --db TMPDIR/coord.db --json status --run run_blog_skill_reassign_001
INBOX_SKILL_PATH/assets/inbox --db TMPDIR/coord.db --json show --thread THREAD_ID_1
INBOX_SKILL_PATH/assets/inbox --db TMPDIR/coord.db --json show --thread THREAD_ID_2
Expected Outcomes
worker-asuccessfully claims the original thread and blocks it- the leader successfully reassigns the task to
worker-b - the original thread reaches
cancelled worker-breceives a distinct reassigned thread and completes it- the final run reaches
done
Assertions
THREAD_ID_1 != THREAD_ID_2status.data.run.status == "done"status.data.tasks[0].status == "done"show THREAD_ID_1reports a terminal cancelled thread stateshow THREAD_ID_2reports a terminal done thread state- the blocked question remains visible in the original thread history
Cleanup
- use the default cleanup policy from README.md
- if the run fails, retain
TMPDIRandcoord.dbfor replay and manual inspection
Recorded Example Run
- recorded on:
2026-03-19 - execution mode:
direct_cli_replayviascripts/run_orch_skill_forward_tests.sh - result:
pass - observed run id:
run_blog_skill_reassign_001 - observed original thread id:
thr_0a61240412134de3b3d9ab219b6c8f19 - observed reassigned thread id:
thr_12fbcf6d89d948548306198d013d77a5 - evidence summary:
orch wait --for task_blockedwoke after worker-a posted a blocked question with payloadProceed with v1 scope?orch reassign --run run_blog_skill_reassign_001 --task T1 --to worker-b --jsonreturnedattempt_no == 2and assigned the new attempt toworker-b- final
inbox showon the original thread returnedthread.status == "cancelled"and preserved the blockedquestionmessage - final
inbox showon the reassigned thread returnedthread.status == "done" - final
orch status --run run_blog_skill_reassign_001 --jsonreturnedrun.status == "done"andtasks[0].status == "done" - note: this recorded run exercised the packaged binaries directly in a temporary DB and did not spawn separate Codex role agents
Recorded Real Forward Run
- recorded on:
2026-03-19 - execution mode:
real_subagent_forward_test - result:
pass - evidence root:
/tmp/orch-skill-subagents.J1XWgs/leader-reassigns-blocked-task-through-bundled-cli-phased - observed run id:
run_blog_skill_reassign_001 - observed original thread id:
thr_7d43af5bc1f7467da98a39adb0de5808 - observed reassigned thread id:
thr_eba253db8965423b855d0c784a29702c - evidence summary:
- the same real leader agent using
skills/orch/completed the case in three phases: initialrun/task/dispatch, thenwait --for task_blockedplusreassign, then finalwait --for task_doneplusstatus - a real
worker-aagent usingskills/inbox/claimed the original thread and posted the blocked questionProceed with v1 scope? - a real
worker-bagent usingskills/inbox/claimed the reassigned thread and completed it - main-thread validation confirmed the original thread finished
cancelled, the reassigned thread finisheddone, and the original blocked question remained visible in thread history