4.9 KiB
4.9 KiB
Case: leader-cancels-active-task-through-bundled-cli
Test Type
This is a forward-test and a direct task-cancel skill validation.
The goal is to verify that a leader using the packaged orch skill can cancel an already active task attempt without cancelling unrelated ready work in the same run.
Purpose
Validate that all of the following can be true at the same time:
- the leader can use
dispatch,cancel,ready, andstatusthrough the bundled orch skill worker-acan claim the original thread and report active progress through the bundled inbox skill- the leader can cancel that active task through
orch cancel --task - the original thread reaches
cancelled - another task in the same run remains actionable instead of being implicitly cancelled
Preconditions
- orch skill path exists:
ORCH_SKILL_PATH=skills/orch - inbox skill path exists:
INBOX_SKILL_PATH=skills/inbox - bundled CLI executables exist at
ORCH_SKILL_PATH/assets/orchandINBOX_SKILL_PATH/assets/inbox - use an empty temporary directory
TMPDIR - initialize
TMPDIR/coord.dbbefore launching role agents throughINBOX_SKILL_PATH/assets/inbox --db TMPDIR/coord.db --json init
Agent Topology
leaderworker-a
Inputs
Leader Prompt
Use $orch at ORCH_SKILL_PATH to act as leader on the already initialized SQLite DB TMPDIR/coord.db. Only coordinate through the bundled orch CLI from the skill. Workflow: 1) create run run_blog_skill_cancel_001, 2) add task T1 for worker-a and a second task T2 that should remain untouched, 3) dispatch T1, 4) wait until worker-a has claimed it or marked it in progress, 5) cancel T1 with a clear reason through orch, 6) inspect ready work and final run status, 7) stop after reporting THREAD_ID_1. Do not use ordinary chat to coordinate with the worker.
Worker Prompt
Use $inbox at INBOX_SKILL_PATH to act as worker-a on SQLite DB TMPDIR/coord.db. Only coordinate through the bundled inbox CLI from the skill. Workflow: 1) fetch and claim the assigned thread, 2) send one in_progress update, 3) stop after reporting THREAD_ID_1 and that the task became active. Do not use ordinary chat to coordinate with the leader.
Execution Parameters
- use the shared execution contract from README.md
- use the shared timeout defaults from README.md
- do not override the default cleanup policy
Execution Steps
- Initialize
TMPDIR/coord.dbonce through the bundled inbox CLI before launching agents - Inject
skills/orch/intoleader - Inject
skills/inbox/intoworker-a - Point both agents at the same database path
TMPDIR/coord.db - Launch
leaderandworker-ain parallel - Wait for both agents to finish
- Resolve
THREAD_ID_1from the agent outputs - Independently run the validation commands from the main thread
Validation Commands
ORCH_SKILL_PATH/assets/orch --db TMPDIR/coord.db --json status --run run_blog_skill_cancel_001
ORCH_SKILL_PATH/assets/orch --db TMPDIR/coord.db --json ready --run run_blog_skill_cancel_001
INBOX_SKILL_PATH/assets/inbox --db TMPDIR/coord.db --json show --thread THREAD_ID_1
Expected Outcomes
worker-asuccessfully claims the original thread and reportsin_progress- the leader successfully cancels
T1throughorch cancel --task - the original thread reaches
cancelled - the untouched task
T2remains available in the ready queue - the run remains open rather than collapsing into a fully cancelled run
Assertions
status.data.taskscontainsT1with statuscancelledstatus.data.taskscontainsT2with statusreadystatus.data.run.status == "ready"ready.data.taskscontains onlyT2show.data.thread.status == "cancelled"- the thread history preserves the worker
progressmessage before the cancel
Cleanup
- use the default cleanup policy from README.md
- if the run fails, retain
TMPDIRandcoord.dbfor replay and manual inspection
Recorded Example Run
- recorded on:
2026-03-19 - execution mode:
direct_cli_replayviascripts/run_orch_skill_forward_tests.sh - result:
pass - observed run id:
run_blog_skill_cancel_001 - observed thread id:
thr_175e00bca76549ea8529cb4c92d99fd4 - evidence summary:
- final
orch status --run run_blog_skill_cancel_001 --jsonreturnedrun.status == "ready"with task countscancelled: 1andready: 1 - that same
statusoutput showedT1.status == "cancelled"whileT2.status == "ready" - final
orch ready --run run_blog_skill_cancel_001 --jsonreturned onlyT2, confirming the untouched task remained dispatchable - final
inbox show --thread thr_175e00bca76549ea8529cb4c92d99fd4 --jsonreturnedthread.status == "cancelled"and preserved the workerprogressmessage before the cancel - note: this recorded run exercised the packaged binaries directly in a temporary DB and did not spawn separate Codex role agents