Case: `leader-cancels-active-task-through-bundled-cli`

Test Type

This is a forward-test and a direct task-cancel skill validation.

The goal is to verify that a leader using the packaged orch skill can cancel an already active task attempt without cancelling unrelated ready work in the same run.

Purpose

Validate that all of the following can be true at the same time:

the leader can use dispatch, cancel, ready, and status through the bundled orch skill
worker-a can claim the original thread and report active progress through the bundled inbox skill
the leader can cancel that active task through orch cancel --task
the original thread reaches cancelled
another task in the same run remains actionable instead of being implicitly cancelled

Preconditions

orch skill path exists: ORCH_SKILL_PATH=skills/orch
inbox skill path exists: INBOX_SKILL_PATH=skills/inbox
bundled CLI executables exist at ORCH_SKILL_PATH/assets/orch and INBOX_SKILL_PATH/assets/inbox
use an empty temporary directory TMPDIR
initialize TMPDIR/coord.db before launching role agents through INBOX_SKILL_PATH/assets/inbox --db TMPDIR/coord.db --json init

Agent Topology

leader
worker-a

Inputs

Leader Prompt

Use $orch at ORCH_SKILL_PATH to act as leader on the already initialized SQLite DB TMPDIR/coord.db. Only coordinate through the bundled orch CLI from the skill. Workflow: 1) create run run_blog_skill_cancel_001, 2) add task T1 for worker-a and a second task T2 that should remain untouched, 3) dispatch T1 with --execution-mode analysis, 4) wait until worker-a has claimed it or marked it in progress, 5) cancel T1 with a clear reason through orch, 6) inspect ready work and final run status, 7) stop after reporting THREAD_ID_1. Do not use ordinary chat to coordinate with the worker.

Worker Prompt

Use $inbox at INBOX_SKILL_PATH to act as worker-a on SQLite DB TMPDIR/coord.db. Only coordinate through the bundled inbox CLI from the skill. Workflow: 1) fetch and claim the assigned thread, 2) send one in_progress update, 3) stop after reporting THREAD_ID_1 and that the task became active. Do not use ordinary chat to coordinate with the leader.

Execution Parameters

use the shared execution contract from README.md
use the shared timeout defaults from README.md
do not override the default cleanup policy

Execution Steps

Initialize TMPDIR/coord.db once through the bundled inbox CLI before launching agents
Inject skills/orch/ into leader
Inject skills/inbox/ into worker-a
Point both agents at the same database path TMPDIR/coord.db
Launch leader and worker-a in parallel
Wait for both agents to finish
Resolve THREAD_ID_1 from the agent outputs
Independently run the validation commands from the main thread

Validation Commands

ORCH_SKILL_PATH/assets/orch --db TMPDIR/coord.db --json status --run run_blog_skill_cancel_001
ORCH_SKILL_PATH/assets/orch --db TMPDIR/coord.db --json ready --run run_blog_skill_cancel_001
INBOX_SKILL_PATH/assets/inbox --db TMPDIR/coord.db --json show --thread THREAD_ID_1

Expected Outcomes

worker-a successfully claims the original thread and reports in_progress
the leader successfully cancels T1 through orch cancel --task
the original thread reaches cancelled
the untouched task T2 remains available in the ready queue
the run remains open rather than collapsing into a fully cancelled run

Assertions

status.data.tasks contains T1 with status cancelled
status.data.tasks contains T2 with status ready
status.data.run.status == "ready"
ready.data.tasks contains only T2
show.data.thread.status == "cancelled"
the thread history preserves the worker progress message before the cancel

Cleanup

use the default cleanup policy from README.md
if the run fails, retain TMPDIR and coord.db for replay and manual inspection

Recorded Example Run

recorded on: 2026-03-19
execution mode: direct_cli_replay via scripts/run_orch_skill_forward_tests.sh
result: pass
observed run id: run_blog_skill_cancel_001
observed thread id: thr_175e00bca76549ea8529cb4c92d99fd4
evidence summary:
final orch status --run run_blog_skill_cancel_001 --json returned run.status == "ready" with task counts cancelled: 1 and ready: 1
that same status output showed T1.status == "cancelled" while T2.status == "ready"
final orch ready --run run_blog_skill_cancel_001 --json returned only T2, confirming the untouched task remained dispatchable
final inbox show --thread thr_175e00bca76549ea8529cb4c92d99fd4 --json returned thread.status == "cancelled" and preserved the worker progress message before the cancel
note: this recorded run exercised the packaged binaries directly in a temporary DB and did not spawn separate Codex role agents

4.9 KiB Raw Blame History

Case: leader-cancels-active-task-through-bundled-cli