Files
ai-workflow-skill/docs/tests/orch-skill/leader-cancels-active-task-through-bundled-cli.md
T

4.9 KiB

Case: leader-cancels-active-task-through-bundled-cli

Test Type

This is a forward-test and a direct task-cancel skill validation.

The goal is to verify that a leader using the packaged orch skill can cancel an already active task attempt without cancelling unrelated ready work in the same run.

Purpose

Validate that all of the following can be true at the same time:

  • the leader can use dispatch, cancel, ready, and status through the bundled orch skill
  • worker-a can claim the original thread and report active progress through the bundled inbox skill
  • the leader can cancel that active task through orch cancel --task
  • the original thread reaches cancelled
  • another task in the same run remains actionable instead of being implicitly cancelled

Preconditions

  • orch skill path exists: ORCH_SKILL_PATH=skills/orch
  • inbox skill path exists: INBOX_SKILL_PATH=skills/inbox
  • bundled CLI executables exist at ORCH_SKILL_PATH/assets/orch and INBOX_SKILL_PATH/assets/inbox
  • use an empty temporary directory TMPDIR
  • initialize TMPDIR/coord.db before launching role agents through INBOX_SKILL_PATH/assets/inbox --db TMPDIR/coord.db --json init

Agent Topology

  • leader
  • worker-a

Inputs

Leader Prompt

Use $orch at ORCH_SKILL_PATH to act as leader on the already initialized SQLite DB TMPDIR/coord.db. Only coordinate through the bundled orch CLI from the skill. Workflow: 1) create run run_blog_skill_cancel_001, 2) add task T1 for worker-a and a second task T2 that should remain untouched, 3) dispatch T1 with --execution-mode analysis, 4) wait until worker-a has claimed it or marked it in progress, 5) cancel T1 with a clear reason through orch, 6) inspect ready work and final run status, 7) stop after reporting THREAD_ID_1. Do not use ordinary chat to coordinate with the worker.

Worker Prompt

Use $inbox at INBOX_SKILL_PATH to act as worker-a on SQLite DB TMPDIR/coord.db. Only coordinate through the bundled inbox CLI from the skill. Workflow: 1) fetch and claim the assigned thread, 2) send one in_progress update, 3) stop after reporting THREAD_ID_1 and that the task became active. Do not use ordinary chat to coordinate with the leader.

Execution Parameters

  • use the shared execution contract from README.md
  • use the shared timeout defaults from README.md
  • do not override the default cleanup policy

Execution Steps

  1. Initialize TMPDIR/coord.db once through the bundled inbox CLI before launching agents
  2. Inject skills/orch/ into leader
  3. Inject skills/inbox/ into worker-a
  4. Point both agents at the same database path TMPDIR/coord.db
  5. Launch leader and worker-a in parallel
  6. Wait for both agents to finish
  7. Resolve THREAD_ID_1 from the agent outputs
  8. Independently run the validation commands from the main thread

Validation Commands

ORCH_SKILL_PATH/assets/orch --db TMPDIR/coord.db --json status --run run_blog_skill_cancel_001
ORCH_SKILL_PATH/assets/orch --db TMPDIR/coord.db --json ready --run run_blog_skill_cancel_001
INBOX_SKILL_PATH/assets/inbox --db TMPDIR/coord.db --json show --thread THREAD_ID_1

Expected Outcomes

  • worker-a successfully claims the original thread and reports in_progress
  • the leader successfully cancels T1 through orch cancel --task
  • the original thread reaches cancelled
  • the untouched task T2 remains available in the ready queue
  • the run remains open rather than collapsing into a fully cancelled run

Assertions

  • status.data.tasks contains T1 with status cancelled
  • status.data.tasks contains T2 with status ready
  • status.data.run.status == "ready"
  • ready.data.tasks contains only T2
  • show.data.thread.status == "cancelled"
  • the thread history preserves the worker progress message before the cancel

Cleanup

  • use the default cleanup policy from README.md
  • if the run fails, retain TMPDIR and coord.db for replay and manual inspection

Recorded Example Run

  • recorded on: 2026-03-19
  • execution mode: direct_cli_replay via scripts/run_orch_skill_forward_tests.sh
  • result: pass
  • observed run id: run_blog_skill_cancel_001
  • observed thread id: thr_175e00bca76549ea8529cb4c92d99fd4
  • evidence summary:
  • final orch status --run run_blog_skill_cancel_001 --json returned run.status == "ready" with task counts cancelled: 1 and ready: 1
  • that same status output showed T1.status == "cancelled" while T2.status == "ready"
  • final orch ready --run run_blog_skill_cancel_001 --json returned only T2, confirming the untouched task remained dispatchable
  • final inbox show --thread thr_175e00bca76549ea8529cb4c92d99fd4 --json returned thread.status == "cancelled" and preserved the worker progress message before the cancel
  • note: this recorded run exercised the packaged binaries directly in a temporary DB and did not spawn separate Codex role agents