Files
ai-workflow-skill/docs/tests/orch-skill/leader-cancels-active-task-through-bundled-cli.md
T

106 lines
4.9 KiB
Markdown

# Case: `leader-cancels-active-task-through-bundled-cli`
## Test Type
This is a `forward-test` and a direct task-cancel skill validation.
The goal is to verify that a leader using the packaged `orch` skill can cancel an already active task attempt without cancelling unrelated ready work in the same run.
## Purpose
Validate that all of the following can be true at the same time:
- the leader can use `dispatch`, `cancel`, `ready`, and `status` through the bundled orch skill
- `worker-a` can claim the original thread and report active progress through the bundled inbox skill
- the leader can cancel that active task through `orch cancel --task`
- the original thread reaches `cancelled`
- another task in the same run remains actionable instead of being implicitly cancelled
## Preconditions
- orch skill path exists: `ORCH_SKILL_PATH=skills/orch`
- inbox skill path exists: `INBOX_SKILL_PATH=skills/inbox`
- bundled CLI executables exist at `ORCH_SKILL_PATH/assets/orch` and `INBOX_SKILL_PATH/assets/inbox`
- use an empty temporary directory `TMPDIR`
- initialize `TMPDIR/coord.db` before launching role agents through `INBOX_SKILL_PATH/assets/inbox --db TMPDIR/coord.db --json init`
## Agent Topology
- `leader`
- `worker-a`
## Inputs
### Leader Prompt
```text
Use $orch at ORCH_SKILL_PATH to act as leader on the already initialized SQLite DB TMPDIR/coord.db. Only coordinate through the bundled orch CLI from the skill. Workflow: 1) create run run_blog_skill_cancel_001, 2) add task T1 for worker-a and a second task T2 that should remain untouched, 3) dispatch T1 with --execution-mode analysis, 4) wait until worker-a has claimed it or marked it in progress, 5) cancel T1 with a clear reason through orch, 6) inspect ready work and final run status, 7) stop after reporting THREAD_ID_1. Do not use ordinary chat to coordinate with the worker.
```
### Worker Prompt
```text
Use $inbox at INBOX_SKILL_PATH to act as worker-a on SQLite DB TMPDIR/coord.db. Only coordinate through the bundled inbox CLI from the skill. Workflow: 1) fetch and claim the assigned thread, 2) send one in_progress update, 3) stop after reporting THREAD_ID_1 and that the task became active. Do not use ordinary chat to coordinate with the leader.
```
## Execution Parameters
- use the shared execution contract from [README.md](./README.md)
- use the shared timeout defaults from [README.md](./README.md)
- do not override the default cleanup policy
## Execution Steps
1. Initialize `TMPDIR/coord.db` once through the bundled inbox CLI before launching agents
2. Inject `skills/orch/` into `leader`
3. Inject `skills/inbox/` into `worker-a`
4. Point both agents at the same database path `TMPDIR/coord.db`
5. Launch `leader` and `worker-a` in parallel
6. Wait for both agents to finish
7. Resolve `THREAD_ID_1` from the agent outputs
8. Independently run the validation commands from the main thread
## Validation Commands
```bash
ORCH_SKILL_PATH/assets/orch --db TMPDIR/coord.db --json status --run run_blog_skill_cancel_001
ORCH_SKILL_PATH/assets/orch --db TMPDIR/coord.db --json ready --run run_blog_skill_cancel_001
INBOX_SKILL_PATH/assets/inbox --db TMPDIR/coord.db --json show --thread THREAD_ID_1
```
## Expected Outcomes
- `worker-a` successfully claims the original thread and reports `in_progress`
- the leader successfully cancels `T1` through `orch cancel --task`
- the original thread reaches `cancelled`
- the untouched task `T2` remains available in the ready queue
- the run remains open rather than collapsing into a fully cancelled run
## Assertions
- `status.data.tasks` contains `T1` with status `cancelled`
- `status.data.tasks` contains `T2` with status `ready`
- `status.data.run.status == "ready"`
- `ready.data.tasks` contains only `T2`
- `show.data.thread.status == "cancelled"`
- the thread history preserves the worker `progress` message before the cancel
## Cleanup
- use the default cleanup policy from [README.md](./README.md)
- if the run fails, retain `TMPDIR` and `coord.db` for replay and manual inspection
## Recorded Example Run
- recorded on: `2026-03-19`
- execution mode: `direct_cli_replay` via `scripts/run_orch_skill_forward_tests.sh`
- result: `pass`
- observed run id: `run_blog_skill_cancel_001`
- observed thread id: `thr_175e00bca76549ea8529cb4c92d99fd4`
- evidence summary:
- final `orch status --run run_blog_skill_cancel_001 --json` returned `run.status == "ready"` with task counts `cancelled: 1` and `ready: 1`
- that same `status` output showed `T1.status == "cancelled"` while `T2.status == "ready"`
- final `orch ready --run run_blog_skill_cancel_001 --json` returned only `T2`, confirming the untouched task remained dispatchable
- final `inbox show --thread thr_175e00bca76549ea8529cb4c92d99fd4 --json` returned `thread.status == "cancelled"` and preserved the worker `progress` message before the cancel
- note: this recorded run exercised the packaged binaries directly in a temporary DB and did not spawn separate Codex role agents