Case: `leader-retries-failed-task-through-bundled-cli`

Test Type

This is a forward-test and a retry-path skill validation.

The goal is to verify that a leader using the packaged orch skill can reconcile a failed attempt, issue retry, and drive the task to success through a second attempt handled by a real worker.

Purpose

Validate that all of the following can be true at the same time:

the leader can use the bundled orch skill to dispatch an initial attempt
a worker can fail the first attempt through inbox
the leader can reconcile that failure and create a fresh retry attempt
the worker can complete the retried attempt
the final run reaches done and the two attempts map to different threads

Preconditions

orch skill path exists: ORCH_SKILL_PATH=skills/orch
inbox skill path exists: INBOX_SKILL_PATH=skills/inbox
bundled CLI executables exist at ORCH_SKILL_PATH/assets/orch and INBOX_SKILL_PATH/assets/inbox
use an empty temporary directory TMPDIR
initialize TMPDIR/coord.db before launching role agents through INBOX_SKILL_PATH/assets/inbox --db TMPDIR/coord.db --json init

Agent Topology

leader
worker-a

Inputs

Leader Prompt

Use $orch at ORCH_SKILL_PATH to act as leader on the already initialized SQLite DB TMPDIR/coord.db. Only coordinate through the bundled orch CLI from the skill. Workflow: 1) create run run_blog_skill_retry_001, 2) add and dispatch one task T1 to worker-a, 3) wait until the first attempt fails, 4) reconcile, 5) retry T1 with a short retry note, 6) wait until the retried attempt completes, 7) reconcile again and inspect final status, 8) stop after reporting RUN_ID, THREAD_ID_1, and THREAD_ID_2. Do not use ordinary chat to coordinate with the worker.

Worker Prompt

Use $inbox at INBOX_SKILL_PATH to act as worker-a on SQLite DB TMPDIR/coord.db. Only coordinate through the bundled inbox CLI from the skill. Workflow: 1) fetch and claim the first assigned thread, 2) fail that first attempt with a clear summary, 3) keep watching for retried work assigned to worker-a, 4) fetch and claim the retried thread, 5) finish the retried attempt with done, 6) stop after reporting both THREAD_ID_1 and THREAD_ID_2. Do not use ordinary chat to coordinate with the leader.

Execution Parameters

use the shared execution contract from README.md
use the shared timeout defaults from README.md
do not override the default cleanup policy

Execution Steps

Initialize TMPDIR/coord.db once through the bundled inbox CLI before launching agents
Inject skills/orch/ into leader
Inject skills/inbox/ into worker-a
Point both agents at the same database path TMPDIR/coord.db
Launch leader and worker-a in parallel
Wait for both agents to finish
Resolve THREAD_ID_1 and THREAD_ID_2 from the agent outputs
Independently run the validation commands from the main thread

Validation Commands

ORCH_SKILL_PATH/assets/orch --db TMPDIR/coord.db --json status --run run_blog_skill_retry_001
INBOX_SKILL_PATH/assets/inbox --db TMPDIR/coord.db --json show --thread THREAD_ID_1
INBOX_SKILL_PATH/assets/inbox --db TMPDIR/coord.db --json show --thread THREAD_ID_2

Expected Outcomes

the first worker-owned thread reaches failed
the leader successfully issues retry
the second worker-owned thread is distinct from the first
the second worker-owned thread reaches done
the final run state is done

Assertions

THREAD_ID_1 != THREAD_ID_2
status.data.run.status == "done"
status.data.tasks[0].status == "done"
show THREAD_ID_1 reports a terminal failed thread state
show THREAD_ID_2 reports a terminal done thread state
the worker summary confirms that the retried attempt was a new thread rather than a reused one

Cleanup

use the default cleanup policy from README.md
if the run fails, retain TMPDIR and coord.db for replay and manual inspection

3.9 KiB Raw Blame History

Case: leader-retries-failed-task-through-bundled-cli