Case: `leader-answers-blocked-task-with-payload-json-through-bundled-cli`

Test Type

This is a forward-test and a structured-answer skill validation.

The goal is to verify that a leader using the packaged orch skill can answer a blocked task with pure --payload-json, allowing the worker to resume without relying on a freeform answer body.

Purpose

Validate that all of the following can be true at the same time:

the leader can use wait, blocked, answer --payload-json, reconcile, and status through the bundled orch skill
a worker can post a blocked question through the bundled inbox skill
the answer reaches the active thread as structured payload data
the worker resumes after reading that payload and completes the task
the final run reaches done

Preconditions

orch skill path exists: ORCH_SKILL_PATH=skills/orch
inbox skill path exists: INBOX_SKILL_PATH=skills/inbox
bundled CLI executables exist at ORCH_SKILL_PATH/assets/orch and INBOX_SKILL_PATH/assets/inbox
use an empty temporary directory TMPDIR
initialize TMPDIR/coord.db before launching role agents through INBOX_SKILL_PATH/assets/inbox --db TMPDIR/coord.db --json init

Agent Topology

leader
worker-a

Inputs

Leader Prompt

Use $orch at ORCH_SKILL_PATH to act as leader on the already initialized SQLite DB TMPDIR/coord.db. Only coordinate through the bundled orch CLI from the skill. Workflow: 1) create run run_blog_skill_payload_answer_001, 2) add and dispatch one task T1 to worker-a, 3) wait until the task becomes blocked, 4) inspect blocked tasks, 5) answer the blocked question using payload-json only with decision=stdout, source=leader, and format=structured, 6) wait until the task completes, 7) reconcile and inspect final status, 8) stop after reporting RUN_ID and THREAD_ID. Do not use ordinary chat to coordinate with the worker.

Worker Prompt

Use $inbox at INBOX_SKILL_PATH to act as worker-a on SQLite DB TMPDIR/coord.db. Only coordinate through the bundled inbox CLI from the skill. Workflow: 1) fetch and claim the assigned task, 2) send a blocked update asking for a structured logging decision, 3) wait for a reply, 4) confirm the reply payload tells you to use stdout, 5) finish the task with done, 6) stop after reporting the THREAD_ID you handled. Do not use ordinary chat to coordinate with the leader.

Execution Parameters

use the shared execution contract from README.md
use the shared timeout defaults from README.md
do not override the default cleanup policy

Execution Steps

Initialize TMPDIR/coord.db once through the bundled inbox CLI before launching agents
Inject skills/orch/ into leader
Inject skills/inbox/ into worker-a
Point both agents at the same database path TMPDIR/coord.db
Launch leader and worker-a in parallel
Wait for both agents to finish
Resolve THREAD_ID from the agent outputs
Independently run the validation commands from the main thread

Validation Commands

ORCH_SKILL_PATH/assets/orch --db TMPDIR/coord.db --json status --run run_blog_skill_payload_answer_001
INBOX_SKILL_PATH/assets/inbox --db TMPDIR/coord.db --json show --thread THREAD_ID

Expected Outcomes

the leader successfully observes a blocked event and inspects the blocked queue
the leader successfully emits one payload-only answer through orch
worker-a receives that answer through inbox history and sees payload_json.decision == "stdout"
worker-a completes the task after the structured answer arrives
the final run state is done

Assertions

status.data.run.status == "done"
status.data.tasks[0].status == "done"
show.data.messages[*].kind includes question, answer, and result
one question message contains payload_json.question == "Use stdout or stderr for structured logs?"
one answer message contains payload_json.decision == "stdout"
one answer message contains payload_json.source == "leader"
one answer message contains payload_json.format == "structured"
the final thread status is done

Cleanup

use the default cleanup policy from README.md
if the run fails, retain TMPDIR and coord.db for replay and manual inspection

Recorded Example Run

recorded on: 2026-03-19
execution mode: direct_cli_replay via scripts/run_orch_skill_forward_tests.sh
result: pass
observed run id: run_blog_skill_payload_answer_001
observed thread id: thr_735bde0f91794174b2b85fbe89e80581
evidence summary:
orch wait --for task_blocked woke after the worker question, and orch blocked listed task T1 as the active blocked task
orch answer --payload-json '{"decision":"stdout","source":"leader","format":"structured"}' appended an answer message with those exact payload fields and an empty body
inbox wait-reply woke on that structured answer and exposed payload_json.decision == "stdout"
final orch status --run run_blog_skill_payload_answer_001 --json returned run.status == "done" and tasks[0].status == "done"
final inbox show --thread thr_735bde0f91794174b2b85fbe89e80581 --json contained the blocked question, the structured answer, and the terminal result
note: this recorded run exercised the packaged binaries directly in a temporary DB and did not spawn separate Codex role agents

5.3 KiB Raw Blame History

Case: leader-answers-blocked-task-with-payload-json-through-bundled-cli