5.4 KiB
5.4 KiB
Case: leader-answers-blocked-task-with-payload-json-through-bundled-cli
Test Type
This is a forward-test and a structured-answer skill validation.
The goal is to verify that a leader using the packaged orch skill can answer a blocked task with pure --payload-json, allowing the worker to resume without relying on a freeform answer body.
Purpose
Validate that all of the following can be true at the same time:
- the leader can use
wait,blocked,answer --payload-json,reconcile, andstatusthrough the bundled orch skill - a worker can post a blocked question through the bundled inbox skill
- the answer reaches the active thread as structured payload data
- the worker resumes after reading that payload and completes the task
- the final run reaches
done
Preconditions
- orch skill path exists:
ORCH_SKILL_PATH=skills/orch - inbox skill path exists:
INBOX_SKILL_PATH=skills/inbox - bundled CLI executables exist at
ORCH_SKILL_PATH/assets/orchandINBOX_SKILL_PATH/assets/inbox - use an empty temporary directory
TMPDIR - initialize
TMPDIR/coord.dbbefore launching role agents throughINBOX_SKILL_PATH/assets/inbox --db TMPDIR/coord.db --json init
Agent Topology
leaderworker-a
Inputs
Leader Prompt
Use $orch at ORCH_SKILL_PATH to act as leader on the already initialized SQLite DB TMPDIR/coord.db. Only coordinate through the bundled orch CLI from the skill. Workflow: 1) create run run_blog_skill_payload_answer_001, 2) add and dispatch one task T1 to worker-a with --execution-mode analysis, 3) wait until the task becomes blocked, 4) inspect blocked tasks, 5) answer the blocked question using payload-json only with decision=stdout, source=leader, and format=structured, 6) wait until the task completes, 7) reconcile and inspect final status, 8) stop after reporting RUN_ID and THREAD_ID. Do not use ordinary chat to coordinate with the worker.
Worker Prompt
Use $inbox at INBOX_SKILL_PATH to act as worker-a on SQLite DB TMPDIR/coord.db. Only coordinate through the bundled inbox CLI from the skill. Workflow: 1) fetch and claim the assigned task, 2) send a blocked update asking for a structured logging decision, 3) wait for a reply, 4) confirm the reply payload tells you to use stdout, 5) finish the task with done, 6) stop after reporting the THREAD_ID you handled. Do not use ordinary chat to coordinate with the leader.
Execution Parameters
- use the shared execution contract from README.md
- use the shared timeout defaults from README.md
- do not override the default cleanup policy
Execution Steps
- Initialize
TMPDIR/coord.dbonce through the bundled inbox CLI before launching agents - Inject
skills/orch/intoleader - Inject
skills/inbox/intoworker-a - Point both agents at the same database path
TMPDIR/coord.db - Launch
leaderandworker-ain parallel - Wait for both agents to finish
- Resolve
THREAD_IDfrom the agent outputs - Independently run the validation commands from the main thread
Validation Commands
ORCH_SKILL_PATH/assets/orch --db TMPDIR/coord.db --json status --run run_blog_skill_payload_answer_001
INBOX_SKILL_PATH/assets/inbox --db TMPDIR/coord.db --json show --thread THREAD_ID
Expected Outcomes
- the leader successfully observes a blocked event and inspects the blocked queue
- the leader successfully emits one payload-only answer through
orch worker-areceives that answer through inbox history and seespayload_json.decision == "stdout"worker-acompletes the task after the structured answer arrives- the final run state is
done
Assertions
status.data.run.status == "done"status.data.tasks[0].status == "done"show.data.messages[*].kindincludesquestion,answer, andresult- one
questionmessage containspayload_json.question == "Use stdout or stderr for structured logs?" - one
answermessage containspayload_json.decision == "stdout" - one
answermessage containspayload_json.source == "leader" - one
answermessage containspayload_json.format == "structured" - the final thread status is
done
Cleanup
- use the default cleanup policy from README.md
- if the run fails, retain
TMPDIRandcoord.dbfor replay and manual inspection
Recorded Example Run
- recorded on:
2026-03-19 - execution mode:
direct_cli_replayviascripts/run_orch_skill_forward_tests.sh - result:
pass - observed run id:
run_blog_skill_payload_answer_001 - observed thread id:
thr_735bde0f91794174b2b85fbe89e80581 - evidence summary:
orch wait --for task_blockedwoke after the worker question, andorch blockedlisted taskT1as the active blocked taskorch answer --payload-json '{"decision":"stdout","source":"leader","format":"structured"}'appended ananswermessage with those exact payload fields and an empty bodyinbox wait-replywoke on that structured answer and exposedpayload_json.decision == "stdout"- final
orch status --run run_blog_skill_payload_answer_001 --jsonreturnedrun.status == "done"andtasks[0].status == "done" - final
inbox show --thread thr_735bde0f91794174b2b85fbe89e80581 --jsoncontained the blockedquestion, the structuredanswer, and the terminalresult - note: this recorded run exercised the packaged binaries directly in a temporary DB and did not spawn separate Codex role agents