90 lines
3.9 KiB
Markdown
90 lines
3.9 KiB
Markdown
# Case: `leader-blocked-answer-resume-through-bundled-cli`
|
|
|
|
## Test Type
|
|
|
|
This is a `forward-test` and a blocked-question resolution skill validation.
|
|
|
|
The goal is to verify that a leader using the packaged `orch` skill can observe a blocked task, answer it through `orch`, and reach final completion with a real worker using the packaged inbox skill.
|
|
|
|
## Purpose
|
|
|
|
Validate that all of the following can be true at the same time:
|
|
|
|
- the leader can use `orch wait`, `blocked`, `answer`, `reconcile`, and `status` through the bundled skill CLI
|
|
- a worker can ask a blocked question through the bundled inbox skill
|
|
- the answer reaches the active attempt thread
|
|
- the worker resumes after the answer and completes the task
|
|
- the final run reaches `done`
|
|
|
|
## Preconditions
|
|
|
|
- orch skill path exists: `ORCH_SKILL_PATH=skills/orch`
|
|
- inbox skill path exists: `INBOX_SKILL_PATH=skills/inbox`
|
|
- bundled CLI executables exist at `ORCH_SKILL_PATH/assets/orch` and `INBOX_SKILL_PATH/assets/inbox`
|
|
- use an empty temporary directory `TMPDIR`
|
|
- initialize `TMPDIR/coord.db` before launching role agents through `INBOX_SKILL_PATH/assets/inbox --db TMPDIR/coord.db --json init`
|
|
|
|
## Agent Topology
|
|
|
|
- `leader`
|
|
- `worker-a`
|
|
|
|
## Inputs
|
|
|
|
### Leader Prompt
|
|
|
|
```text
|
|
Use $orch at ORCH_SKILL_PATH to act as leader on the already initialized SQLite DB TMPDIR/coord.db. Only coordinate through the bundled orch CLI from the skill. Workflow: 1) create run run_blog_skill_002, 2) add and dispatch one task T1 to worker-a, 3) wait until the task becomes blocked, 4) inspect blocked tasks, 5) answer the blocked question with the decision "Use stdout for MVP.", 6) wait until the task completes, 7) reconcile and inspect final status, 8) stop after reporting RUN_ID and THREAD_ID. Do not use ordinary chat to coordinate with the worker.
|
|
```
|
|
|
|
### Worker Prompt
|
|
|
|
```text
|
|
Use $inbox at INBOX_SKILL_PATH to act as worker-a on SQLite DB TMPDIR/coord.db. Only coordinate through the bundled inbox CLI from the skill. Workflow: 1) fetch and claim the assigned task, 2) send one in_progress update, 3) send a blocked update asking "Should logging go to stdout or stderr?", 4) wait for a reply, 5) finish the task with done after you receive the leader decision, 6) stop after reporting the THREAD_ID you handled. Do not use ordinary chat to coordinate with the leader.
|
|
```
|
|
|
|
## Execution Parameters
|
|
|
|
- use the shared execution contract from [README.md](./README.md)
|
|
- use the shared timeout defaults from [README.md](./README.md)
|
|
- do not override the default cleanup policy
|
|
|
|
## Execution Steps
|
|
|
|
1. Initialize `TMPDIR/coord.db` once through the bundled inbox CLI before launching agents
|
|
2. Inject `skills/orch/` into `leader`
|
|
3. Inject `skills/inbox/` into `worker-a`
|
|
4. Point both agents at the same database path `TMPDIR/coord.db`
|
|
5. Launch `leader` and `worker-a` in parallel
|
|
6. Wait for both agents to finish
|
|
7. Resolve `THREAD_ID` from the agent outputs
|
|
8. Independently run the validation commands from the main thread
|
|
|
|
## Validation Commands
|
|
|
|
```bash
|
|
ORCH_SKILL_PATH/assets/orch --db TMPDIR/coord.db --json status --run run_blog_skill_002
|
|
INBOX_SKILL_PATH/assets/inbox --db TMPDIR/coord.db --json show --thread THREAD_ID
|
|
```
|
|
|
|
## Expected Outcomes
|
|
|
|
- `leader` successfully observes a blocked event through `orch`
|
|
- `leader` successfully inspects the blocked queue and emits one `answer`
|
|
- `worker-a` receives that answer through inbox history and completes the task
|
|
- the final run state is `done`
|
|
|
|
## Assertions
|
|
|
|
- `status.data.run.status == "done"`
|
|
- `status.data.tasks[0].status == "done"`
|
|
- `show.data.messages[*].kind` includes `question`, `answer`, and `result`
|
|
- one `question` message contains `payload_json.question == "Should logging go to stdout or stderr?"`
|
|
- one `answer` message contains body `Use stdout for MVP.`
|
|
- the final thread status is `done`
|
|
|
|
## Cleanup
|
|
|
|
- use the default cleanup policy from [README.md](./README.md)
|
|
- if the run fails, retain `TMPDIR` and `coord.db` for replay and manual inspection
|