Add orch skill forward test evidence
This commit is contained in:
@@ -96,3 +96,34 @@ INBOX_SKILL_PATH/assets/inbox --db TMPDIR/coord.db --json show --thread THREAD_I
|
||||
|
||||
- use the default cleanup policy from [README.md](./README.md)
|
||||
- if the run fails, retain `TMPDIR` and `coord.db` for replay and manual inspection
|
||||
|
||||
## Recorded Example Run
|
||||
|
||||
- recorded on: `2026-03-19`
|
||||
- execution mode: `direct_cli_replay` via `scripts/run_orch_skill_forward_tests.sh`
|
||||
- result: `pass`
|
||||
- observed run id: `run_blog_skill_reassign_001`
|
||||
- observed original thread id: `thr_0a61240412134de3b3d9ab219b6c8f19`
|
||||
- observed reassigned thread id: `thr_12fbcf6d89d948548306198d013d77a5`
|
||||
- evidence summary:
|
||||
- `orch wait --for task_blocked` woke after worker-a posted a blocked question with payload `Proceed with v1 scope?`
|
||||
- `orch reassign --run run_blog_skill_reassign_001 --task T1 --to worker-b --json` returned `attempt_no == 2` and assigned the new attempt to `worker-b`
|
||||
- final `inbox show` on the original thread returned `thread.status == "cancelled"` and preserved the blocked `question` message
|
||||
- final `inbox show` on the reassigned thread returned `thread.status == "done"`
|
||||
- final `orch status --run run_blog_skill_reassign_001 --json` returned `run.status == "done"` and `tasks[0].status == "done"`
|
||||
- note: this recorded run exercised the packaged binaries directly in a temporary DB and did not spawn separate Codex role agents
|
||||
|
||||
## Recorded Real Forward Run
|
||||
|
||||
- recorded on: `2026-03-19`
|
||||
- execution mode: `real_subagent_forward_test`
|
||||
- result: `pass`
|
||||
- evidence root: `/tmp/orch-skill-subagents.J1XWgs/leader-reassigns-blocked-task-through-bundled-cli-phased`
|
||||
- observed run id: `run_blog_skill_reassign_001`
|
||||
- observed original thread id: `thr_7d43af5bc1f7467da98a39adb0de5808`
|
||||
- observed reassigned thread id: `thr_eba253db8965423b855d0c784a29702c`
|
||||
- evidence summary:
|
||||
- the same real leader agent using `skills/orch/` completed the case in three phases: initial `run/task/dispatch`, then `wait --for task_blocked` plus `reassign`, then final `wait --for task_done` plus `status`
|
||||
- a real `worker-a` agent using `skills/inbox/` claimed the original thread and posted the blocked question `Proceed with v1 scope?`
|
||||
- a real `worker-b` agent using `skills/inbox/` claimed the reassigned thread and completed it
|
||||
- main-thread validation confirmed the original thread finished `cancelled`, the reassigned thread finished `done`, and the original blocked question remained visible in thread history
|
||||
|
||||
Reference in New Issue
Block a user