3.1 KiB
3.1 KiB
Title
Real Subagent Forward Tests For Orch Skill
Status
completed
Owner
- codex
Started At
2026-03-19
Goal
- Execute the documented
docs/tests/orch-skill/scenarios using real spawned role agents with injectedskills/orch/andskills/inbox/, then record concrete pass/fail evidence and sync the repository docs.
Scope
- validate subagent skill injection for project-local orch and inbox skills
- run the five documented orch-skill forward cases with real leader and worker subagents
- collect main-thread validation evidence and agent summaries
- update the orch-skill docs and implementation roadmap with the real forward-test results
Checklist
- Re-read the orch-skill shared execution contract and worker skill constraints.
- Validate project-local skill injection with a small spawned-agent probe.
- Execute the five orch-skill cases with real spawned role agents and collect evidence.
- Update the orch-skill docs and implementation roadmap with the real forward-test results.
- Archive this execution roadmap with a completion summary.
Files
docs/tests/orch-skill/README.mddocs/tests/orch-skill/leader-run-dispatch-reconcile-through-bundled-cli.mddocs/tests/orch-skill/leader-blocked-answer-resume-through-bundled-cli.mddocs/tests/orch-skill/strict-worktree-dispatch-to-cleanup-through-bundled-cli.mddocs/tests/orch-skill/leader-retries-failed-task-through-bundled-cli.mddocs/tests/orch-skill/leader-reassigns-blocked-task-through-bundled-cli.mddocs/implementation-roadmap.mddocs/roadmaps/archive/orch-skill-real-forward-test.md
Decisions
- Use real spawned role agents per case instead of the direct replay runner, because the user explicitly asked for true tests with subagents.
- Keep the main thread responsible for DB setup, fixture creation, and independent validation so the final judgment does not rely only on role-agent self-reporting.
- Fall back from
fork_context: truetofork_context: falsefor the real case runs after the first wider-context attempt stalled and mis-executed the worker-side contract in this repo. - For the longer
retryandreassigncases, keep one leader agent active across staged prompts instead of one long monolithic prompt, because staged execution proved more reliable while still preserving a real agent-ownedorchflow.
Blockers
- none
Next Step
- rerun the same five cases when the packaged skill binaries or case docs change, and consider adding the same real subagent coverage for
council-reviewif that surface needs parity
Completion Summary
- Verified both project-local skill bundles with spawned-agent help-command probes before the real runs.
- Collected successful real subagent evidence for all five orch-skill cases under
/tmp/orch-skill-subagents.J1XWgs. - Main-thread validation confirmed all five final successful runs reached the expected
orchandinboxstates. - Updated
docs/tests/orch-skill/README.md, all five case files, anddocs/implementation-roadmap.mdto record the new real forward-test coverage.