# Implementation Roadmap ## Purpose This document is the handoff-oriented implementation plan for the project. It is intentionally short and execution-focused. A new agent should be able to read this file, understand the current project state, and immediately know what to build next without re-deriving the whole design. ## Current Status As of now: - architecture and workflow docs are written - CLI surfaces for `inbox`, `orch`, worktree execution, and `council-review` are defined - embedded SQLite schema and migrations exist in code - JSON output shapes are defined for the major flows - Go module and initial command skeletons exist - `inbox` and `orch` both compile - shared SQLite schema initialization exists - `inbox` is implemented end-to-end, including send/fetch/claim/renew/update/reply/done/fail/cancel/list/show/watch/wait-reply - `inbox` supports blocking waits, lease renewal, unread fetches backed by per-agent read cursors, `--body-file`, artifact attachments, and structured JSON errors with stable exit codes - integration tests now cover each implemented inbox command, plus the main inbox workflows, wait/watch flows, artifact persistence, unread behavior, and JSON error contracts - a human-readable inbox command test-plan set has been authored under `docs/tests/inbox/` - a human-readable `orch` test-plan set has now been authored under `docs/tests/orch/`, with a `ROADMAP.md`, shared conventions, workflow scenarios, per-command indexes, and concrete case documents aligned to the current CLI surface, including supplemental coverage for key flag validation, ordering/limit behavior, payload-only answers, cleanup errors, and council report default/error contracts - a reusable Codex skill package for `inbox` now exists under `skills/inbox/`, with a formal `SKILL.md`, `agents/openai.yaml`, and a bundled CLI binary asset - reusable Codex skill packages for `orch` and `council-review` now exist under `skills/orch/` and `skills/council-review/`, both using bundled copies of the `orch` CLI binary asset - an inbox skill forward-test plan directory now exists under `docs/tests/inbox-skill/`, with a shared execution template and multiple scenario cases - an orch skill forward-test plan directory now exists under `docs/tests/orch-skill/`, with a shared execution contract and eight leader-side workflow scenarios - a repo-local replay runner now exists at `scripts/run_orch_skill_forward_tests.sh`, and all eight `docs/tests/orch-skill/` cases now include recorded example runs from bundled-CLI replays captured on `2026-03-19`, including added coverage for dependency-gated ready sequencing, active task cancellation, and payload-only blocked answers - the original five `docs/tests/orch-skill/` cases also include recorded real subagent-forward runs captured on `2026-03-19`, with spawned leader and worker agents using the packaged `skills/orch/` and `skills/inbox/` bundles - a council-review skill forward-test plan directory now exists under `docs/tests/council-review-skill/`, with a shared execution contract and nine council workflow scenarios covering end-to-end flow, unanimous-only defaults, timeout/before-tally errors, explicit minority reporting, invalid report filters, strict tally semantics, malformed reviewer JSON, and target-file inputs - an execution-roadmap workflow now exists under `docs/roadmaps/active/` and `docs/roadmaps/archive/` for agent-level work traces and completion archives - a forward-looking web product monorepo plan now exists under `docs/web-product-monorepo.md`, defining the recommended React frontend, `chi` HTTP service, `cmd/orchd` entrypoint, and shared application/query layering for future web work - the Phase 1 web-product skeleton is now in place, including root `pnpm` workspace files, a standalone React app under `apps/web`, an initial OpenAPI/events contract under `api/`, and a new `cmd/orchd` HTTP service backed by `internal/app`, `internal/query`, and `internal/httpapi` - `orchd` now serves a minimal read-only web API with `chi`, including `/health`, runs list/detail, run task list, blocked-task list, and thread detail endpoints backed by the existing SQLite state - HTTP tests now cover the initial read-only `orchd` slice, and the new frontend workspace builds successfully with `pnpm run web:build` - Phase 2 frontend work has now started by bootstrapping `apps/web` with copied-in `cadence-ui` tokens and foundational components for button, input, textarea, dialog, form, tabs, card, badge, and alert, with the shared token stylesheet loaded from the frontend entrypoint - the first real Phase 2 read-only operator UI is now implemented in `apps/web`, including routed runs list, run detail, blocked queue, and thread timeline views backed by the existing `orchd` HTTP API, plus Tailwind v4 consumer wiring so the source-owned Cadence UI components render correctly in the app - a repo-local `scripts/package_skill_clis.sh` packaging flow now builds bundled skill CLI assets for `inbox`, `orch`, and `council-review` - `orch` now implements `run init/show`, `task add`, `dep add`, `ready`, `dispatch`, `reconcile`, `wait`, `blocked`, `answer`, `retry`, `reassign`, `cancel`, `cleanup`, and `status` - `orch` can create runs, gate tasks through dependencies, dispatch work through `inbox`, reconcile worker thread state back into task state, answer blocked tasks, retry or reassign work, cancel tasks or runs, clean attempt worktrees, and create per-attempt Git worktrees during strict dispatch - `orch dispatch` now supports `--repo-path`, `--workspace-root`, and `--strict-worktree`, auto-enables strict worktree mode for code-like tasks inferred from task metadata, resolves committed base revisions, records workspace metadata on attempts, and writes that metadata into inbox task payloads - `orch wait` now blocks on run-scoped task events and reconciles inbox state while polling so leader waits can wake on worker progress without manual sleep loops - `orch council start` now creates a dedicated council run, persists council target input metadata, and dispatches the three fixed reviewer roles through the existing scheduler - `orch council wait` now blocks until the three reviewer tasks reach terminal states or a timeout is reached - `orch council tally` now parses completed reviewer outputs, persists `council_findings`, groups recommendations into `consensus`, `majority`, and `minority`, and persists `council_groups` - `orch council report` now reads persisted `council_groups`, renders human-readable markdown reports, writes markdown artifacts, and persists final report metadata in `council_reports` - automated integration tests now cover the main `orch` scheduler slice, including dependency gating, dispatch, blocked-answer flow, retry, reassign, cancel, cleanup, strict worktree creation, automatic code-task worktree enablement, dirty-repo rejection rules, wait wake/timeout behavior, and council start/wait/tally/report behavior - additional `orch` command and workflow contract tests now cover the full documented Markdown case set under `docs/tests/orch/`, including `run init/show`, `task add` validation, ready ordering, dispatch attempt/thread contracts, blocked latest-question output, answer payload-only and empty-input rejection, cleanup selector and no-match errors, status summaries, reconcile failed-state mapping, strict-worktree dispatch-to-cleanup, and council report default/error behavior This means the project now has a working `orch` core scheduler with automatic worktree selection for code-like tasks, strict worktree-backed dispatch, the main leader-side control loop, and the full v1 council workflow from start through final report generation. ## Source Of Truth Read these docs first: - [architecture.md](/home/kurihada/project/ai-workflow-skill/docs/architecture.md) - [inbox-cli.md](/home/kurihada/project/ai-workflow-skill/docs/inbox-cli.md) - [orch-cli.md](/home/kurihada/project/ai-workflow-skill/docs/orch-cli.md) - [worktree-execution.md](/home/kurihada/project/ai-workflow-skill/docs/worktree-execution.md) - [council-review.md](/home/kurihada/project/ai-workflow-skill/docs/council-review.md) - [web-product-monorepo.md](/home/kurihada/project/ai-workflow-skill/docs/web-product-monorepo.md) Use this roadmap for implementation order, not for protocol design. ## Project Goal Build a Go-based local agent orchestration stack with: - `inbox`: worker-facing durable coordination bus - `orch`: leader-facing scheduler and control plane - strict worktree-backed execution for code-writing task attempts - `council-review`: a user-facing three-reviewer brainstorm workflow implemented on top of `orch` ## Implementation Principles - Do not redesign the protocol unless implementation reveals a real contradiction. - Keep `inbox` and `orch` as separate CLIs or command groups, but share one SQLite file. - Prefer one small working path over broad unfinished scaffolding. - Make JSON output stable early. - Implement the happy path first, then add wait/retry/cleanup. ## Recommended v1 Order ## Progress Snapshot Current implementation status: - `Milestone 1: Go Skeleton` is complete - `Milestone 2: Shared DB Layer` is complete enough for both CLIs - `Milestone 3: Inbox Happy Path` is complete - `Milestone 4: Orch Core Scheduling` is complete for the current non-worktree scheduler scope - `Milestone 5: Strict Worktree Support` is complete - `Milestone 6: Waiting Primitives` is complete - `Milestone 7: Council Review` is complete - `Milestone 8: Web Product Phase 1 Skeleton` is complete - `Milestone 9: Web Product Phase 2 Read-Only Operator UI` is complete for the initial operator surface The council review v1 surface is complete, the first web-product skeleton now exists as a separate monorepo workspace plus read-only HTTP backend slice, and the first real operator-facing Phase 2 read-only web views now exist on top of the internal Cadence UI component library. ### Milestone 1: Go Skeleton Goal: - initialize the Go module - choose CLI framework and SQLite driver - create package layout - make empty commands compile Recommended shape: - `cmd/inbox` - `cmd/orch` - `internal/db` - `internal/store` - `internal/protocol` - `internal/cli` Definition of done: - `go build ./...` succeeds - `inbox --help` works - `orch --help` works Status: - completed ### Milestone 2: Shared DB Layer Goal: - create the SQLite connection layer - enable required pragmas - add schema initialization and migration mechanism Minimum scope: - communication tables for `inbox` - scheduling tables for `orch` - shared `events` table Definition of done: - `inbox init` initializes the database - `orch` can open the same database successfully Status: - completed for current inbox needs Completed so far: - shared DB open layer exists - required SQLite pragmas are applied - embedded schema files exist - `inbox init` applies schema successfully Remaining: - decide whether `orch` should gain an explicit DB bootstrap check or continue to rely on `inbox init` ### Milestone 3: Inbox Happy Path Goal: - implement worker-facing coordination primitives first First commands: - `inbox init` - `inbox send` - `inbox fetch` - `inbox claim` - `inbox update` - `inbox reply` - `inbox done` - `inbox fail` - `inbox show` Delay if needed: - `watch` - `wait-reply` - `cancel` - `list` Definition of done: - one thread can be created, claimed, updated, replied to, and completed - all major commands support `--json` Status: - completed Completed so far: - `inbox init` - `inbox send` - `inbox fetch` - `inbox claim` - `inbox renew` - `inbox update` - `inbox reply` - `inbox done` - `inbox fail` - `inbox cancel` - `inbox list` - `inbox show` - `inbox watch` - `inbox wait-reply` ### Milestone 4: Orch Core Scheduling Goal: - implement run/task/dependency/attempt orchestration on top of `inbox` First commands: - `orch run init` - `orch task add` - `orch dep add` - `orch ready` - `orch dispatch` - `orch reconcile` - `orch blocked` - `orch answer` - `orch status` Delay if needed: - `retry` - `reassign` - `cancel` - `cleanup` - `wait` Definition of done: - a leader can create a run - add tasks and dependencies - dispatch a task through `orch` - see worker state reflected back after `reconcile` Status: - completed for the current non-worktree scheduling scope Completed so far: - `orch run init` - `orch run show` - `orch task add` - `orch dep add` - `orch ready` - `orch dispatch` - `orch reconcile` - `orch wait` - `orch blocked` - `orch answer` - `orch retry` - `orch reassign` - `orch cancel` - `orch cleanup` - `orch status` - CLI integration tests cover dispatch/reconcile, dependency gating, blocked-answer flow, wait wake/timeout, retry, reassign, cancel, cleanup, and non-ready dispatch rejection Remaining: - none for the current scheduler control surface ### Milestone 5: Strict Worktree Support Goal: - ensure code-writing tasks execute in isolated worktrees First scope: - `orch dispatch` resolves `base_ref` - strict mode fails when the repo is dirty and no explicit base is provided - worktree path and branch name are stored on the attempt Definition of done: - a code task dispatch creates a real worktree - the assigned worktree path appears in attempt metadata and inbox payload Status: - completed Completed so far: - `orch dispatch` can use `--repo-path` to target a source Git repository without relying on the caller's current working directory - `orch dispatch --strict-worktree` resolves `base_ref` to a concrete commit, defaults to `HEAD` on clean repositories, and rejects dirty repositories when `--base-ref` is omitted - `orch dispatch` auto-selects worktree mode for code-like tasks inferred from existing task metadata such as worker role and acceptance markers - dispatch creates a fresh branch and Git worktree per attempt and persists `base_ref`, `base_commit`, `branch_name`, `worktree_path`, and `workspace_status` - dispatch writes workspace metadata into the inbox task payload for worker runtimes - reconcile now advances `workspace_status` from `created` to `active`, `completed`, or `abandoned` based on thread state - `orch cleanup` removes completed or abandoned worktrees and marks attempt workspace state as `cleaned` - CLI integration tests cover strict worktree creation, auto-enabled worktrees for code-like tasks, explicit-base dispatch on dirty repos, strict dirty-repo rejection, and cleanup Remaining: - none ### Milestone 6: Waiting Primitives Goal: - replace blind polling with blocking CLI waits Commands: - `orch wait` - `inbox wait-reply` Definition of done: - leader can block on new task events - blocked worker can block on reply events Status: - completed Completed so far: - `orch wait` - `inbox wait-reply` - `orch wait` reconciles inbox state while polling and wakes on matching run-scoped `task_*` events - CLI integration tests cover wait wake and timeout behavior ### Milestone 7: Council Review Goal: - implement the user-facing three-reviewer brainstorming workflow First commands: - `orch council start` - `orch council wait` - `orch council tally` - `orch council report` Definition of done: - one council run can dispatch three reviewers - tally grouped recommendations into `consensus`, `majority`, and `minority` - produce stable JSON and a markdown report artifact Status: - completed Completed so far: - council-specific storage now includes run metadata, reviewer assignment rows, reviewer findings/groups tables, persisted council input references, and final report metadata - `orch council start` - `orch council wait` - `orch council tally` - `orch council report` - council start creates a dedicated run, stores council target input metadata, creates reviewer tasks `CR1` through `CR3`, and dispatches the fixed reviewer roles `architecture-reviewer`, `implementation-reviewer`, and `risk-reviewer` - council wait blocks until all three reviewer tasks reach terminal states or timeout - council tally parses structured reviewer outputs from completed reviewer result messages and persists grouped recommendations - council report reads grouped recommendations from persisted `council_groups`, supports `--show` bucket filtering, renders markdown report artifacts, and persists report metadata plus artifact paths - CLI integration tests cover council start dispatch, metadata persistence, council wait wake/timeout behavior, council tally grouping in `normal` and `strict` modes, and council report default/all/JSON rendering behavior Remaining: - none for the v1 council workflow ### Milestone 8: Web Product Phase 1 Skeleton Goal: - create the first durable web-product backbone without replacing the existing CLI workflows Add: - root `pnpm` workspace files - `apps/web` - `api/openapi.yaml` - `api/events.md` - `cmd/orchd` - `internal/app` - `internal/query` - `internal/httpapi` Definition of done: - the repository contains the agreed monorepo skeleton - `orchd` can serve a small read-only HTTP API against the existing database - the frontend workspace builds and can evolve independently in later milestones Status: - completed Completed so far: - root `package.json`, `pnpm-workspace.yaml`, and `pnpm-lock.yaml` now define the monorepo JS workspace - `apps/web` now contains a Vite + React + TypeScript + TanStack Router + TanStack Query frontend shell - `cmd/orchd` now opens the shared SQLite database, applies migrations, and serves a `chi` router with graceful shutdown handling - `internal/query` now exposes run list/detail, run tasks, blocked-task, and thread-detail read models for the web surface - `internal/app` now provides a thin web-service boundary over the new read service - `internal/httpapi` now owns HTTP routing, JSON/error helpers, and the initial read-only endpoints - `api/openapi.yaml` now documents the implemented read-only endpoints and response shapes - `api/events.md` now captures the planned SSE contract for the next realtime slice - `go test ./...` covers the new HTTP slice, and `pnpm run web:build` succeeds for the frontend workspace Remaining: - Phase 2 should turn the frontend shell into actual run, task-board, blocked-queue, and thread-detail pages using the new HTTP contract ### Milestone 9: Web Product Phase 2 Read-Only Operator UI Goal: - implement the first real operator-facing read-only web UI on top of the Phase 1 shell and current `orchd` API contract Add: - copied-in `cadence-ui` primitives and token CSS under `apps/web/src/cadence-ui` - Tailwind v4 consumer setup so the copied-in Cadence UI source renders correctly in the app - routed screens for runs list, run detail, blocked queue, and thread timeline - typed frontend read helpers for the current `orchd` endpoints Definition of done: - `apps/web` imports the shared Cadence token stylesheet from the frontend entrypoint - the Cadence UI source-owned components render correctly inside the consumer app - the first routed read-only operator screens ship against the existing `orchd` contract - future web screens can compose from `cadence-ui` primitives instead of raw one-off HTML controls Status: - completed for the first read-only operator slice Completed so far: - `apps/web/src/cadence-ui/` now contains copied-in Cadence UI tokens plus foundational components for button, input, textarea, dialog, form, tabs, card, badge, and alert - `apps/web/package.json` now includes the required Radix, `react-hook-form`, `motion`, and utility dependencies for the copied-in components - `apps/web/src/main.tsx` now imports `./cadence-ui/tokens/styles.css` - `apps/web` now includes Tailwind v4 consumer wiring in Vite and the global stylesheet so the copied-in Cadence UI utility classes render correctly - `apps/web` now includes a typed frontend read layer for runs, run detail, blocked queue aggregation, and thread detail - `apps/web` now ships routed runs list, run detail, blocked queue, and thread timeline pages using Cadence UI source-owned components for cards, tabs, alerts, inputs, badges, buttons, dialogs, and textareas - the run detail view now includes grouped task boards and blocked-task summaries, while the thread timeline view now shows message payload and artifact metadata inspectors - `pnpm run web:build` succeeds for the new operator UI, and local Vite-to-`orchd` proxy smoke checks confirm the frontend can read the seeded runs, blocked, and thread endpoints through the dev server Remaining: - add operator write actions such as answer, retry, reassign, and cancel on top of the new read-only screens - add council result/report views and realtime event handling on top of the current routed UI - install additional `cadence-ui` components on demand as the product surface expands ## Immediate Next Task If a new agent is taking over now, the next concrete step should be: 1. treat `Milestone 8: Web Product Phase 1 Skeleton` as complete unless a new user request reopens the backend skeleton itself 2. keep the authored inbox test-plan set in `docs/tests/inbox/` synchronized if future `orch` or web work changes shared CLI-visible behavior 3. treat `Milestone 9: Web Product Phase 2 Read-Only Operator UI` as complete for the initial operator surface and build the next web slice on top of the shipped read pages rather than replacing them 4. start the next web phase by wiring operator write actions such as answer, retry, reassign, and cancel into the existing runs, blocked, and thread views 5. add council result/report screens and realtime event handling after the operator write path is clear 6. install additional `cadence-ui` components on demand when those screens need them, instead of reintroducing bespoke primitives into `apps/web` 7. keep `api/openapi.yaml`, `api/events.md`, and `docs/web-product-monorepo.md` synchronized as the web surface expands The inbox implementation and its human-readable test-plan set are already in place, `orch` supports the main scheduler loop plus the complete council start/wait/tally/report workflow, and the web product now has its first real operator-facing read surfaces, so the next step should be write-capable operator workflows and council/realtime expansion rather than reopening the frontend shell or basic read pages. ## Recommended Driver Choices Current recommendation: - CLI framework: `Cobra` - SQLite driver: pure-Go driver Reason: - command surfaces are already command-group heavy - pure-Go SQLite keeps distribution simpler ## Suggested Early Tests Completed so far: - schema init test - inbox command-level CLI integration coverage aligned to `docs/tests/inbox/` - inbox workflow lifecycle coverage - orch scheduler lifecycle coverage for run/task/dependency/dispatch/reconcile - orch blocked-question and answer coverage - orch strict worktree creation and dirty-repo policy coverage - orch wait wake and timeout coverage - orch retry, reassign, cancel, and cleanup coverage - orch council start dispatch and persistence coverage - orch council wait wake and timeout coverage - orch council tally grouping coverage - orch council report default markdown, `--show all`, and JSON shape coverage Still recommended before the codebase grows too much: - worktree path generation test ## Inbox Test Documentation Roadmap Status: - completed for the current inbox CLI surface - command-level and workflow Markdown documents exist under `docs/tests/inbox/` - future updates should revise this section only when new inbox commands or materially new CLI-visible behavior are added Goal: - make inbox behavior easy for a new agent to understand and convert into automated tests without re-reading all code paths Directory layout: - `docs/tests/inbox/README.md` - `docs/tests/inbox/_shared/README.md` - `docs/tests/inbox/workflows/README.md` - `docs/tests/inbox//README.md` - `docs/tests/inbox//.md` Initial command folders: - `init` - `send` - `fetch` - `claim` - `renew` - `update` - `reply` - `done` - `fail` - `cancel` - `list` - `show` - `watch` - `wait-reply` Documentation rules: - organize by folder with a `README.md` entrypoint - command folders use `README.md` as an index only - each command case lives in its own Markdown file named after the case slug - do not use numeric test case IDs - identify command cases by concrete file path - keep one command per directory, plus `workflows/` for cross-command behavior - use `_shared/` for common fixtures, database conventions, exit-code rules, and shared JSON assertions Required per-case structure: - `用例意义` - `前置条件` - `输入` - `预期输出` - `断言结论` Case file naming pattern: - `.md` Authoring order: 1. global conventions in `docs/tests/inbox/README.md` 2. shared fixtures and assertion helpers in `docs/tests/inbox/_shared/README.md` 3. lifecycle flow in `docs/tests/inbox/workflows/README.md` 4. core command docs: `send`, `fetch`, `claim`, `reply`, `done`, `show` 5. secondary command docs: `renew`, `update`, `fail`, `cancel`, `list` 6. waiting and read-state docs: `watch`, `wait-reply`, unread and mark-read workflow cases Definition of done: - every implemented inbox command has a dedicated document directory - every documented case contains concrete input and expected output - shared assumptions are centralized instead of copied into each command file - a new agent can pick any case and implement it as an automated test with minimal additional discovery ## Orch Test Documentation Roadmap Status: - current planned `orch` Markdown test-plan set is authored under `docs/tests/orch/` - global conventions, shared fixtures, workflow scenarios, per-command indexes, and concrete case documents now exist - `docs/tests/orch/ROADMAP.md` now tracks authored counts, document progress, and future additions in the same style used for `docs/tests/inbox/ROADMAP.md` - supplemental command-visible cases now cover high-value gaps in `task add`, `ready`, `answer`, `cleanup`, and `council report` Goal: - make `orch` behavior easy for a new agent to understand and convert into automated tests or manual validation steps without re-reading all scheduler code paths Directory layout: - `docs/tests/orch/README.md` - `docs/tests/orch/ROADMAP.md` - `docs/tests/orch/_shared/README.md` - `docs/tests/orch/workflows/README.md` - `docs/tests/orch//README.md` - `docs/tests/orch//.md` Current document model: - one folder per implemented leaf command - each command folder uses `README.md` as an index only - workflow cases live in `docs/tests/orch/workflows/README.md` - detailed case backlog and authored-case register are tracked centrally in `docs/tests/orch/ROADMAP.md` Next documentation step: - keep `docs/tests/orch/ROADMAP.md` synchronized when new `orch` CLI behavior or workflow cases are added, removed, or materially revised ## Out Of Scope For First Pass Do not block v1 on these: - advanced auth or permissions - background daemons beyond blocking CLI commands ## Handoff Notes For Future Agents - The design phase is complete enough to start coding. - Avoid reopening major design questions unless implementation forces it. - The repository already has compiling binaries and working schema init. - The inbox test-plan docs are in place; keep them synchronized before and during broad `orch` implementation. - inbox command test-plan folders use `README.md` as an index plus one file per case; keep any further structural changes consistent with the documented rules above. - Preserve the separation: - `inbox` handles communication - `orch` handles scheduling - `council-review` is a workflow on top of `orch` - When writing inbox test docs, use the folder-per-command structure described above and keep cross-command cases inside `docs/tests/inbox/workflows/`. - Treat this file as the implementation entrypoint for new agents.