# Implementation Roadmap ## Purpose This document is the handoff-oriented implementation plan for the project. It is intentionally short and execution-focused. A new agent should be able to read this file, understand the current project state, and immediately know what to build next without re-deriving the whole design. ## Current Status As of now: - architecture and workflow docs are written - CLI surfaces for `inbox`, `orch`, worktree execution, and `council-review` are defined - embedded SQLite schema and migrations exist in code - JSON output shapes are defined for the major flows - Go module and initial command skeletons exist - `inbox` and `orch` both compile - shared SQLite schema initialization exists - `inbox` is implemented end-to-end, including send/fetch/claim/renew/update/reply/done/fail/cancel/list/show/watch/wait-reply - `inbox` supports blocking waits, lease renewal, unread fetches backed by per-agent read cursors, `--body-file`, artifact attachments, and structured JSON errors with stable exit codes - integration tests now cover each implemented inbox command, plus the main inbox workflows, wait/watch flows, artifact persistence, unread behavior, and JSON error contracts - a human-readable inbox command test-plan set has been authored under `docs/tests/inbox/` - a human-readable `orch` test-plan set has now been authored under `docs/tests/orch/`, with a `ROADMAP.md`, shared conventions, workflow scenarios, per-command indexes, and concrete case documents aligned to the current CLI surface, including supplemental coverage for key flag validation, ordering/limit behavior, payload-only answers, cleanup errors, and council report default/error contracts - a reusable Codex skill package for `inbox` now exists under `skills/inbox/`, with a formal `SKILL.md`, `agents/openai.yaml`, and a bundled CLI binary asset - reusable Codex skill packages for `orch` and `council-review` now exist under `skills/orch/` and `skills/council-review/`, both using bundled copies of the `orch` CLI binary asset - an inbox skill forward-test plan directory now exists under `docs/tests/inbox-skill/`, with a shared execution template and multiple scenario cases - an orch skill forward-test plan directory now exists under `docs/tests/orch-skill/`, with a shared execution contract and eight leader-side workflow scenarios - a repo-local replay runner now exists at `scripts/run_orch_skill_forward_tests.sh`, and all eight `docs/tests/orch-skill/` cases now include recorded example runs from bundled-CLI replays captured on `2026-03-19`, including added coverage for dependency-gated ready sequencing, active task cancellation, and payload-only blocked answers - the original five `docs/tests/orch-skill/` cases also include recorded real subagent-forward runs captured on `2026-03-19`, with spawned leader and worker agents using the packaged `skills/orch/` and `skills/inbox/` bundles - a council-review skill forward-test plan directory now exists under `docs/tests/council-review-skill/`, with a shared execution contract and nine council workflow scenarios covering end-to-end flow, unanimous-only defaults, timeout/before-tally errors, explicit minority reporting, invalid report filters, strict tally semantics, malformed reviewer JSON, and target-file inputs - an execution-roadmap workflow now exists under `docs/roadmaps/active/` and `docs/roadmaps/archive/` for agent-level work traces and completion archives - a forward-looking web product monorepo plan now exists under `docs/web-product-monorepo.md`, defining the recommended React frontend, `chi` HTTP service, `cmd/orchd` entrypoint, and shared application/query layering for future web work - the Phase 1 web-product skeleton is now in place, including root `pnpm` workspace files, a standalone React app under `apps/web`, an initial OpenAPI/events contract under `api/`, and a new `cmd/orchd` HTTP service backed by `internal/app`, `internal/query`, and `internal/httpapi` - `orchd` now serves a minimal read-only web API with `chi`, including `/health`, runs list/detail, run task list, blocked-task list, and thread detail endpoints backed by the existing SQLite state - HTTP tests now cover the initial read-only `orchd` slice, and the new frontend workspace builds successfully with `pnpm run web:build` - Phase 2 frontend work has now started by bootstrapping `apps/web` with copied-in `cadence-ui` tokens and foundational components for button, input, textarea, dialog, form, tabs, card, badge, and alert, with the shared token stylesheet loaded from the frontend entrypoint - a repo-local `scripts/package_skill_clis.sh` packaging flow now builds bundled skill CLI assets for `inbox`, `orch`, and `council-review` - `orch` now implements `run init/show`, `task add`, `dep add`, `ready`, `dispatch`, `reconcile`, `wait`, `blocked`, `answer`, `retry`, `reassign`, `cancel`, `cleanup`, and `status` - `orch` can create runs, gate tasks through dependencies, dispatch work through `inbox`, reconcile worker thread state back into task state, answer blocked tasks, retry or reassign work, cancel tasks or runs, clean attempt worktrees, and create per-attempt Git worktrees during strict dispatch - `orch dispatch` now supports `--repo-path`, `--workspace-root`, and `--strict-worktree`, auto-enables strict worktree mode for code-like tasks inferred from task metadata, resolves committed base revisions, records workspace metadata on attempts, and writes that metadata into inbox task payloads - `orch wait` now blocks on run-scoped task events and reconciles inbox state while polling so leader waits can wake on worker progress without manual sleep loops - `orch council start` now creates a dedicated council run, persists council target input metadata, and dispatches the three fixed reviewer roles through the existing scheduler - `orch council wait` now blocks until the three reviewer tasks reach terminal states or a timeout is reached - `orch council tally` now parses completed reviewer outputs, persists `council_findings`, groups recommendations into `consensus`, `majority`, and `minority`, and persists `council_groups` - `orch council report` now reads persisted `council_groups`, renders human-readable markdown reports, writes markdown artifacts, and persists final report metadata in `council_reports` - automated integration tests now cover the main `orch` scheduler slice, including dependency gating, dispatch, blocked-answer flow, retry, reassign, cancel, cleanup, strict worktree creation, automatic code-task worktree enablement, dirty-repo rejection rules, wait wake/timeout behavior, and council start/wait/tally/report behavior - additional `orch` command and workflow contract tests now cover the full documented Markdown case set under `docs/tests/orch/`, including `run init/show`, `task add` validation, ready ordering, dispatch attempt/thread contracts, blocked latest-question output, answer payload-only and empty-input rejection, cleanup selector and no-match errors, status summaries, reconcile failed-state mapping, strict-worktree dispatch-to-cleanup, and council report default/error behavior This means the project now has a working `orch` core scheduler with automatic worktree selection for code-like tasks, strict worktree-backed dispatch, the main leader-side control loop, and the full v1 council workflow from start through final report generation. ## Source Of Truth Read these docs first: - [architecture.md](/home/kurihada/project/ai-workflow-skill/docs/architecture.md) - [inbox-cli.md](/home/kurihada/project/ai-workflow-skill/docs/inbox-cli.md) - [orch-cli.md](/home/kurihada/project/ai-workflow-skill/docs/orch-cli.md) - [worktree-execution.md](/home/kurihada/project/ai-workflow-skill/docs/worktree-execution.md) - [council-review.md](/home/kurihada/project/ai-workflow-skill/docs/council-review.md) - [web-product-monorepo.md](/home/kurihada/project/ai-workflow-skill/docs/web-product-monorepo.md) Use this roadmap for implementation order, not for protocol design. ## Project Goal Build a Go-based local agent orchestration stack with: - `inbox`: worker-facing durable coordination bus - `orch`: leader-facing scheduler and control plane - strict worktree-backed execution for code-writing task attempts - `council-review`: a user-facing three-reviewer brainstorm workflow implemented on top of `orch` ## Implementation Principles - Do not redesign the protocol unless implementation reveals a real contradiction. - Keep `inbox` and `orch` as separate CLIs or command groups, but share one SQLite file. - Prefer one small working path over broad unfinished scaffolding. - Make JSON output stable early. - Implement the happy path first, then add wait/retry/cleanup. ## Recommended v1 Order ## Progress Snapshot Current implementation status: - `Milestone 1: Go Skeleton` is complete - `Milestone 2: Shared DB Layer` is complete enough for both CLIs - `Milestone 3: Inbox Happy Path` is complete - `Milestone 4: Orch Core Scheduling` is complete for the current non-worktree scheduler scope - `Milestone 5: Strict Worktree Support` is complete - `Milestone 6: Waiting Primitives` is complete - `Milestone 7: Council Review` is complete - `Milestone 8: Web Product Phase 1 Skeleton` is complete - `Milestone 9: Web Product Phase 2 UI Foundation` is in progress The council review v1 surface is complete, the first web-product skeleton now exists as a separate monorepo workspace plus read-only HTTP backend slice, and Phase 2 frontend work has started on top of the internal Cadence UI component library. ### Milestone 1: Go Skeleton Goal: - initialize the Go module - choose CLI framework and SQLite driver - create package layout - make empty commands compile Recommended shape: - `cmd/inbox` - `cmd/orch` - `internal/db` - `internal/store` - `internal/protocol` - `internal/cli` Definition of done: - `go build ./...` succeeds - `inbox --help` works - `orch --help` works Status: - completed ### Milestone 2: Shared DB Layer Goal: - create the SQLite connection layer - enable required pragmas - add schema initialization and migration mechanism Minimum scope: - communication tables for `inbox` - scheduling tables for `orch` - shared `events` table Definition of done: - `inbox init` initializes the database - `orch` can open the same database successfully Status: - completed for current inbox needs Completed so far: - shared DB open layer exists - required SQLite pragmas are applied - embedded schema files exist - `inbox init` applies schema successfully Remaining: - decide whether `orch` should gain an explicit DB bootstrap check or continue to rely on `inbox init` ### Milestone 3: Inbox Happy Path Goal: - implement worker-facing coordination primitives first First commands: - `inbox init` - `inbox send` - `inbox fetch` - `inbox claim` - `inbox update` - `inbox reply` - `inbox done` - `inbox fail` - `inbox show` Delay if needed: - `watch` - `wait-reply` - `cancel` - `list` Definition of done: - one thread can be created, claimed, updated, replied to, and completed - all major commands support `--json` Status: - completed Completed so far: - `inbox init` - `inbox send` - `inbox fetch` - `inbox claim` - `inbox renew` - `inbox update` - `inbox reply` - `inbox done` - `inbox fail` - `inbox cancel` - `inbox list` - `inbox show` - `inbox watch` - `inbox wait-reply` ### Milestone 4: Orch Core Scheduling Goal: - implement run/task/dependency/attempt orchestration on top of `inbox` First commands: - `orch run init` - `orch task add` - `orch dep add` - `orch ready` - `orch dispatch` - `orch reconcile` - `orch blocked` - `orch answer` - `orch status` Delay if needed: - `retry` - `reassign` - `cancel` - `cleanup` - `wait` Definition of done: - a leader can create a run - add tasks and dependencies - dispatch a task through `orch` - see worker state reflected back after `reconcile` Status: - completed for the current non-worktree scheduling scope Completed so far: - `orch run init` - `orch run show` - `orch task add` - `orch dep add` - `orch ready` - `orch dispatch` - `orch reconcile` - `orch wait` - `orch blocked` - `orch answer` - `orch retry` - `orch reassign` - `orch cancel` - `orch cleanup` - `orch status` - CLI integration tests cover dispatch/reconcile, dependency gating, blocked-answer flow, wait wake/timeout, retry, reassign, cancel, cleanup, and non-ready dispatch rejection Remaining: - none for the current scheduler control surface ### Milestone 5: Strict Worktree Support Goal: - ensure code-writing tasks execute in isolated worktrees First scope: - `orch dispatch` resolves `base_ref` - strict mode fails when the repo is dirty and no explicit base is provided - worktree path and branch name are stored on the attempt Definition of done: - a code task dispatch creates a real worktree - the assigned worktree path appears in attempt metadata and inbox payload Status: - completed Completed so far: - `orch dispatch` can use `--repo-path` to target a source Git repository without relying on the caller's current working directory - `orch dispatch --strict-worktree` resolves `base_ref` to a concrete commit, defaults to `HEAD` on clean repositories, and rejects dirty repositories when `--base-ref` is omitted - `orch dispatch` auto-selects worktree mode for code-like tasks inferred from existing task metadata such as worker role and acceptance markers - dispatch creates a fresh branch and Git worktree per attempt and persists `base_ref`, `base_commit`, `branch_name`, `worktree_path`, and `workspace_status` - dispatch writes workspace metadata into the inbox task payload for worker runtimes - reconcile now advances `workspace_status` from `created` to `active`, `completed`, or `abandoned` based on thread state - `orch cleanup` removes completed or abandoned worktrees and marks attempt workspace state as `cleaned` - CLI integration tests cover strict worktree creation, auto-enabled worktrees for code-like tasks, explicit-base dispatch on dirty repos, strict dirty-repo rejection, and cleanup Remaining: - none ### Milestone 6: Waiting Primitives Goal: - replace blind polling with blocking CLI waits Commands: - `orch wait` - `inbox wait-reply` Definition of done: - leader can block on new task events - blocked worker can block on reply events Status: - completed Completed so far: - `orch wait` - `inbox wait-reply` - `orch wait` reconciles inbox state while polling and wakes on matching run-scoped `task_*` events - CLI integration tests cover wait wake and timeout behavior ### Milestone 7: Council Review Goal: - implement the user-facing three-reviewer brainstorming workflow First commands: - `orch council start` - `orch council wait` - `orch council tally` - `orch council report` Definition of done: - one council run can dispatch three reviewers - tally grouped recommendations into `consensus`, `majority`, and `minority` - produce stable JSON and a markdown report artifact Status: - completed Completed so far: - council-specific storage now includes run metadata, reviewer assignment rows, reviewer findings/groups tables, persisted council input references, and final report metadata - `orch council start` - `orch council wait` - `orch council tally` - `orch council report` - council start creates a dedicated run, stores council target input metadata, creates reviewer tasks `CR1` through `CR3`, and dispatches the fixed reviewer roles `architecture-reviewer`, `implementation-reviewer`, and `risk-reviewer` - council wait blocks until all three reviewer tasks reach terminal states or timeout - council tally parses structured reviewer outputs from completed reviewer result messages and persists grouped recommendations - council report reads grouped recommendations from persisted `council_groups`, supports `--show` bucket filtering, renders markdown report artifacts, and persists report metadata plus artifact paths - CLI integration tests cover council start dispatch, metadata persistence, council wait wake/timeout behavior, council tally grouping in `normal` and `strict` modes, and council report default/all/JSON rendering behavior Remaining: - none for the v1 council workflow ### Milestone 8: Web Product Phase 1 Skeleton Goal: - create the first durable web-product backbone without replacing the existing CLI workflows Add: - root `pnpm` workspace files - `apps/web` - `api/openapi.yaml` - `api/events.md` - `cmd/orchd` - `internal/app` - `internal/query` - `internal/httpapi` Definition of done: - the repository contains the agreed monorepo skeleton - `orchd` can serve a small read-only HTTP API against the existing database - the frontend workspace builds and can evolve independently in later milestones Status: - completed Completed so far: - root `package.json`, `pnpm-workspace.yaml`, and `pnpm-lock.yaml` now define the monorepo JS workspace - `apps/web` now contains a Vite + React + TypeScript + TanStack Router + TanStack Query frontend shell - `cmd/orchd` now opens the shared SQLite database, applies migrations, and serves a `chi` router with graceful shutdown handling - `internal/query` now exposes run list/detail, run tasks, blocked-task, and thread-detail read models for the web surface - `internal/app` now provides a thin web-service boundary over the new read service - `internal/httpapi` now owns HTTP routing, JSON/error helpers, and the initial read-only endpoints - `api/openapi.yaml` now documents the implemented read-only endpoints and response shapes - `api/events.md` now captures the planned SSE contract for the next realtime slice - `go test ./...` covers the new HTTP slice, and `pnpm run web:build` succeeds for the frontend workspace Remaining: - Phase 2 should turn the frontend shell into actual run, task-board, blocked-queue, and thread-detail pages using the new HTTP contract ### Milestone 9: Web Product Phase 2 UI Foundation Goal: - bootstrap the frontend UI layer on top of the Phase 1 shell and read-only backend contract Add: - copied-in `cadence-ui` primitives and token CSS under `apps/web/src/cadence-ui` - app-wide token style wiring in the frontend entrypoint - any additional component installs needed as real screens land Definition of done: - `apps/web` imports the shared Cadence token stylesheet from the frontend entrypoint - the initial shared component set builds successfully inside the workspace - future web screens can compose from `cadence-ui` primitives instead of raw one-off HTML controls Status: - in progress Completed so far: - `apps/web/src/cadence-ui/` now contains copied-in Cadence UI tokens plus foundational components for button, input, textarea, dialog, form, tabs, card, badge, and alert - `apps/web/package.json` now includes the required Radix, `react-hook-form`, `motion`, and utility dependencies for the copied-in components - `apps/web/src/main.tsx` now imports `./cadence-ui/tokens/styles.css` - `pnpm install` refreshed the workspace lockfile, and `pnpm run web:build` succeeds with the copied-in component slice Remaining: - build the actual runs list, run detail, blocked queue, and thread timeline screens on top of the Cadence UI primitives - install additional `cadence-ui` components on demand as the product surface expands ## Immediate Next Task If a new agent is taking over now, the next concrete step should be: 1. treat `Milestone 8: Web Product Phase 1 Skeleton` as complete unless a new user request reopens the backend skeleton itself 2. keep the authored inbox test-plan set in `docs/tests/inbox/` synchronized if future `orch` or web work changes shared CLI-visible behavior 3. continue `Milestone 9: Web Product Phase 2 UI Foundation` by implementing the first runs list, run detail, blocked queue, and thread timeline screens on top of the existing `apps/web` and `orchd` contract 4. install additional `cadence-ui` components on demand when those screens need them, instead of reintroducing bespoke primitives into `apps/web` 5. keep `api/openapi.yaml`, `api/events.md`, and `docs/web-product-monorepo.md` synchronized as the web surface expands The inbox implementation and its human-readable test-plan set are already in place, `orch` supports the main scheduler loop plus the complete council start/wait/tally/report workflow, and the web product is now past the bare frontend shell stage, so the next step should be actual Phase 2 product screens built on top of the Cadence UI foundation rather than reopening earlier milestones. ## Recommended Driver Choices Current recommendation: - CLI framework: `Cobra` - SQLite driver: pure-Go driver Reason: - command surfaces are already command-group heavy - pure-Go SQLite keeps distribution simpler ## Suggested Early Tests Completed so far: - schema init test - inbox command-level CLI integration coverage aligned to `docs/tests/inbox/` - inbox workflow lifecycle coverage - orch scheduler lifecycle coverage for run/task/dependency/dispatch/reconcile - orch blocked-question and answer coverage - orch strict worktree creation and dirty-repo policy coverage - orch wait wake and timeout coverage - orch retry, reassign, cancel, and cleanup coverage - orch council start dispatch and persistence coverage - orch council wait wake and timeout coverage - orch council tally grouping coverage - orch council report default markdown, `--show all`, and JSON shape coverage Still recommended before the codebase grows too much: - worktree path generation test ## Inbox Test Documentation Roadmap Status: - completed for the current inbox CLI surface - command-level and workflow Markdown documents exist under `docs/tests/inbox/` - future updates should revise this section only when new inbox commands or materially new CLI-visible behavior are added Goal: - make inbox behavior easy for a new agent to understand and convert into automated tests without re-reading all code paths Directory layout: - `docs/tests/inbox/README.md` - `docs/tests/inbox/_shared/README.md` - `docs/tests/inbox/workflows/README.md` - `docs/tests/inbox//README.md` - `docs/tests/inbox//.md` Initial command folders: - `init` - `send` - `fetch` - `claim` - `renew` - `update` - `reply` - `done` - `fail` - `cancel` - `list` - `show` - `watch` - `wait-reply` Documentation rules: - organize by folder with a `README.md` entrypoint - command folders use `README.md` as an index only - each command case lives in its own Markdown file named after the case slug - do not use numeric test case IDs - identify command cases by concrete file path - keep one command per directory, plus `workflows/` for cross-command behavior - use `_shared/` for common fixtures, database conventions, exit-code rules, and shared JSON assertions Required per-case structure: - `用例意义` - `前置条件` - `输入` - `预期输出` - `断言结论` Case file naming pattern: - `.md` Authoring order: 1. global conventions in `docs/tests/inbox/README.md` 2. shared fixtures and assertion helpers in `docs/tests/inbox/_shared/README.md` 3. lifecycle flow in `docs/tests/inbox/workflows/README.md` 4. core command docs: `send`, `fetch`, `claim`, `reply`, `done`, `show` 5. secondary command docs: `renew`, `update`, `fail`, `cancel`, `list` 6. waiting and read-state docs: `watch`, `wait-reply`, unread and mark-read workflow cases Definition of done: - every implemented inbox command has a dedicated document directory - every documented case contains concrete input and expected output - shared assumptions are centralized instead of copied into each command file - a new agent can pick any case and implement it as an automated test with minimal additional discovery ## Orch Test Documentation Roadmap Status: - current planned `orch` Markdown test-plan set is authored under `docs/tests/orch/` - global conventions, shared fixtures, workflow scenarios, per-command indexes, and concrete case documents now exist - `docs/tests/orch/ROADMAP.md` now tracks authored counts, document progress, and future additions in the same style used for `docs/tests/inbox/ROADMAP.md` - supplemental command-visible cases now cover high-value gaps in `task add`, `ready`, `answer`, `cleanup`, and `council report` Goal: - make `orch` behavior easy for a new agent to understand and convert into automated tests or manual validation steps without re-reading all scheduler code paths Directory layout: - `docs/tests/orch/README.md` - `docs/tests/orch/ROADMAP.md` - `docs/tests/orch/_shared/README.md` - `docs/tests/orch/workflows/README.md` - `docs/tests/orch//README.md` - `docs/tests/orch//.md` Current document model: - one folder per implemented leaf command - each command folder uses `README.md` as an index only - workflow cases live in `docs/tests/orch/workflows/README.md` - detailed case backlog and authored-case register are tracked centrally in `docs/tests/orch/ROADMAP.md` Next documentation step: - keep `docs/tests/orch/ROADMAP.md` synchronized when new `orch` CLI behavior or workflow cases are added, removed, or materially revised ## Out Of Scope For First Pass Do not block v1 on these: - advanced auth or permissions - background daemons beyond blocking CLI commands ## Handoff Notes For Future Agents - The design phase is complete enough to start coding. - Avoid reopening major design questions unless implementation forces it. - The repository already has compiling binaries and working schema init. - The inbox test-plan docs are in place; keep them synchronized before and during broad `orch` implementation. - inbox command test-plan folders use `README.md` as an index plus one file per case; keep any further structural changes consistent with the documented rules above. - Preserve the separation: - `inbox` handles communication - `orch` handles scheduling - `council-review` is a workflow on top of `orch` - When writing inbox test docs, use the folder-per-command structure described above and keep cross-command cases inside `docs/tests/inbox/workflows/`. - Treat this file as the implementation entrypoint for new agents.