Files
ai-workflow-skill/docs/implementation-roadmap.md
T

19 KiB

Implementation Roadmap

Purpose

This document is the handoff-oriented implementation plan for the project. It is intentionally short and execution-focused.

A new agent should be able to read this file, understand the current project state, and immediately know what to build next without re-deriving the whole design.

Current Status

As of now:

  • architecture and workflow docs are written
  • CLI surfaces for inbox, orch, worktree execution, and council-review are defined
  • embedded SQLite schema and migrations exist in code
  • JSON output shapes are defined for the major flows
  • Go module and initial command skeletons exist
  • inbox and orch both compile
  • shared SQLite schema initialization exists
  • inbox is implemented end-to-end, including send/fetch/claim/renew/update/reply/done/fail/cancel/list/show/watch/wait-reply
  • inbox supports blocking waits, lease renewal, unread fetches backed by per-agent read cursors, --body-file, artifact attachments, and structured JSON errors with stable exit codes
  • integration tests now cover each implemented inbox command, plus the main inbox workflows, wait/watch flows, artifact persistence, unread behavior, and JSON error contracts
  • a human-readable inbox command test-plan set has been authored under docs/tests/inbox/
  • a human-readable orch test-plan set has now been authored under docs/tests/orch/, with a ROADMAP.md, shared conventions, workflow scenarios, per-command indexes, and concrete case documents aligned to the current CLI surface, including supplemental coverage for key flag validation, ordering/limit behavior, payload-only answers, cleanup errors, and council report default/error contracts
  • a reusable Codex skill package for inbox now exists under skills/inbox/, with a formal SKILL.md, agents/openai.yaml, and a bundled CLI binary asset
  • an inbox skill forward-test plan directory now exists under docs/tests/inbox-skill/, with a shared execution template and multiple scenario cases
  • an execution-roadmap workflow now exists under docs/roadmaps/active/ and docs/roadmaps/archive/ for agent-level work traces and completion archives
  • orch now implements run init/show, task add, dep add, ready, dispatch, reconcile, wait, blocked, answer, retry, reassign, cancel, cleanup, and status
  • orch can create runs, gate tasks through dependencies, dispatch work through inbox, reconcile worker thread state back into task state, answer blocked tasks, retry or reassign work, cancel tasks or runs, clean attempt worktrees, and create per-attempt Git worktrees during strict dispatch
  • orch dispatch now supports --repo-path, --workspace-root, and --strict-worktree, auto-enables strict worktree mode for code-like tasks inferred from task metadata, resolves committed base revisions, records workspace metadata on attempts, and writes that metadata into inbox task payloads
  • orch wait now blocks on run-scoped task events and reconciles inbox state while polling so leader waits can wake on worker progress without manual sleep loops
  • orch council start now creates a dedicated council run, persists council target input metadata, and dispatches the three fixed reviewer roles through the existing scheduler
  • orch council wait now blocks until the three reviewer tasks reach terminal states or a timeout is reached
  • orch council tally now parses completed reviewer outputs, persists council_findings, groups recommendations into consensus, majority, and minority, and persists council_groups
  • orch council report now reads persisted council_groups, renders human-readable markdown reports, writes markdown artifacts, and persists final report metadata in council_reports
  • automated integration tests now cover the main orch scheduler slice, including dependency gating, dispatch, blocked-answer flow, retry, reassign, cancel, cleanup, strict worktree creation, automatic code-task worktree enablement, dirty-repo rejection rules, wait wake/timeout behavior, and council start/wait/tally/report behavior

This means the project now has a working orch core scheduler with automatic worktree selection for code-like tasks, strict worktree-backed dispatch, the main leader-side control loop, and the full v1 council workflow from start through final report generation.

Source Of Truth

Read these docs first:

Use this roadmap for implementation order, not for protocol design.

Project Goal

Build a Go-based local agent orchestration stack with:

  • inbox: worker-facing durable coordination bus
  • orch: leader-facing scheduler and control plane
  • strict worktree-backed execution for code-writing task attempts
  • council-review: a user-facing three-reviewer brainstorm workflow implemented on top of orch

Implementation Principles

  • Do not redesign the protocol unless implementation reveals a real contradiction.
  • Keep inbox and orch as separate CLIs or command groups, but share one SQLite file.
  • Prefer one small working path over broad unfinished scaffolding.
  • Make JSON output stable early.
  • Implement the happy path first, then add wait/retry/cleanup.

Progress Snapshot

Current implementation status:

  • Milestone 1: Go Skeleton is complete
  • Milestone 2: Shared DB Layer is complete enough for both CLIs
  • Milestone 3: Inbox Happy Path is complete
  • Milestone 4: Orch Core Scheduling is complete for the current non-worktree scheduler scope
  • Milestone 5: Strict Worktree Support is complete
  • Milestone 6: Waiting Primitives is complete
  • Milestone 7: Council Review is complete

The council review v1 surface is now complete, including final report rendering and metadata persistence.

Milestone 1: Go Skeleton

Goal:

  • initialize the Go module
  • choose CLI framework and SQLite driver
  • create package layout
  • make empty commands compile

Recommended shape:

  • cmd/inbox
  • cmd/orch
  • internal/db
  • internal/store
  • internal/protocol
  • internal/cli

Definition of done:

  • go build ./... succeeds
  • inbox --help works
  • orch --help works

Status:

  • completed

Milestone 2: Shared DB Layer

Goal:

  • create the SQLite connection layer
  • enable required pragmas
  • add schema initialization and migration mechanism

Minimum scope:

  • communication tables for inbox
  • scheduling tables for orch
  • shared events table

Definition of done:

  • inbox init initializes the database
  • orch can open the same database successfully

Status:

  • completed for current inbox needs

Completed so far:

  • shared DB open layer exists
  • required SQLite pragmas are applied
  • embedded schema files exist
  • inbox init applies schema successfully

Remaining:

  • decide whether orch should gain an explicit DB bootstrap check or continue to rely on inbox init

Milestone 3: Inbox Happy Path

Goal:

  • implement worker-facing coordination primitives first

First commands:

  • inbox init
  • inbox send
  • inbox fetch
  • inbox claim
  • inbox update
  • inbox reply
  • inbox done
  • inbox fail
  • inbox show

Delay if needed:

  • watch
  • wait-reply
  • cancel
  • list

Definition of done:

  • one thread can be created, claimed, updated, replied to, and completed
  • all major commands support --json

Status:

  • completed

Completed so far:

  • inbox init
  • inbox send
  • inbox fetch
  • inbox claim
  • inbox renew
  • inbox update
  • inbox reply
  • inbox done
  • inbox fail
  • inbox cancel
  • inbox list
  • inbox show
  • inbox watch
  • inbox wait-reply

Milestone 4: Orch Core Scheduling

Goal:

  • implement run/task/dependency/attempt orchestration on top of inbox

First commands:

  • orch run init
  • orch task add
  • orch dep add
  • orch ready
  • orch dispatch
  • orch reconcile
  • orch blocked
  • orch answer
  • orch status

Delay if needed:

  • retry
  • reassign
  • cancel
  • cleanup
  • wait

Definition of done:

  • a leader can create a run
  • add tasks and dependencies
  • dispatch a task through orch
  • see worker state reflected back after reconcile

Status:

  • completed for the current non-worktree scheduling scope

Completed so far:

  • orch run init
  • orch run show
  • orch task add
  • orch dep add
  • orch ready
  • orch dispatch
  • orch reconcile
  • orch wait
  • orch blocked
  • orch answer
  • orch retry
  • orch reassign
  • orch cancel
  • orch cleanup
  • orch status
  • CLI integration tests cover dispatch/reconcile, dependency gating, blocked-answer flow, wait wake/timeout, retry, reassign, cancel, cleanup, and non-ready dispatch rejection

Remaining:

  • none for the current scheduler control surface

Milestone 5: Strict Worktree Support

Goal:

  • ensure code-writing tasks execute in isolated worktrees

First scope:

  • orch dispatch resolves base_ref
  • strict mode fails when the repo is dirty and no explicit base is provided
  • worktree path and branch name are stored on the attempt

Definition of done:

  • a code task dispatch creates a real worktree
  • the assigned worktree path appears in attempt metadata and inbox payload

Status:

  • completed

Completed so far:

  • orch dispatch can use --repo-path to target a source Git repository without relying on the caller's current working directory
  • orch dispatch --strict-worktree resolves base_ref to a concrete commit, defaults to HEAD on clean repositories, and rejects dirty repositories when --base-ref is omitted
  • orch dispatch auto-selects worktree mode for code-like tasks inferred from existing task metadata such as worker role and acceptance markers
  • dispatch creates a fresh branch and Git worktree per attempt and persists base_ref, base_commit, branch_name, worktree_path, and workspace_status
  • dispatch writes workspace metadata into the inbox task payload for worker runtimes
  • reconcile now advances workspace_status from created to active, completed, or abandoned based on thread state
  • orch cleanup removes completed or abandoned worktrees and marks attempt workspace state as cleaned
  • CLI integration tests cover strict worktree creation, auto-enabled worktrees for code-like tasks, explicit-base dispatch on dirty repos, strict dirty-repo rejection, and cleanup

Remaining:

  • none

Milestone 6: Waiting Primitives

Goal:

  • replace blind polling with blocking CLI waits

Commands:

  • orch wait
  • inbox wait-reply

Definition of done:

  • leader can block on new task events
  • blocked worker can block on reply events

Status:

  • completed

Completed so far:

  • orch wait
  • inbox wait-reply
  • orch wait reconciles inbox state while polling and wakes on matching run-scoped task_* events
  • CLI integration tests cover wait wake and timeout behavior

Milestone 7: Council Review

Goal:

  • implement the user-facing three-reviewer brainstorming workflow

First commands:

  • orch council start
  • orch council wait
  • orch council tally
  • orch council report

Definition of done:

  • one council run can dispatch three reviewers
  • tally grouped recommendations into consensus, majority, and minority
  • produce stable JSON and a markdown report artifact

Status:

  • completed

Completed so far:

  • council-specific storage now includes run metadata, reviewer assignment rows, reviewer findings/groups tables, persisted council input references, and final report metadata
  • orch council start
  • orch council wait
  • orch council tally
  • orch council report
  • council start creates a dedicated run, stores council target input metadata, creates reviewer tasks CR1 through CR3, and dispatches the fixed reviewer roles architecture-reviewer, implementation-reviewer, and risk-reviewer
  • council wait blocks until all three reviewer tasks reach terminal states or timeout
  • council tally parses structured reviewer outputs from completed reviewer result messages and persists grouped recommendations
  • council report reads grouped recommendations from persisted council_groups, supports --show bucket filtering, renders markdown report artifacts, and persists report metadata plus artifact paths
  • CLI integration tests cover council start dispatch, metadata persistence, council wait wake/timeout behavior, council tally grouping in normal and strict modes, and council report default/all/JSON rendering behavior

Remaining:

  • none for the v1 council workflow

Immediate Next Task

If a new agent is taking over now, the next concrete step should be:

  1. treat Milestone 7: Council Review as complete unless a new user request introduces a new council capability
  2. keep the authored inbox test-plan set in docs/tests/inbox/ synchronized if future orch work changes shared CLI behavior
  3. choose the next milestone explicitly instead of reopening the completed council v1 slice

The inbox implementation and its human-readable test-plan set are already in place, and orch now supports the main scheduler loop plus the complete council start/wait/tally/report workflow, so any next step should be a new milestone rather than unfinished council v1 work.

Current recommendation:

  • CLI framework: Cobra
  • SQLite driver: pure-Go driver

Reason:

  • command surfaces are already command-group heavy
  • pure-Go SQLite keeps distribution simpler

Suggested Early Tests

Completed so far:

  • schema init test
  • inbox command-level CLI integration coverage aligned to docs/tests/inbox/
  • inbox workflow lifecycle coverage
  • orch scheduler lifecycle coverage for run/task/dependency/dispatch/reconcile
  • orch blocked-question and answer coverage
  • orch strict worktree creation and dirty-repo policy coverage
  • orch wait wake and timeout coverage
  • orch retry, reassign, cancel, and cleanup coverage
  • orch council start dispatch and persistence coverage
  • orch council wait wake and timeout coverage
  • orch council tally grouping coverage
  • orch council report default markdown, --show all, and JSON shape coverage

Still recommended before the codebase grows too much:

  • worktree path generation test

Inbox Test Documentation Roadmap

Status:

  • completed for the current inbox CLI surface
  • command-level and workflow Markdown documents exist under docs/tests/inbox/
  • future updates should revise this section only when new inbox commands or materially new CLI-visible behavior are added

Goal:

  • make inbox behavior easy for a new agent to understand and convert into automated tests without re-reading all code paths

Directory layout:

  • docs/tests/inbox/README.md
  • docs/tests/inbox/_shared/README.md
  • docs/tests/inbox/workflows/README.md
  • docs/tests/inbox/<command>/README.md
  • docs/tests/inbox/<command>/<case-slug>.md

Initial command folders:

  • init
  • send
  • fetch
  • claim
  • renew
  • update
  • reply
  • done
  • fail
  • cancel
  • list
  • show
  • watch
  • wait-reply

Documentation rules:

  • organize by folder with a README.md entrypoint
  • command folders use README.md as an index only
  • each command case lives in its own Markdown file named after the case slug
  • do not use numeric test case IDs
  • identify command cases by concrete file path
  • keep one command per directory, plus workflows/ for cross-command behavior
  • use _shared/ for common fixtures, database conventions, exit-code rules, and shared JSON assertions

Required per-case structure:

  • 用例意义
  • 前置条件
  • 输入
  • 预期输出
  • 断言结论

Case file naming pattern:

  • <stable-slug>.md

Authoring order:

  1. global conventions in docs/tests/inbox/README.md
  2. shared fixtures and assertion helpers in docs/tests/inbox/_shared/README.md
  3. lifecycle flow in docs/tests/inbox/workflows/README.md
  4. core command docs: send, fetch, claim, reply, done, show
  5. secondary command docs: renew, update, fail, cancel, list
  6. waiting and read-state docs: watch, wait-reply, unread and mark-read workflow cases

Definition of done:

  • every implemented inbox command has a dedicated document directory
  • every documented case contains concrete input and expected output
  • shared assumptions are centralized instead of copied into each command file
  • a new agent can pick any case and implement it as an automated test with minimal additional discovery

Orch Test Documentation Roadmap

Status:

  • current planned orch Markdown test-plan set is authored under docs/tests/orch/
  • global conventions, shared fixtures, workflow scenarios, per-command indexes, and concrete case documents now exist
  • docs/tests/orch/ROADMAP.md now tracks authored counts, document progress, and future additions in the same style used for docs/tests/inbox/ROADMAP.md
  • supplemental command-visible cases now cover high-value gaps in task add, ready, answer, cleanup, and council report

Goal:

  • make orch behavior easy for a new agent to understand and convert into automated tests or manual validation steps without re-reading all scheduler code paths

Directory layout:

  • docs/tests/orch/README.md
  • docs/tests/orch/ROADMAP.md
  • docs/tests/orch/_shared/README.md
  • docs/tests/orch/workflows/README.md
  • docs/tests/orch/<leaf-command>/README.md
  • docs/tests/orch/<leaf-command>/<case-slug>.md

Current document model:

  • one folder per implemented leaf command
  • each command folder uses README.md as an index only
  • workflow cases live in docs/tests/orch/workflows/README.md
  • detailed case backlog and authored-case register are tracked centrally in docs/tests/orch/ROADMAP.md

Next documentation step:

  • keep docs/tests/orch/ROADMAP.md synchronized when new orch CLI behavior or workflow cases are added, removed, or materially revised

Out Of Scope For First Pass

Do not block v1 on these:

  • advanced auth or permissions
  • background daemons beyond blocking CLI commands

Handoff Notes For Future Agents

  • The design phase is complete enough to start coding.
  • Avoid reopening major design questions unless implementation forces it.
  • The repository already has compiling binaries and working schema init.
  • The inbox test-plan docs are in place; keep them synchronized before and during broad orch implementation.
  • inbox command test-plan folders use README.md as an index plus one file per case; keep any further structural changes consistent with the documented rules above.
  • Preserve the separation:
    • inbox handles communication
    • orch handles scheduling
    • council-review is a workflow on top of orch
  • When writing inbox test docs, use the folder-per-command structure described above and keep cross-command cases inside docs/tests/inbox/workflows/.
  • Treat this file as the implementation entrypoint for new agents.