Add design docs and gitignore
This commit is contained in:
@@ -0,0 +1,161 @@
|
||||
# Agent Coordination Architecture
|
||||
|
||||
## Purpose
|
||||
|
||||
This document defines the system split between the worker-facing `inbox` layer and the leader-facing `orch` layer.
|
||||
|
||||
The design target is a local, file-portable agent coordination stack:
|
||||
|
||||
- `inbox`: durable communication bus
|
||||
- `orch`: task graph and scheduling control plane
|
||||
- worktree-backed task execution for code-writing workers
|
||||
- optional user-facing council review workflow on top of `orch`
|
||||
- shared SQLite database file
|
||||
- leader and workers coordinated through stable CLI commands
|
||||
|
||||
## Why Two Layers
|
||||
|
||||
`inbox` and `orch` solve different problems.
|
||||
|
||||
- `inbox` answers: how do agents exchange durable messages, claim work, report progress, and return results?
|
||||
- `orch` answers: what work exists, which tasks are ready, who should get them, and what happens after a block, failure, or retry?
|
||||
|
||||
If `inbox` is reduced to pure chat storage, the scheduler must reconstruct state from message history and ownership becomes ambiguous. If `inbox` tries to become a full scheduler, worker concerns and leader concerns get mixed into one unstable interface.
|
||||
|
||||
## Role Model
|
||||
|
||||
- `user`: talks only to the leader
|
||||
- `leader`: owns the overall goal, task graph, acceptance criteria, and final integration
|
||||
- `worker`: executes one assigned task at a time and reports through `inbox`
|
||||
- `inbox`: durable thread/message/lease/artifact store
|
||||
- `orch`: run/task/dependency/dispatch state machine built on top of `inbox`
|
||||
|
||||
## Default Usage Rules
|
||||
|
||||
- The leader should use `orch` as the default control surface.
|
||||
- The leader may use `inbox` directly for inspection or manual repair.
|
||||
- Workers should use `inbox` only.
|
||||
- Workers should not use `orch`.
|
||||
- User-facing discussion stays with the leader.
|
||||
- Code-writing workers should run in `orch`-assigned Git worktrees, not in the user's primary checkout.
|
||||
|
||||
## Shared Storage Model
|
||||
|
||||
Both CLIs should point at the same SQLite file.
|
||||
|
||||
- `inbox` owns communication tables such as threads, messages, leases, and artifacts.
|
||||
- `orch` owns scheduling tables such as runs, tasks, dependencies, and attempts.
|
||||
- both layers append to a shared event stream for blocking waits
|
||||
- `orch dispatch` creates or updates `inbox` threads.
|
||||
- `orch reconcile` reads `inbox` state and updates task state.
|
||||
|
||||
This preserves a clean boundary while keeping deployment simple.
|
||||
|
||||
## Worker Execution Model
|
||||
|
||||
For code tasks, execution should be isolated from the user's primary checkout.
|
||||
|
||||
- `orch dispatch` should create a task-attempt worktree
|
||||
- the assigned worktree path should be stored in attempt metadata and inbox task payload
|
||||
- the worker runtime should execute inside that worktree
|
||||
- strict mode should require a committed base revision
|
||||
|
||||
See [worktree-execution.md](/home/kurihada/project/ai-workflow-skill/docs/worktree-execution.md) for the full lifecycle.
|
||||
|
||||
## Event-Driven Waiting
|
||||
|
||||
The leader does not receive worker messages as an in-memory push. Workers write state into `inbox`, and the leader must read it back through CLI commands.
|
||||
|
||||
The intended solution is event-driven blocking waits, not ad hoc `sleep` loops.
|
||||
|
||||
- leaders should use `orch wait`
|
||||
- blocked workers should use `inbox wait-reply`
|
||||
- low-level polling may still exist internally, but it should be hidden inside the CLI
|
||||
|
||||
This means there is still one logical leader. The extra behavior is a blocking wait primitive, not a second leader.
|
||||
|
||||
## Shared Event Stream
|
||||
|
||||
To support blocking waits cleanly, both layers should append rows to a shared `events` table.
|
||||
|
||||
Typical emitters:
|
||||
|
||||
- `inbox`: claim, progress, blocked, answer, done, fail, cancel
|
||||
- `orch`: dispatch, answer, retry, reassign, cancel, reconcile-driven task state changes
|
||||
|
||||
Typical consumers:
|
||||
|
||||
- `orch wait`: watches run-scoped task events for the leader
|
||||
- `inbox wait-reply`: watches thread-scoped reply events for a blocked worker
|
||||
|
||||
Every waiter should use a monotonic cursor such as `event_id` or `message_id`, so it can resume safely without reprocessing old events.
|
||||
|
||||
## Recommended Binary Layout
|
||||
|
||||
The recommended v1 shape is:
|
||||
|
||||
- `inbox` binary for communication primitives
|
||||
- `orch` binary for leader-side planning and scheduling
|
||||
- one shared `--db PATH`
|
||||
|
||||
If packaging later favors a single binary, the same model can be exposed as command groups:
|
||||
|
||||
- `agentctl inbox ...`
|
||||
- `agentctl orch ...`
|
||||
|
||||
## Responsibility Split
|
||||
|
||||
`inbox` should own:
|
||||
|
||||
- directed messages
|
||||
- durable threads
|
||||
- worker claiming and leases
|
||||
- progress, blocked, result, and failure events
|
||||
- artifact references
|
||||
- thread history and watch functionality
|
||||
- thread-scoped waiting for replies
|
||||
|
||||
`orch` should own:
|
||||
|
||||
- runs
|
||||
- task graph and dependencies
|
||||
- ready queue calculation
|
||||
- dispatch decisions
|
||||
- task-attempt worktree allocation
|
||||
- blocked queue review for the leader
|
||||
- retries, reassignment, and cancellation
|
||||
- mapping task attempts to inbox threads
|
||||
- run-scoped waiting for actionable events
|
||||
- reusable higher-level workflows such as council review
|
||||
|
||||
## What Not To Mix
|
||||
|
||||
Do not put these into `inbox`:
|
||||
|
||||
- dependency graph logic
|
||||
- automatic worker selection policy
|
||||
- retry policy
|
||||
- acceptance-driven task completion logic
|
||||
|
||||
Do not put these into `orch`:
|
||||
|
||||
- worker claiming
|
||||
- low-level message append/reply primitives
|
||||
- raw thread history storage
|
||||
|
||||
## Reading Order
|
||||
|
||||
- [inbox-cli.md](/home/kurihada/project/ai-workflow-skill/docs/inbox-cli.md): worker-facing bus and low-level message protocol
|
||||
- [orch-cli.md](/home/kurihada/project/ai-workflow-skill/docs/orch-cli.md): leader-facing scheduler and task graph control plane
|
||||
- [worktree-execution.md](/home/kurihada/project/ai-workflow-skill/docs/worktree-execution.md): strict worktree model for code-writing task attempts
|
||||
- [council-review.md](/home/kurihada/project/ai-workflow-skill/docs/council-review.md): user-facing three-reviewer brainstorm and voting workflow
|
||||
- [implementation-roadmap.md](/home/kurihada/project/ai-workflow-skill/docs/implementation-roadmap.md): handoff-oriented implementation order and next steps
|
||||
- [blog-project-example.md](/home/kurihada/project/ai-workflow-skill/docs/blog-project-example.md): concrete example using both layers
|
||||
|
||||
## Skills
|
||||
|
||||
The intended skill split mirrors the CLI split.
|
||||
|
||||
- `inbox` skill: used when an agent needs to fetch work, claim a thread, send progress, ask blocked questions, reply, or return results through `inbox`
|
||||
- `orchestrator` skill: used when the leader needs to create runs, decompose tasks, manage dependencies, dispatch ready work, inspect blocks, answer them, retry failures, or reassign work through `orch`
|
||||
- `council-review` skill: used when the user explicitly wants a structured three-reviewer brainstorm or review with grouped and tallied recommendations
|
||||
Reference in New Issue
Block a user