diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 0000000..0cfd78a --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,34 @@ +# AGENTS.md + +## Scope + +This file applies to the entire repository. + +## Read First + +Before starting substantial work, read the roadmap that matches the task: + +- implementation work: [docs/implementation-roadmap.md](/home/kurihada/project/ai-workflow-skill/docs/implementation-roadmap.md) +- inbox Markdown test-plan work: [docs/tests/inbox/ROADMAP.md](/home/kurihada/project/ai-workflow-skill/docs/tests/inbox/ROADMAP.md) + +## Roadmap Update Rule + +Updating the relevant roadmap is part of the definition of done. + +Do not finish a task and leave its roadmap stale. + +Required behavior: + +- if you complete or materially change implementation progress, update [docs/implementation-roadmap.md](/home/kurihada/project/ai-workflow-skill/docs/implementation-roadmap.md) in the same change +- if you add, remove, or materially revise inbox Markdown test cases or test-plan documents, update [docs/tests/inbox/ROADMAP.md](/home/kurihada/project/ai-workflow-skill/docs/tests/inbox/ROADMAP.md) in the same change +- when a test-plan document is created, update document progress +- when a test case is written, update authored-case tracking and pending backlog +- when a planned item is no longer needed, mark it as removed or deferred instead of silently dropping it + +## Inbox Test-Plan Specific Rule + +For `docs/tests/inbox/`: + +- organize by folder plus `README.md` +- do not use numeric test IDs +- use stable case slugs and keep the roadmap synchronized with the actual files on disk diff --git a/docs/implementation-roadmap.md b/docs/implementation-roadmap.md index e8f92f0..4a3168f 100644 --- a/docs/implementation-roadmap.md +++ b/docs/implementation-roadmap.md @@ -20,6 +20,7 @@ As of now: - `inbox` is implemented end-to-end, including send/fetch/claim/renew/update/reply/done/fail/cancel/list/show/watch/wait-reply - `inbox` supports blocking waits, lease renewal, unread fetches backed by per-agent read cursors, `--body-file`, artifact attachments, and structured JSON errors with stable exit codes - integration tests cover the main inbox lifecycle, wait/watch flows, artifact persistence, and JSON error contracts +- a human-readable inbox command test-plan set has not been authored yet - `orch` currently exists as a command skeleton only - no scheduler workflows have been implemented yet @@ -269,12 +270,13 @@ Definition of done: If a new agent is taking over now, the next concrete step should be: -1. implement `inbox send` -2. implement `inbox fetch` -3. implement `inbox claim` -4. add a small integration test covering `init -> send -> fetch -> claim` +1. create the inbox test documentation tree under `docs/tests/inbox/` +2. write the shared testing conventions in `docs/tests/inbox/README.md` +3. add `_shared/README.md` for common fixtures and assertion rules +4. add command-level `README.md` files for the implemented inbox commands +5. add `workflows/README.md` for cross-command cases such as unread, wait, and reply flows -This is the smallest meaningful slice because the project already has a compiling skeleton and working schema initialization. +This is the smallest meaningful documentation slice because the inbox implementation is already present and stable enough to document in detail before `orch` work begins. ## Recommended Driver Choices @@ -298,6 +300,72 @@ Add these tests before the codebase grows too much: - worktree path generation test - council tally grouping test +## Inbox Test Documentation Roadmap + +Goal: + +- make inbox behavior easy for a new agent to understand and convert into automated tests without re-reading all code paths + +Directory layout: + +- `docs/tests/inbox/README.md` +- `docs/tests/inbox/_shared/README.md` +- `docs/tests/inbox/workflows/README.md` +- `docs/tests/inbox//README.md` + +Initial command folders: + +- `init` +- `send` +- `fetch` +- `claim` +- `renew` +- `update` +- `reply` +- `done` +- `fail` +- `cancel` +- `list` +- `show` +- `watch` +- `wait-reply` + +Documentation rules: + +- organize by folder and `README.md`, not one file per test case +- do not use numeric test case IDs +- identify cases by file path plus a stable case title or `slug` +- keep one command per directory, plus `workflows/` for cross-command behavior +- use `_shared/` for common fixtures, database conventions, exit-code rules, and shared JSON assertions + +Required per-case structure: + +- `用例意义` +- `前置条件` +- `输入` +- `预期输出` +- `断言结论` + +Recommended case-title pattern: + +- `## case: ` + +Authoring order: + +1. global conventions in `docs/tests/inbox/README.md` +2. shared fixtures and assertion helpers in `docs/tests/inbox/_shared/README.md` +3. lifecycle flow in `docs/tests/inbox/workflows/README.md` +4. core command docs: `send`, `fetch`, `claim`, `reply`, `done`, `show` +5. secondary command docs: `renew`, `update`, `fail`, `cancel`, `list` +6. waiting and read-state docs: `watch`, `wait-reply`, unread and mark-read workflow cases + +Definition of done: + +- every implemented inbox command has a dedicated document directory +- every documented case contains concrete input and expected output +- shared assumptions are centralized instead of copied into each command file +- a new agent can pick any case and implement it as an automated test with minimal additional discovery + ## Out Of Scope For First Pass Do not block v1 on these: @@ -314,9 +382,10 @@ Do not block v1 on these: - The design phase is complete enough to start coding. - Avoid reopening major design questions unless implementation forces it. - The repository already has compiling binaries and working schema init. -- Continue with inbox lifecycle commands before adding advanced orchestration. +- Finish the inbox test-plan docs before starting broad `orch` implementation. - Preserve the separation: - `inbox` handles communication - `orch` handles scheduling - `council-review` is a workflow on top of `orch` +- When writing inbox test docs, use the folder-per-command structure described above and keep cross-command cases inside `docs/tests/inbox/workflows/`. - Treat this file as the implementation entrypoint for new agents. diff --git a/docs/tests/inbox/ROADMAP.md b/docs/tests/inbox/ROADMAP.md new file mode 100644 index 0000000..0b08498 --- /dev/null +++ b/docs/tests/inbox/ROADMAP.md @@ -0,0 +1,325 @@ +# Inbox Test Documentation Roadmap + +## Purpose + +This roadmap tracks the human-readable Markdown test plan for `inbox`. + +It exists so a new agent can immediately answer four questions without re-reading the whole codebase: + +- which test-plan documents already exist +- which cases have already been written down +- which cases are still missing +- what file should be updated next + +This roadmap is for the Markdown test-plan set under `docs/tests/inbox/`. +It is not a replacement for automated Go tests. + +## Current Snapshot + +Snapshot date: + +- `2026-03-19` + +Current state: + +- `inbox` CLI is implemented end-to-end +- automated Go integration tests already exist for the main lifecycle, wait flows, unread behavior, artifacts, and JSON error contracts +- this roadmap now exists under `docs/tests/inbox/ROADMAP.md` +- the command, workflow, and shared Markdown test-plan documents have not been authored yet + +Progress summary for planned test-plan documents, excluding `ROADMAP.md`: + +- planned document files: `16` +- authored document files: `0` +- planned case slugs in this roadmap: `61` +- authored case slugs in this roadmap: `0` + +## Scope + +In scope: + +- `inbox init` +- `inbox send` +- `inbox fetch` +- `inbox claim` +- `inbox renew` +- `inbox update` +- `inbox reply` +- `inbox done` +- `inbox fail` +- `inbox cancel` +- `inbox list` +- `inbox show` +- `inbox watch` +- `inbox wait-reply` +- cross-command workflows +- shared test conventions for JSON output, exit codes, fixtures, and assertions + +Out of scope: + +- `orch` +- `council-review` +- implementation details that are not visible through the CLI contract + +## Tracking Rules + +Directory model: + +- one folder per command or shared area +- one `README.md` per folder +- no one-file-per-case sprawl + +Case identity: + +- do not use numeric IDs +- identify a case by `path + case slug` +- recommended heading pattern inside the command document: + +```md +## case: send-rejects-invalid-payload-json +``` + +Per-case structure inside the command document: + +- `用例意义` +- `前置条件` +- `输入` +- `预期输出` +- `断言结论` + +How to update this roadmap when a new case is written: + +1. add the case content to the target `README.md` +2. move the case slug from `Pending Case Backlog` to `Authored Case Register` +3. update the authored counts in `Current Snapshot` +4. if a whole command document is created, update `Document Progress` + +Allowed status values in this roadmap: + +- `pending` +- `in_progress` +- `done` +- `deferred` + +## Existing Automated Coverage Reference + +The Markdown test-plan set starts at zero, but these automated tests already exist and should be used as source material when writing the docs: + +- [integration_test.go](/home/kurihada/project/ai-workflow-skill/internal/cli/inbox/integration_test.go#L12) `TestInboxLifecycle` +- [integration_test.go](/home/kurihada/project/ai-workflow-skill/internal/cli/inbox/integration_test.go#L176) `TestInboxFailLifecycle` +- [integration_test.go](/home/kurihada/project/ai-workflow-skill/internal/cli/inbox/integration_test.go#L243) `TestInboxRenewWaitReplyAndCancel` +- [integration_test.go](/home/kurihada/project/ai-workflow-skill/internal/cli/inbox/integration_test.go#L392) `TestInboxWatchListUnreadAndAppend` +- [integration_test.go](/home/kurihada/project/ai-workflow-skill/internal/cli/inbox/integration_test.go#L549) `TestInboxUnreadReadCursor` +- [integration_test.go](/home/kurihada/project/ai-workflow-skill/internal/cli/inbox/integration_test.go#L639) `TestInboxJSONErrorsAndExitCodes` + +These tests do not remove the need for the Markdown plan. They only reduce discovery work. + +## Planned Directory Tree + +```text +docs/tests/inbox/ + ROADMAP.md + README.md + _shared/ + README.md + workflows/ + README.md + init/ + README.md + send/ + README.md + fetch/ + README.md + claim/ + README.md + renew/ + README.md + update/ + README.md + reply/ + README.md + done/ + README.md + fail/ + README.md + cancel/ + README.md + list/ + README.md + show/ + README.md + watch/ + README.md + wait-reply/ + README.md +``` + +## Document Progress + +| Path | Purpose | Planned Cases | Authored Cases | Status | +| --- | --- | ---: | ---: | --- | +| `docs/tests/inbox/README.md` | Global testing conventions and glossary | 0 | 0 | pending | +| `docs/tests/inbox/_shared/README.md` | Shared fixtures, JSON assertions, exit-code rules | 0 | 0 | pending | +| `docs/tests/inbox/workflows/README.md` | Cross-command scenarios | 8 | 0 | pending | +| `docs/tests/inbox/init/README.md` | `init` command cases | 2 | 0 | pending | +| `docs/tests/inbox/send/README.md` | `send` command cases | 6 | 0 | pending | +| `docs/tests/inbox/fetch/README.md` | `fetch` command cases | 4 | 0 | pending | +| `docs/tests/inbox/claim/README.md` | `claim` command cases | 4 | 0 | pending | +| `docs/tests/inbox/renew/README.md` | `renew` command cases | 3 | 0 | pending | +| `docs/tests/inbox/update/README.md` | `update` command cases | 5 | 0 | pending | +| `docs/tests/inbox/reply/README.md` | `reply` command cases | 4 | 0 | pending | +| `docs/tests/inbox/done/README.md` | `done` command cases | 4 | 0 | pending | +| `docs/tests/inbox/fail/README.md` | `fail` command cases | 4 | 0 | pending | +| `docs/tests/inbox/cancel/README.md` | `cancel` command cases | 3 | 0 | pending | +| `docs/tests/inbox/list/README.md` | `list` command cases | 4 | 0 | pending | +| `docs/tests/inbox/show/README.md` | `show` command cases | 4 | 0 | pending | +| `docs/tests/inbox/watch/README.md` | `watch` command cases | 3 | 0 | pending | +| `docs/tests/inbox/wait-reply/README.md` | `wait-reply` command cases | 3 | 0 | pending | + +## Authoring Order + +Recommended order: + +1. `docs/tests/inbox/README.md` +2. `docs/tests/inbox/_shared/README.md` +3. `docs/tests/inbox/workflows/README.md` +4. `docs/tests/inbox/send/README.md` +5. `docs/tests/inbox/fetch/README.md` +6. `docs/tests/inbox/claim/README.md` +7. `docs/tests/inbox/reply/README.md` +8. `docs/tests/inbox/done/README.md` +9. `docs/tests/inbox/show/README.md` +10. the remaining command documents + +Reason: + +- the workflow file captures the highest-value end-to-end behavior first +- the command documents can then reuse shared conventions and already-fixed terminology + +## Authored Case Register + +No Markdown test cases have been authored yet. + +When the first case is written, add rows in this format: + +| Path | Case Slug | Coverage Note | Status | +| --- | --- | --- | --- | +| `docs/tests/inbox/send/README.md` | `send-creates-new-thread` | minimal happy path for new thread creation | done | + +## Pending Case Backlog + +### `docs/tests/inbox/workflows/README.md` + +- `pending` `thread-lifecycle-happy-path` +- `pending` `blocked-question-reply-resume-to-done` +- `pending` `fail-lifecycle-from-claim-to-terminal` +- `pending` `cancel-lifecycle-after-worker-claim` +- `pending` `watch-wakes-then-fetch-sees-new-thread` +- `pending` `artifact-visible-through-send-and-show` +- `pending` `unread-clears-after-mark-read-and-reappears-on-new-message` +- `pending` `wait-reply-clears-blocked-unread-for-agent` + +### `docs/tests/inbox/init/README.md` + +- `pending` `init-creates-schema-on-empty-db` +- `pending` `init-is-idempotent-on-existing-db` + +### `docs/tests/inbox/send/README.md` + +- `pending` `send-creates-new-thread` +- `pending` `send-appends-message-to-existing-thread` +- `pending` `send-reads-body-from-body-file` +- `pending` `send-attaches-artifact-with-metadata` +- `pending` `send-rejects-invalid-payload-json` +- `pending` `send-rejects-invalid-artifact-metadata-json` + +### `docs/tests/inbox/fetch/README.md` + +- `pending` `fetch-returns-pending-thread-for-target-agent` +- `pending` `fetch-respects-status-and-limit-filters` +- `pending` `fetch-unread-uses-read-cursor` +- `pending` `fetch-returns-no-matching-work-when-empty` + +### `docs/tests/inbox/claim/README.md` + +- `pending` `claim-acquires-thread-lease` +- `pending` `claim-rejects-when-thread-missing` +- `pending` `claim-rejects-when-thread-already-claimed` +- `pending` `claim-records-requested-lease-duration` + +### `docs/tests/inbox/renew/README.md` + +- `pending` `renew-extends-active-lease` +- `pending` `renew-rejects-non-owner` +- `pending` `renew-rejects-without-active-lease` + +### `docs/tests/inbox/update/README.md` + +- `pending` `update-moves-thread-to-in-progress` +- `pending` `update-moves-thread-to-blocked-with-payload` +- `pending` `update-accepts-body-file-and-artifact` +- `pending` `update-rejects-invalid-payload-json` +- `pending` `update-rejects-non-owner` + +### `docs/tests/inbox/reply/README.md` + +- `pending` `reply-adds-answer-message` +- `pending` `reply-supports-control-kind` +- `pending` `reply-attaches-artifact` +- `pending` `reply-rejects-invalid-payload-json` + +### `docs/tests/inbox/done/README.md` + +- `pending` `done-marks-thread-terminal` +- `pending` `done-persists-result-body-and-artifact` +- `pending` `done-rejects-non-owner` +- `pending` `done-rejects-on-terminal-thread` + +### `docs/tests/inbox/fail/README.md` + +- `pending` `fail-marks-thread-failed` +- `pending` `fail-persists-failure-body-and-artifact` +- `pending` `fail-rejects-non-owner` +- `pending` `fail-rejects-on-terminal-thread` + +### `docs/tests/inbox/cancel/README.md` + +- `pending` `cancel-marks-thread-cancelled` +- `pending` `cancel-persists-reason-and-artifact` +- `pending` `cancel-rejects-when-thread-missing` + +### `docs/tests/inbox/list/README.md` + +- `pending` `list-filters-by-status` +- `pending` `list-filters-by-created-by` +- `pending` `list-filters-by-assigned-to` +- `pending` `list-respects-limit` + +### `docs/tests/inbox/show/README.md` + +- `pending` `show-returns-thread-and-message-history` +- `pending` `show-includes-artifacts-per-message` +- `pending` `show-mark-read-advances-read-cursor` +- `pending` `show-rejects-when-thread-missing` + +### `docs/tests/inbox/watch/README.md` + +- `pending` `watch-wakes-on-matching-thread` +- `pending` `watch-respects-status-filter` +- `pending` `watch-times-out-with-no-activity` + +### `docs/tests/inbox/wait-reply/README.md` + +- `pending` `wait-reply-wakes-on-answer-after-message` +- `pending` `wait-reply-can-start-from-after-event` +- `pending` `wait-reply-times-out-when-no-reply` + +## Definition Of Done + +This roadmap is complete only when all of the following are true: + +- every implemented inbox command has a corresponding document folder +- each planned command document exists +- each pending case slug has been either authored or explicitly deferred +- the authored-case register matches the actual Markdown files on disk +- a new agent can pick any pending case and know exactly where it should be written