Add inbox test planning guidance

2026-03-19 10:06:42 +08:00
parent 1927930570
commit dab0506c5a
3 changed files with 434 additions and 6 deletions
@@ -0,0 +1,34 @@
+# AGENTS.md
+
+## Scope
+
+This file applies to the entire repository.
+
+## Read First
+
+Before starting substantial work, read the roadmap that matches the task:
+
+- implementation work: [docs/implementation-roadmap.md](/home/kurihada/project/ai-workflow-skill/docs/implementation-roadmap.md)
+- inbox Markdown test-plan work: [docs/tests/inbox/ROADMAP.md](/home/kurihada/project/ai-workflow-skill/docs/tests/inbox/ROADMAP.md)
+
+## Roadmap Update Rule
+
+Updating the relevant roadmap is part of the definition of done.
+
+Do not finish a task and leave its roadmap stale.
+
+Required behavior:
+
+- if you complete or materially change implementation progress, update [docs/implementation-roadmap.md](/home/kurihada/project/ai-workflow-skill/docs/implementation-roadmap.md) in the same change
+- if you add, remove, or materially revise inbox Markdown test cases or test-plan documents, update [docs/tests/inbox/ROADMAP.md](/home/kurihada/project/ai-workflow-skill/docs/tests/inbox/ROADMAP.md) in the same change
+- when a test-plan document is created, update document progress
+- when a test case is written, update authored-case tracking and pending backlog
+- when a planned item is no longer needed, mark it as removed or deferred instead of silently dropping it
+
+## Inbox Test-Plan Specific Rule
+
+For `docs/tests/inbox/`:
+
+- organize by folder plus `README.md`
+- do not use numeric test IDs
+- use stable case slugs and keep the roadmap synchronized with the actual files on disk
@@ -20,6 +20,7 @@ As of now:
 - `inbox` is implemented end-to-end, including send/fetch/claim/renew/update/reply/done/fail/cancel/list/show/watch/wait-reply
 - `inbox` supports blocking waits, lease renewal, unread fetches backed by per-agent read cursors, `--body-file`, artifact attachments, and structured JSON errors with stable exit codes
 - integration tests cover the main inbox lifecycle, wait/watch flows, artifact persistence, and JSON error contracts
+- a human-readable inbox command test-plan set has not been authored yet
 - `orch` currently exists as a command skeleton only
 - no scheduler workflows have been implemented yet

@@ -269,12 +270,13 @@ Definition of done:

 If a new agent is taking over now, the next concrete step should be:

-1. implement `inbox send`
-2. implement `inbox fetch`
-3. implement `inbox claim`
-4. add a small integration test covering `init -> send -> fetch -> claim`
+1. create the inbox test documentation tree under `docs/tests/inbox/`
+2. write the shared testing conventions in `docs/tests/inbox/README.md`
+3. add `_shared/README.md` for common fixtures and assertion rules
+4. add command-level `README.md` files for the implemented inbox commands
+5. add `workflows/README.md` for cross-command cases such as unread, wait, and reply flows

-This is the smallest meaningful slice because the project already has a compiling skeleton and working schema initialization.
+This is the smallest meaningful documentation slice because the inbox implementation is already present and stable enough to document in detail before `orch` work begins.

 ## Recommended Driver Choices

@@ -298,6 +300,72 @@ Add these tests before the codebase grows too much:
 - worktree path generation test
 - council tally grouping test

+## Inbox Test Documentation Roadmap
+
+Goal:
+
+- make inbox behavior easy for a new agent to understand and convert into automated tests without re-reading all code paths
+
+Directory layout:
+
+- `docs/tests/inbox/README.md`
+- `docs/tests/inbox/_shared/README.md`
+- `docs/tests/inbox/workflows/README.md`
+- `docs/tests/inbox/<command>/README.md`
+
+Initial command folders:
+
+- `init`
+- `send`
+- `fetch`
+- `claim`
+- `renew`
+- `update`
+- `reply`
+- `done`
+- `fail`
+- `cancel`
+- `list`
+- `show`
+- `watch`
+- `wait-reply`
+
+Documentation rules:
+
+- organize by folder and `README.md`, not one file per test case
+- do not use numeric test case IDs
+- identify cases by file path plus a stable case title or `slug`
+- keep one command per directory, plus `workflows/` for cross-command behavior
+- use `_shared/` for common fixtures, database conventions, exit-code rules, and shared JSON assertions
+
+Required per-case structure:
+
+- `用例意义`
+- `前置条件`
+- `输入`
+- `预期输出`
+- `断言结论`
+
+Recommended case-title pattern:
+
+- `## case: <stable-slug>`
+
+Authoring order:
+
+1. global conventions in `docs/tests/inbox/README.md`
+2. shared fixtures and assertion helpers in `docs/tests/inbox/_shared/README.md`
+3. lifecycle flow in `docs/tests/inbox/workflows/README.md`
+4. core command docs: `send`, `fetch`, `claim`, `reply`, `done`, `show`
+5. secondary command docs: `renew`, `update`, `fail`, `cancel`, `list`
+6. waiting and read-state docs: `watch`, `wait-reply`, unread and mark-read workflow cases
+
+Definition of done:
+
+- every implemented inbox command has a dedicated document directory
+- every documented case contains concrete input and expected output
+- shared assumptions are centralized instead of copied into each command file
+- a new agent can pick any case and implement it as an automated test with minimal additional discovery
+
 ## Out Of Scope For First Pass

 Do not block v1 on these:
@@ -314,9 +382,10 @@ Do not block v1 on these:
 - The design phase is complete enough to start coding.
 - Avoid reopening major design questions unless implementation forces it.
 - The repository already has compiling binaries and working schema init.
- Continue with inbox lifecycle commands before adding advanced orchestration.
+- Finish the inbox test-plan docs before starting broad `orch` implementation.
 - Preserve the separation:
  - `inbox` handles communication
  - `orch` handles scheduling
  - `council-review` is a workflow on top of `orch`
+- When writing inbox test docs, use the folder-per-command structure described above and keep cross-command cases inside `docs/tests/inbox/workflows/`.
 - Treat this file as the implementation entrypoint for new agents.
@@ -0,0 +1,325 @@
+# Inbox Test Documentation Roadmap
+
+## Purpose
+
+This roadmap tracks the human-readable Markdown test plan for `inbox`.
+
+It exists so a new agent can immediately answer four questions without re-reading the whole codebase:
+
+- which test-plan documents already exist
+- which cases have already been written down
+- which cases are still missing
+- what file should be updated next
+
+This roadmap is for the Markdown test-plan set under `docs/tests/inbox/`.
+It is not a replacement for automated Go tests.
+
+## Current Snapshot
+
+Snapshot date:
+
+- `2026-03-19`
+
+Current state:
+
+- `inbox` CLI is implemented end-to-end
+- automated Go integration tests already exist for the main lifecycle, wait flows, unread behavior, artifacts, and JSON error contracts
+- this roadmap now exists under `docs/tests/inbox/ROADMAP.md`
+- the command, workflow, and shared Markdown test-plan documents have not been authored yet
+
+Progress summary for planned test-plan documents, excluding `ROADMAP.md`:
+
+- planned document files: `16`
+- authored document files: `0`
+- planned case slugs in this roadmap: `61`
+- authored case slugs in this roadmap: `0`
+
+## Scope
+
+In scope:
+
+- `inbox init`
+- `inbox send`
+- `inbox fetch`
+- `inbox claim`
+- `inbox renew`
+- `inbox update`
+- `inbox reply`
+- `inbox done`
+- `inbox fail`
+- `inbox cancel`
+- `inbox list`
+- `inbox show`
+- `inbox watch`
+- `inbox wait-reply`
+- cross-command workflows
+- shared test conventions for JSON output, exit codes, fixtures, and assertions
+
+Out of scope:
+
+- `orch`
+- `council-review`
+- implementation details that are not visible through the CLI contract
+
+## Tracking Rules
+
+Directory model:
+
+- one folder per command or shared area
+- one `README.md` per folder
+- no one-file-per-case sprawl
+
+Case identity:
+
+- do not use numeric IDs
+- identify a case by `path + case slug`
+- recommended heading pattern inside the command document:
+
+```md
+## case: send-rejects-invalid-payload-json
+```
+
+Per-case structure inside the command document:
+
+- `用例意义`
+- `前置条件`
+- `输入`
+- `预期输出`
+- `断言结论`
+
+How to update this roadmap when a new case is written:
+
+1. add the case content to the target `README.md`
+2. move the case slug from `Pending Case Backlog` to `Authored Case Register`
+3. update the authored counts in `Current Snapshot`
+4. if a whole command document is created, update `Document Progress`
+
+Allowed status values in this roadmap:
+
+- `pending`
+- `in_progress`
+- `done`
+- `deferred`
+
+## Existing Automated Coverage Reference
+
+The Markdown test-plan set starts at zero, but these automated tests already exist and should be used as source material when writing the docs:
+
+- [integration_test.go](/home/kurihada/project/ai-workflow-skill/internal/cli/inbox/integration_test.go#L12) `TestInboxLifecycle`
+- [integration_test.go](/home/kurihada/project/ai-workflow-skill/internal/cli/inbox/integration_test.go#L176) `TestInboxFailLifecycle`
+- [integration_test.go](/home/kurihada/project/ai-workflow-skill/internal/cli/inbox/integration_test.go#L243) `TestInboxRenewWaitReplyAndCancel`
+- [integration_test.go](/home/kurihada/project/ai-workflow-skill/internal/cli/inbox/integration_test.go#L392) `TestInboxWatchListUnreadAndAppend`
+- [integration_test.go](/home/kurihada/project/ai-workflow-skill/internal/cli/inbox/integration_test.go#L549) `TestInboxUnreadReadCursor`
+- [integration_test.go](/home/kurihada/project/ai-workflow-skill/internal/cli/inbox/integration_test.go#L639) `TestInboxJSONErrorsAndExitCodes`
+
+These tests do not remove the need for the Markdown plan. They only reduce discovery work.
+
+## Planned Directory Tree
+
+```text
+docs/tests/inbox/
+  ROADMAP.md
+  README.md
+  _shared/
+    README.md
+  workflows/
+    README.md
+  init/
+    README.md
+  send/
+    README.md
+  fetch/
+    README.md
+  claim/
+    README.md
+  renew/
+    README.md
+  update/
+    README.md
+  reply/
+    README.md
+  done/
+    README.md
+  fail/
+    README.md
+  cancel/
+    README.md
+  list/
+    README.md
+  show/
+    README.md
+  watch/
+    README.md
+  wait-reply/
+    README.md
+```
+
+## Document Progress
+
+| Path | Purpose | Planned Cases | Authored Cases | Status |
+| --- | --- | ---: | ---: | --- |
+| `docs/tests/inbox/README.md` | Global testing conventions and glossary | 0 | 0 | pending |
+| `docs/tests/inbox/_shared/README.md` | Shared fixtures, JSON assertions, exit-code rules | 0 | 0 | pending |
+| `docs/tests/inbox/workflows/README.md` | Cross-command scenarios | 8 | 0 | pending |
+| `docs/tests/inbox/init/README.md` | `init` command cases | 2 | 0 | pending |
+| `docs/tests/inbox/send/README.md` | `send` command cases | 6 | 0 | pending |
+| `docs/tests/inbox/fetch/README.md` | `fetch` command cases | 4 | 0 | pending |
+| `docs/tests/inbox/claim/README.md` | `claim` command cases | 4 | 0 | pending |
+| `docs/tests/inbox/renew/README.md` | `renew` command cases | 3 | 0 | pending |
+| `docs/tests/inbox/update/README.md` | `update` command cases | 5 | 0 | pending |
+| `docs/tests/inbox/reply/README.md` | `reply` command cases | 4 | 0 | pending |
+| `docs/tests/inbox/done/README.md` | `done` command cases | 4 | 0 | pending |
+| `docs/tests/inbox/fail/README.md` | `fail` command cases | 4 | 0 | pending |
+| `docs/tests/inbox/cancel/README.md` | `cancel` command cases | 3 | 0 | pending |
+| `docs/tests/inbox/list/README.md` | `list` command cases | 4 | 0 | pending |
+| `docs/tests/inbox/show/README.md` | `show` command cases | 4 | 0 | pending |
+| `docs/tests/inbox/watch/README.md` | `watch` command cases | 3 | 0 | pending |
+| `docs/tests/inbox/wait-reply/README.md` | `wait-reply` command cases | 3 | 0 | pending |
+
+## Authoring Order
+
+Recommended order:
+
+1. `docs/tests/inbox/README.md`
+2. `docs/tests/inbox/_shared/README.md`
+3. `docs/tests/inbox/workflows/README.md`
+4. `docs/tests/inbox/send/README.md`
+5. `docs/tests/inbox/fetch/README.md`
+6. `docs/tests/inbox/claim/README.md`
+7. `docs/tests/inbox/reply/README.md`
+8. `docs/tests/inbox/done/README.md`
+9. `docs/tests/inbox/show/README.md`
+10. the remaining command documents
+
+Reason:
+
+- the workflow file captures the highest-value end-to-end behavior first
+- the command documents can then reuse shared conventions and already-fixed terminology
+
+## Authored Case Register
+
+No Markdown test cases have been authored yet.
+
+When the first case is written, add rows in this format:
+
+| Path | Case Slug | Coverage Note | Status |
+| --- | --- | --- | --- |
+| `docs/tests/inbox/send/README.md` | `send-creates-new-thread` | minimal happy path for new thread creation | done |
+
+## Pending Case Backlog
+
+### `docs/tests/inbox/workflows/README.md`
+
+- `pending` `thread-lifecycle-happy-path`
+- `pending` `blocked-question-reply-resume-to-done`
+- `pending` `fail-lifecycle-from-claim-to-terminal`
+- `pending` `cancel-lifecycle-after-worker-claim`
+- `pending` `watch-wakes-then-fetch-sees-new-thread`
+- `pending` `artifact-visible-through-send-and-show`
+- `pending` `unread-clears-after-mark-read-and-reappears-on-new-message`
+- `pending` `wait-reply-clears-blocked-unread-for-agent`
+
+### `docs/tests/inbox/init/README.md`
+
+- `pending` `init-creates-schema-on-empty-db`
+- `pending` `init-is-idempotent-on-existing-db`
+
+### `docs/tests/inbox/send/README.md`
+
+- `pending` `send-creates-new-thread`
+- `pending` `send-appends-message-to-existing-thread`
+- `pending` `send-reads-body-from-body-file`
+- `pending` `send-attaches-artifact-with-metadata`
+- `pending` `send-rejects-invalid-payload-json`
+- `pending` `send-rejects-invalid-artifact-metadata-json`
+
+### `docs/tests/inbox/fetch/README.md`
+
+- `pending` `fetch-returns-pending-thread-for-target-agent`
+- `pending` `fetch-respects-status-and-limit-filters`
+- `pending` `fetch-unread-uses-read-cursor`
+- `pending` `fetch-returns-no-matching-work-when-empty`
+
+### `docs/tests/inbox/claim/README.md`
+
+- `pending` `claim-acquires-thread-lease`
+- `pending` `claim-rejects-when-thread-missing`
+- `pending` `claim-rejects-when-thread-already-claimed`
+- `pending` `claim-records-requested-lease-duration`
+
+### `docs/tests/inbox/renew/README.md`
+
+- `pending` `renew-extends-active-lease`
+- `pending` `renew-rejects-non-owner`
+- `pending` `renew-rejects-without-active-lease`
+
+### `docs/tests/inbox/update/README.md`
+
+- `pending` `update-moves-thread-to-in-progress`
+- `pending` `update-moves-thread-to-blocked-with-payload`
+- `pending` `update-accepts-body-file-and-artifact`
+- `pending` `update-rejects-invalid-payload-json`
+- `pending` `update-rejects-non-owner`
+
+### `docs/tests/inbox/reply/README.md`
+
+- `pending` `reply-adds-answer-message`
+- `pending` `reply-supports-control-kind`
+- `pending` `reply-attaches-artifact`
+- `pending` `reply-rejects-invalid-payload-json`
+
+### `docs/tests/inbox/done/README.md`
+
+- `pending` `done-marks-thread-terminal`
+- `pending` `done-persists-result-body-and-artifact`
+- `pending` `done-rejects-non-owner`
+- `pending` `done-rejects-on-terminal-thread`
+
+### `docs/tests/inbox/fail/README.md`
+
+- `pending` `fail-marks-thread-failed`
+- `pending` `fail-persists-failure-body-and-artifact`
+- `pending` `fail-rejects-non-owner`
+- `pending` `fail-rejects-on-terminal-thread`
+
+### `docs/tests/inbox/cancel/README.md`
+
+- `pending` `cancel-marks-thread-cancelled`
+- `pending` `cancel-persists-reason-and-artifact`
+- `pending` `cancel-rejects-when-thread-missing`
+
+### `docs/tests/inbox/list/README.md`
+
+- `pending` `list-filters-by-status`
+- `pending` `list-filters-by-created-by`
+- `pending` `list-filters-by-assigned-to`
+- `pending` `list-respects-limit`
+
+### `docs/tests/inbox/show/README.md`
+
+- `pending` `show-returns-thread-and-message-history`
+- `pending` `show-includes-artifacts-per-message`
+- `pending` `show-mark-read-advances-read-cursor`
+- `pending` `show-rejects-when-thread-missing`
+
+### `docs/tests/inbox/watch/README.md`
+
+- `pending` `watch-wakes-on-matching-thread`
+- `pending` `watch-respects-status-filter`
+- `pending` `watch-times-out-with-no-activity`
+
+### `docs/tests/inbox/wait-reply/README.md`
+
+- `pending` `wait-reply-wakes-on-answer-after-message`
+- `pending` `wait-reply-can-start-from-after-event`
+- `pending` `wait-reply-times-out-when-no-reply`
+
+## Definition Of Done
+
+This roadmap is complete only when all of the following are true:
+
+- every implemented inbox command has a corresponding document folder
+- each planned command document exists
+- each pending case slug has been either authored or explicitly deferred
+- the authored-case register matches the actual Markdown files on disk
+- a new agent can pick any pending case and know exactly where it should be written