# Inbox Test Documentation Roadmap ## Purpose This roadmap tracks the human-readable Markdown test plan for `inbox`. It exists so a new agent can immediately answer four questions without re-reading the whole codebase: - which test-plan documents already exist - which cases have already been written down - which cases are still missing - what file should be updated next This roadmap is for the Markdown test-plan set under `docs/tests/inbox/`. It is not a replacement for automated Go tests. ## Current Snapshot Snapshot date: - `2026-03-19` Current state: - `inbox` CLI is implemented end-to-end - automated Go integration tests already exist for the main lifecycle, wait flows, unread behavior, artifacts, and JSON error contracts - this roadmap now exists under `docs/tests/inbox/ROADMAP.md` - all planned global, shared, workflow, command-index, and command-case Markdown documents have been authored - command-level documents have been audited once per command against current CLI and store behavior, with edge-contract notes added for defaults, fallbacks, and error boundaries where needed - every inbox command folder now uses `README.md` as an index plus one Markdown file per case Progress summary for planned test-plan documents, excluding `ROADMAP.md`: - planned document files: `70` - authored document files: `70` - planned case slugs in this roadmap: `61` - authored case slugs in this roadmap: `61` ## Scope In scope: - `inbox init` - `inbox send` - `inbox fetch` - `inbox claim` - `inbox renew` - `inbox update` - `inbox reply` - `inbox done` - `inbox fail` - `inbox cancel` - `inbox list` - `inbox show` - `inbox watch` - `inbox wait-reply` - cross-command workflows - shared test conventions for JSON output, exit codes, fixtures, and assertions Out of scope: - `orch` - `council-review` - implementation details that are not visible through the CLI contract ## Tracking Rules Directory model: - one folder per command or shared area - each folder keeps a `README.md` entrypoint - command folders use `README.md` as an index only - each command case lives in its own Markdown file named after the case slug - cross-command workflow cases remain grouped in `docs/tests/inbox/workflows/README.md` Case identity: - do not use numeric IDs - identify each command case by its concrete file path - identify each workflow case by `path + case slug` - command case file naming pattern: ```text .md ``` - workflow case heading pattern: ```md ## case: send-rejects-invalid-payload-json ``` Per-case structure inside the case document: - `用例意义` - `前置条件` - `输入` - `预期输出` - `断言结论` How to update this roadmap when a new case is written: 1. if it is a command case, create or update the target `.md` file under the relevant command folder 2. if it is a command case, add or update the entry in that folder `README.md` index 3. if it is a workflow case, add or update the case inside `docs/tests/inbox/workflows/README.md` 4. move the case slug from `Pending Case Backlog` to `Authored Case Register` 5. update the authored counts in `Current Snapshot` 6. if a new Markdown file is created, update `Document Progress` Allowed status values in this roadmap: - `pending` - `in_progress` - `done` - `deferred` ## Existing Automated Coverage Reference The Markdown test-plan set starts at zero, but these automated tests already exist and should be used as source material when writing the docs: - [integration_test.go](../../../internal/cli/inbox/integration_test.go#L12) `TestInboxLifecycle` - [integration_test.go](../../../internal/cli/inbox/integration_test.go#L176) `TestInboxFailLifecycle` - [integration_test.go](../../../internal/cli/inbox/integration_test.go#L243) `TestInboxRenewWaitReplyAndCancel` - [integration_test.go](../../../internal/cli/inbox/integration_test.go#L392) `TestInboxWatchListUnreadAndAppend` - [integration_test.go](../../../internal/cli/inbox/integration_test.go#L549) `TestInboxUnreadReadCursor` - [integration_test.go](../../../internal/cli/inbox/integration_test.go#L639) `TestInboxJSONErrorsAndExitCodes` These tests do not remove the need for the Markdown plan. They only reduce discovery work. ## Planned Directory Tree ```text docs/tests/inbox/ ROADMAP.md README.md _shared/ README.md workflows/ README.md init/ README.md .md send/ README.md .md fetch/ README.md .md claim/ README.md .md renew/ README.md .md update/ README.md .md reply/ README.md .md done/ README.md .md fail/ README.md .md cancel/ README.md .md list/ README.md .md show/ README.md .md watch/ README.md .md wait-reply/ README.md .md ``` ## Document Progress | Path | Purpose | Planned Cases | Authored Cases | Status | | --- | --- | ---: | ---: | --- | | `docs/tests/inbox/README.md` | Global testing conventions and glossary | 0 | 0 | done | | `docs/tests/inbox/_shared/README.md` | Shared fixtures, JSON assertions, exit-code rules | 0 | 0 | done | | `docs/tests/inbox/workflows/README.md` | Cross-command scenarios | 8 | 8 | done | | `docs/tests/inbox/init/README.md` | `init` command case index | 0 | 0 | done | | `docs/tests/inbox/init/init-creates-schema-on-empty-db.md` | `init` command case | 1 | 1 | done | | `docs/tests/inbox/init/init-is-idempotent-on-existing-db.md` | `init` command case | 1 | 1 | done | | `docs/tests/inbox/send/README.md` | `send` command case index | 0 | 0 | done | | `docs/tests/inbox/send/send-creates-new-thread.md` | `send` command case | 1 | 1 | done | | `docs/tests/inbox/send/send-appends-message-to-existing-thread.md` | `send` command case | 1 | 1 | done | | `docs/tests/inbox/send/send-reads-body-from-body-file.md` | `send` command case | 1 | 1 | done | | `docs/tests/inbox/send/send-attaches-artifact-with-metadata.md` | `send` command case | 1 | 1 | done | | `docs/tests/inbox/send/send-rejects-invalid-payload-json.md` | `send` command case | 1 | 1 | done | | `docs/tests/inbox/send/send-rejects-invalid-artifact-metadata-json.md` | `send` command case | 1 | 1 | done | | `docs/tests/inbox/fetch/README.md` | `fetch` command case index | 0 | 0 | done | | `docs/tests/inbox/fetch/fetch-returns-pending-thread-for-target-agent.md` | `fetch` command case | 1 | 1 | done | | `docs/tests/inbox/fetch/fetch-respects-status-and-limit-filters.md` | `fetch` command case | 1 | 1 | done | | `docs/tests/inbox/fetch/fetch-unread-uses-read-cursor.md` | `fetch` command case | 1 | 1 | done | | `docs/tests/inbox/fetch/fetch-returns-no-matching-work-when-empty.md` | `fetch` command case | 1 | 1 | done | | `docs/tests/inbox/claim/README.md` | `claim` command case index | 0 | 0 | done | | `docs/tests/inbox/claim/claim-acquires-thread-lease.md` | `claim` command case | 1 | 1 | done | | `docs/tests/inbox/claim/claim-rejects-when-thread-missing.md` | `claim` command case | 1 | 1 | done | | `docs/tests/inbox/claim/claim-rejects-when-thread-already-claimed.md` | `claim` command case | 1 | 1 | done | | `docs/tests/inbox/claim/claim-records-requested-lease-duration.md` | `claim` command case | 1 | 1 | done | | `docs/tests/inbox/renew/README.md` | `renew` command case index | 0 | 0 | done | | `docs/tests/inbox/renew/renew-extends-active-lease.md` | `renew` command case | 1 | 1 | done | | `docs/tests/inbox/renew/renew-rejects-non-owner.md` | `renew` command case | 1 | 1 | done | | `docs/tests/inbox/renew/renew-rejects-without-active-lease.md` | `renew` command case | 1 | 1 | done | | `docs/tests/inbox/update/README.md` | `update` command case index | 0 | 0 | done | | `docs/tests/inbox/update/update-moves-thread-to-in-progress.md` | `update` command case | 1 | 1 | done | | `docs/tests/inbox/update/update-moves-thread-to-blocked-with-payload.md` | `update` command case | 1 | 1 | done | | `docs/tests/inbox/update/update-accepts-body-file-and-artifact.md` | `update` command case | 1 | 1 | done | | `docs/tests/inbox/update/update-rejects-invalid-payload-json.md` | `update` command case | 1 | 1 | done | | `docs/tests/inbox/update/update-rejects-non-owner.md` | `update` command case | 1 | 1 | done | | `docs/tests/inbox/reply/README.md` | `reply` command case index | 0 | 0 | done | | `docs/tests/inbox/reply/reply-adds-answer-message.md` | `reply` command case | 1 | 1 | done | | `docs/tests/inbox/reply/reply-supports-control-kind.md` | `reply` command case | 1 | 1 | done | | `docs/tests/inbox/reply/reply-attaches-artifact.md` | `reply` command case | 1 | 1 | done | | `docs/tests/inbox/reply/reply-rejects-invalid-payload-json.md` | `reply` command case | 1 | 1 | done | | `docs/tests/inbox/done/README.md` | `done` command case index | 0 | 0 | done | | `docs/tests/inbox/done/done-marks-thread-terminal.md` | `done` command case | 1 | 1 | done | | `docs/tests/inbox/done/done-persists-result-body-and-artifact.md` | `done` command case | 1 | 1 | done | | `docs/tests/inbox/done/done-rejects-non-owner.md` | `done` command case | 1 | 1 | done | | `docs/tests/inbox/done/done-rejects-on-terminal-thread.md` | `done` command case | 1 | 1 | done | | `docs/tests/inbox/fail/README.md` | `fail` command case index | 0 | 0 | done | | `docs/tests/inbox/fail/fail-marks-thread-failed.md` | `fail` command case | 1 | 1 | done | | `docs/tests/inbox/fail/fail-persists-failure-body-and-artifact.md` | `fail` command case | 1 | 1 | done | | `docs/tests/inbox/fail/fail-rejects-non-owner.md` | `fail` command case | 1 | 1 | done | | `docs/tests/inbox/fail/fail-rejects-on-terminal-thread.md` | `fail` command case | 1 | 1 | done | | `docs/tests/inbox/cancel/README.md` | `cancel` command case index | 0 | 0 | done | | `docs/tests/inbox/cancel/cancel-marks-thread-cancelled.md` | `cancel` command case | 1 | 1 | done | | `docs/tests/inbox/cancel/cancel-persists-reason-and-artifact.md` | `cancel` command case | 1 | 1 | done | | `docs/tests/inbox/cancel/cancel-rejects-when-thread-missing.md` | `cancel` command case | 1 | 1 | done | | `docs/tests/inbox/list/README.md` | `list` command case index | 0 | 0 | done | | `docs/tests/inbox/list/list-filters-by-status.md` | `list` command case | 1 | 1 | done | | `docs/tests/inbox/list/list-filters-by-created-by.md` | `list` command case | 1 | 1 | done | | `docs/tests/inbox/list/list-filters-by-assigned-to.md` | `list` command case | 1 | 1 | done | | `docs/tests/inbox/list/list-respects-limit.md` | `list` command case | 1 | 1 | done | | `docs/tests/inbox/show/README.md` | `show` command case index | 0 | 0 | done | | `docs/tests/inbox/show/show-returns-thread-and-message-history.md` | `show` command case | 1 | 1 | done | | `docs/tests/inbox/show/show-includes-artifacts-per-message.md` | `show` command case | 1 | 1 | done | | `docs/tests/inbox/show/show-mark-read-advances-read-cursor.md` | `show` command case | 1 | 1 | done | | `docs/tests/inbox/show/show-rejects-when-thread-missing.md` | `show` command case | 1 | 1 | done | | `docs/tests/inbox/watch/README.md` | `watch` command case index | 0 | 0 | done | | `docs/tests/inbox/watch/watch-wakes-on-matching-thread.md` | `watch` command case | 1 | 1 | done | | `docs/tests/inbox/watch/watch-respects-status-filter.md` | `watch` command case | 1 | 1 | done | | `docs/tests/inbox/watch/watch-times-out-with-no-activity.md` | `watch` command case | 1 | 1 | done | | `docs/tests/inbox/wait-reply/README.md` | `wait-reply` command case index | 0 | 0 | done | | `docs/tests/inbox/wait-reply/wait-reply-wakes-on-answer-after-message.md` | `wait-reply` command case | 1 | 1 | done | | `docs/tests/inbox/wait-reply/wait-reply-can-start-from-after-event.md` | `wait-reply` command case | 1 | 1 | done | | `docs/tests/inbox/wait-reply/wait-reply-times-out-when-no-reply.md` | `wait-reply` command case | 1 | 1 | done | ## Authoring Order Recommended order: 1. `docs/tests/inbox/README.md` 2. `docs/tests/inbox/_shared/README.md` 3. `docs/tests/inbox/workflows/README.md` 4. `docs/tests/inbox/send/README.md` plus its linked case files 5. `docs/tests/inbox/fetch/README.md` plus its linked case files 6. `docs/tests/inbox/claim/README.md` plus its linked case files 7. `docs/tests/inbox/reply/README.md` plus its linked case files 8. `docs/tests/inbox/done/README.md` plus its linked case files 9. `docs/tests/inbox/show/README.md` plus its linked case files 10. the remaining command indexes and case files Reason: - the workflow file captures the highest-value end-to-end behavior first - the command documents can then reuse shared conventions and already-fixed terminology ## Authored Case Register | Path | Case Slug | Coverage Note | Status | | --- | --- | --- | --- | | `docs/tests/inbox/workflows/README.md` | `thread-lifecycle-happy-path` | end-to-end happy path from send to show after done | done | | `docs/tests/inbox/workflows/README.md` | `blocked-question-reply-resume-to-done` | blocked thread receives answer and resumes to done | done | | `docs/tests/inbox/workflows/README.md` | `fail-lifecycle-from-claim-to-terminal` | claimed thread transitions to failed terminal state | done | | `docs/tests/inbox/workflows/README.md` | `cancel-lifecycle-after-worker-claim` | claimed thread can be cancelled by initiator | done | | `docs/tests/inbox/workflows/README.md` | `watch-wakes-then-fetch-sees-new-thread` | watch wake-up remains consistent with unread fetch visibility | done | | `docs/tests/inbox/workflows/README.md` | `artifact-visible-through-send-and-show` | body-file and artifact data survive send and show | done | | `docs/tests/inbox/workflows/README.md` | `unread-clears-after-mark-read-and-reappears-on-new-message` | read cursor clears unread and new message restores it | done | | `docs/tests/inbox/workflows/README.md` | `wait-reply-clears-blocked-unread-for-agent` | wait-reply consumes reply and clears blocked unread view | done | | `docs/tests/inbox/init/init-creates-schema-on-empty-db.md` | `init-creates-schema-on-empty-db` | initializes an empty database path and returns initialized status | done | | `docs/tests/inbox/init/init-is-idempotent-on-existing-db.md` | `init-is-idempotent-on-existing-db` | repeated init succeeds on the same database path | done | | `docs/tests/inbox/send/send-creates-new-thread.md` | `send-creates-new-thread` | creates a pending thread with an initial task message | done | | `docs/tests/inbox/send/send-appends-message-to-existing-thread.md` | `send-appends-message-to-existing-thread` | appends a message to an existing non-terminal thread | done | | `docs/tests/inbox/send/send-reads-body-from-body-file.md` | `send-reads-body-from-body-file` | reads message body from a file path | done | | `docs/tests/inbox/send/send-attaches-artifact-with-metadata.md` | `send-attaches-artifact-with-metadata` | persists artifact path, kind, and metadata on send | done | | `docs/tests/inbox/send/send-rejects-invalid-payload-json.md` | `send-rejects-invalid-payload-json` | rejects malformed payload JSON with `invalid_input` | done | | `docs/tests/inbox/send/send-rejects-invalid-artifact-metadata-json.md` | `send-rejects-invalid-artifact-metadata-json` | rejects malformed artifact metadata JSON | done | | `docs/tests/inbox/fetch/fetch-returns-pending-thread-for-target-agent.md` | `fetch-returns-pending-thread-for-target-agent` | returns pending candidate work for the target agent | done | | `docs/tests/inbox/fetch/fetch-respects-status-and-limit-filters.md` | `fetch-respects-status-and-limit-filters` | enforces status filtering and max row count | done | | `docs/tests/inbox/fetch/fetch-unread-uses-read-cursor.md` | `fetch-unread-uses-read-cursor` | unread filtering depends on per-agent read cursor state | done | | `docs/tests/inbox/fetch/fetch-returns-no-matching-work-when-empty.md` | `fetch-returns-no-matching-work-when-empty` | empty fetch result returns no_matching_work | done | | `docs/tests/inbox/claim/claim-acquires-thread-lease.md` | `claim-acquires-thread-lease` | claims a pending thread and records a claim event message | done | | `docs/tests/inbox/claim/claim-rejects-when-thread-missing.md` | `claim-rejects-when-thread-missing` | missing thread returns not_found | done | | `docs/tests/inbox/claim/claim-rejects-when-thread-already-claimed.md` | `claim-rejects-when-thread-already-claimed` | active lease conflict returns lease_conflict | done | | `docs/tests/inbox/claim/claim-records-requested-lease-duration.md` | `claim-records-requested-lease-duration` | claim event payload records requested lease duration | done | | `docs/tests/inbox/renew/renew-extends-active-lease.md` | `renew-extends-active-lease` | owner renews an active lease and gets a renewal event | done | | `docs/tests/inbox/renew/renew-rejects-non-owner.md` | `renew-rejects-non-owner` | non-owner renew attempt returns lease_conflict | done | | `docs/tests/inbox/renew/renew-rejects-without-active-lease.md` | `renew-rejects-without-active-lease` | missing active lease returns invalid_state | done | | `docs/tests/inbox/update/update-moves-thread-to-in-progress.md` | `update-moves-thread-to-in-progress` | moves a claimed thread to `in_progress` and emits a progress message | done | | `docs/tests/inbox/update/update-moves-thread-to-blocked-with-payload.md` | `update-moves-thread-to-blocked-with-payload` | moves a claimed thread to `blocked` with structured question payload | done | | `docs/tests/inbox/update/update-accepts-body-file-and-artifact.md` | `update-accepts-body-file-and-artifact` | persists update body from file plus artifacts | done | | `docs/tests/inbox/update/update-rejects-invalid-payload-json.md` | `update-rejects-invalid-payload-json` | rejects malformed `--payload-json` input | done | | `docs/tests/inbox/update/update-rejects-non-owner.md` | `update-rejects-non-owner` | rejects update when caller is not the active lease owner | done | | `docs/tests/inbox/reply/reply-adds-answer-message.md` | `reply-adds-answer-message` | appends default `answer` message to an existing non-terminal thread | done | | `docs/tests/inbox/reply/reply-supports-control-kind.md` | `reply-supports-control-kind` | supports explicit `--kind control` reply message | done | | `docs/tests/inbox/reply/reply-attaches-artifact.md` | `reply-attaches-artifact` | appends reply message with artifact payload | done | | `docs/tests/inbox/reply/reply-rejects-invalid-payload-json.md` | `reply-rejects-invalid-payload-json` | rejects malformed `--payload-json` input | done | | `docs/tests/inbox/done/done-marks-thread-terminal.md` | `done-marks-thread-terminal` | marks a claimed thread as `done` with a result message | done | | `docs/tests/inbox/done/done-persists-result-body-and-artifact.md` | `done-persists-result-body-and-artifact` | persists result body and artifact for follow-up reads | done | | `docs/tests/inbox/done/done-rejects-non-owner.md` | `done-rejects-non-owner` | rejects `done` from non-owner agent | done | | `docs/tests/inbox/done/done-rejects-on-terminal-thread.md` | `done-rejects-on-terminal-thread` | rejects `done` on terminal thread states | done | | `docs/tests/inbox/fail/fail-marks-thread-failed.md` | `fail-marks-thread-failed` | marks a claimed thread as `failed` with a result message | done | | `docs/tests/inbox/fail/fail-persists-failure-body-and-artifact.md` | `fail-persists-failure-body-and-artifact` | persists failure body and artifacts for diagnosis | done | | `docs/tests/inbox/fail/fail-rejects-non-owner.md` | `fail-rejects-non-owner` | rejects `fail` from non-owner agent | done | | `docs/tests/inbox/fail/fail-rejects-on-terminal-thread.md` | `fail-rejects-on-terminal-thread` | rejects `fail` on terminal thread states | done | | `docs/tests/inbox/cancel/cancel-marks-thread-cancelled.md` | `cancel-marks-thread-cancelled` | moves a non-terminal thread into `cancelled` and emits a control message | done | | `docs/tests/inbox/cancel/cancel-persists-reason-and-artifact.md` | `cancel-persists-reason-and-artifact` | persists cancel reason text and attached artifacts | done | | `docs/tests/inbox/cancel/cancel-rejects-when-thread-missing.md` | `cancel-rejects-when-thread-missing` | returns stable not-found contract when thread does not exist | done | | `docs/tests/inbox/list/list-filters-by-status.md` | `list-filters-by-status` | filters returned threads by status set | done | | `docs/tests/inbox/list/list-filters-by-created-by.md` | `list-filters-by-created-by` | filters returned threads by creator | done | | `docs/tests/inbox/list/list-filters-by-assigned-to.md` | `list-filters-by-assigned-to` | filters returned threads by current assignee | done | | `docs/tests/inbox/list/list-respects-limit.md` | `list-respects-limit` | enforces hard cap on returned thread count | done | | `docs/tests/inbox/show/show-returns-thread-and-message-history.md` | `show-returns-thread-and-message-history` | returns thread details and full time-ordered message history | done | | `docs/tests/inbox/show/show-includes-artifacts-per-message.md` | `show-includes-artifacts-per-message` | expands per-message artifacts in the show payload | done | | `docs/tests/inbox/show/show-mark-read-advances-read-cursor.md` | `show-mark-read-advances-read-cursor` | advances caller read cursor when `--mark-read` is used | done | | `docs/tests/inbox/show/show-rejects-when-thread-missing.md` | `show-rejects-when-thread-missing` | returns stable not-found contract for missing thread | done | | `docs/tests/inbox/watch/watch-wakes-on-matching-thread.md` | `watch-wakes-on-matching-thread` | wakes when a matching post-start event arrives and returns event context | done | | `docs/tests/inbox/watch/watch-respects-status-filter.md` | `watch-respects-status-filter` | wakes only when thread transitions into requested status | done | | `docs/tests/inbox/watch/watch-times-out-with-no-activity.md` | `watch-times-out-with-no-activity` | returns timeout contract when no matching activity arrives | done | | `docs/tests/inbox/wait-reply/wait-reply-wakes-on-answer-after-message.md` | `wait-reply-wakes-on-answer-after-message` | wakes for a qualifying reply after known message boundary | done | | `docs/tests/inbox/wait-reply/wait-reply-can-start-from-after-event.md` | `wait-reply-can-start-from-after-event` | resumes waiting from a known event cursor | done | | `docs/tests/inbox/wait-reply/wait-reply-times-out-when-no-reply.md` | `wait-reply-times-out-when-no-reply` | returns timeout contract when no qualifying reply arrives | done | ## Pending Case Backlog No pending case slugs remain in the current plan. When a new CLI contract or workflow needs coverage: 1. if it is a command case, create a new `.md` file under the relevant command folder and add it to that folder `README.md` index 2. if it is a workflow case, add it to `docs/tests/inbox/workflows/README.md` 3. add the new slug to `Authored Case Register` 4. update `Current Snapshot` and `Document Progress` ## Definition Of Done This roadmap is complete only when all of the following are true: - every implemented inbox command has a corresponding document folder - each planned command index and case document exists - each pending case slug has been either authored or explicitly deferred - the authored-case register matches the actual Markdown files on disk - a new agent can pick any pending case and know exactly where it should be written