Files
ai-workflow-skill/docs/tests/inbox/ROADMAP.md
T

23 KiB

Inbox Test Documentation Roadmap

Purpose

This roadmap tracks the human-readable Markdown test plan for inbox.

It exists so a new agent can immediately answer four questions without re-reading the whole codebase:

  • which test-plan documents already exist
  • which cases have already been written down
  • which cases are still missing
  • what file should be updated next

This roadmap is for the Markdown test-plan set under docs/tests/inbox/. It is not a replacement for automated Go tests.

Current Snapshot

Snapshot date:

  • 2026-03-19

Current state:

  • inbox CLI is implemented end-to-end
  • automated Go integration tests already exist for the main lifecycle, wait flows, unread behavior, artifacts, and JSON error contracts
  • this roadmap now exists under docs/tests/inbox/ROADMAP.md
  • all planned global, shared, workflow, command-index, and command-case Markdown documents have been authored
  • command-level documents have been audited once per command against current CLI and store behavior, with edge-contract notes added for defaults, fallbacks, and error boundaries where needed
  • every inbox command folder now uses README.md as an index plus one Markdown file per case

Progress summary for planned test-plan documents, excluding ROADMAP.md:

  • planned document files: 70
  • authored document files: 70
  • planned case slugs in this roadmap: 61
  • authored case slugs in this roadmap: 61

Scope

In scope:

  • inbox init
  • inbox send
  • inbox fetch
  • inbox claim
  • inbox renew
  • inbox update
  • inbox reply
  • inbox done
  • inbox fail
  • inbox cancel
  • inbox list
  • inbox show
  • inbox watch
  • inbox wait-reply
  • cross-command workflows
  • shared test conventions for JSON output, exit codes, fixtures, and assertions

Out of scope:

  • orch
  • council-review
  • implementation details that are not visible through the CLI contract

Tracking Rules

Directory model:

  • one folder per command or shared area
  • each folder keeps a README.md entrypoint
  • command folders use README.md as an index only
  • each command case lives in its own Markdown file named after the case slug
  • cross-command workflow cases remain grouped in docs/tests/inbox/workflows/README.md

Case identity:

  • do not use numeric IDs
  • identify each command case by its concrete file path
  • identify each workflow case by path + case slug
  • command case file naming pattern:
<case-slug>.md
  • workflow case heading pattern:
## case: send-rejects-invalid-payload-json

Per-case structure inside the case document:

  • 用例意义
  • 前置条件
  • 输入
  • 预期输出
  • 断言结论

How to update this roadmap when a new case is written:

  1. if it is a command case, create or update the target <case-slug>.md file under the relevant command folder
  2. if it is a command case, add or update the entry in that folder README.md index
  3. if it is a workflow case, add or update the case inside docs/tests/inbox/workflows/README.md
  4. move the case slug from Pending Case Backlog to Authored Case Register
  5. update the authored counts in Current Snapshot
  6. if a new Markdown file is created, update Document Progress

Allowed status values in this roadmap:

  • pending
  • in_progress
  • done
  • deferred

Existing Automated Coverage Reference

The Markdown test-plan set starts at zero, but these automated tests already exist and should be used as source material when writing the docs:

These tests do not remove the need for the Markdown plan. They only reduce discovery work.

Planned Directory Tree

docs/tests/inbox/
  ROADMAP.md
  README.md
  _shared/
    README.md
  workflows/
    README.md
  init/
    README.md
    <case-slug>.md
  send/
    README.md
    <case-slug>.md
  fetch/
    README.md
    <case-slug>.md
  claim/
    README.md
    <case-slug>.md
  renew/
    README.md
    <case-slug>.md
  update/
    README.md
    <case-slug>.md
  reply/
    README.md
    <case-slug>.md
  done/
    README.md
    <case-slug>.md
  fail/
    README.md
    <case-slug>.md
  cancel/
    README.md
    <case-slug>.md
  list/
    README.md
    <case-slug>.md
  show/
    README.md
    <case-slug>.md
  watch/
    README.md
    <case-slug>.md
  wait-reply/
    README.md
    <case-slug>.md

Document Progress

Path Purpose Planned Cases Authored Cases Status
docs/tests/inbox/README.md Global testing conventions and glossary 0 0 done
docs/tests/inbox/_shared/README.md Shared fixtures, JSON assertions, exit-code rules 0 0 done
docs/tests/inbox/workflows/README.md Cross-command scenarios 8 8 done
docs/tests/inbox/init/README.md init command case index 0 0 done
docs/tests/inbox/init/init-creates-schema-on-empty-db.md init command case 1 1 done
docs/tests/inbox/init/init-is-idempotent-on-existing-db.md init command case 1 1 done
docs/tests/inbox/send/README.md send command case index 0 0 done
docs/tests/inbox/send/send-creates-new-thread.md send command case 1 1 done
docs/tests/inbox/send/send-appends-message-to-existing-thread.md send command case 1 1 done
docs/tests/inbox/send/send-reads-body-from-body-file.md send command case 1 1 done
docs/tests/inbox/send/send-attaches-artifact-with-metadata.md send command case 1 1 done
docs/tests/inbox/send/send-rejects-invalid-payload-json.md send command case 1 1 done
docs/tests/inbox/send/send-rejects-invalid-artifact-metadata-json.md send command case 1 1 done
docs/tests/inbox/fetch/README.md fetch command case index 0 0 done
docs/tests/inbox/fetch/fetch-returns-pending-thread-for-target-agent.md fetch command case 1 1 done
docs/tests/inbox/fetch/fetch-respects-status-and-limit-filters.md fetch command case 1 1 done
docs/tests/inbox/fetch/fetch-unread-uses-read-cursor.md fetch command case 1 1 done
docs/tests/inbox/fetch/fetch-returns-no-matching-work-when-empty.md fetch command case 1 1 done
docs/tests/inbox/claim/README.md claim command case index 0 0 done
docs/tests/inbox/claim/claim-acquires-thread-lease.md claim command case 1 1 done
docs/tests/inbox/claim/claim-rejects-when-thread-missing.md claim command case 1 1 done
docs/tests/inbox/claim/claim-rejects-when-thread-already-claimed.md claim command case 1 1 done
docs/tests/inbox/claim/claim-records-requested-lease-duration.md claim command case 1 1 done
docs/tests/inbox/renew/README.md renew command case index 0 0 done
docs/tests/inbox/renew/renew-extends-active-lease.md renew command case 1 1 done
docs/tests/inbox/renew/renew-rejects-non-owner.md renew command case 1 1 done
docs/tests/inbox/renew/renew-rejects-without-active-lease.md renew command case 1 1 done
docs/tests/inbox/update/README.md update command case index 0 0 done
docs/tests/inbox/update/update-moves-thread-to-in-progress.md update command case 1 1 done
docs/tests/inbox/update/update-moves-thread-to-blocked-with-payload.md update command case 1 1 done
docs/tests/inbox/update/update-accepts-body-file-and-artifact.md update command case 1 1 done
docs/tests/inbox/update/update-rejects-invalid-payload-json.md update command case 1 1 done
docs/tests/inbox/update/update-rejects-non-owner.md update command case 1 1 done
docs/tests/inbox/reply/README.md reply command case index 0 0 done
docs/tests/inbox/reply/reply-adds-answer-message.md reply command case 1 1 done
docs/tests/inbox/reply/reply-supports-control-kind.md reply command case 1 1 done
docs/tests/inbox/reply/reply-attaches-artifact.md reply command case 1 1 done
docs/tests/inbox/reply/reply-rejects-invalid-payload-json.md reply command case 1 1 done
docs/tests/inbox/done/README.md done command case index 0 0 done
docs/tests/inbox/done/done-marks-thread-terminal.md done command case 1 1 done
docs/tests/inbox/done/done-persists-result-body-and-artifact.md done command case 1 1 done
docs/tests/inbox/done/done-rejects-non-owner.md done command case 1 1 done
docs/tests/inbox/done/done-rejects-on-terminal-thread.md done command case 1 1 done
docs/tests/inbox/fail/README.md fail command case index 0 0 done
docs/tests/inbox/fail/fail-marks-thread-failed.md fail command case 1 1 done
docs/tests/inbox/fail/fail-persists-failure-body-and-artifact.md fail command case 1 1 done
docs/tests/inbox/fail/fail-rejects-non-owner.md fail command case 1 1 done
docs/tests/inbox/fail/fail-rejects-on-terminal-thread.md fail command case 1 1 done
docs/tests/inbox/cancel/README.md cancel command case index 0 0 done
docs/tests/inbox/cancel/cancel-marks-thread-cancelled.md cancel command case 1 1 done
docs/tests/inbox/cancel/cancel-persists-reason-and-artifact.md cancel command case 1 1 done
docs/tests/inbox/cancel/cancel-rejects-when-thread-missing.md cancel command case 1 1 done
docs/tests/inbox/list/README.md list command case index 0 0 done
docs/tests/inbox/list/list-filters-by-status.md list command case 1 1 done
docs/tests/inbox/list/list-filters-by-created-by.md list command case 1 1 done
docs/tests/inbox/list/list-filters-by-assigned-to.md list command case 1 1 done
docs/tests/inbox/list/list-respects-limit.md list command case 1 1 done
docs/tests/inbox/show/README.md show command case index 0 0 done
docs/tests/inbox/show/show-returns-thread-and-message-history.md show command case 1 1 done
docs/tests/inbox/show/show-includes-artifacts-per-message.md show command case 1 1 done
docs/tests/inbox/show/show-mark-read-advances-read-cursor.md show command case 1 1 done
docs/tests/inbox/show/show-rejects-when-thread-missing.md show command case 1 1 done
docs/tests/inbox/watch/README.md watch command case index 0 0 done
docs/tests/inbox/watch/watch-wakes-on-matching-thread.md watch command case 1 1 done
docs/tests/inbox/watch/watch-respects-status-filter.md watch command case 1 1 done
docs/tests/inbox/watch/watch-times-out-with-no-activity.md watch command case 1 1 done
docs/tests/inbox/wait-reply/README.md wait-reply command case index 0 0 done
docs/tests/inbox/wait-reply/wait-reply-wakes-on-answer-after-message.md wait-reply command case 1 1 done
docs/tests/inbox/wait-reply/wait-reply-can-start-from-after-event.md wait-reply command case 1 1 done
docs/tests/inbox/wait-reply/wait-reply-times-out-when-no-reply.md wait-reply command case 1 1 done

Authoring Order

Recommended order:

  1. docs/tests/inbox/README.md
  2. docs/tests/inbox/_shared/README.md
  3. docs/tests/inbox/workflows/README.md
  4. docs/tests/inbox/send/README.md plus its linked case files
  5. docs/tests/inbox/fetch/README.md plus its linked case files
  6. docs/tests/inbox/claim/README.md plus its linked case files
  7. docs/tests/inbox/reply/README.md plus its linked case files
  8. docs/tests/inbox/done/README.md plus its linked case files
  9. docs/tests/inbox/show/README.md plus its linked case files
  10. the remaining command indexes and case files

Reason:

  • the workflow file captures the highest-value end-to-end behavior first
  • the command documents can then reuse shared conventions and already-fixed terminology

Authored Case Register

Path Case Slug Coverage Note Status
docs/tests/inbox/workflows/README.md thread-lifecycle-happy-path end-to-end happy path from send to show after done done
docs/tests/inbox/workflows/README.md blocked-question-reply-resume-to-done blocked thread receives answer and resumes to done done
docs/tests/inbox/workflows/README.md fail-lifecycle-from-claim-to-terminal claimed thread transitions to failed terminal state done
docs/tests/inbox/workflows/README.md cancel-lifecycle-after-worker-claim claimed thread can be cancelled by initiator done
docs/tests/inbox/workflows/README.md watch-wakes-then-fetch-sees-new-thread watch wake-up remains consistent with unread fetch visibility done
docs/tests/inbox/workflows/README.md artifact-visible-through-send-and-show body-file and artifact data survive send and show done
docs/tests/inbox/workflows/README.md unread-clears-after-mark-read-and-reappears-on-new-message read cursor clears unread and new message restores it done
docs/tests/inbox/workflows/README.md wait-reply-clears-blocked-unread-for-agent wait-reply consumes reply and clears blocked unread view done
docs/tests/inbox/init/init-creates-schema-on-empty-db.md init-creates-schema-on-empty-db initializes an empty database path and returns initialized status done
docs/tests/inbox/init/init-is-idempotent-on-existing-db.md init-is-idempotent-on-existing-db repeated init succeeds on the same database path done
docs/tests/inbox/send/send-creates-new-thread.md send-creates-new-thread creates a pending thread with an initial task message done
docs/tests/inbox/send/send-appends-message-to-existing-thread.md send-appends-message-to-existing-thread appends a message to an existing non-terminal thread done
docs/tests/inbox/send/send-reads-body-from-body-file.md send-reads-body-from-body-file reads message body from a file path done
docs/tests/inbox/send/send-attaches-artifact-with-metadata.md send-attaches-artifact-with-metadata persists artifact path, kind, and metadata on send done
docs/tests/inbox/send/send-rejects-invalid-payload-json.md send-rejects-invalid-payload-json rejects malformed payload JSON with invalid_input done
docs/tests/inbox/send/send-rejects-invalid-artifact-metadata-json.md send-rejects-invalid-artifact-metadata-json rejects malformed artifact metadata JSON done
docs/tests/inbox/fetch/fetch-returns-pending-thread-for-target-agent.md fetch-returns-pending-thread-for-target-agent returns pending candidate work for the target agent done
docs/tests/inbox/fetch/fetch-respects-status-and-limit-filters.md fetch-respects-status-and-limit-filters enforces status filtering and max row count done
docs/tests/inbox/fetch/fetch-unread-uses-read-cursor.md fetch-unread-uses-read-cursor unread filtering depends on per-agent read cursor state done
docs/tests/inbox/fetch/fetch-returns-no-matching-work-when-empty.md fetch-returns-no-matching-work-when-empty empty fetch result returns no_matching_work done
docs/tests/inbox/claim/claim-acquires-thread-lease.md claim-acquires-thread-lease claims a pending thread and records a claim event message done
docs/tests/inbox/claim/claim-rejects-when-thread-missing.md claim-rejects-when-thread-missing missing thread returns not_found done
docs/tests/inbox/claim/claim-rejects-when-thread-already-claimed.md claim-rejects-when-thread-already-claimed active lease conflict returns lease_conflict done
docs/tests/inbox/claim/claim-records-requested-lease-duration.md claim-records-requested-lease-duration claim event payload records requested lease duration done
docs/tests/inbox/renew/renew-extends-active-lease.md renew-extends-active-lease owner renews an active lease and gets a renewal event done
docs/tests/inbox/renew/renew-rejects-non-owner.md renew-rejects-non-owner non-owner renew attempt returns lease_conflict done
docs/tests/inbox/renew/renew-rejects-without-active-lease.md renew-rejects-without-active-lease missing active lease returns invalid_state done
docs/tests/inbox/update/update-moves-thread-to-in-progress.md update-moves-thread-to-in-progress moves a claimed thread to in_progress and emits a progress message done
docs/tests/inbox/update/update-moves-thread-to-blocked-with-payload.md update-moves-thread-to-blocked-with-payload moves a claimed thread to blocked with structured question payload done
docs/tests/inbox/update/update-accepts-body-file-and-artifact.md update-accepts-body-file-and-artifact persists update body from file plus artifacts done
docs/tests/inbox/update/update-rejects-invalid-payload-json.md update-rejects-invalid-payload-json rejects malformed --payload-json input done
docs/tests/inbox/update/update-rejects-non-owner.md update-rejects-non-owner rejects update when caller is not the active lease owner done
docs/tests/inbox/reply/reply-adds-answer-message.md reply-adds-answer-message appends default answer message to an existing non-terminal thread done
docs/tests/inbox/reply/reply-supports-control-kind.md reply-supports-control-kind supports explicit --kind control reply message done
docs/tests/inbox/reply/reply-attaches-artifact.md reply-attaches-artifact appends reply message with artifact payload done
docs/tests/inbox/reply/reply-rejects-invalid-payload-json.md reply-rejects-invalid-payload-json rejects malformed --payload-json input done
docs/tests/inbox/done/done-marks-thread-terminal.md done-marks-thread-terminal marks a claimed thread as done with a result message done
docs/tests/inbox/done/done-persists-result-body-and-artifact.md done-persists-result-body-and-artifact persists result body and artifact for follow-up reads done
docs/tests/inbox/done/done-rejects-non-owner.md done-rejects-non-owner rejects done from non-owner agent done
docs/tests/inbox/done/done-rejects-on-terminal-thread.md done-rejects-on-terminal-thread rejects done on terminal thread states done
docs/tests/inbox/fail/fail-marks-thread-failed.md fail-marks-thread-failed marks a claimed thread as failed with a result message done
docs/tests/inbox/fail/fail-persists-failure-body-and-artifact.md fail-persists-failure-body-and-artifact persists failure body and artifacts for diagnosis done
docs/tests/inbox/fail/fail-rejects-non-owner.md fail-rejects-non-owner rejects fail from non-owner agent done
docs/tests/inbox/fail/fail-rejects-on-terminal-thread.md fail-rejects-on-terminal-thread rejects fail on terminal thread states done
docs/tests/inbox/cancel/cancel-marks-thread-cancelled.md cancel-marks-thread-cancelled moves a non-terminal thread into cancelled and emits a control message done
docs/tests/inbox/cancel/cancel-persists-reason-and-artifact.md cancel-persists-reason-and-artifact persists cancel reason text and attached artifacts done
docs/tests/inbox/cancel/cancel-rejects-when-thread-missing.md cancel-rejects-when-thread-missing returns stable not-found contract when thread does not exist done
docs/tests/inbox/list/list-filters-by-status.md list-filters-by-status filters returned threads by status set done
docs/tests/inbox/list/list-filters-by-created-by.md list-filters-by-created-by filters returned threads by creator done
docs/tests/inbox/list/list-filters-by-assigned-to.md list-filters-by-assigned-to filters returned threads by current assignee done
docs/tests/inbox/list/list-respects-limit.md list-respects-limit enforces hard cap on returned thread count done
docs/tests/inbox/show/show-returns-thread-and-message-history.md show-returns-thread-and-message-history returns thread details and full time-ordered message history done
docs/tests/inbox/show/show-includes-artifacts-per-message.md show-includes-artifacts-per-message expands per-message artifacts in the show payload done
docs/tests/inbox/show/show-mark-read-advances-read-cursor.md show-mark-read-advances-read-cursor advances caller read cursor when --mark-read is used done
docs/tests/inbox/show/show-rejects-when-thread-missing.md show-rejects-when-thread-missing returns stable not-found contract for missing thread done
docs/tests/inbox/watch/watch-wakes-on-matching-thread.md watch-wakes-on-matching-thread wakes when a matching post-start event arrives and returns event context done
docs/tests/inbox/watch/watch-respects-status-filter.md watch-respects-status-filter wakes only when thread transitions into requested status done
docs/tests/inbox/watch/watch-times-out-with-no-activity.md watch-times-out-with-no-activity returns timeout contract when no matching activity arrives done
docs/tests/inbox/wait-reply/wait-reply-wakes-on-answer-after-message.md wait-reply-wakes-on-answer-after-message wakes for a qualifying reply after known message boundary done
docs/tests/inbox/wait-reply/wait-reply-can-start-from-after-event.md wait-reply-can-start-from-after-event resumes waiting from a known event cursor done
docs/tests/inbox/wait-reply/wait-reply-times-out-when-no-reply.md wait-reply-times-out-when-no-reply returns timeout contract when no qualifying reply arrives done

Pending Case Backlog

No pending case slugs remain in the current plan.

When a new CLI contract or workflow needs coverage:

  1. if it is a command case, create a new <case-slug>.md file under the relevant command folder and add it to that folder README.md index
  2. if it is a workflow case, add it to docs/tests/inbox/workflows/README.md
  3. add the new slug to Authored Case Register
  4. update Current Snapshot and Document Progress

Definition Of Done

This roadmap is complete only when all of the following are true:

  • every implemented inbox command has a corresponding document folder
  • each planned command index and case document exists
  • each pending case slug has been either authored or explicitly deferred
  • the authored-case register matches the actual Markdown files on disk
  • a new agent can pick any pending case and know exactly where it should be written