Add inbox test planning guidance

This commit is contained in:
2026-03-19 10:06:42 +08:00
parent 1927930570
commit dab0506c5a
3 changed files with 434 additions and 6 deletions
+325
View File
@@ -0,0 +1,325 @@
# Inbox Test Documentation Roadmap
## Purpose
This roadmap tracks the human-readable Markdown test plan for `inbox`.
It exists so a new agent can immediately answer four questions without re-reading the whole codebase:
- which test-plan documents already exist
- which cases have already been written down
- which cases are still missing
- what file should be updated next
This roadmap is for the Markdown test-plan set under `docs/tests/inbox/`.
It is not a replacement for automated Go tests.
## Current Snapshot
Snapshot date:
- `2026-03-19`
Current state:
- `inbox` CLI is implemented end-to-end
- automated Go integration tests already exist for the main lifecycle, wait flows, unread behavior, artifacts, and JSON error contracts
- this roadmap now exists under `docs/tests/inbox/ROADMAP.md`
- the command, workflow, and shared Markdown test-plan documents have not been authored yet
Progress summary for planned test-plan documents, excluding `ROADMAP.md`:
- planned document files: `16`
- authored document files: `0`
- planned case slugs in this roadmap: `61`
- authored case slugs in this roadmap: `0`
## Scope
In scope:
- `inbox init`
- `inbox send`
- `inbox fetch`
- `inbox claim`
- `inbox renew`
- `inbox update`
- `inbox reply`
- `inbox done`
- `inbox fail`
- `inbox cancel`
- `inbox list`
- `inbox show`
- `inbox watch`
- `inbox wait-reply`
- cross-command workflows
- shared test conventions for JSON output, exit codes, fixtures, and assertions
Out of scope:
- `orch`
- `council-review`
- implementation details that are not visible through the CLI contract
## Tracking Rules
Directory model:
- one folder per command or shared area
- one `README.md` per folder
- no one-file-per-case sprawl
Case identity:
- do not use numeric IDs
- identify a case by `path + case slug`
- recommended heading pattern inside the command document:
```md
## case: send-rejects-invalid-payload-json
```
Per-case structure inside the command document:
- `用例意义`
- `前置条件`
- `输入`
- `预期输出`
- `断言结论`
How to update this roadmap when a new case is written:
1. add the case content to the target `README.md`
2. move the case slug from `Pending Case Backlog` to `Authored Case Register`
3. update the authored counts in `Current Snapshot`
4. if a whole command document is created, update `Document Progress`
Allowed status values in this roadmap:
- `pending`
- `in_progress`
- `done`
- `deferred`
## Existing Automated Coverage Reference
The Markdown test-plan set starts at zero, but these automated tests already exist and should be used as source material when writing the docs:
- [integration_test.go](/home/kurihada/project/ai-workflow-skill/internal/cli/inbox/integration_test.go#L12) `TestInboxLifecycle`
- [integration_test.go](/home/kurihada/project/ai-workflow-skill/internal/cli/inbox/integration_test.go#L176) `TestInboxFailLifecycle`
- [integration_test.go](/home/kurihada/project/ai-workflow-skill/internal/cli/inbox/integration_test.go#L243) `TestInboxRenewWaitReplyAndCancel`
- [integration_test.go](/home/kurihada/project/ai-workflow-skill/internal/cli/inbox/integration_test.go#L392) `TestInboxWatchListUnreadAndAppend`
- [integration_test.go](/home/kurihada/project/ai-workflow-skill/internal/cli/inbox/integration_test.go#L549) `TestInboxUnreadReadCursor`
- [integration_test.go](/home/kurihada/project/ai-workflow-skill/internal/cli/inbox/integration_test.go#L639) `TestInboxJSONErrorsAndExitCodes`
These tests do not remove the need for the Markdown plan. They only reduce discovery work.
## Planned Directory Tree
```text
docs/tests/inbox/
ROADMAP.md
README.md
_shared/
README.md
workflows/
README.md
init/
README.md
send/
README.md
fetch/
README.md
claim/
README.md
renew/
README.md
update/
README.md
reply/
README.md
done/
README.md
fail/
README.md
cancel/
README.md
list/
README.md
show/
README.md
watch/
README.md
wait-reply/
README.md
```
## Document Progress
| Path | Purpose | Planned Cases | Authored Cases | Status |
| --- | --- | ---: | ---: | --- |
| `docs/tests/inbox/README.md` | Global testing conventions and glossary | 0 | 0 | pending |
| `docs/tests/inbox/_shared/README.md` | Shared fixtures, JSON assertions, exit-code rules | 0 | 0 | pending |
| `docs/tests/inbox/workflows/README.md` | Cross-command scenarios | 8 | 0 | pending |
| `docs/tests/inbox/init/README.md` | `init` command cases | 2 | 0 | pending |
| `docs/tests/inbox/send/README.md` | `send` command cases | 6 | 0 | pending |
| `docs/tests/inbox/fetch/README.md` | `fetch` command cases | 4 | 0 | pending |
| `docs/tests/inbox/claim/README.md` | `claim` command cases | 4 | 0 | pending |
| `docs/tests/inbox/renew/README.md` | `renew` command cases | 3 | 0 | pending |
| `docs/tests/inbox/update/README.md` | `update` command cases | 5 | 0 | pending |
| `docs/tests/inbox/reply/README.md` | `reply` command cases | 4 | 0 | pending |
| `docs/tests/inbox/done/README.md` | `done` command cases | 4 | 0 | pending |
| `docs/tests/inbox/fail/README.md` | `fail` command cases | 4 | 0 | pending |
| `docs/tests/inbox/cancel/README.md` | `cancel` command cases | 3 | 0 | pending |
| `docs/tests/inbox/list/README.md` | `list` command cases | 4 | 0 | pending |
| `docs/tests/inbox/show/README.md` | `show` command cases | 4 | 0 | pending |
| `docs/tests/inbox/watch/README.md` | `watch` command cases | 3 | 0 | pending |
| `docs/tests/inbox/wait-reply/README.md` | `wait-reply` command cases | 3 | 0 | pending |
## Authoring Order
Recommended order:
1. `docs/tests/inbox/README.md`
2. `docs/tests/inbox/_shared/README.md`
3. `docs/tests/inbox/workflows/README.md`
4. `docs/tests/inbox/send/README.md`
5. `docs/tests/inbox/fetch/README.md`
6. `docs/tests/inbox/claim/README.md`
7. `docs/tests/inbox/reply/README.md`
8. `docs/tests/inbox/done/README.md`
9. `docs/tests/inbox/show/README.md`
10. the remaining command documents
Reason:
- the workflow file captures the highest-value end-to-end behavior first
- the command documents can then reuse shared conventions and already-fixed terminology
## Authored Case Register
No Markdown test cases have been authored yet.
When the first case is written, add rows in this format:
| Path | Case Slug | Coverage Note | Status |
| --- | --- | --- | --- |
| `docs/tests/inbox/send/README.md` | `send-creates-new-thread` | minimal happy path for new thread creation | done |
## Pending Case Backlog
### `docs/tests/inbox/workflows/README.md`
- `pending` `thread-lifecycle-happy-path`
- `pending` `blocked-question-reply-resume-to-done`
- `pending` `fail-lifecycle-from-claim-to-terminal`
- `pending` `cancel-lifecycle-after-worker-claim`
- `pending` `watch-wakes-then-fetch-sees-new-thread`
- `pending` `artifact-visible-through-send-and-show`
- `pending` `unread-clears-after-mark-read-and-reappears-on-new-message`
- `pending` `wait-reply-clears-blocked-unread-for-agent`
### `docs/tests/inbox/init/README.md`
- `pending` `init-creates-schema-on-empty-db`
- `pending` `init-is-idempotent-on-existing-db`
### `docs/tests/inbox/send/README.md`
- `pending` `send-creates-new-thread`
- `pending` `send-appends-message-to-existing-thread`
- `pending` `send-reads-body-from-body-file`
- `pending` `send-attaches-artifact-with-metadata`
- `pending` `send-rejects-invalid-payload-json`
- `pending` `send-rejects-invalid-artifact-metadata-json`
### `docs/tests/inbox/fetch/README.md`
- `pending` `fetch-returns-pending-thread-for-target-agent`
- `pending` `fetch-respects-status-and-limit-filters`
- `pending` `fetch-unread-uses-read-cursor`
- `pending` `fetch-returns-no-matching-work-when-empty`
### `docs/tests/inbox/claim/README.md`
- `pending` `claim-acquires-thread-lease`
- `pending` `claim-rejects-when-thread-missing`
- `pending` `claim-rejects-when-thread-already-claimed`
- `pending` `claim-records-requested-lease-duration`
### `docs/tests/inbox/renew/README.md`
- `pending` `renew-extends-active-lease`
- `pending` `renew-rejects-non-owner`
- `pending` `renew-rejects-without-active-lease`
### `docs/tests/inbox/update/README.md`
- `pending` `update-moves-thread-to-in-progress`
- `pending` `update-moves-thread-to-blocked-with-payload`
- `pending` `update-accepts-body-file-and-artifact`
- `pending` `update-rejects-invalid-payload-json`
- `pending` `update-rejects-non-owner`
### `docs/tests/inbox/reply/README.md`
- `pending` `reply-adds-answer-message`
- `pending` `reply-supports-control-kind`
- `pending` `reply-attaches-artifact`
- `pending` `reply-rejects-invalid-payload-json`
### `docs/tests/inbox/done/README.md`
- `pending` `done-marks-thread-terminal`
- `pending` `done-persists-result-body-and-artifact`
- `pending` `done-rejects-non-owner`
- `pending` `done-rejects-on-terminal-thread`
### `docs/tests/inbox/fail/README.md`
- `pending` `fail-marks-thread-failed`
- `pending` `fail-persists-failure-body-and-artifact`
- `pending` `fail-rejects-non-owner`
- `pending` `fail-rejects-on-terminal-thread`
### `docs/tests/inbox/cancel/README.md`
- `pending` `cancel-marks-thread-cancelled`
- `pending` `cancel-persists-reason-and-artifact`
- `pending` `cancel-rejects-when-thread-missing`
### `docs/tests/inbox/list/README.md`
- `pending` `list-filters-by-status`
- `pending` `list-filters-by-created-by`
- `pending` `list-filters-by-assigned-to`
- `pending` `list-respects-limit`
### `docs/tests/inbox/show/README.md`
- `pending` `show-returns-thread-and-message-history`
- `pending` `show-includes-artifacts-per-message`
- `pending` `show-mark-read-advances-read-cursor`
- `pending` `show-rejects-when-thread-missing`
### `docs/tests/inbox/watch/README.md`
- `pending` `watch-wakes-on-matching-thread`
- `pending` `watch-respects-status-filter`
- `pending` `watch-times-out-with-no-activity`
### `docs/tests/inbox/wait-reply/README.md`
- `pending` `wait-reply-wakes-on-answer-after-message`
- `pending` `wait-reply-can-start-from-after-event`
- `pending` `wait-reply-times-out-when-no-reply`
## Definition Of Done
This roadmap is complete only when all of the following are true:
- every implemented inbox command has a corresponding document folder
- each planned command document exists
- each pending case slug has been either authored or explicitly deferred
- the authored-case register matches the actual Markdown files on disk
- a new agent can pick any pending case and know exactly where it should be written