docs: add inbox skill test scenarios

2026-03-19 12:35:05 +08:00
parent 1a9fc4c136
commit 72d7caa552
7 changed files with 568 additions and 0 deletions
@@ -0,0 +1,83 @@
+# Case: `artifact-roundtrip-through-bundled-cli`
+
+## Test Type
+
+This is a `forward-test` and an artifact-preservation validation.
+
+The goal is to verify that agents using the packaged inbox skill can exchange body-file content and artifacts through the bundled CLI without losing message data.
+
+## Purpose
+
+Validate that all of the following can be true at the same time:
+
+- the leader can create task input files and send them through the bundled CLI
+- the worker can inspect those artifacts through inbox history
+- the worker can return a final result using body-file or artifact inputs
+- the final thread history preserves both task-side and result-side file references
+
+## Preconditions
+
+- skill path exists: `SKILL_PATH=skills/inbox`
+- bundled CLI executable exists: `SKILL_PATH/assets/inbox`
+- use an empty temporary directory `TMPDIR`
+- test database path is `TMPDIR/coord.db`
+
+## Agent Topology
+
+- `leader`
+- `worker-a`
+
+## Inputs
+
+### Leader Prompt
+
+```text
+Use $inbox at SKILL_PATH to act as leader on SQLite DB TMPDIR/coord.db. Only coordinate through the bundled inbox CLI from the skill. Workflow: 1) initialize the DB, 2) create a small task file under TMPDIR, 3) send one task to worker-a using body-file plus at least one artifact and artifact metadata, 4) wait until worker-a marks the thread done, 5) inspect the final thread with show, 6) stop. Do not use ordinary chat to coordinate with the other agent.
+```
+
+### Worker Prompt
+
+```text
+Use $inbox at SKILL_PATH to act as worker-a on SQLite DB TMPDIR/coord.db. Only coordinate through the bundled inbox CLI from the skill. Workflow: 1) fetch and claim the task, 2) inspect the task message with show and confirm the artifact is visible, 3) create a small result file under TMPDIR, 4) finish the thread with done using body-file or artifact input, 5) stop after reporting what files were preserved. Do not use ordinary chat to coordinate with the other agent.
+```
+
+## Execution Parameters
+
+- use the shared execution contract from [README.md](./README.md)
+- use the shared timeout defaults from [README.md](./README.md)
+- do not override the default cleanup policy
+
+## Execution Steps
+
+1. Inject the same `skills/inbox/` skill into both real agents
+2. Point both agents at the same database path `TMPDIR/coord.db`
+3. Launch `leader` and `worker-a` in parallel
+4. Wait for both agents to finish
+5. Resolve `THREAD_ID` from the agent outputs or inbox history
+6. Independently run the validation commands from the main thread
+
+## Validation Commands
+
+```bash
+SKILL_PATH/assets/inbox --db TMPDIR/coord.db --json show --thread THREAD_ID
+```
+
+## Expected Outcomes
+
+- `leader` successfully creates a task file and sends it through `body-file`
+- the initial task message contains at least one artifact reference
+- `worker-a` successfully inspects the task artifact through `show`
+- `worker-a` completes the thread with `done`
+- the final `show` output preserves task-side and result-side file content or artifact references
+
+## Assertions
+
+- the first task message contains non-empty body content sourced from a file
+- the first task message contains at least one artifact entry
+- the final `result` message contains either body-file content or at least one artifact entry
+- the final thread status is `done`
+
+## Cleanup
+
+- use the default cleanup policy from [README.md](./README.md)
+- if the run fails, retain `TMPDIR`, created files, and `coord.db` for replay and manual inspection