Repo Memory Skill Test Plan
Purpose
This directory tracks human-readable test plans for the skills/repo-memory/
Codex skill bundle.
These documents are not direct CLI command-contract specs for repo-memory.
That coverage now lives under ../repo-memory/.
These documents are also not package-level unit tests for the runtime.
Those live under packages/repo-memory-runtime/.
This directory covers a different surface:
- whether an agent can actually use the packaged
repo-memoryskill - whether the bundled
./assets/repo-memoryCLI works inside real skill-guided repository work - whether durable repository knowledge is stored and retrieved correctly
Test Model
README.mdis the index for this directory- each skill test case lives in its own Markdown file
- use stable case slugs in filenames
Shared Execution Contract
Use these defaults unless a case file explicitly overrides them:
- run the scenario with one real agent using the bundled
repo-memoryskill - create an isolated temporary directory, repository fixture, and SQLite DB path
- require the agent to use the bundled
./assets/repo-memoryCLI instead of ad hoc notes - validate final database state independently from the main thread after the agent stops
How An Agent Runs These Cases
Use one test-runner agent to execute each case.
The test-runner agent is responsible for:
- reading this
README.mdfirst, then one specific case file - creating an isolated temporary directory, repository fixture, and SQLite DB path
- injecting
skills/repo-memory/into the role agent - passing the concrete
SKILL_PATH,TMPDIR,DB_PATH, andREPO_PATHvalues from the case file - requiring the role agent to use the bundled
./assets/repo-memoryCLI instead of free-form notes - collecting the role agent final summary as evidence
- running the case
Validation Commandsfrom the main thread after the role agent stops - comparing the observed results against
Expected OutcomesandAssertions
The role agent is responsible for:
- acting only within the case scope
- using the injected
repo-memoryskill rather than ad hoc repository discovery - coordinating through the bundled CLI and SQLite DB
- reporting concrete keys, entry ids, and final observed state back to the test-runner agent
Default Timeouts
Use these defaults unless a case file explicitly overrides them:
- per-agent timeout:
3m - overall scenario timeout:
4m
Default Failure Conditions
Treat the test as failed if any of the following happens:
- the role agent does not reach a final state before timeout
- a required bundled CLI command returns a non-success result unless the case expects that failure
- the final repo-memory DB state conflicts with the documented assertions
- the role agent falls back to free-form notes for durable knowledge that should go through the bundled CLI
Evidence Capture
Collect at least the following artifacts for every run:
- the role agent final summary
- the temporary DB path and repository path
- the outputs of the case
Validation Commands - any resolved entry ids, keys, or relation rows needed to verify the case
Cleanup Policy
Use these defaults unless a case file explicitly overrides them:
- keep the temporary DB and repo fixture on failure for debugging
- cleanup on success only if replay artifacts are not needed
Per-Case Template
Each case file should use this structure:
Test TypePurposePreconditionsInputsExecution ParametersExecution StepsValidation CommandsExpected OutcomesAssertionsCleanupRecorded Example Runwhen a real run has already been captured
Case Files
| Case Slug | File | Coverage Note |
|---|---|---|
search-and-add-through-bundled-cli |
search-and-add-through-bundled-cli.md | validates that an agent can miss on search, add one durable entry, then retrieve it through the packaged repo-memory skill |
ingest-and-search-through-bundled-cli |
ingest-and-search-through-bundled-cli.md | validates that an agent can ingest docs/ai markdown through the bundled CLI and then retrieve imported knowledge through search and list |
verify-downgrade-after-file-change-through-bundled-cli |
verify-downgrade-after-file-change-through-bundled-cli.md | validates that an agent can record confirmed knowledge, mutate the tracked file, run verify, and observe a needs_review downgrade |
verify-stale-missing-hard-dependency-through-bundled-cli |
verify-stale-missing-hard-dependency-through-bundled-cli.md | validates that an agent can detect a missing hard dependency through verify and observe a stale result |
link-two-entries-through-bundled-cli |
link-two-entries-through-bundled-cli.md | validates that an agent can add two entries, link them, and leave a durable relation in the packaged repo-memory database |
Scope
In scope:
- explicit
$repo-memoryskill invocation - bundled
./assets/repo-memoryCLI usage - durable knowledge add/search/list/event flows
- markdown ingest through
docs/ai - verify downgrade and stale transitions
- entry relation/link flows
- package-backed SQLite memory database behavior as surfaced through the skill
Out of scope:
- direct CLI contract coverage that now belongs under ../repo-memory/
- package-level unit tests for
packages/repo-memory-runtime - future auto-export flows such as
repo-briefgeneration - implicit skill triggering without
$repo-memory