Add repo-memory skill case plans
This commit is contained in:
@@ -35,6 +35,60 @@ Use these defaults unless a case file explicitly overrides them:
|
||||
- validate final database state independently from the main thread after the
|
||||
agent stops
|
||||
|
||||
## How An Agent Runs These Cases
|
||||
|
||||
Use one test-runner agent to execute each case.
|
||||
|
||||
The test-runner agent is responsible for:
|
||||
|
||||
- reading this `README.md` first, then one specific case file
|
||||
- creating an isolated temporary directory, repository fixture, and SQLite DB path
|
||||
- injecting `skills/repo-memory/` into the role agent
|
||||
- passing the concrete `SKILL_PATH`, `TMPDIR`, `DB_PATH`, and `REPO_PATH` values from the case file
|
||||
- requiring the role agent to use the bundled `./assets/repo-memory` CLI instead of free-form notes
|
||||
- collecting the role agent final summary as evidence
|
||||
- running the case `Validation Commands` from the main thread after the role agent stops
|
||||
- comparing the observed results against `Expected Outcomes` and `Assertions`
|
||||
|
||||
The role agent is responsible for:
|
||||
|
||||
- acting only within the case scope
|
||||
- using the injected `repo-memory` skill rather than ad hoc repository discovery
|
||||
- coordinating through the bundled CLI and SQLite DB
|
||||
- reporting concrete keys, entry ids, and final observed state back to the test-runner agent
|
||||
|
||||
## Default Timeouts
|
||||
|
||||
Use these defaults unless a case file explicitly overrides them:
|
||||
|
||||
- per-agent timeout: `3m`
|
||||
- overall scenario timeout: `4m`
|
||||
|
||||
## Default Failure Conditions
|
||||
|
||||
Treat the test as failed if any of the following happens:
|
||||
|
||||
- the role agent does not reach a final state before timeout
|
||||
- a required bundled CLI command returns a non-success result unless the case expects that failure
|
||||
- the final repo-memory DB state conflicts with the documented assertions
|
||||
- the role agent falls back to free-form notes for durable knowledge that should go through the bundled CLI
|
||||
|
||||
## Evidence Capture
|
||||
|
||||
Collect at least the following artifacts for every run:
|
||||
|
||||
- the role agent final summary
|
||||
- the temporary DB path and repository path
|
||||
- the outputs of the case `Validation Commands`
|
||||
- any resolved entry ids, keys, or relation rows needed to verify the case
|
||||
|
||||
## Cleanup Policy
|
||||
|
||||
Use these defaults unless a case file explicitly overrides them:
|
||||
|
||||
- keep the temporary DB and repo fixture on failure for debugging
|
||||
- cleanup on success only if replay artifacts are not needed
|
||||
|
||||
## Per-Case Template
|
||||
|
||||
Each case file should use this structure:
|
||||
@@ -56,6 +110,10 @@ Each case file should use this structure:
|
||||
| Case Slug | File | Coverage Note |
|
||||
| --- | --- | --- |
|
||||
| `search-and-add-through-bundled-cli` | [search-and-add-through-bundled-cli.md](./search-and-add-through-bundled-cli.md) | validates that an agent can miss on search, add one durable entry, then retrieve it through the packaged `repo-memory` skill |
|
||||
| `ingest-and-search-through-bundled-cli` | [ingest-and-search-through-bundled-cli.md](./ingest-and-search-through-bundled-cli.md) | validates that an agent can ingest `docs/ai` markdown through the bundled CLI and then retrieve imported knowledge through search and list |
|
||||
| `verify-downgrade-after-file-change-through-bundled-cli` | [verify-downgrade-after-file-change-through-bundled-cli.md](./verify-downgrade-after-file-change-through-bundled-cli.md) | validates that an agent can record confirmed knowledge, mutate the tracked file, run verify, and observe a `needs_review` downgrade |
|
||||
| `verify-stale-missing-hard-dependency-through-bundled-cli` | [verify-stale-missing-hard-dependency-through-bundled-cli.md](./verify-stale-missing-hard-dependency-through-bundled-cli.md) | validates that an agent can detect a missing hard dependency through `verify` and observe a `stale` result |
|
||||
| `link-two-entries-through-bundled-cli` | [link-two-entries-through-bundled-cli.md](./link-two-entries-through-bundled-cli.md) | validates that an agent can add two entries, link them, and leave a durable relation in the packaged repo-memory database |
|
||||
|
||||
## Scope
|
||||
|
||||
@@ -64,6 +122,9 @@ In scope:
|
||||
- explicit `$repo-memory` skill invocation
|
||||
- bundled `./assets/repo-memory` CLI usage
|
||||
- durable knowledge add/search/list/event flows
|
||||
- markdown ingest through `docs/ai`
|
||||
- verify downgrade and stale transitions
|
||||
- entry relation/link flows
|
||||
- package-backed SQLite memory database behavior as surfaced through the skill
|
||||
|
||||
Out of scope:
|
||||
|
||||
@@ -0,0 +1,72 @@
|
||||
# Ingest And Search Through Bundled CLI
|
||||
|
||||
## Test Type
|
||||
|
||||
- forward skill execution
|
||||
|
||||
## Purpose
|
||||
|
||||
- validate that a single agent can use `skills/repo-memory/` to ingest
|
||||
repository-local `docs/ai` markdown through the bundled CLI and retrieve the
|
||||
imported knowledge afterwards through `search` and `list`
|
||||
|
||||
## Preconditions
|
||||
|
||||
- `skills/repo-memory/assets/repo-memory` exists and is executable
|
||||
- the test runner can create a temporary Git repository fixture
|
||||
- the test runner can create a temporary SQLite DB path
|
||||
- the repository fixture includes one `docs/ai/repo-memory.md` file with at
|
||||
least `Module Map` and `Danger Zones` sections
|
||||
|
||||
## Inputs
|
||||
|
||||
- `SKILL_PATH=/.../skills/repo-memory`
|
||||
- `TMPDIR=/tmp/...`
|
||||
- `DB_PATH=TMPDIR/repo-memory.db`
|
||||
- `REPO_PATH=TMPDIR/repo-fixture`
|
||||
|
||||
## Execution Parameters
|
||||
|
||||
- one agent only
|
||||
- per-agent timeout: `3m`
|
||||
- overall timeout: `4m`
|
||||
|
||||
## Execution Steps
|
||||
|
||||
1. Create a temporary Git repository fixture under `REPO_PATH`.
|
||||
2. Add `docs/ai/repo-memory.md` with markdown content that describes module and
|
||||
danger knowledge.
|
||||
3. Ask the agent to use `$repo-memory` against `DB_PATH`.
|
||||
4. Have the agent initialize or bootstrap the DB as needed, run `ingest`
|
||||
against `REPO_PATH`, then use `search` and `list` to confirm the imported
|
||||
knowledge is visible.
|
||||
5. Capture the agent summary and the concrete imported entry keys it reports.
|
||||
|
||||
## Validation Commands
|
||||
|
||||
Run these from the main thread after the agent stops:
|
||||
|
||||
```bash
|
||||
SKILL_PATH/assets/repo-memory ingest --db DB_PATH --repo REPO_PATH
|
||||
SKILL_PATH/assets/repo-memory search --db DB_PATH --repo REPO_PATH --query "gateway"
|
||||
SKILL_PATH/assets/repo-memory list --db DB_PATH --repo REPO_PATH
|
||||
```
|
||||
|
||||
## Expected Outcomes
|
||||
|
||||
- `ingest` succeeds and reports one imported markdown document
|
||||
- `search` returns the imported `module` entry for the `Module Map` section
|
||||
- `list` returns at least one `module` entry and one `danger` entry for the
|
||||
fixture repo
|
||||
|
||||
## Assertions
|
||||
|
||||
- the agent used the bundled CLI instead of copying markdown into ad hoc notes
|
||||
- the imported knowledge is attached to the target repo path
|
||||
- the imported keys match the expected `repo-memory:<slug>` style generated from
|
||||
the markdown sections
|
||||
|
||||
## Cleanup
|
||||
|
||||
- keep the temporary DB and repo on failure
|
||||
- remove temporary artifacts on success only if replay evidence is not needed
|
||||
@@ -0,0 +1,69 @@
|
||||
# Link Two Entries Through Bundled CLI
|
||||
|
||||
## Test Type
|
||||
|
||||
- forward skill execution
|
||||
|
||||
## Purpose
|
||||
|
||||
- validate that a single agent can use `skills/repo-memory/` to add two durable
|
||||
knowledge entries, create a relation between them through the bundled CLI,
|
||||
and leave a durable graph edge in the SQLite database
|
||||
|
||||
## Preconditions
|
||||
|
||||
- `skills/repo-memory/assets/repo-memory` exists and is executable
|
||||
- the test runner can create a temporary Git repository fixture
|
||||
- the test runner can create a temporary SQLite DB path
|
||||
- the repository fixture includes any evidence files needed for the two entries
|
||||
|
||||
## Inputs
|
||||
|
||||
- `SKILL_PATH=/.../skills/repo-memory`
|
||||
- `TMPDIR=/tmp/...`
|
||||
- `DB_PATH=TMPDIR/repo-memory.db`
|
||||
- `REPO_PATH=TMPDIR/repo-fixture`
|
||||
|
||||
## Execution Parameters
|
||||
|
||||
- one agent only
|
||||
- per-agent timeout: `3m`
|
||||
- overall timeout: `4m`
|
||||
|
||||
## Execution Steps
|
||||
|
||||
1. Create a temporary Git repository fixture under `REPO_PATH`.
|
||||
2. Add any files needed to justify two durable knowledge entries.
|
||||
3. Ask the agent to use `$repo-memory` against `DB_PATH`.
|
||||
4. Have the agent add one `term` entry and one `chain` entry for the same repo.
|
||||
5. Have the agent link the first entry to the second with relation
|
||||
`related_to`.
|
||||
6. Capture the agent summary and the concrete entry ids it reports.
|
||||
|
||||
## Validation Commands
|
||||
|
||||
Run these from the main thread after the agent stops:
|
||||
|
||||
```bash
|
||||
SKILL_PATH/assets/repo-memory list --db DB_PATH --repo REPO_PATH
|
||||
SKILL_PATH/assets/repo-memory events --db DB_PATH --id 1
|
||||
SKILL_PATH/assets/repo-memory events --db DB_PATH --id 2
|
||||
sqlite3 DB_PATH "SELECT relation FROM knowledge_links WHERE from_entry_id = 1 AND to_entry_id = 2;"
|
||||
```
|
||||
|
||||
## Expected Outcomes
|
||||
|
||||
- both `add` calls succeed and leave two queryable entries
|
||||
- `link` succeeds and reports the relation textually
|
||||
- the final SQL validation returns one `related_to` row
|
||||
|
||||
## Assertions
|
||||
|
||||
- the agent used the bundled CLI for entry creation and relation creation
|
||||
- the relation is durable in the packaged SQLite DB, not just mentioned in the summary
|
||||
- both entries remain independently inspectable through `events`
|
||||
|
||||
## Cleanup
|
||||
|
||||
- keep the temporary DB and repo on failure
|
||||
- remove temporary artifacts on success only if replay evidence is not needed
|
||||
+71
@@ -0,0 +1,71 @@
|
||||
# Verify Downgrade After File Change Through Bundled CLI
|
||||
|
||||
## Test Type
|
||||
|
||||
- forward skill execution
|
||||
|
||||
## Purpose
|
||||
|
||||
- validate that a single agent can use `skills/repo-memory/` to record
|
||||
confirmed knowledge with a hard file dependency, change that file, run
|
||||
`verify`, and observe the expected `needs_review` downgrade
|
||||
|
||||
## Preconditions
|
||||
|
||||
- `skills/repo-memory/assets/repo-memory` exists and is executable
|
||||
- the test runner can create a temporary Git repository fixture
|
||||
- the repository fixture contains one evidence file committed in Git before the
|
||||
agent starts
|
||||
- the test runner can modify the evidence file before or during the scenario
|
||||
|
||||
## Inputs
|
||||
|
||||
- `SKILL_PATH=/.../skills/repo-memory`
|
||||
- `TMPDIR=/tmp/...`
|
||||
- `DB_PATH=TMPDIR/repo-memory.db`
|
||||
- `REPO_PATH=TMPDIR/repo-fixture`
|
||||
- `EVIDENCE_PATH=REPO_PATH/foo.txt`
|
||||
|
||||
## Execution Parameters
|
||||
|
||||
- one agent only
|
||||
- per-agent timeout: `3m`
|
||||
- overall timeout: `4m`
|
||||
|
||||
## Execution Steps
|
||||
|
||||
1. Create a temporary Git repository fixture under `REPO_PATH`.
|
||||
2. Commit one evidence file at `EVIDENCE_PATH`.
|
||||
3. Ask the agent to use `$repo-memory` against `DB_PATH`.
|
||||
4. Have the agent add one `confirmed` entry that depends on `EVIDENCE_PATH`.
|
||||
5. Mutate `EVIDENCE_PATH` after the entry is recorded.
|
||||
6. Have the agent run `verify`, then inspect the result with `list` and
|
||||
`events`.
|
||||
7. Capture the agent summary and the final entry status it reports.
|
||||
|
||||
## Validation Commands
|
||||
|
||||
Run these from the main thread after the agent stops:
|
||||
|
||||
```bash
|
||||
SKILL_PATH/assets/repo-memory verify --db DB_PATH --repo REPO_PATH
|
||||
SKILL_PATH/assets/repo-memory list --db DB_PATH --repo REPO_PATH --status needs_review
|
||||
SKILL_PATH/assets/repo-memory events --db DB_PATH --id 1
|
||||
```
|
||||
|
||||
## Expected Outcomes
|
||||
|
||||
- `verify` reports one downgraded entry
|
||||
- `list` returns the target entry in `needs_review`
|
||||
- `events` includes a `downgraded` event for the target entry
|
||||
|
||||
## Assertions
|
||||
|
||||
- the agent used the bundled CLI for both the write and the verification flow
|
||||
- the downgrade reason is driven by real repository state, not by chat-only reasoning
|
||||
- the final state transition is visible both in the current listing and the event history
|
||||
|
||||
## Cleanup
|
||||
|
||||
- keep the temporary DB and repo on failure
|
||||
- remove temporary artifacts on success only if replay evidence is not needed
|
||||
+70
@@ -0,0 +1,70 @@
|
||||
# Verify Stale Missing Hard Dependency Through Bundled CLI
|
||||
|
||||
## Test Type
|
||||
|
||||
- forward skill execution
|
||||
|
||||
## Purpose
|
||||
|
||||
- validate that a single agent can use `skills/repo-memory/` to record
|
||||
confirmed knowledge with a missing hard dependency, run `verify`, and observe
|
||||
the expected `stale` outcome
|
||||
|
||||
## Preconditions
|
||||
|
||||
- `skills/repo-memory/assets/repo-memory` exists and is executable
|
||||
- the test runner can create a temporary Git repository fixture
|
||||
- the repository fixture has a valid Git HEAD before verification starts
|
||||
- the hard dependency path referenced by the entry does not exist
|
||||
|
||||
## Inputs
|
||||
|
||||
- `SKILL_PATH=/.../skills/repo-memory`
|
||||
- `TMPDIR=/tmp/...`
|
||||
- `DB_PATH=TMPDIR/repo-memory.db`
|
||||
- `REPO_PATH=TMPDIR/repo-fixture`
|
||||
- `MISSING_PATH=REPO_PATH/missing.txt`
|
||||
|
||||
## Execution Parameters
|
||||
|
||||
- one agent only
|
||||
- per-agent timeout: `3m`
|
||||
- overall timeout: `4m`
|
||||
|
||||
## Execution Steps
|
||||
|
||||
1. Create a temporary Git repository fixture under `REPO_PATH` and ensure it
|
||||
has an initial commit.
|
||||
2. Ask the agent to use `$repo-memory` against `DB_PATH`.
|
||||
3. Have the agent add one `confirmed` entry that declares `MISSING_PATH` as a
|
||||
hard dependency.
|
||||
4. Have the agent run `verify`, then inspect the result with `list` and
|
||||
`events`.
|
||||
5. Capture the agent summary and the final entry status it reports.
|
||||
|
||||
## Validation Commands
|
||||
|
||||
Run these from the main thread after the agent stops:
|
||||
|
||||
```bash
|
||||
SKILL_PATH/assets/repo-memory verify --db DB_PATH --repo REPO_PATH
|
||||
SKILL_PATH/assets/repo-memory list --db DB_PATH --repo REPO_PATH --status stale
|
||||
SKILL_PATH/assets/repo-memory events --db DB_PATH --id 1
|
||||
```
|
||||
|
||||
## Expected Outcomes
|
||||
|
||||
- `verify` reports one stale entry
|
||||
- `list` returns the target entry in `stale`
|
||||
- `events` includes a `marked_stale` event for the target entry
|
||||
|
||||
## Assertions
|
||||
|
||||
- the agent used the bundled CLI for the full verify flow
|
||||
- the stale result is driven by the missing hard dependency, not by a generic command failure
|
||||
- the final state is visible in both current listing output and event history
|
||||
|
||||
## Cleanup
|
||||
|
||||
- keep the temporary DB and repo on failure
|
||||
- remove temporary artifacts on success only if replay evidence is not needed
|
||||
Reference in New Issue
Block a user