Add repo-memory skill case plans

2026-03-20 16:19:46 +08:00
parent 9915e12a30
commit 693a79345b
6 changed files with 343 additions and 0 deletions
@@ -35,6 +35,60 @@ Use these defaults unless a case file explicitly overrides them:
 - validate final database state independently from the main thread after the
  agent stops

+## How An Agent Runs These Cases
+
+Use one test-runner agent to execute each case.
+
+The test-runner agent is responsible for:
+
+- reading this `README.md` first, then one specific case file
+- creating an isolated temporary directory, repository fixture, and SQLite DB path
+- injecting `skills/repo-memory/` into the role agent
+- passing the concrete `SKILL_PATH`, `TMPDIR`, `DB_PATH`, and `REPO_PATH` values from the case file
+- requiring the role agent to use the bundled `./assets/repo-memory` CLI instead of free-form notes
+- collecting the role agent final summary as evidence
+- running the case `Validation Commands` from the main thread after the role agent stops
+- comparing the observed results against `Expected Outcomes` and `Assertions`
+
+The role agent is responsible for:
+
+- acting only within the case scope
+- using the injected `repo-memory` skill rather than ad hoc repository discovery
+- coordinating through the bundled CLI and SQLite DB
+- reporting concrete keys, entry ids, and final observed state back to the test-runner agent
+
+## Default Timeouts
+
+Use these defaults unless a case file explicitly overrides them:
+
+- per-agent timeout: `3m`
+- overall scenario timeout: `4m`
+
+## Default Failure Conditions
+
+Treat the test as failed if any of the following happens:
+
+- the role agent does not reach a final state before timeout
+- a required bundled CLI command returns a non-success result unless the case expects that failure
+- the final repo-memory DB state conflicts with the documented assertions
+- the role agent falls back to free-form notes for durable knowledge that should go through the bundled CLI
+
+## Evidence Capture
+
+Collect at least the following artifacts for every run:
+
+- the role agent final summary
+- the temporary DB path and repository path
+- the outputs of the case `Validation Commands`
+- any resolved entry ids, keys, or relation rows needed to verify the case
+
+## Cleanup Policy
+
+Use these defaults unless a case file explicitly overrides them:
+
+- keep the temporary DB and repo fixture on failure for debugging
+- cleanup on success only if replay artifacts are not needed
+
 ## Per-Case Template

 Each case file should use this structure:
@@ -56,6 +110,10 @@ Each case file should use this structure:
 | Case Slug | File | Coverage Note |
 | --- | --- | --- |
 | `search-and-add-through-bundled-cli` | [search-and-add-through-bundled-cli.md](./search-and-add-through-bundled-cli.md) | validates that an agent can miss on search, add one durable entry, then retrieve it through the packaged `repo-memory` skill |
+| `ingest-and-search-through-bundled-cli` | [ingest-and-search-through-bundled-cli.md](./ingest-and-search-through-bundled-cli.md) | validates that an agent can ingest `docs/ai` markdown through the bundled CLI and then retrieve imported knowledge through search and list |
+| `verify-downgrade-after-file-change-through-bundled-cli` | [verify-downgrade-after-file-change-through-bundled-cli.md](./verify-downgrade-after-file-change-through-bundled-cli.md) | validates that an agent can record confirmed knowledge, mutate the tracked file, run verify, and observe a `needs_review` downgrade |
+| `verify-stale-missing-hard-dependency-through-bundled-cli` | [verify-stale-missing-hard-dependency-through-bundled-cli.md](./verify-stale-missing-hard-dependency-through-bundled-cli.md) | validates that an agent can detect a missing hard dependency through `verify` and observe a `stale` result |
+| `link-two-entries-through-bundled-cli` | [link-two-entries-through-bundled-cli.md](./link-two-entries-through-bundled-cli.md) | validates that an agent can add two entries, link them, and leave a durable relation in the packaged repo-memory database |

 ## Scope

@@ -64,6 +122,9 @@ In scope:
 - explicit `$repo-memory` skill invocation
 - bundled `./assets/repo-memory` CLI usage
 - durable knowledge add/search/list/event flows
+- markdown ingest through `docs/ai`
+- verify downgrade and stale transitions
+- entry relation/link flows
 - package-backed SQLite memory database behavior as surfaced through the skill

 Out of scope:
@@ -0,0 +1,72 @@
+# Ingest And Search Through Bundled CLI
+
+## Test Type
+
+- forward skill execution
+
+## Purpose
+
+- validate that a single agent can use `skills/repo-memory/` to ingest
+  repository-local `docs/ai` markdown through the bundled CLI and retrieve the
+  imported knowledge afterwards through `search` and `list`
+
+## Preconditions
+
+- `skills/repo-memory/assets/repo-memory` exists and is executable
+- the test runner can create a temporary Git repository fixture
+- the test runner can create a temporary SQLite DB path
+- the repository fixture includes one `docs/ai/repo-memory.md` file with at
+  least `Module Map` and `Danger Zones` sections
+
+## Inputs
+
+- `SKILL_PATH=/.../skills/repo-memory`
+- `TMPDIR=/tmp/...`
+- `DB_PATH=TMPDIR/repo-memory.db`
+- `REPO_PATH=TMPDIR/repo-fixture`
+
+## Execution Parameters
+
+- one agent only
+- per-agent timeout: `3m`
+- overall timeout: `4m`
+
+## Execution Steps
+
+1. Create a temporary Git repository fixture under `REPO_PATH`.
+2. Add `docs/ai/repo-memory.md` with markdown content that describes module and
+   danger knowledge.
+3. Ask the agent to use `$repo-memory` against `DB_PATH`.
+4. Have the agent initialize or bootstrap the DB as needed, run `ingest`
+   against `REPO_PATH`, then use `search` and `list` to confirm the imported
+   knowledge is visible.
+5. Capture the agent summary and the concrete imported entry keys it reports.
+
+## Validation Commands
+
+Run these from the main thread after the agent stops:
+
+```bash
+SKILL_PATH/assets/repo-memory ingest --db DB_PATH --repo REPO_PATH
+SKILL_PATH/assets/repo-memory search --db DB_PATH --repo REPO_PATH --query "gateway"
+SKILL_PATH/assets/repo-memory list --db DB_PATH --repo REPO_PATH
+```
+
+## Expected Outcomes
+
+- `ingest` succeeds and reports one imported markdown document
+- `search` returns the imported `module` entry for the `Module Map` section
+- `list` returns at least one `module` entry and one `danger` entry for the
+  fixture repo
+
+## Assertions
+
+- the agent used the bundled CLI instead of copying markdown into ad hoc notes
+- the imported knowledge is attached to the target repo path
+- the imported keys match the expected `repo-memory:<slug>` style generated from
+  the markdown sections
+
+## Cleanup
+
+- keep the temporary DB and repo on failure
+- remove temporary artifacts on success only if replay evidence is not needed
@@ -0,0 +1,69 @@
+# Link Two Entries Through Bundled CLI
+
+## Test Type
+
+- forward skill execution
+
+## Purpose
+
+- validate that a single agent can use `skills/repo-memory/` to add two durable
+  knowledge entries, create a relation between them through the bundled CLI,
+  and leave a durable graph edge in the SQLite database
+
+## Preconditions
+
+- `skills/repo-memory/assets/repo-memory` exists and is executable
+- the test runner can create a temporary Git repository fixture
+- the test runner can create a temporary SQLite DB path
+- the repository fixture includes any evidence files needed for the two entries
+
+## Inputs
+
+- `SKILL_PATH=/.../skills/repo-memory`
+- `TMPDIR=/tmp/...`
+- `DB_PATH=TMPDIR/repo-memory.db`
+- `REPO_PATH=TMPDIR/repo-fixture`
+
+## Execution Parameters
+
+- one agent only
+- per-agent timeout: `3m`
+- overall timeout: `4m`
+
+## Execution Steps
+
+1. Create a temporary Git repository fixture under `REPO_PATH`.
+2. Add any files needed to justify two durable knowledge entries.
+3. Ask the agent to use `$repo-memory` against `DB_PATH`.
+4. Have the agent add one `term` entry and one `chain` entry for the same repo.
+5. Have the agent link the first entry to the second with relation
+   `related_to`.
+6. Capture the agent summary and the concrete entry ids it reports.
+
+## Validation Commands
+
+Run these from the main thread after the agent stops:
+
+```bash
+SKILL_PATH/assets/repo-memory list --db DB_PATH --repo REPO_PATH
+SKILL_PATH/assets/repo-memory events --db DB_PATH --id 1
+SKILL_PATH/assets/repo-memory events --db DB_PATH --id 2
+sqlite3 DB_PATH "SELECT relation FROM knowledge_links WHERE from_entry_id = 1 AND to_entry_id = 2;"
+```
+
+## Expected Outcomes
+
+- both `add` calls succeed and leave two queryable entries
+- `link` succeeds and reports the relation textually
+- the final SQL validation returns one `related_to` row
+
+## Assertions
+
+- the agent used the bundled CLI for entry creation and relation creation
+- the relation is durable in the packaged SQLite DB, not just mentioned in the summary
+- both entries remain independently inspectable through `events`
+
+## Cleanup
+
+- keep the temporary DB and repo on failure
+- remove temporary artifacts on success only if replay evidence is not needed
@@ -0,0 +1,71 @@
+# Verify Downgrade After File Change Through Bundled CLI
+
+## Test Type
+
+- forward skill execution
+
+## Purpose
+
+- validate that a single agent can use `skills/repo-memory/` to record
+  confirmed knowledge with a hard file dependency, change that file, run
+  `verify`, and observe the expected `needs_review` downgrade
+
+## Preconditions
+
+- `skills/repo-memory/assets/repo-memory` exists and is executable
+- the test runner can create a temporary Git repository fixture
+- the repository fixture contains one evidence file committed in Git before the
+  agent starts
+- the test runner can modify the evidence file before or during the scenario
+
+## Inputs
+
+- `SKILL_PATH=/.../skills/repo-memory`
+- `TMPDIR=/tmp/...`
+- `DB_PATH=TMPDIR/repo-memory.db`
+- `REPO_PATH=TMPDIR/repo-fixture`
+- `EVIDENCE_PATH=REPO_PATH/foo.txt`
+
+## Execution Parameters
+
+- one agent only
+- per-agent timeout: `3m`
+- overall timeout: `4m`
+
+## Execution Steps
+
+1. Create a temporary Git repository fixture under `REPO_PATH`.
+2. Commit one evidence file at `EVIDENCE_PATH`.
+3. Ask the agent to use `$repo-memory` against `DB_PATH`.
+4. Have the agent add one `confirmed` entry that depends on `EVIDENCE_PATH`.
+5. Mutate `EVIDENCE_PATH` after the entry is recorded.
+6. Have the agent run `verify`, then inspect the result with `list` and
+   `events`.
+7. Capture the agent summary and the final entry status it reports.
+
+## Validation Commands
+
+Run these from the main thread after the agent stops:
+
+```bash
+SKILL_PATH/assets/repo-memory verify --db DB_PATH --repo REPO_PATH
+SKILL_PATH/assets/repo-memory list --db DB_PATH --repo REPO_PATH --status needs_review
+SKILL_PATH/assets/repo-memory events --db DB_PATH --id 1
+```
+
+## Expected Outcomes
+
+- `verify` reports one downgraded entry
+- `list` returns the target entry in `needs_review`
+- `events` includes a `downgraded` event for the target entry
+
+## Assertions
+
+- the agent used the bundled CLI for both the write and the verification flow
+- the downgrade reason is driven by real repository state, not by chat-only reasoning
+- the final state transition is visible both in the current listing and the event history
+
+## Cleanup
+
+- keep the temporary DB and repo on failure
+- remove temporary artifacts on success only if replay evidence is not needed
@@ -0,0 +1,70 @@
+# Verify Stale Missing Hard Dependency Through Bundled CLI
+
+## Test Type
+
+- forward skill execution
+
+## Purpose
+
+- validate that a single agent can use `skills/repo-memory/` to record
+  confirmed knowledge with a missing hard dependency, run `verify`, and observe
+  the expected `stale` outcome
+
+## Preconditions
+
+- `skills/repo-memory/assets/repo-memory` exists and is executable
+- the test runner can create a temporary Git repository fixture
+- the repository fixture has a valid Git HEAD before verification starts
+- the hard dependency path referenced by the entry does not exist
+
+## Inputs
+
+- `SKILL_PATH=/.../skills/repo-memory`
+- `TMPDIR=/tmp/...`
+- `DB_PATH=TMPDIR/repo-memory.db`
+- `REPO_PATH=TMPDIR/repo-fixture`
+- `MISSING_PATH=REPO_PATH/missing.txt`
+
+## Execution Parameters
+
+- one agent only
+- per-agent timeout: `3m`
+- overall timeout: `4m`
+
+## Execution Steps
+
+1. Create a temporary Git repository fixture under `REPO_PATH` and ensure it
+   has an initial commit.
+2. Ask the agent to use `$repo-memory` against `DB_PATH`.
+3. Have the agent add one `confirmed` entry that declares `MISSING_PATH` as a
+   hard dependency.
+4. Have the agent run `verify`, then inspect the result with `list` and
+   `events`.
+5. Capture the agent summary and the final entry status it reports.
+
+## Validation Commands
+
+Run these from the main thread after the agent stops:
+
+```bash
+SKILL_PATH/assets/repo-memory verify --db DB_PATH --repo REPO_PATH
+SKILL_PATH/assets/repo-memory list --db DB_PATH --repo REPO_PATH --status stale
+SKILL_PATH/assets/repo-memory events --db DB_PATH --id 1
+```
+
+## Expected Outcomes
+
+- `verify` reports one stale entry
+- `list` returns the target entry in `stale`
+- `events` includes a `marked_stale` event for the target entry
+
+## Assertions
+
+- the agent used the bundled CLI for the full verify flow
+- the stale result is driven by the missing hard dependency, not by a generic command failure
+- the final state is visible in both current listing output and event history
+
+## Cleanup
+
+- keep the temporary DB and repo on failure
+- remove temporary artifacts on success only if replay evidence is not needed