Files
ai-workflow-skill/docs/tests/repo-memory-skill

Repo Memory Skill Test Plan

Purpose

This directory tracks human-readable test plans for the skills/repo-memory/ Codex skill bundle.

These documents are not direct CLI command-contract specs for repo-memory. That coverage now lives under ../repo-memory/.

These documents are also not package-level unit tests for the runtime. Those live under packages/repo-memory-runtime/.

This directory covers a different surface:

  • whether an agent can actually use the packaged repo-memory skill
  • whether the bundled ./assets/repo-memory CLI works inside real skill-guided repository work
  • whether durable repository knowledge is stored and retrieved correctly

Test Model

  • README.md is the index for this directory
  • each skill test case lives in its own Markdown file
  • use stable case slugs in filenames

Shared Execution Contract

Use these defaults unless a case file explicitly overrides them:

  • run the scenario with one real agent using the bundled repo-memory skill
  • create an isolated temporary directory, repository fixture, and SQLite DB path
  • require the agent to use the bundled ./assets/repo-memory CLI instead of ad hoc notes
  • validate final database state independently from the main thread after the agent stops

How An Agent Runs These Cases

Use one test-runner agent to execute each case.

The test-runner agent is responsible for:

  • reading this README.md first, then one specific case file
  • creating an isolated temporary directory, repository fixture, and SQLite DB path
  • injecting skills/repo-memory/ into the role agent
  • passing the concrete SKILL_PATH, TMPDIR, DB_PATH, and REPO_PATH values from the case file
  • requiring the role agent to use the bundled ./assets/repo-memory CLI instead of free-form notes
  • collecting the role agent final summary as evidence
  • running the case Validation Commands from the main thread after the role agent stops
  • comparing the observed results against Expected Outcomes and Assertions

The role agent is responsible for:

  • acting only within the case scope
  • using the injected repo-memory skill rather than ad hoc repository discovery
  • coordinating through the bundled CLI and SQLite DB
  • reporting concrete keys, entry ids, and final observed state back to the test-runner agent

Default Timeouts

Use these defaults unless a case file explicitly overrides them:

  • per-agent timeout: 3m
  • overall scenario timeout: 4m

Default Failure Conditions

Treat the test as failed if any of the following happens:

  • the role agent does not reach a final state before timeout
  • a required bundled CLI command returns a non-success result unless the case expects that failure
  • the final repo-memory DB state conflicts with the documented assertions
  • the role agent falls back to free-form notes for durable knowledge that should go through the bundled CLI

Evidence Capture

Collect at least the following artifacts for every run:

  • the role agent final summary
  • the temporary DB path and repository path
  • the outputs of the case Validation Commands
  • any resolved entry ids, keys, or relation rows needed to verify the case

Cleanup Policy

Use these defaults unless a case file explicitly overrides them:

  • keep the temporary DB and repo fixture on failure for debugging
  • cleanup on success only if replay artifacts are not needed

Per-Case Template

Each case file should use this structure:

  • Test Type
  • Purpose
  • Preconditions
  • Inputs
  • Execution Parameters
  • Execution Steps
  • Validation Commands
  • Expected Outcomes
  • Assertions
  • Cleanup
  • Recorded Example Run when a real run has already been captured

Case Files

Case Slug File Coverage Note
search-and-add-through-bundled-cli search-and-add-through-bundled-cli.md validates that an agent can miss on search, add one durable entry, then retrieve it through the packaged repo-memory skill
ingest-and-search-through-bundled-cli ingest-and-search-through-bundled-cli.md validates that an agent can ingest docs/ai markdown through the bundled CLI and then retrieve imported knowledge through search and list
verify-downgrade-after-file-change-through-bundled-cli verify-downgrade-after-file-change-through-bundled-cli.md validates that an agent can record confirmed knowledge, mutate the tracked file, run verify, and observe a needs_review downgrade
verify-stale-missing-hard-dependency-through-bundled-cli verify-stale-missing-hard-dependency-through-bundled-cli.md validates that an agent can detect a missing hard dependency through verify and observe a stale result
link-two-entries-through-bundled-cli link-two-entries-through-bundled-cli.md validates that an agent can add two entries, link them, and leave a durable relation in the packaged repo-memory database

Scope

In scope:

  • explicit $repo-memory skill invocation
  • bundled ./assets/repo-memory CLI usage
  • durable knowledge add/search/list/event flows
  • markdown ingest through docs/ai
  • verify downgrade and stale transitions
  • entry relation/link flows
  • package-backed SQLite memory database behavior as surfaced through the skill

Out of scope:

  • direct CLI contract coverage that now belongs under ../repo-memory/
  • package-level unit tests for packages/repo-memory-runtime
  • future auto-export flows such as repo-brief generation
  • implicit skill triggering without $repo-memory