Files
ai-workflow-skill/docs/tests/repo-memory-skill/verify-downgrade-after-file-change-through-bundled-cli.md
T

2.2 KiB

Verify Downgrade After File Change Through Bundled CLI

Test Type

  • forward skill execution

Purpose

  • validate that a single agent can use skills/repo-memory/ to record confirmed knowledge with a hard file dependency, change that file, run verify, and observe the expected needs_review downgrade

Preconditions

  • skills/repo-memory/assets/repo-memory exists and is executable
  • the test runner can create a temporary Git repository fixture
  • the repository fixture contains one evidence file committed in Git before the agent starts
  • the test runner can modify the evidence file before or during the scenario

Inputs

  • SKILL_PATH=/.../skills/repo-memory
  • TMPDIR=/tmp/...
  • DB_PATH=TMPDIR/repo-memory.db
  • REPO_PATH=TMPDIR/repo-fixture
  • EVIDENCE_PATH=REPO_PATH/foo.txt

Execution Parameters

  • one agent only
  • per-agent timeout: 3m
  • overall timeout: 4m

Execution Steps

  1. Create a temporary Git repository fixture under REPO_PATH.
  2. Commit one evidence file at EVIDENCE_PATH.
  3. Ask the agent to use $repo-memory against DB_PATH.
  4. Have the agent add one confirmed entry that depends on EVIDENCE_PATH.
  5. Mutate EVIDENCE_PATH after the entry is recorded.
  6. Have the agent run verify, then inspect the result with list and events.
  7. Capture the agent summary and the final entry status it reports.

Validation Commands

Run these from the main thread after the agent stops:

SKILL_PATH/assets/repo-memory verify --db DB_PATH --repo REPO_PATH
SKILL_PATH/assets/repo-memory list --db DB_PATH --repo REPO_PATH --status needs_review
SKILL_PATH/assets/repo-memory events --db DB_PATH --id 1

Expected Outcomes

  • verify reports one downgraded entry
  • list returns the target entry in needs_review
  • events includes a downgraded event for the target entry

Assertions

  • the agent used the bundled CLI for both the write and the verification flow
  • the downgrade reason is driven by real repository state, not by chat-only reasoning
  • the final state transition is visible both in the current listing and the event history

Cleanup

  • keep the temporary DB and repo on failure
  • remove temporary artifacts on success only if replay evidence is not needed