Files
cadence-ui/docs/harness-engineering.md
T
2026-03-24 18:34:56 +08:00

179 lines
7.0 KiB
Markdown

# Harness Engineering
Cadence UI already has good validation primitives. Harness engineering makes those primitives
agent-usable by turning the repo into a system that can explain itself, accept explicit execution
plans, and expose repeatable machine-runnable feedback loops.
## What it means in this repo
For Cadence UI, harness engineering means:
- the repository has clear system-of-record files for architecture, contracts, and release rules
- non-trivial work starts with an execution plan checked into git
- validation is exposed as stable suites that humans, agents, and CI can run the same way
- known gaps are recorded explicitly instead of being rediscovered by every new task
This follows the direction described in OpenAI's harness engineering guidance: repositories should
be more legible to agents, use explicit execution plans, and prefer faster feedback loops over
purely ad hoc prompting.
## System Of Record
Agents and contributors should treat these files as the repository's baseline knowledge:
- `DESIGN.md`: active visual language, dynamic color direction, and motion rules
- `README.md`: repo purpose, workspace layout, distribution modes, and QA surface
- `CONTRIBUTING.md`: component contract, review expectations, and definition of done
- `roadmap.md`: current system direction and planned component work
- `packages/ui/src/lib/contracts.ts`: public authoring contract for components
- `apps/docs/src/component-authoring.stories.tsx`: review surface for authoring rules
- `docs/registry.md`: source-copy registry contract
- `docs/releasing.md`: package release contract
- `docs/rfcs/*`: design decisions that should not be silently bypassed
- `AGENTS.md`: agent operating mode for this repository
## Execution Plans
Non-trivial changes should start with an execution plan under `docs/exec-plans/`.
Use an execution plan when the work:
- touches multiple repo surfaces such as `packages/ui`, `apps/docs`, `tests`, or `registry`
- changes public component contracts or release behavior
- introduces new dependencies, workflows, or automation
- is large enough that another engineer or agent may need to resume it later
Plans should state:
- the problem or goal
- constraints and non-goals
- affected surfaces and likely files
- validation suites to run
- a status log with concrete checkpoints
## Validation Suites
Harness validation is exposed through `scripts/harness/validate.mjs` and root `pnpm` scripts.
Primary suites:
- `pnpm harness:validate:static`
- lint and workspace typecheck for general repo changes
- `pnpm harness:validate:component`
- lint, typecheck, and unit coverage for normal component work
- `pnpm harness:validate:docs`
- Storybook build for docs-surface changes
- `pnpm harness:validate:a11y`
- curated Storybook accessibility validation from the shared browser coverage contract
- `pnpm harness:validate:docs-smoke`
- Playwright smoke coverage for high-value Storybook flows
- `pnpm harness:validate:consumers`
- registry metadata plus consumer smoke validation
- `pnpm harness:validate:pr`
- baseline pull request gate for packages, docs build, and consumer surfaces
- `pnpm harness:validate:release`
- full release gate, including browser-driven smoke coverage
- `pnpm harness:validate:changed`
- selects suites from git diff or working tree changes before validating
Each run writes a JSON report to `.artifacts/harness/<suite>.json`.
GitHub Actions uploads the generated harness, a11y, and browser test artifacts from `.artifacts/`
so CI failures stay debuggable after the run finishes.
## Working Loop
Recommended change loop:
1. Read the relevant system-of-record files.
2. Create or update an execution plan when the change is non-trivial.
3. Modify the smallest surface that can prove the change.
4. Run the narrowest useful harness suite first.
5. Escalate to broader suites before merge.
6. Record any skipped checks or known failures in the execution plan or PR.
## Diff-aware Selection
Harness selection is exposed through `pnpm harness:select` and `pnpm harness:validate:changed`.
- `pnpm harness:select`
- reads the current working tree diff by default
- `pnpm harness:select -- --from <ref> --to <ref>`
- selects suites from an explicit git range
- `pnpm harness:validate:changed`
- runs the selected suites with a JSON report
The CI workflow uses `pnpm harness:validate:changed` as its execution entrypoint. Selection is no
longer advisory-only: harness-risk and release-risk surfaces escalate directly to the broader `pr`
or `release` gates instead of falling back to `static`.
Selection intentionally maps repo surfaces to validation surfaces:
- package source changes select `component`, `docs`, `docs-smoke`, and `consumers`
- docs/story changes select `static`, `docs`, and `docs-smoke`
- registry/consumer changes select `static` and `consumers`
- doc-only or metadata-only changes may select no suites
High-risk control-plane changes override those narrow mappings:
- `scripts/harness/*` and other harness-control files escalate to `pr`
- release workflows, release docs, registry docs, and consumer/release scripts escalate to `release`
## Browser Coverage Contract
Curated browser coverage lives in `tests/e2e/support/story-harness-contract.json`.
That contract is the shared source of truth for:
- `pnpm harness:validate:a11y`
- `pnpm harness:validate:docs-smoke`
Each entry records the story id, why it is covered, which suites own it, and any required
interaction scenario keys. When a new high-value review surface should participate in browser or
accessibility validation, update this contract instead of adding one-off story lists in multiple
scripts.
## Worktree Orchestration
Cadence UI also exposes an orchestration wrapper:
- `pnpm harness:orch -- <orch command>`
The wrapper keeps orchestration state under `.artifacts/orch/` and applies worktree defaults for
dispatch. Details live in [docs/orchestration.md](/Users/xd/project/cadence-ui/docs/orchestration.md).
Use `pnpm harness:orch -- doctor` to verify binary discovery and plan parsing on a new machine, and
prefer `dispatch --plan-file <plan>` so execution plans remain the source of truth for generated
task bodies.
## Current Rollout Scope
This repository is adding harness engineering in phases.
Phase 1 established:
- a documented harness workflow
- execution-plan conventions
- shared validation suites
- a pull request workflow that runs the repository harness gate
Phase 2 established:
- change-aware suite selection from git diff
- a stabilized Storybook smoke harness
- worktree-oriented orchestration defaults
Phase 3 hardens:
- authoritative changed-suite execution in CI
- uploaded CI artifacts for harness, a11y, and browser test output
- a shared browser coverage contract
- orchestration preflight and plan-linked dispatch helpers
- harness self-tests for selection, artifact, and orchestration helpers
Future phases can layer on:
- richer browser/app harnesses for interactive review
- deeper plan-to-task automation for orchestration runs
- stronger safety rails for agent-owned automation