Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
7.0 KiB
Harness Engineering
Cadence UI already has good validation primitives. Harness engineering makes those primitives agent-usable by turning the repo into a system that can explain itself, accept explicit execution plans, and expose repeatable machine-runnable feedback loops.
What it means in this repo
For Cadence UI, harness engineering means:
- the repository has clear system-of-record files for architecture, contracts, and release rules
- non-trivial work starts with an execution plan checked into git
- validation is exposed as stable suites that humans, agents, and CI can run the same way
- known gaps are recorded explicitly instead of being rediscovered by every new task
This follows the direction described in OpenAI's harness engineering guidance: repositories should be more legible to agents, use explicit execution plans, and prefer faster feedback loops over purely ad hoc prompting.
System Of Record
Agents and contributors should treat these files as the repository's baseline knowledge:
DESIGN.md: active visual language, dynamic color direction, and motion rulesREADME.md: repo purpose, workspace layout, distribution modes, and QA surfaceCONTRIBUTING.md: component contract, review expectations, and definition of doneroadmap.md: current system direction and planned component workpackages/ui/src/lib/contracts.ts: public authoring contract for componentsapps/docs/src/component-authoring.stories.tsx: review surface for authoring rulesdocs/registry.md: source-copy registry contractdocs/releasing.md: package release contractdocs/rfcs/*: design decisions that should not be silently bypassedAGENTS.md: agent operating mode for this repository
Execution Plans
Non-trivial changes should start with an execution plan under docs/exec-plans/.
Use an execution plan when the work:
- touches multiple repo surfaces such as
packages/ui,apps/docs,tests, orregistry - changes public component contracts or release behavior
- introduces new dependencies, workflows, or automation
- is large enough that another engineer or agent may need to resume it later
Plans should state:
- the problem or goal
- constraints and non-goals
- affected surfaces and likely files
- validation suites to run
- a status log with concrete checkpoints
Validation Suites
Harness validation is exposed through scripts/harness/validate.mjs and root pnpm scripts.
Primary suites:
pnpm harness:validate:static- lint and workspace typecheck for general repo changes
pnpm harness:validate:component- lint, typecheck, and unit coverage for normal component work
pnpm harness:validate:docs- Storybook build for docs-surface changes
pnpm harness:validate:a11y- curated Storybook accessibility validation from the shared browser coverage contract
pnpm harness:validate:docs-smoke- Playwright smoke coverage for high-value Storybook flows
pnpm harness:validate:consumers- registry metadata plus consumer smoke validation
pnpm harness:validate:pr- baseline pull request gate for packages, docs build, and consumer surfaces
pnpm harness:validate:release- full release gate, including browser-driven smoke coverage
pnpm harness:validate:changed- selects suites from git diff or working tree changes before validating
Each run writes a JSON report to .artifacts/harness/<suite>.json.
GitHub Actions uploads the generated harness, a11y, and browser test artifacts from .artifacts/
so CI failures stay debuggable after the run finishes.
Working Loop
Recommended change loop:
- Read the relevant system-of-record files.
- Create or update an execution plan when the change is non-trivial.
- Modify the smallest surface that can prove the change.
- Run the narrowest useful harness suite first.
- Escalate to broader suites before merge.
- Record any skipped checks or known failures in the execution plan or PR.
Diff-aware Selection
Harness selection is exposed through pnpm harness:select and pnpm harness:validate:changed.
pnpm harness:select- reads the current working tree diff by default
pnpm harness:select -- --from <ref> --to <ref>- selects suites from an explicit git range
pnpm harness:validate:changed- runs the selected suites with a JSON report
The CI workflow uses pnpm harness:validate:changed as its execution entrypoint. Selection is no
longer advisory-only: harness-risk and release-risk surfaces escalate directly to the broader pr
or release gates instead of falling back to static.
Selection intentionally maps repo surfaces to validation surfaces:
- package source changes select
component,docs,docs-smoke, andconsumers - docs/story changes select
static,docs, anddocs-smoke - registry/consumer changes select
staticandconsumers - doc-only or metadata-only changes may select no suites
High-risk control-plane changes override those narrow mappings:
scripts/harness/*and other harness-control files escalate topr- release workflows, release docs, registry docs, and consumer/release scripts escalate to
release
Browser Coverage Contract
Curated browser coverage lives in tests/e2e/support/story-harness-contract.json.
That contract is the shared source of truth for:
pnpm harness:validate:a11ypnpm harness:validate:docs-smoke
Each entry records the story id, why it is covered, which suites own it, and any required interaction scenario keys. When a new high-value review surface should participate in browser or accessibility validation, update this contract instead of adding one-off story lists in multiple scripts.
Worktree Orchestration
Cadence UI also exposes an orchestration wrapper:
pnpm harness:orch -- <orch command>
The wrapper keeps orchestration state under .artifacts/orch/ and applies worktree defaults for
dispatch. Details live in docs/orchestration.md.
Use pnpm harness:orch -- doctor to verify binary discovery and plan parsing on a new machine, and
prefer dispatch --plan-file <plan> so execution plans remain the source of truth for generated
task bodies.
Current Rollout Scope
This repository is adding harness engineering in phases.
Phase 1 established:
- a documented harness workflow
- execution-plan conventions
- shared validation suites
- a pull request workflow that runs the repository harness gate
Phase 2 established:
- change-aware suite selection from git diff
- a stabilized Storybook smoke harness
- worktree-oriented orchestration defaults
Phase 3 hardens:
- authoritative changed-suite execution in CI
- uploaded CI artifacts for harness, a11y, and browser test output
- a shared browser coverage contract
- orchestration preflight and plan-linked dispatch helpers
- harness self-tests for selection, artifact, and orchestration helpers
Future phases can layer on:
- richer browser/app harnesses for interactive review
- deeper plan-to-task automation for orchestration runs
- stronger safety rails for agent-owned automation