Files
ai-workflow-skill/docs/tests/orch/council-tally/council-tally-keeps-distinct-proposals-in-strict-mode.md
T

3.4 KiB

Case: council-tally-keeps-distinct-proposals-in-strict-mode

用例意义

验证 council tally --similarity strict 不会合并 wording 不同的 proposal,即使它们语义接近,也会保留为独立 recommendation。

前置条件

  • 使用隔离的临时目录 TMPDIR
  • 本地可使用 sqlite3task_attempts 中读取 reviewer thread ID
  • 已准备好三份 reviewer 输出 JSON;其中 architecture 与 implementation proposal 语义相近但措辞不同

输入

cat <<'EOF' > TMPDIR/architecture-review.json
{"reviewer_role":"architecture-reviewer","findings":[{"title":"Split contracts","summary":"Transport contracts are mixed into UI code.","proposal":"Move API contract definitions into a dedicated module.","rationale":"This lowers coupling.","confidence":"high","tags":["architecture"],"target_refs":{"repo_path":"."}}]}
EOF

cat <<'EOF' > TMPDIR/implementation-review.json
{"reviewer_role":"implementation-reviewer","findings":[{"title":"Extract API contracts","summary":"Shared transport shapes are duplicated.","proposal":"Move API contract definitions into dedicated module","rationale":"This reduces duplication.","confidence":"medium","tags":["maintainability"],"target_refs":{"repo_path":"."}}]}
EOF

cat <<'EOF' > TMPDIR/risk-review.json
{"reviewer_role":"risk-reviewer","findings":[{"title":"Add auth integration tests","summary":"Login regressions are hard to catch.","proposal":"Add integration tests for auth flows.","rationale":"This catches regressions earlier.","confidence":"high","tags":["risk"],"target_refs":{"repo_path":"."}}]}
EOF

orch --db TMPDIR/coord.db --json council start \
  --run council_blog_tally_002 \
  --target "Review the current blog architecture."

THREAD_ID_CR1=$(sqlite3 TMPDIR/coord.db "SELECT thread_id FROM task_attempts WHERE run_id = 'council_blog_tally_002' AND task_id = 'CR1' AND attempt_no = 1;")
THREAD_ID_CR2=$(sqlite3 TMPDIR/coord.db "SELECT thread_id FROM task_attempts WHERE run_id = 'council_blog_tally_002' AND task_id = 'CR2' AND attempt_no = 1;")
THREAD_ID_CR3=$(sqlite3 TMPDIR/coord.db "SELECT thread_id FROM task_attempts WHERE run_id = 'council_blog_tally_002' AND task_id = 'CR3' AND attempt_no = 1;")

inbox --db TMPDIR/coord.db --json claim --agent architecture-reviewer --thread "$THREAD_ID_CR1"
inbox --db TMPDIR/coord.db --json done --agent architecture-reviewer --thread "$THREAD_ID_CR1" --summary "Review complete" --body-file TMPDIR/architecture-review.json

inbox --db TMPDIR/coord.db --json claim --agent implementation-reviewer --thread "$THREAD_ID_CR2"
inbox --db TMPDIR/coord.db --json done --agent implementation-reviewer --thread "$THREAD_ID_CR2" --summary "Review complete" --body-file TMPDIR/implementation-review.json

inbox --db TMPDIR/coord.db --json claim --agent risk-reviewer --thread "$THREAD_ID_CR3"
inbox --db TMPDIR/coord.db --json done --agent risk-reviewer --thread "$THREAD_ID_CR3" --summary "Review complete" --body-file TMPDIR/risk-review.json

orch --db TMPDIR/coord.db --json council tally \
  --run council_blog_tally_002 \
  --similarity strict

预期输出

  • council tally 退出码为 0
  • tally.data.similarity == "strict"
  • tally.data.counts.minority == 3
  • tally.data.grouped_recommendations 长度为 3
  • 三组 recommendation 都应落入 minority

断言结论

  • strict 模式的目标是保留 proposal 的字面差异,而不是宽松合并
  • 当没有 proposal 被合并时,support count 会退化成单 reviewer 支持