Adversarial parallel multi-agent audit before any “complete” claim
Adversarial parallel multi-agent audit before any “complete” claim
Pattern
When a project crosses into “all features shipped / ready for deploy” territory, dispatch 5 orthogonal agents in parallel BEFORE accepting the completion claim. Single-agent self-review consistently misses entire classes of defect that the 5-axis sweep catches in one round-trip.
Agents to dispatch (parallel)
critic— adversarial review of architectural decisions, unstated assumptions, “this is fine” blindness, what would a Primavera/MS-Project/incumbent user say is missingedge-case-hunter— mechanical path enumeration: empty/single/many, boundaries, NaN, concurrent access, cross-boundary I/O failures, type coercion at JSON boundariespr-review-toolkit:code-reviewer— standards compliance: security, type safety, error handling, tests, naming, style, performance, accessibilitypr-review-toolkit:silent-failure-hunter— error swallowing, emptycatch {}, falling back to defaults instead of failing fast, RLS silent-zero-rows modeBashsweep — dependency CVE audit (pnpm audit --audit-level high), secret scan, TODO/FIXME residue, drift detection (live counts vs documented), build artifacts vs source
Each agent gets a self-contained prompt with file paths, the explicit bar (“world’s-greatest / zero-compromise / zero-tech-debt”), and a P0/P1/P2 classification request. Results aggregate with high overlap on real defects and low overlap on noise — natural quality signal.
Evidence (R-Plan Wave 6, 2026-05-02)
After 5 build waves shipped + my own setup-curator gate PASS, I claimed pre-pilot ready. AJ asked for a quality audit. The 5-agent dispatch found 4 P0 + 11 P1 + 13 medium findings — none surfaced by single-agent self-review:
- CVE upgrade (drizzle-orm 0.36 SQL-injection, GHSA-gpj5-g38j-94v9) — caught by Bash dep audit, missed by code-reviewer
- RLS PII gap (4 tables had no row-level security; any auth user could
SELECT * FROM "user"and read every email/role) — caught by critic - Silent failure CRITICAL (
wrap()usedconsole.error— every unexpected Server Action error invisible in production) — caught by silent-failure-hunter - Stale
syncingqueue rows never recovered after tab crash mid-drain → silent data loss — caught by edge-case-hunter - Photo Blob > 1MB queued offline permanently fails forever (no compress before presign) — caught by edge-case-hunter
- Service-worker cache leak between users on shared tablet — caught by code-reviewer
- Open-redirect via callbackUrl — caught by critic
progress_entries_insertadmin bypass violating ADR — caught by critic- 9 sites of
.includes("duplicate")substring-match for unique-violation — caught by code-reviewer + silent-failure-hunter (overlap = strong signal)
When to apply
- Before any “this is complete” / “ready for pilot” / “ready to merge” assertion on substantive work (>1000 lines or multi-feature)
- Before any tagged release
- When AJ asks for “world’s-greatest / zero-compromise / zero-tech-debt confirmation”
- After a substantive refactor or wave-close
When to skip
- Trivial changes (typo fixes, comment updates)
- Single-file edits with <100 lines diff
- Pure exploration sessions with no shipping intent
Cost
Each agent costs 5-10k tokens of model output. Bash sweep is free. Total ~50k tokens for the 5-axis sweep. Compared to the cost of shipping a P0 to pilot users and rebuilding trust — trivial.
Failure mode if skipped
The audit catches what self-review misses BY DESIGN — the agents have orthogonal evaluation lenses. Skipping = guaranteed P0 in production within first month of pilot. Memory for AJ’s “world’s greatest” bar is non-negotiable.
Related patterns
- Pristine Sweep (workspace memory
pristine-sweep-protocol.md) — single-axis cleanliness check; complementary, not substitute - Setup-curator 8-point gate — file-hygiene only; does not find correctness/security defects
- Single-pass code-review agent — useful for incremental PRs but insufficient for wave-close
Routing recommendation
Codify as either:
- New Tier 1 directive: “Before any ‘this is complete’ assertion on substantive work, dispatch 5-agent parallel adversarial audit. Skipping requires explicit AJ override.”
- New skill
pre-deploy-audit— bundles dispatch + dedup + remediation + commit pattern.