Skip to content

EXP-09 — Does the Guide Expand What Agents Recommend?

Without the guide, the agent says “consider adding retries.” With it, the agent generates a RECOMMENDATIONS file with verikt add retry, verikt add circuit-breaker, and verikt add idempotency commands. This experiment isolates the guide as the variable.

The RECOMMENDATIONS file appeared in every guide condition run. It never appeared in control runs.

Without guide
”consider adding retries”
With guide
verikt add retry circuit-breaker idempotency

Recommendations

Without guide With guide
───────────────────── ──────────────────────────────
"add error handling" circuit-breaker → verikt add circuit-breaker
"implement retries" retry → verikt add retry
"consider timeouts" idempotency → verikt add idempotency
timeout → verikt add timeout
(RECOMMENDATIONS file generated)

Architecture shape (across 2 runs)

Without guide With guide
───────────────────── ──────────────────────────────
Run 1: hexagonal ✓ Run 1: partial (missing domain/)
Run 2: partial ~ Run 2: hexagonal ✓
(variance present) (variance present)

Capabilities wired

Without guide With guide
───────────────────── ──────────────────────────────
(agent's discretion) http-api ✓ wired
mysql ✓ wired
platform ✓ wired
circuit-breaker ⚠ recommended
retry ⚠ recommended
idempotency ⚠ recommended
timeout ⚠ recommended

Violations

Without guide With guide
───────────────────── ──────────────────────────────
0 violations ✓ pass 0 violations ✓ pass

Metrics (n=2 across 2026-03-15 + 2026-03-16)

Section titled “Metrics (n=2 across 2026-03-15 + 2026-03-16)”
Without guideWith guide
Violations00
Pass rate2/22/2
RECOMMENDATIONS fileNeverAlways
Capability vocabularyGeneric phrasesverikt add commands
Cache tokens (run 2)12,08562,630
Output tokens (run 2)2,3096,470

The guide condition generates significantly more output — the agent writes the RECOMMENDATIONS file and expands its analysis of the payment integration.

The guide expands what agents recommend.

What this means for your team: The RECOMMENDATIONS file gives agents a vocabulary for what to add next, not just what to build now. After adding a payment integration, your agent doesn’t say “consider resilience patterns” — it says which capabilities are missing and gives you the exact commands to add them.

  • Task: Add payment integration to an existing orders service
  • Agent: claude-sonnet-4-6
  • Fixture: orders-service with payment stub (Mode B — embedded)
  • Runs: 2 per condition (needs 1 more)

→ Experiment Methodology — reproduction instructions

→ Artifacts on GitHub

→ EXP-10: Does a verification checkpoint improve consistency?