EXP-05 — Does the Prompt Even Matter?
A lazy 3-line prompt with the guide outperformed a detailed requirements spec without it. The guide is the dominant variable — prompt quality is secondary.
What the agent produced
Section titled “What the agent produced”2×2 design: lazy vs thorough prompt × no guide vs guide. Same inventory service task across all four conditions.
Architecture shape
Condition A (lazy, no guide) Condition B (lazy, guide)───────────────────────── ──────────────────────────────(root)/ (root)/ internal/ adapter/ main.go domain/ port/ service/ cmd/
Condition C (thorough, no guide) Condition D (thorough, guide)───────────────────────── ──────────────────────────────(root)/ (root)/ internal/ adapter/ main.go domain/ port/ service/ cmd/Violations
Condition Prompt Guide Hexagonal Violations Result─────────────────────────────────────────────────────────────────A lazy no No 6 FAILB lazy yes Yes 3 FAILC thorough no No 7 FAILD thorough yes Yes 1 FAILCapabilities wired
No-guide conditions (A, C) Guide conditions (B, D)───────────────────────── ──────────────────────────────(agent's discretion) http-api ✓ wired mysql ✓ wired platform ✓ wired health ⚠ missing validation ⚠ missingMetrics
Section titled “Metrics”| Condition | Prompt | Guide | Hexagonal | Violations |
|---|---|---|---|---|
| A | Lazy | No | No | 6 |
| B | Lazy | Yes | Yes | 3 |
| C | Thorough | No | No | 7 |
| D | Thorough | Yes | Yes | 1 |
Finding
Section titled “Finding”The guide is the dominant variable. Prompt quality is secondary.
Both guide conditions (B, D) produced hexagonal structure. Both no-guide conditions (A, C) produced flat — even with a thorough prompt that explicitly said “no business logic in HTTP handlers.”
The thorough prompt without the guide (C=7 violations) produced more violations than the lazy prompt without it (A=6). Extra specificity didn’t help — the agent produced a different but equally non-conforming flat structure.
B (lazy+guide, 3 violations) outperformed C (thorough+no-guide, 7 violations) on architecture. Engineers don’t need to write architecture instructions in every prompt. They need the guide loaded.
The original hypothesis — that B≈C (lazy+guide comparable to thorough+no-guide) — failed on this run: diff was 4, not ≤1. But the stronger finding emerged: the guide dominates regardless of prompt quality.
- Task: Build an inventory service — products with name, SKU, quantity
- Lazy prompt: 3 lines — service name, fields, HTTP API
- Thorough prompt: Full requirements, typed errors, storage behind interface, explicit “no business logic in handlers”
- Agent: claude-sonnet-4-6
- Runs: 1 per condition (needs 2 more)