Skip to content

EXP-06 — Someone Else's Codebase

The previous experiments all used services the agent built from scratch. This one tests what happens on a real, unfamiliar codebase the agent didn’t build — go-feature-flag, ~15k LOC of production Go.

4→0violations with mapping
−67%tokens with mapping
−50%agent turns

EXP-06d — flagship: guide-only vs guide + codebase mapping

Guide only Guide + codebase mapping
───────────────────── ──────────────────────────────
4 violations ✗ 0 violations ✓ pass
baseline tokens −67% tokens
baseline turns −50% turns

Violations by type (guide-only condition)

Dependency violations: present
Architecture violations: present
verikt check: FAIL

Capabilities

Guide only Guide + mapping
───────────────────── ──────────────────────────────
rules without context rules + where things live
agent explores first agent acts immediately
Guide onlyGuide + mapping
Violations40
verikt checkfailpass
Token usagebaseline−67%
Agent turnsbaseline−50%

The guide tells the agent the rules. The mapping tells it where things live.

Without the map, the agent works in an unfamiliar codebase by exploring — reading files, tracing imports, figuring out the structure before acting. That exploration costs tokens and turns, and the agent still guesses wrong about where to put new code.

With the map, the agent knows the structure before it starts. It acts immediately, and it acts correctly.

Guide alone is insufficient for brownfield work. Guide + mapping eliminates violations and cuts token usage in half.

Four conditions were tested, escalating from stateless to fully agentic:

  • EXP-06a: Broad stateless prompt, no tool access — null result (expected)
  • EXP-06b: Scoped stateless prompt, no tool access — null result (expected)
  • EXP-06c: Agentic with tools, guide vs no-guide
  • EXP-06d: Agentic with tools, guide+mapping vs guide-only — flagship result above
  • Fixture: go-feature-flag v1.30.0 (~15k LOC production Go)
  • Agent: claude-sonnet-4-6 with file tools
  • Fixture delivery: tool-access (Mode C — agent reads codebase directly)

→ Experiment Methodology — reproduction instructions

→ Artifacts on GitHub

→ EXP-07: Does variance reduction hold on feature additions?