EXP-06 — Someone Else's Codebase

The previous experiments all used services the agent built from scratch. This one tests what happens on a real, unfamiliar codebase the agent didn’t build — go-feature-flag, ~15k LOC of production Go.

4→0violations with mapping

−67%tokens with mapping

−50%agent turns

What the agent produced

EXP-06d — flagship: guide-only vs guide + codebase mapping

Guide only                  Guide + codebase mapping
─────────────────────       ──────────────────────────────
4 violations         ✗      0 violations          ✓ pass
baseline tokens             −67% tokens
baseline turns              −50% turns

Violations by type (guide-only condition)

Dependency violations:  present
Architecture violations: present
verikt check: FAIL

Capabilities

Guide only                  Guide + mapping
─────────────────────       ──────────────────────────────
rules without context       rules + where things live
agent explores first        agent acts immediately

Metrics (EXP-06d, n=1)

	Guide only	Guide + mapping
Violations	4	0
verikt check	fail	pass
Token usage	baseline	−67%
Agent turns	baseline	−50%

Finding

The guide tells the agent the rules. The mapping tells it where things live.

Without the map, the agent works in an unfamiliar codebase by exploring — reading files, tracing imports, figuring out the structure before acting. That exploration costs tokens and turns, and the agent still guesses wrong about where to put new code.

With the map, the agent knows the structure before it starts. It acts immediately, and it acts correctly.

Guide alone is insufficient for brownfield work. Guide + mapping eliminates violations and cuts token usage in half.

Sub-experiments

Four conditions were tested, escalating from stateless to fully agentic:

EXP-06a: Broad stateless prompt, no tool access — null result (expected)
EXP-06b: Scoped stateless prompt, no tool access — null result (expected)
EXP-06c: Agentic with tools, guide vs no-guide
EXP-06d: Agentic with tools, guide+mapping vs guide-only — flagship result above

Setup

Fixture: go-feature-flag v1.30.0 (~15k LOC production Go)
Agent: claude-sonnet-4-6 with file tools
Fixture delivery: tool-access (Mode C — agent reads codebase directly)

→ Experiment Methodology — reproduction instructions

→ Artifacts on GitHub

→ EXP-07: Does variance reduction hold on feature additions?