10 days of vibe coding: what I learned building an MCP server

This is part of an ongoing series on what it means to be a software engineer in the age of AI.

This is about building dev-agent, an MCP server that gives AI coding tools semantic search. A hackathon project that turned into something I use every day.

The night before the hackathon, I watched Claude read the same file again.

Slowly. Patiently.

Looking for something it had already passed twice.

I closed the laptop.

Made tea.

Let the room return to silence.

There was a friction I hadn’t found words for. Not a bug. Not a flaw. Just a subtle drag in the work, the feeling of spending more time redirecting the AI than collaborating with it.

Like trying to continue a conversation with someone who keeps losing the thread.

I didn’t know it then, but I was circling a lesson that had been following me for months.

Context isn’t an accessory. It’s the whole thing.

A week to figure it out

It was the 2025 holiday season. The company paused production pushes and hosted a "vibe coding" hackathon — a week to explore, build something, see what happens.

I had a question I wanted to answer: Can I make this better?

I began with a blank monorepo and a simple PLAN.md. Not because I knew what I was making, but because planning sometimes reveals the outline of the problem.

The outline was familiar. For me, Claude didn’t need more instructions.

It needed more awareness.

Days 1–2: the obvious thing

The obvious thing was semantic search. Instead of Claude grepping for exact strings, give it a way to search by meaning.

I picked tools that would run locally. My code stays on my machine.

LanceDB for vector storage (embedded, no server)
Transformers.js for embeddings (runs locally)
ts-morph for parsing TypeScript

By the end of day 2, I had a CLI that worked:

dev index .        # Index the repository
dev search "auth"  # Semantic search

The scanner hit 94% test coverage on day 1. I used it as a way to find edge cases in how I parsed TypeScript. The tests were notes to myself.

Days 3–4: getting ambitious

I got ambitious. What if I had specialized agents for different tasks?

An Explorer for tracing code relationships
A Planner for analyzing GitHub issues
A GitHub agent for indexing issues and PRs

By day 4, I had 557 tests passing. The subagents could route messages between each other, share context, handle shutdown gracefully.

I also built a PR agent that would create pull requests automatically.

Then I deleted it.

I realized I was solving the wrong problem. The friction wasn't that I couldn't create PRs. It was that Claude didn't have enough context to make good decisions. Automation could come later. First, solve the information problem.

Days 5–6: the thing that actually helped

I originally planned an HTTP API server. Spent half a day building it before realizing I didn't need it. MCP was simpler — just a CLI command that integrates directly with Claude Code.

dev mcp install

When I first got semantic search working inside Claude Code, I noticed something I hadn't expected.

Claude was making fewer file reads.

Before, it would grep, find file paths, then read entire files. My search returned code snippets — not just paths. Claude could see the relevant code without opening the file.

This felt like a small detail. It wasn't.

Days 7–8: adding context

With the core working, I added more tools:

dev_refs — who calls this function, what does it call
dev_map — codebase structure at a glance
dev_history — semantic search over git commits

The git history one surprised me. Claude can search commits by meaning:

dev_history query="authentication refactor"

It finds commits about auth even if they don't use that exact word.

Days 9–10: measuring it

I'm an engineer. I had to know if it actually helped.

I ran the same tasks with and without dev-agent. Tracked time, cost, tool calls, quality.

One example that stuck with me:

Task: "Where is rate limiting implemented?"

Without dev-agent: 18 tool calls, 10 files read, ~18,000 input tokens, 45 seconds.

With dev-agent: 3 tool calls, 2 files read, ~1,200 input tokens, 28 seconds.

Same answer. 93% fewer input tokens.

The aggregate across different task types:

Task Type	Cost Reduction	Time Reduction
Debugging	42%	37%
Exploration	44%	19%
Implementation	29%	22%

The 42% wasn't the goal. It was a side effect of returning code snippets instead of file paths.

What I got wrong

I got a lot wrong.

Tool descriptions. My first ones were paragraphs. Claude ignored them. Shorter, more direct ones worked better.

Too many tools too fast. I had 9 tools by day 4. Too many to test properly. Should have started with 3.

Measuring too late. I waited until day 9. If I'd measured on day 3, I would have noticed the code-snippet insight sooner.

Solving the wrong problem. The PR agent was a distraction. The real problem was context.

What changed

Before this, vibe coding felt like babysitting. I'd describe what I wanted, watch Claude grep around, then correct its assumptions.

Now it feels more like pair programming. Claude finds the right code faster. I spend more time on interesting decisions, less time on "no, look in that file."

The biggest change: I trust Claude's first answer more often. When it has the right context, it makes fewer mistakes.

The thing I keep coming back to

The 42% cost savings is nice. But the real win is faster iteration. When Claude finds the right code on the first try, I don't have to correct it.

I've been using dev-agent daily since. It's open source:

npm install -g dev-agent
dev index .
dev mcp install

I'm not sure what the next version will look like. But I keep thinking about that moment — watching Claude read the same file for the third time. The friction I couldn't name.

Sometimes the answer is simpler than you think. Give your tools better context. See what happens.

Built during a hackathon week in November 2025. Source on GitHub.

This is part of an ongoing series on what it means to be a software engineer in the age of AI.

10 days of vibe coding: what I learned building an MCP server

A week to figure it out

Days 1–2: the obvious thing

Days 3–4: getting ambitious

Days 5–6: the thing that actually helped

Days 7–8: adding context

Days 9–10: measuring it

What I got wrong

What changed

The thing I keep coming back to

dev-agent

Related Posts

Respecting intermediates: from chemistry lab to spec-driven development

How dev_search helps AI understand code

Those who like to build

Questions about this post?