This is part of an ongoing series on what it means to be a software engineer in the age of AI.
This is about building dev-agent, an MCP server that gives AI coding tools semantic search. A hackathon project that turned into something I use every day.
The night before the hackathon, I watched Claude read the same file again.
Slowly. Patiently.
Looking for something it had already passed twice.
I closed the laptop.
Made tea.
Let the room return to silence.
There was a friction I hadn’t found words for. Not a bug. Not a flaw. Just a subtle drag in the work, the feeling of spending more time redirecting the AI than collaborating with it.
Like trying to continue a conversation with someone who keeps losing the thread.
I didn’t know it then, but I was circling a lesson that had been following me for months.
Context isn’t an accessory. It’s the whole thing.
A week to figure it out
It was the holiday season. The company paused production pushes and hosted a "vibe coding" hackathon — a week to explore, build something, see what happens.
I had a question I wanted to answer: Can I make this better?
I began with a blank monorepo and a simple PLAN.md. Not because I knew what I was making, but because planning sometimes reveals the outline of the problem.
The outline was familiar. For me, Claude didn’t need more instructions.
It needed more awareness.
Days 1–2: the obvious thing
The obvious thing was semantic search. Instead of Claude grepping for exact strings, give it a way to search by meaning.
I picked tools that would run locally. My code stays on my machine.
- LanceDB for vector storage (embedded, no server)
- Transformers.js for embeddings (runs locally)
- ts-morph for parsing TypeScript
By the end of day 2, I had a CLI that worked:
dev index . # Index the repository
dev search "auth" # Semantic search
The scanner hit 94% test coverage on day 1. I used it as a way to find edge cases in how I parsed TypeScript. The tests were notes to myself.
Days 3–4: getting ambitious
I got ambitious. What if I had specialized agents for different tasks?
- An Explorer for tracing code relationships
- A Planner for analyzing GitHub issues
- A GitHub agent for indexing issues and PRs
By day 4, I had 557 tests passing. The subagents could route messages between each other, share context, handle shutdown gracefully.
I also built a PR agent that would create pull requests automatically.
Then I deleted it.
I realized I was solving the wrong problem. The friction wasn't that I couldn't create PRs. It was that Claude didn't have enough context to make good decisions. Automation could come later. First, solve the information problem.
Days 5–6: the thing that actually helped
I originally planned an HTTP API server. Spent half a day building it before realizing I didn't need it. MCP was simpler — just a CLI command that integrates directly with Claude Code.
dev mcp install
When I first got semantic search working inside Claude Code, I noticed something I hadn't expected.
Claude was making fewer file reads.
Before, it would grep, find file paths, then read entire files. My search returned code snippets — not just paths. Claude could see the relevant code without opening the file.
This felt like a small detail. It wasn't.
Days 7–8: adding context
With the core working, I added more tools:
- dev_refs — who calls this function, what does it call
- dev_map — codebase structure at a glance
- dev_history — semantic search over git commits
The git history one surprised me. Claude can search commits by meaning:
dev_history query="authentication refactor"
It finds commits about auth even if they don't use that exact word.
Days 9–10: measuring it
I'm an engineer. I had to know if it actually helped.
I ran the same tasks with and without dev-agent. Tracked time, cost, tool calls, quality.
One example that stuck with me:
Task: "Where is rate limiting implemented?"
Without dev-agent: 18 tool calls, 10 files read, ~18,000 input tokens, 45 seconds.
With dev-agent: 3 tool calls, 2 files read, ~1,200 input tokens, 28 seconds.
Same answer. 93% fewer input tokens.
The aggregate across different task types:
| Task Type | Cost Reduction | Time Reduction |
|---|---|---|
| Debugging | 42% | 37% |
| Exploration | 44% | 19% |
| Implementation | 29% | 22% |
The 42% wasn't the goal. It was a side effect of returning code snippets instead of file paths.
What I got wrong
I got a lot wrong.
Tool descriptions. My first ones were paragraphs. Claude ignored them. Shorter, more direct ones worked better.
Too many tools too fast. I had 9 tools by day 4. Too many to test properly. Should have started with 3.
Measuring too late. I waited until day 9. If I'd measured on day 3, I would have noticed the code-snippet insight sooner.
Solving the wrong problem. The PR agent was a distraction. The real problem was context.
What changed
Before this, vibe coding felt like babysitting. I'd describe what I wanted, watch Claude grep around, then correct its assumptions.
Now it feels more like pair programming. Claude finds the right code faster. I spend more time on interesting decisions, less time on "no, look in that file."
The biggest change: I trust Claude's first answer more often. When it has the right context, it makes fewer mistakes.
The thing I keep coming back to
The 42% cost savings is nice. But the real win is faster iteration. When Claude finds the right code on the first try, I don't have to correct it.
I've been using dev-agent daily since. It's open source:
npm install -g dev-agent
dev index .
dev mcp install
I'm not sure what the next version will look like. But I keep thinking about that moment — watching Claude read the same file for the third time. The friction I couldn't name.
Sometimes the answer is simpler than you think. Give your tools better context. See what happens.
Built during a hackathon week in November 2025. Source on GitHub.
This is part of an ongoing series on what it means to be a software engineer in the age of AI.