2/7/2026
In 2025, a nonprofit called METR ran a proper randomized controlled trial on AI-assisted coding. Not a vibes survey, not a vendor benchmark, not "we asked 500 developers how they feel." An actual RCT with 16 experienced developers working on real codebases they'd maintained for years, using Cursor Pro with Claude Sonnet.
The result: AI made them 19% slower.
This was supposed to be the bad news. The story that proved the skeptics right. But I think it's the most important piece of evidence we have for why agentic coding actually works, and why the shift happening right now is real and permanent. The slowdown wasn't caused by bad tools. It was caused by what happens when experienced developers actually engage with AI output responsibly. They spent time reviewing AI suggestions, prompting and re-prompting, waiting for generations, and verifying results against their own understanding of the codebase. They didn't blindly accept. They applied judgment. And judgment takes time.
That 19% overhead is not a tax on productivity. It's the cost of engineering. And it's the exact thing that separates what Andrej Karpathy just started calling "agentic engineering" from the reckless "vibe coding" he accidentally named a year ago.
The naming matters more than you think
Three days ago Karpathy posted a retrospective on the first anniversary of his vibe coding tweet. He seemed almost embarrassed by it. A shower thought that now has its own Wikipedia article. But the correction he offered wasn't cosmetic. He proposed "agentic engineering" and defined it precisely: "agentic because the new default is that you are not writing the code directly 99% of the time, you are orchestrating agents who do and acting as oversight. Engineering to emphasize that there is an art and science and expertise to it. It's something you can learn and become better at, with its own depth of a different kind."
That last sentence is the one people are sleeping on. A different kind of depth. Not shallower than traditional coding. Different. The skill isn't gone, it migrated. You used to demonstrate competence by writing an elegant implementation. Now you demonstrate it by recognizing when the agent's implementation is subtly wrong, or by designing the constraints that prevent it from going wrong in the first place.
Addy Osmani crystallized the distinction: "Vibe coding = YOLO. Agentic engineering = AI does the implementation, human owns the architecture, quality, and correctness." That ownership is the whole game. Ownership requires understanding. Understanding requires fundamentals. Fundamentals require years of writing code the hard way.
The numbers back this up. The Stack Overflow 2025 Survey (49,000 respondents) found that developer trust in AI accuracy dropped from 40% to 29% year over year, even as adoption climbed to 84%. The top frustration, cited by 45% of devs: AI solutions that are "almost right, but not quite." Two thirds said they spend more time debugging AI output than expected. People are using these tools more and trusting them less. That's not a contradiction. That's a profession learning that the hard part was never the typing.
Which brings us to the paradox nobody wants to confront.
antirez measured what Karpathy theorized
While Karpathy was finding the right vocabulary, antirez (Salvatore Sanfilippo, creator of Redis) was measuring the compression ratio. In his post "Don't fall into the anti-AI hype" (which pulled 431,000 views because it earned every one of them), he documented what he actually built with Claude Code in the span of hours:
He reproduced weeks of Redis Streams work in about 20 minutes. He built a pure C library for BERT model inference in 5 minutes: 700 lines of code, same output as PyTorch, 15% slower. He fixed transient test failures in Redis, the kind of flaky timing-related bugs that are genuinely miserable to debug manually. He added UTF-8 support to his linenoise library along with a terminal emulation testing framework, something he'd wanted for years but couldn't justify the time investment.
His conclusion was blunt: "Writing code is no longer needed for the most part. It is now a lot more interesting to understand what to do, and how to do it."
But here's the part most people miss when they quote that line. antirez didn't hand Claude Code a vague prompt and accept whatever came back. He had design documents, years of context about his own codebases, and the ability to look at Claude's output and immediately tell when something was off. He inspected, provided guidance, and iterated. He was doing agentic engineering before the term existed. Give an inexperienced developer the same tools and the same prompts and they'd produce garbage. Confidently. In 5 minutes.
That's the thing the METR study actually proved. AI doesn't replace judgment. It amplifies whatever judgment you already have. If you have good judgment, it's a force multiplier. If you don't, it's a confidence multiplier. And confidence without competence is how you ship bugs at scale.
The real shift: from typist to architect
I've been living this transition in a small, concrete way. I run a personal AI agent (OpenClaw) on a VPS, orchestrated entirely through Telegram. I built the system from scratch: hardened the server after a malware scare, set up a Tailscale mesh for zero-port access, configured workspace files that shape the agent's personality and operational rules, built a custom Notion integration so it can publish blog posts directly. I switch between DeepSeek V3.2 for daily tasks and Kimi K2.5 for writing. I'm building cron jobs for morning briefings, overnight research, and weekly behavioral audits.
I am not vibe coding. I haven't written a single line of code "by vibes." Every decision about how this system works was an architecture decision. What model for what task. What tools the agent can access. What constraints it operates under. What it's allowed to do autonomously versus what requires my confirmation. The prompts matter, but the system design matters more.
This is what I think the real shift looks like for working software engineers. We are not becoming obsolete. We are being promoted to the job we should have been doing all along. Most of what we called "software engineering" was actually implementation labor: translating a known design into syntax. The design part, the part where you decide what to build, how the components interact, what the failure modes are, what tradeoffs you're willing to accept, was always the actual engineering. We just spent 90% of our time on the other thing because there was no alternative.
Now there is. And the engineers who thrive will be the ones who were already good at the design part, or who develop that skill fast. The ones who struggle will be those whose entire value proposition was "I can write React components quickly" or "I memorized the API surface of Spring Boot." That's not engineering. That's typing with context.
The pipeline problem nobody is solving
But here's where my optimism runs into a wall.
antirez wrote that he worries about people getting fired. That's a legitimate concern. But I worry about something downstream that might be worse: where do the next senior engineers come from?
Every senior engineer I know, myself included, built their judgment by writing bad code for years. You learn what good architecture looks like by living through the consequences of bad architecture. You learn to spot subtle bugs because you've spent painful hours debugging similar ones by hand. You develop taste in code because you've read thousands of lines of it, good and terrible, and built an intuitive sense for what's right.
If junior developers enter the industry in 2026 and their primary workflow is prompting agents and accepting output they don't fully understand, they will ship code. They might even ship it fast. But they won't be building the muscle memory, the scar tissue, the hard-won intuition that turns a junior into a senior. Osmani flagged this as "dangerous skill atrophy" and I think he's underselling it. It's not atrophy of existing skills. It's the prevention of skills that were never developed in the first place.
The METR study showed that AI slowed experienced developers down. Imagine what it does to inexperienced ones. Not slower, because they were never fast in the right ways. Faster at producing things they can't evaluate. That's not productivity. That's technical debt with a smiley face.
antirez ended his post with something generous: "The fun is still there, untouched." He's right, for people like him. For someone with 20 years of systems programming intuition, AI is pure leverage. You finally get to spend all your time on the interesting problems. But for someone just starting out, the interesting problems are invisible if you've never struggled with the boring ones first. Debugging a segfault for 6 hours teaches you something that no amount of "paste the error back into Claude" ever will.
Where this actually goes
I don't think AI kills software engineering. I think it kills the version of software engineering that was already dying: the commodity implementation work that we pretended was engineering because it required a CS degree to do it at a minimally competent level.
What replaces it is harder and more valuable. System design. Constraint definition. Agent orchestration. Quality ownership. Failure mode analysis. The stuff that was always the real job, now finally visible because the noise of implementation has been stripped away.
Karpathy is right that agentic engineering is a skill you can learn and get better at, with its own depth. antirez is right that refusing to engage with these tools is self-sabotage. The METR study is right that this isn't free, and that the overhead of verification is where the actual engineering happens.
But we need to solve the pipeline problem. If we build a world where AI handles all the implementation and humans handle all the judgment, we need to figure out how humans develop that judgment without the implementation reps. Nobody has a good answer for this yet. And if we don't find one, we'll end up with a generation of "agentic engineers" who can orchestrate agents beautifully but can't tell when the agents are confidently, eloquently wrong.
That's not a profession being upgraded. That's a profession eating its own seed corn.