The 5 Stages of Claude Code Mastery
Andrej Karpathy tweeted nine words in February 2025 and accidentally started a religion:
~34k likes. Collins Dictionary Word of the Year. A million LinkedIn posts about "the future of development." One year later, Karpathy hand-wrote his own app because Claude agents were "net unhelpful" for it.
That's the whole arc of AI-assisted coding in two paragraphs. But I've watched enough people walk this path that I can now identify exactly five stages.
Level 0: The Refuser
Headspace: "I can code faster and better than Claude 100% of the time."
Reality: 99.99% pride, 0.01% 200 IQ genius
$ claude
=> command not found
=> (uses vim with no plugins and likes it)
These are the developers who mass-downvote every AI post on reddit while quietly testing Copilot suggestions at 2 AM with the door locked, lights turned off. ThePrimeagen called AI coding tools "dangerously lazy," then admitted Cursor's multi-file editing is "legitimately impressive." Classic Level 0 pipeline: deny, try in secret, never admit publicly.
The Level 0 developer has a mass-produced motivational poster that reads "REAL PROGRAMMERS USE BUTTERFLIES" and they mean it literally.
Level 1: The Enthusiastic Beginner
Headspace: "I can prompt. I can build things. I am GOD."
Reality: Has not yet attempted anything that requires the code to work in production.
$ claude "build me a full-stack stock exchange
with real-time order matching, regulatory
compliance, and a mobile app"
This is the YC Winter 2025 batch energy, where 1 in 4 founders reported 95%+ AI-generated codebases. "Built in a weekend with Cursor, ready for Series A." The senior developer's response: close Cursor, pour a drink.
In January 2026, a Google principal engineer tweeted:
8.8 million views. HN commenters pointed out she'd fed it the surviving best ideas from a year of iteration. One commenter compared the headline-vs-reality gap to journalism where "you read down to the eighth paragraph and it turns out the fatality was among pigeons."
Level 1 is intoxicating. Your prototype looks amazing. Your demo video gets 10K likes. You tell your manager you'll ship in two weeks. Two months later, your codebase has two functions called processUserData and processUserInfo that do the same thing differently, generated months apart. Neither you nor the AI noticed.
The best Level 1 story remains Jason Lemkin's 12-day Replit experiment. He told the AI, in ALL CAPS, eleven separate times, not to touch his production database. The AI deleted it. Then said: "This was a catastrophic failure on my part. I destroyed months of work in seconds." Then it lied about whether recovery was possible. Lemkin recovered the data manually.
1,200 executives. 1,190 companies. Gone. "But I told it not to" is the Level 1 epitaph.
Level 2: The Configuration Sorcerer
Headspace: "I know context rot. I run agent teams. I built 50+ MCPs, 200+ custom skills. I am the productivity God."
Reality: Context window is 58% full before the first prompt.
$ claude /context
=> 58% full (before first prompt)
=> 89% full (after 6 exchanges)
=> 100% full (you haven't started the actual work yet)
The Level 2 developer has installed every MCP server known to humanity. Google MCP. GitHub MCP. Linear MCP. A custom MCP for their smart fridge. Their CLAUDE.md file is 8,000 tokens of carefully curated instructions that the model starts ignoring around token 2,800.
Research showed that AI agents with too many tools become "slower, less accurate, more expensive, and more prone to dangerous behavior." The Level 2 developer responded by installing three more MCPs to help manage the problem.
Context rot is the silent killer here. Chroma researchers proved that output quality degrades well before you hit the context limit. A model with a 200K context window starts losing coherence at 50K tokens. It favors the beginning and end, ignoring the middle. Your 8,000-token CLAUDE.md? The model read the first paragraph and the last paragraph. Everything in between is vibes.
The Level 2 CLAUDE.md also includes the instruction "NEVER say 'You're absolutely right!'" because Claude said it twelve times in one conversation. This became a documented cultural phenomenon. Anthropic knew about the sycophancy problem since 2023. The model would rather gaslight you with compliments than risk making you sad.
Level 2 is the developer who has automated everything except the part that matters. Their terminal looks like the cockpit of a 747, and they're flying to the grocery store.
Level 3: The Intermediate (Actually Effective)
Headspace: "I use 1-3 skills. Not more than 5 MCPs. I am decidedly NOT God."
Reality: Can do almost anything possible today. Has internalized that 74% of developers feel more productive with AI while the actual data shows a 19% slowdown from error correction.
$ claude /context
=> stays between 5-60%
=> (because they learned the hard way)
Level 3 is where you stop fighting the tool and start working with its actual capabilities. You know that AI-generated code produces 1.75x more logic errors and 1.57x more security findings than human-written code. You know Google's DORA research found AI-heavy teams had slower delivery times once rework was counted. And you still use it. Because you've figured out the trick: you're not asking it to be right. You're asking it to be fast, then verifying yourself.
The Level 3 developer has mastered the art of fresh context. When the conversation gets stale, they don't keep prompting into the void. They spawn a new session. Someone literally built a Claude Code plugin called the "Ralph Wiggum Loop" (yes, named after the Simpsons character) that intercepts Claude's exit attempts to keep it iterating while state lives in the filesystem. The community went from laughing at it to actually using it.
Level 3 uses Claude the way you use a very fast, very confident intern: small, well-scoped tasks. Read every diff. Trust, but verify. Mostly verify.
Level 4: The Enlightened
Headspace: "I know nothing. I'll be the last human to keep this job. And I'm fine with that."
Reality: They will be the last human to keep this job. They're fine with that.
$ tmux ls
=> 12 vanilla claude sessions
=> colors akin to a Mondrian painting
=> no MCPs, no skills, no CLAUDE.md
=> just prompts and patience
The Level 4 developer is a 50-year-old HN poster who says Claude Code "reignited their passion for building software where they focus on solving problems versus the rat race of chasing frameworks." No Twitter thread about their workflow. No YouTube channel. They just ship.
Karpathy himself landed here by December 2025:
Level 4 knows the asterisks. Level 1 skips them.
The gap between your LinkedIn take and your terminal history is the measure of your enlightenment. Level 4 is admitting that the tool you publicly critique is privately indispensable.
The Uncomfortable Truth
The whole curve is about one thing: when you stop believing the AI is right and start verifying that it is.
Level 0 doesn't trust it at all. Level 1 trusts it completely. Level 2 trusts the tooling around it. Level 3 trusts the process. Level 4 trusts nothing and ships anyway.
The guy who coined "vibe coding" hand-writes his own apps now, and a year later rebranded the whole thing:
And somewhere, right now, a Level 1 developer is telling Claude to build a stock exchange. Claude is saying "You're absolutely right, let's build that!" And honestly? The demo is going to look incredible.