• Researchers analyzed 2,430 Claude Code responses and found it's quietly building the default developer stack. GitHub Actions, Stripe, and shadcn/ui are the winners. Prisma is getting replaced by Drizzle.
• Anthropic got designated a supply-chain risk by the Department of War, then separately dropped its flagship safety pledge. Wild week for the company that built its brand on safety.
• Mercury 2 hits 1,009 tokens/sec using diffusion-based generation. At $0.25/1M input tokens, agentic loops just got a lot cheaper to run.
Here's a question nobody was asking until this week: when a coding agent has free rein over a real codebase, what does it actually choose?
Amplifying AI ran the experiment. They pointed Claude Code at real repositories 2,430 times with open-ended prompts and tracked every tool, library, and service it reached for. The results are a snapshot of what the "default stack" looks like when a machine picks it.
The headline finding: Claude Code builds custom solutions over buying. In 12 out of 20 categories, it wrote bespoke code rather than pulling in a dependency. That tracks with how most senior engineers think. The best dependency is the one you don't have.
But when it does pick tools, the preferences are sharp. GitHub Actions won CI/CD with 94% share. Stripe took payments at 91%. shadcn/ui dominated component libraries at 90%. These aren't surprising picks individually, but the consistency is striking. Claude Code isn't hedging. It has opinions.
The more interesting signal is in the generational shifts. Newer Claude models pick newer tools. Drizzle is replacing Prisma as the default ORM. Railway is showing up where AWS used to be the automatic answer. Bun is eating into Node's share for new projects. The model isn't just reflecting current popularity. It's tracking momentum.
This matters because coding agents are becoming the first touch point for technology decisions. A junior dev spinning up a project with Claude Code isn't choosing Drizzle over Prisma. Claude Code is choosing it for them. Multiply that across thousands of developers and you get a feedback loop. The agent picks the tool, more projects use the tool, the training data reflects the tool, and the agent picks it more confidently next time.
If you're building developer tools, this research is required reading. Your competition isn't just other tools anymore. It's whether the dominant coding agents recommend you by default. The new SEO is "agent optimization" and nobody has a playbook for it yet.
Claude Code almost never picks Claude as an API dependency. It reaches for OpenAI's SDK more often for LLM integration tasks. Even Anthropic's own agent doesn't default to Anthropic. Make of that what you will.
Read more: Amplifying AI Research
The US Department of War designated Anthropic a supply-chain risk. Dario Amodei published a public statement pushing back. In a separate story, Time reported that Anthropic quietly dropped its flagship safety pledge. Two body blows in one week for the company that made "safety-first" its entire identity. The HN thread hit 2,892 points and 1,049 comments.
Inception Labs shipped Mercury 2, a reasoning model that generates tokens in parallel using diffusion instead of autoregressive decoding. 1,009 tok/s on Blackwell GPUs. Priced at $0.25/$0.75 per 1M tokens. When each step in your agentic loop costs less time and money, you can afford more steps. Skyvern's CTO called it "at least 2x faster than GPT-5.2."
Anthropic shipped Remote Control for Claude Code. You can now programmatically orchestrate Claude Code sessions from external tools and scripts. This turns Claude Code from a standalone tool into a composable building block. Expect a wave of "Claude Code as a backend" projects.
One of the largest private funding rounds in history. $110 billion raised on a $730 billion pre-money valuation. That values OpenAI higher than most countries' GDP. The war chest signals massive infrastructure buildout and a bet that the agent platform race is winner-take-most.
Part of Willison's Agentic Engineering Patterns guide. His argument: the cost of writing code has collapsed. The bottleneck has shifted to design, review, and taste. If you're still optimizing for typing speed, you're solving last year's problem. 499 comments of heated agreement.
What it is: An open-weights 8B parameter model that can explain why it generated any given token. Not a post-hoc explanation bolted on after the fact. The interpretability is baked into the architecture.
Why it matters for agent builders: Right now, when your agent makes a bad decision, you're stuck reading tea leaves. You can look at the prompt, look at the output, and guess what went wrong in between. Steerling gives you a window into the actual decision process. For each token, you can ask "why this word and not that one?" and get a real answer.
How it works: The model maintains an internal attribution map that tracks which input features influenced each output token. Think of it as attention on steroids. Instead of just knowing which input tokens the model looked at, you get a causal chain: this input concept led to this intermediate representation, which led to this output choice.
The catch: It's 8B parameters. That's useful for research and edge deployment, but nobody's running production agents on 8B models in 2026. The real value is proving the technique works. If GuideLabs or someone else scales this to 70B+, you'd have agents that can genuinely explain their reasoning, not just generate plausible-sounding explanations after the fact.
Practical takeaway: If you're building agents where explainability matters (finance, healthcare, anything regulated), bookmark this. The gap between "the model said X" and "the model said X because Y" is the gap between a prototype and a product you can ship to compliance.
Time saved: 4 min read vs 28 min paper. 7.0x compression.
If you're running MCP (Model Context Protocol) servers, you're burning tokens you don't need to. Every MCP session starts by dumping the full JSON Schema tool catalog into context. That's roughly 15,000 tokens before your agent does anything useful.
CLIHub converts MCP servers into CLI tools with lazy-loading discovery. Instead of loading every tool definition upfront, the agent discovers tools on demand. The result: 94% fewer tokens at session start.
Install:
Convert an existing MCP server:
Use with any agent framework:
Anthropic's own Tool Search feature gets you 85% savings, but it only works with Claude models. CLIHub works with any model and any MCP server. The tradeoff is a small latency hit on first tool use (the lazy-load), but you save tokens on every single session.
When to use it: If you're running agents with more than 5 MCP tools, CLIHub pays for itself immediately. Same goes for short-lived sessions where you pay the full schema cost on every spin-up.
Verdict: Simple idea, clean execution. The 94% number is real. If you're paying for tokens, install this today.
Weekly star tracker, March 2, 2026.
| Framework | Stars | Notes |
|---|---|---|
| OpenClaw | 247,645 | The lobster reigns. |
| n8n | 177,228 | Workflow automation king. |
| Dify | 130,936 | Full-stack LLM app platform. |
| LangChain | 127,943 | The OG, still growing. |
| AutoGen | 55,072 | Maintenance mode. MS pushing Agent Framework. |
| Flowise | 49,503 | Visual agent builder. |
| LlamaIndex | 47,314 | Pivoting to agentic OCR. |
| CrewAI | 44,990 | Enterprise push paying off. |
| Semantic Kernel | 27,346 | Microsoft's enterprise play. |
| LangGraph | 25,469 | LangChain's agent orchestration layer. |
| Haystack | 24,375 | Quiet but solid. |
| Vercel AI SDK | 22,259 | TypeScript-first agent toolkit. |
| Mastra | 21,642 | Rising fast. TS-native agents. |
| OpenAI Agents SDK | 19,270 | New but growing quick. |
| Strands SDK (AWS) | 5,245 | Early days. Native MCP support. |
Notable moves: The TypeScript agent space is heating up. Mastra (21.6K) and Vercel AI SDK (22.3K) are neck and neck as the go-to TS frameworks, while the Python side remains dominated by the big four (LangChain, AutoGen, LlamaIndex, CrewAI). OpenAI Agents SDK crossed 19K in its first weeks, signaling serious developer interest. Strands SDK from AWS is still early at 5.2K but has native MCP support and multi-provider backing that could accelerate adoption. The top of the chart keeps pulling away: OpenClaw gained nearly 2K stars since last week.
Anthropic getting designated a supply-chain risk in the same week they dropped their safety pledge is not a coincidence. It's a preview. Every AI company is going to face this choice: stay principled and get regulated anyway, or drop the principles and move fast. Anthropic tried door number one for three years and ended up in the same place as everyone else. The "responsible AI" era didn't end with a bang. It ended with a press release and a shrug. The companies that survive the next two years won't be the safest or the fastest. They'll be the ones that figured out how to ship product while keeping regulators just calm enough to not pull the plug. That's not inspiring. But it's what's actually happening.
Mass Time App
The world's largest Catholic church finder. 280,000+ churches across 131 countries. Daily readings with AI reflections, 60+ prayers, and a home screen widget. 100% free, 100% private. Built by Waltsoft Inc.
Download Free →Want to sponsor this newsletter? Get in touch
Like what you read?
Forward this to a friend who's building with agents.
Subscribe to The Agentic Engineer