Is Replacing Workers With AI Actually More Expensive Than Just Hiring Them?
- 60% of MD5 Password Hashes Can Be Cracked in Under an Hour with a Single GPU
- Dirty Frag: Root Access on Every Major Linux Distribution — No Patch, No Warning
- Ubuntu 26.04 LTS (Resolute Raccoon): The Most Ambitious Ubuntu LTS in a Decade
- Proton Mail: Data Transferred to FBI Again!
- How Close Are Quantum Computers to Breaking RSA-2048?
- How to Prevent Ransomware Infection Risks?
- What is the best alternative to Microsoft Office?
Is Replacing Workers With AI Actually More Expensive Than Just Hiring Them?
Microsoft’s abrupt retreat from Claude Code is the loudest signal yet that the real cost of enterprise AI is colliding head-on with the promises made on stage. Tokens are cheaper. Bills are exploding.
In December 2025, Microsoft threw open access to Claude Code — Anthropic’s powerful AI coding assistant — to thousands of engineers across its Experiences and Devices division, covering teams building Windows, Microsoft 365, Outlook, Teams, and Surface. The experiment was a smashing success. So successful, in fact, that by May 2026 Microsoft was pulling the plug. The deadline for engineers to stop using Claude Code and switch to Microsoft’s own GitHub Copilot CLI: June 30, 2026 — the last day of the company’s fiscal year.
The official explanation from Rajesh Jha, Microsoft’s VP of Experiences and Devices, was “toolchain unification” — GitHub Copilot CLI offered something Claude Code could not: a product Microsoft could directly shape for its own repos, workflows, and security expectations. But the subtext was plain. Heavy employee usage had driven AI-related spending sharply upward, and canceling third-party licenses ahead of a new fiscal year was also a clean way to reset the budget.
- Claude Code rolled out to Microsoft engineers in December 2025 via the Experiences & Devices division.
- Less than six months later, most licenses are being cancelled — not because the tool failed, but because employees used it too much.
- Engineers preferred Claude Code over GitHub Copilot CLI in head-to-head use; the tool was popular by virtually every measure.
- Deadline for removal: June 30, 2026 — aligned with end of Microsoft’s fiscal year.
- The shift also coincides with GitHub moving all Copilot plans to usage-based billing via AI Credits starting June 1, 2026.
Uber, Meta, and Amazon: A Pattern Emerges
Microsoft is not alone. Uber rolled out Claude Code to roughly 5,000 engineers in December 2025 — and actively encouraged adoption through internal leaderboards ranking teams by total AI tool usage. By February 2026, 32% of engineers were regular users. By March, 84% were classified as agentic users. By April, Uber’s CTO Praveen Neppalli Naga revealed to The Information that the company had burned through its entire 2026 AI coding tools budget in just four months. Monthly API costs per engineer ranged from $500 to $2,000. Naga said the company was “back to the drawing board” on AI budgeting.
Uber’s COO Andrew Macdonald, speaking on the Rapid Response podcast, added a sobering note about the productivity question. Despite the surge in usage — with roughly 70% of committed code originating from AI tools by spring — he said that drawing a clear line between those statistics and actual consumer-facing product improvements remained difficult: the ROI link, he noted, simply was not there yet.
Elsewhere, Meta employees built an internal dashboard called “Claudeonomics” to track AI token usage across its more than 85,000-person workforce. Within 30 days, total consumption exceeded 60 trillion tokens. The dashboard was subsequently shut down after external reporting drew attention to it. Amazon, meanwhile, has reportedly been encouraging engineers to “tokenmaxx” — a cultural shorthand for using AI to its absolute maximum. Engineers were tracked with internal targets requiring 80%+ of developers to use AI tools weekly.
The Token Paradox: Cheaper Per Unit, Costlier in Total
At the heart of this story is an economic paradox that CFOs are discovering the hard way. The prevailing narrative inside the AI industry has been reassuring: token prices are falling fast — semiconductor providers are delivering per-token inference cost reductions of 60–70% per year. Research firm Gartner forecasts that by 2030, running inference on a one-trillion-parameter large language model will cost AI providers nearly 90% less than it does today.
But Gartner also warns that those savings will not translate into lower enterprise bills. The reason is volume. Agentic AI systems — the kind that plan and execute multi-step tasks autonomously — consume dramatically more tokens per task than conventional chat models. Gartner estimates agentic models require five to thirty times more tokens per task than standard models. Goldman Sachs, in its May 2026 research note “Decoding the Agentic Economy,” forecasts a 24-fold increase in global token consumption by 2030, reaching 120 quadrillion tokens per month as enterprises deploy AI agents at scale.
- Token inference costs falling 60–70% per year as chip and architecture efficiency improves. (Goldman Sachs)
- Gartner predicts per-token costs will drop ~90% by 2030 for a 1-trillion-parameter model.
- Goldman Sachs forecasts a 24× increase in global token consumption by 2030 — reaching 120 quadrillion tokens/month.
- Agentic AI uses 5–30× more tokens per task than standard models. (Gartner)
- Uber engineers reached average monthly costs of $500–$2,000 per person; heavy users hit $2,000.
- A 2024 MIT study found AI automation was economically viable in only 23% of vision-intensive roles; in the other 77%, humans were cheaper.
The math is unforgiving. If tokens get 90% cheaper but consumption grows 24-fold, the net result is a bill roughly 2.4× larger than today — not smaller. And that calculation assumes AI providers fully pass through their cost savings to customers, which Gartner’s analysts caution is unlikely in practice. As one analyst put it starkly: don’t mistake the deflation of commodity token pricing for the democratization of cutting-edge inference. A cheaper price tag per token is not a cheaper AI bill if you are using exponentially more of them.
Nvidia Acknowledges What Others Are Quietly Discovering
Perhaps the most striking data point came from inside the industry itself. Bryan Catanzaro, Nvidia’s Vice President of Applied Deep Learning — a company whose entire business depends on AI infrastructure spending — told Axios that for his team, compute costs now far exceed what the company pays its human employees. The candor was notable, and it resonated: the quote drew thousands of upvotes on Reddit because it named publicly something enterprise finance teams were increasingly encountering privately.
It is worth noting that Catanzaro’s team runs some of the most compute-intensive workloads imaginable — training experiments, inference pipelines, iterative model evaluation. His experience is not a universal benchmark. But the directional signal is real: for knowledge-intensive AI workflows, the cost of compute can and does exceed the cost of the people operating it.
The Two Narratives Running in Parallel
A quiet tension now runs through the AI industry. The external message has been consistent: tokens are getting cheaper, AI is becoming more accessible, productivity gains are transformational. Jensen Huang has spoken about a future where each employee is supported by 100 AI agents working collaboratively around the clock — a genuinely transformative vision. But operating 100 AI agents in 24/7 agentic loops generates token volumes on a scale that most financial planning cycles were never designed to anticipate.
Internally, CFOs and finance teams are grappling with something different: rapidly rising total bills, unpredictable usage spikes driven by employee enthusiasm, and limited tools to forecast spend when pricing is usage-based and adoption curves are steep. The budget math that looked reasonable in January looked catastrophic by April — as Uber discovered firsthand.
Big Tech, to be clear, is not retreating from AI. Amazon, Microsoft, Alphabet, and Meta have collectively announced over $725 billion in AI-related capital expenditure guidance for 2026 alone, according to the Financial Times. Morgan Stanley estimates total AI-related spending commitments across major tech firms have already reached $740 billion this year, a 69% increase from 2025. The investment thesis remains intact at the macro level.
What Comes Next
What Microsoft’s retreat from Claude Code actually signals is not that AI is failing, but that the industry is entering a more mature and disciplined phase of adoption. The first wave was characterized by enthusiasm and experimentation — give engineers the best tools, measure adoption by usage volume, treat token spend as the cost of learning. The second wave will require something harder: demonstrating that AI spending generates measurable returns that justify the bill.
Companies that loudly proclaimed “all in on AI” in quarterly earnings calls may face more pointed questions from investors as they submit their first full-year reports. The cost of AI-assisted workflows is real, it compounds at scale, and it does not automatically shrink just because token prices are declining. Whether it is worth it depends on what, precisely, those tokens are producing — a question that Uber’s COO, at least, admitted remains genuinely unanswered.
The AI industry’s cost paradox is, in the end, a maturity test. The technology is advancing faster than the economic frameworks most enterprises have in place to manage it. Microsoft’s decision is an unsentimental reminder that passion for a tool and the business case for a tool are two different things — and that the second one eventually has to show up in the numbers.
