June 4, 2026

PBX Science

VoIP & PBX, Networking, DIY, Computers.

Claude Code Unearths a Linux Vulnerability Hidden for 23 Years

Claude Code Unearths a Linux Vulnerability Hidden for 23 Years



Claude Code Unearths a Linux Vulnerability Hidden for 23 Years
AI Security Report May 2026 Based on Verified Sources
AI Security   /   Linux Kernel   /   CVE-2026-31402

Claude Code Unearths a Linux Vulnerability Hidden for 23 Years

Nicholas Carlini is not a name most people recognize. But in security research circles, his credentials speak clearly: a research scientist at Anthropic, a PhD from UC Berkeley under David Wagner, best-paper awards at IEEE S&P, USENIX Security (twice), and ICML (three times)—and, as of this writing, more than 70,000 citations on Google Scholar. He spent years at Google Brain and DeepMind before joining Anthropic. At the [un]prompted 2026 AI security conference, he told the audience something that stopped the room:

We now have a number of remotely exploitable heap buffer overflows in the Linux kernel. I have never found one of these in my life before. This is very, very, very hard to do. With these language models, I have a bunch.

— Nicholas Carlini, [un]prompted AI Security Conference, 2026

A security professional who had spent decades in the field admitted that what he had never managed to accomplish alone, AI had done—not once, but repeatedly. The most striking example: a vulnerability buried inside the Linux kernel since March 2003, invisible to every code review, static analysis tool, and fuzzing campaign that had touched the kernel in the 23 years since.


A Landmine Planted in 2003

The flaw lives in NFS—the Network File System, the foundational protocol Linux uses to share files across servers. It is present in virtually every enterprise Linux deployment. The specific vulnerability sits in NFS’s locking mechanism, and while the attack path requires coordination, the underlying principle can be stated simply: the server tries to pour 1,056 bytes of water into a 112-byte cup.

In technical terms, the attack works like this: two cooperating NFS clients target a Linux NFS server. Client A acquires a file lock and declares a 1,024-byte owner ID—unusual in length, but fully permitted by the NFSv4 protocol. When Client B then requests the same lock, the server rejects it. Generating that rejection response, the server attempts to include the full 1,024-byte owner ID. But the allocated memory buffer—NFSD4_REPLAY_ISIZE—is only 112 bytes. The result: 944 bytes of kernel memory are overwritten. Because the attacker controls the owner ID, they also control what gets written there. No login credentials. No special privileges. Just network access to an exposed NFS service.

This is a remotely exploitable heap buffer overflow—one of the most severe vulnerability classes in systems security. It was assigned CVE-2026-31402 and has since been patched in the Linux kernel.

23
Years hidden in the kernel
944
Bytes of overwritten kernel memory
5
Linux kernel vulnerabilities confirmed total

The original commit, from March 2003, explains the developer’s thinking at the time: the 112-byte static buffer was explicitly sized for the OPEN operation—the largest of the NFSv4 sequence mutation operations at the time of writing. The reasoning was sound. What the developer did not—could not—anticipate was that the LOCK operation added later would permit owner IDs of up to 1,024 bytes. A perfectly logical design choice at the protocol layer became a fatal flaw at the implementation layer. For 23 years, it waited.


Why Existing Tools Missed It for Two Decades

The more important question is not how the bug was introduced, but why it survived so long. The Linux kernel has more than 30 million lines of code and receives constant security scrutiny. Traditional vulnerability discovery relies on three primary approaches—and all three fail against this class of bug.

Static analysis can flag a buffer that appears undersized, but it cannot understand the semantics of the NFS protocol. It has no way to know that under specific interaction conditions—a LOCK denial following an OPEN with an unusually long owner ID—the 112-byte buffer would be called upon to hold 1,056 bytes. The code, viewed in isolation, looks correct.

Fuzzing throws random data at a system and watches for crashes. But triggering this vulnerability requires a precise, ordered sequence: two clients, operating in a specific choreography, exchanging protocol messages in the right order. The probability of random fuzzing stumbling onto that exact combination is vanishingly small.

Manual auditing is limited by human bandwidth. Even experienced kernel security researchers have finite time and attention; the NFS implementation details are dense and easy to skim past. No one connected the dots between the OPEN buffer size decision made in 2003 and the LOCK operation’s owner ID capacity added afterward.

Claude Code found it because it could reason about the entire interaction flow: the protocol handshake, the state relationship between OPEN and LOCK, and the length variation of the owner ID across different operation contexts. It assembled those pieces into a coherent picture of the gap between 112 and 1,056.

What’s most surprising about the vulnerability Carlini shared is how little oversight Claude Code needed to find the bug. He essentially just pointed Claude Code at the Linux kernel source code and asked: “Where are the security vulnerabilities?”

— Michael Lynch, mtlynch.io detailed breakdown, April 2026

Carlini’s approach was deliberately minimal. He wrote a bash script that iterated over every source file in the Linux kernel and, for each file, told Claude Code it was in a CTF (capture-the-flag) competition and should look for vulnerabilities. No protocol documentation. No hand-crafted hints. A find command looping into a claude call. Claude returned complete vulnerability reports—including ASCII diagrams of the attack chain.


A Leap, Not a Step

Carlini tested the identical workflow against earlier models. Claude Opus 4.1, released roughly eight months before the conference, and Claude Sonnet 4.5, released six months prior, found only a small fraction of the vulnerabilities that Claude Opus 4.6 identified. This was not incremental improvement—it was a qualitative jump.

Greg Kroah-Hartman, lead maintainer of the Linux kernel, observed the same inflection point from the other side. For months, AI-generated security reports arriving in kernel maintainers’ inboxes were largely noise—developers dismissed them as “AI slop.” Then, roughly a month before Carlini’s talk, something changed. Reports became legitimate. The kernel security lists began receiving five to ten valid, actionable reports per day.

The trend extends beyond the Linux kernel and beyond Claude. Security researcher Sean Heelan used OpenAI’s o3 model to analyze approximately 12,000 lines of SMB command handler code in the Linux kernel and discovered a use-after-free vulnerability in the ksmbd module—a race condition in the SMB2 LOGOFF handler (CVE-2025-37899) that traditional static analysis had consistently missed, because it required reasoning about concurrent thread access patterns rather than simple code inspection.

Mozilla Firefox—perhaps the most rigorously audited open-source browser on Earth—became the subject of a two-week collaboration between Anthropic and Mozilla in February 2026. Claude Opus 4.6 scanned nearly 6,000 C++ files, submitted 112 unique reports to Mozilla’s Bugzilla tracker, and identified 22 confirmed vulnerabilities, including 14 classified as high-severity. The first vulnerability—a use-after-free in the JavaScript engine—was flagged within 20 minutes of the scan beginning. Mozilla shipped fixes for the majority of these flaws in Firefox 148, protecting hundreds of millions of users. Those 14 high-severity findings represent nearly a fifth of all high-severity Firefox bugs that were remediated across the entire year of 2025.


The Bottleneck Has Moved

Carlini articulated the new problem during his talk with characteristic candor: he has hundreds of unverified kernel crash reports sitting on his desk. He cannot send them to Linux maintainers without first validating each one himself—but verification takes human time he does not have. The speed at which AI finds vulnerabilities has now outpaced the speed at which humans can review them.

This is historically new. The bottleneck in security research has never before been review capacity rather than discovery capacity.

The implications cut in both directions. For defenders—open-source maintainers, security teams, platform operators—the window between a vulnerability’s introduction and its discovery is collapsing. Bugs that might have lurked for a decade can now be found in days or weeks. For attackers wielding the same tools, the same acceleration applies. Carlini himself acknowledged this without equivocation.

What it means for ordinary users in the near term is better-secured operating systems and software, as vulnerabilities are found and patched before exploitation. What it means in a longer timeframe depends on whether the defensive use of these tools outpaces the offensive use—a race that is, at this moment, genuinely unresolved.

I expect to see an enormous wave of security bugs uncovered in the coming months.

— Nicholas Carlini, closing remarks, [un]prompted 2026

The wave is already visible. A 23-year-old vulnerability in the protocol that underlies enterprise Linux file sharing was found not by a decade of manual auditing, but by a script, a model, and the instruction to look. The security field is experiencing its own AlphaGo moment—except, unlike Go, the board has 30 million lines of code, and both sides are playing.

Editorial note on sourcing: The original text circulating online contains two minor inaccuracies corrected here. (1) The NFS vulnerability was introduced in March 2003, not September 2003, per the verified kernel commit history. (2) Nicholas Carlini’s Google Scholar citation count stands at approximately 70,000+ as of May 2026, not 67,000. All other core claims—the attack mechanics, the 112-byte / 1,056-byte figures, the model comparison, the Firefox findings, the Greg Kroah-Hartman observations, and the OpenAI o3 / SMB vulnerability—have been verified against primary and secondary sources.

Claude Code Unearths a Linux Vulnerability Hidden for 23 Years


Windows Software Alternatives in Linux


Disclaimer of pbxscience.com

PBXscience.com © All Copyrights Reserved. | Newsphere by AF themes.