Speaking on Wednesday, March 4, 2026, at the Morgan Stanley Technology, Media & Telecom Conference in New York, NVIDIA CEO Jensen Huang delivered what observers are calling one of his most consequential public addresses in years — a sweeping declaration that agentic artificial intelligence has crossed an irreversible threshold, anchored by an extraordinary claim about a single piece of open-source software.

OpenClaw is probably the single most important release of software, probably ever. Linux took some 30 years to reach this level. OpenClaw, in three weeks, has now surpassed it.

— Jensen Huang, CEO, NVIDIA — Morgan Stanley TMT Conference, March 4, 2026

Huang’s remarks centered on OpenClaw, a recently released open-source AI agent framework, which he described as the benchmark for measuring just how dramatically adoption curves have changed in the age of agentic AI. The software, he noted, is now the single most downloaded open-source project in history — a milestone that took the Linux operating system three decades to approach. OpenClaw eclipsed that benchmark in three weeks, and its adoption curve, Huang said, “looks like the Y-axis” even plotted on a semi-logarithmic scale.

The “Five-Layer Cake” of AI Value

Huang described the current AI technology stack as a “five-layer cake,” spanning physical infrastructure at the base through to end-user applications at the top. He argued that the applications layer — where OpenClaw and similar agentic tools reside — is now the most productive and financially rewarding tier for hyperscalers and frontier AI labs alike.

The reason, he explained, is not technical complexity but practical impact. OpenClaw has demonstrated that AI agents can operate effectively in highly personalized environments, autonomously completing tasks that previously required domain expertise — from reading an unfamiliar tool’s manual and learning its functions on the fly, to conducting web research, generating reports, and iterating on software code, all without human intervention. “If it has to use a tool it’s never used before,” Huang said, “it reads the manual of the tool.”

Context: What Is OpenClaw? OpenClaw is an open-source AI agent orchestration framework that enables AI models to autonomously plan, use tools, browse the web, and execute multi-step workflows. Its rapid adoption has made it a defining piece of infrastructure in the emerging agentic AI era.

The Token Surge and the Compute Vacuum

Huang drew a sharp distinction between generative AI — where a single prompt produces a single response — and agentic AI, where agents run continuously in the background, handling tasks on behalf of users and organizations indefinitely. A standard generative interaction consumes a given number of tokens. A single agentic task already requires roughly 1,000 times more tokens. But persistent agents of the type enabled by OpenClaw, Huang said, are now consuming approximately one million times more tokens than a traditional generative prompt.

NVIDIA itself has deployed OpenClaw agents internally. “These OpenClaw are running continuously in the background, doing things for us — writing, developing tools, developing software,” Huang said. The consequence: the company’s own compute appetite has “skyrocketed,” and he expects every major enterprise to face the same reality as agentic AI penetrates their operations.

This creates what Huang termed a “compute vacuum” — a structural condition in which, regardless of how aggressively hardware is deployed, compute will remain a constrained resource in the long term. The implication for the semiconductor industry is clear: demand for AI compute infrastructure is not approaching saturation; it may be accelerating away from it.

The amount of compute every company needs is skyrocketing. No matter how large hardware deployments become, computing power will remain constrained.

— Jensen Huang, Morgan Stanley TMT Conference, March 4, 2026

Vera Rubin: Built for the Agentic Era

Huang positioned NVIDIA’s next-generation hardware platform, Vera Rubin — announced at CES 2026 in January — as the architectural answer to the demands of persistent, context-rich AI agents. Unlike its predecessors, Hopper and Blackwell, which were primarily optimized for training workloads, Vera Rubin has been purpose-built for inference at scale and agentic operation.

The platform integrates six distinct chips into a unified system, centered on the R100 GPU (manufactured on TSMC’s 3nm process) paired with the Vera CPU — featuring 88 custom “Olympus” cores capable of 176 simultaneous threads — and NVLink 6, delivering 3.6 TB/s of bidirectional inter-GPU bandwidth in the NVL72 rack-scale system, which functions as a single massive logical GPU.

NVIDIA Vera Rubin Platform — Key Specifications
Component Specification
GPU (R100)~336 billion transistors; TSMC 3nm (N3P); up to 50 PFLOPS inference
MemoryHBM4 at 22 TB/s bandwidth
Vera CPU88 custom Olympus Arm cores; 176 threads; 1.2 TB/s SOCAMM LPDDR5X
InterconnectNVLink 6 — 3.6 TB/s bidirectional
Rack systemNVL72 — 72 Rubin GPUs as a single logical unit; liquid-cooled
ICMSBlueField-4 DPU-powered flash KV cache tier; 5× tokens/sec vs traditional storage
Efficiency vs Blackwell10× reduction in inference token cost; 4× fewer GPUs needed to train equivalent MoE models
AvailabilityFull production H2 2026; Microsoft Azure and CoreWeave among first deployers

The ICMS Breakthrough: Memory for Long Contexts

Perhaps the most technically significant innovation in the Vera Rubin platform is the Inference Context Memory Storage (ICMS) system — a new AI-native memory tier powered by NVIDIA’s BlueField-4 data processing unit. As agentic AI workloads push token contexts into the millions, the key-value (KV) cache generated during inference grows rapidly beyond what on-chip GPU memory can hold. Previously, this overflow forced state into either scarce GPU HBM or conventional enterprise storage, creating latency and efficiency penalties that became prohibitive at scale.

ICMS establishes a dedicated, flash-based “G3.5” context memory layer that sits between the GPU and traditional storage — optimized specifically for the low-latency, high-throughput demands of agentic inference. According to NVIDIA, the platform delivers up to five times more tokens per second, five times better performance per total cost of ownership dollar, and five times better power efficiency compared to traditional network storage solutions used in inference contexts. The result is that AI agents can maintain long-term memory and reason across massive contexts without being throttled by hardware constraints.

Investments in OpenAI and Anthropic: A Chapter Closing

Huang also used the conference to address NVIDIA’s investment strategy in frontier AI labs. He confirmed that the company’s recently finalized $30 billion investment in OpenAI — part of a $110 billion funding round — will likely be NVIDIA’s last equity stake in the firm, citing OpenAI’s anticipated IPO later in 2026. A previously discussed $100 billion infrastructure partnership between the two companies is, Huang said, “not in the cards.” NVIDIA’s $10 billion investment in Anthropic, made in late 2025, is similarly expected to be its final commitment to that company as it too approaches a public offering.

Analyst Perspective Huang’s remarks at Morgan Stanley signal a structural shift: NVIDIA is transitioning from being a passive financial backer of AI labs to positioning itself as the foundational infrastructure layer across the entire agentic AI stack — hardware, memory, networking, and orchestration — with Vera Rubin as its platform of record for the next computing era.
— ✦ —

Taken together, Huang’s presentation at the Morgan Stanley TMT Conference mapped out a coherent vision: AI has moved decisively from an era of discrete generative tasks to one of persistent autonomous agents operating at massive scale, consuming compute at a rate the industry is only beginning to grasp. OpenClaw is, in Huang’s framing, the proof of concept. Vera Rubin is the infrastructure response. And NVIDIA, with its full-stack approach spanning silicon, memory, networking, and software, intends to be the architecture on which that future is built.