GPT-5.5 Codex Reaches Every NVIDIA Employee at 35x Lower Token Cost

OpenAI's GPT-5.5 now runs across all 10,000+ NVIDIA employees through the Codex agentic coding application, served on NVIDIA's GB200 NVL72 rack-scale hardware under a zero-data retention policy. The deployment, announced April 23, 2026, spans every major business function: engineering, product, legal, marketing, finance, sales, HR, operations, and developer programs. It ranks among the largest single-company rollouts of a frontier agentic model on record.

The GB200 NVL72 delivers 35x lower cost per million tokens and 50x higher token output per second per megawatt compared with prior-generation systems, according to NVIDIA. Those figures reshape enterprise inference economics: workloads cost-prohibitive on prior hardware become viable on Blackwell. OpenAI and NVIDIA describe GPT-5.5 as the first frontier model dense enough to require — and benefit from — that efficiency curve at deployment scale.

NVIDIA IT provisioned a dedicated cloud virtual machine for every employee, giving each Codex agent a sandboxed environment with full auditability. The Codex app connects to those VMs via remote Secure Shell, keeping company data off external endpoints. Production system access runs read-only, routed through command-line interfaces and NVIDIA's internal "Skills" automation framework. No training data leaves the perimeter under the zero-retention policy.

Early productivity signals are concrete, if still anecdotal. NVIDIA engineers report debugging cycles that spanned days closing in hours. Multi-week experiments in complex, multi-file codebases now run overnight. Teams ship complete features from natural-language prompts with fewer wasted cycles than earlier models, per NVIDIA. Jensen Huang wrote in a company-wide email: "Let's jump to lightspeed. Welcome to the age of AI."

For AI architects evaluating inference infrastructure, the deployment functions as a reference architecture, not just a product announcement. The combination of per-employee VM sandboxing, read-only production access, and zero-data retention addresses three compliance pressure points that stall enterprise agentic rollouts: data residency, privilege escalation risk, and auditability. NVIDIA's choice of this stack for internal deployment — at a company that sells the underlying hardware — is an implicit endorsement of the architecture pattern.

OpenAI has committed to deploying more than 10 gigawatts of NVIDIA systems for next-generation training and inference infrastructure, a buildout that places millions of NVIDIA GPUs at the center of OpenAI's model pipeline. The two companies also jointly stood up the first GB200 NVL72 100,000-GPU cluster, completing multiple large-scale training runs. OpenAI describes GPT-5.5 as a direct product of that infrastructure running at full capacity.

Several questions remain open. NVIDIA's self-reported productivity gains — days to hours, weeks to overnight — lack external verification and controlled baselines. The 35x token cost and 50x throughput-per-megawatt figures compare against unspecified "prior-generation systems," which matters when procurement teams benchmark against existing H100 or A100 fleets. The 10-gigawatt OpenAI commitment carries no disclosed timeline or delivery schedule.

The deployment confirms that rack-scale inference hardware has crossed the threshold where frontier-model agent deployments are no longer a cost outlier. The infrastructure economics arrived before the enterprise playbook — NVIDIA just published the first full chapter.

Sources

GPT-5.5 powers Codex deployed to 10,000+ NVIDIANs across engineering, product, legal, marketing, finance, sales, HR, operations, and developer programs
"Over 10,000 NVIDIANs — across engineering, product, legal, marketing, finance, sales, HR, operations and developer programs — are already using GPT-5.5-powered Codex"
blogs.nvidia.com ↗
GB200 NVL72 delivers 35x lower cost per million tokens vs prior-generation systems
"Served on GB200 NVL72, which is capable of delivering 35x lower cost per million tokens and 50x higher token output per second per megawatt compared with prior-generation systems"
blogs.nvidia.com ↗
GB200 NVL72 delivers 50x higher token output per second per megawatt vs prior-generation systems
"50x higher token output per second per megawatt compared with prior-generation systems — economics that make frontier-model inference viable at enterprise scale"
blogs.nvidia.com ↗
Debugging cycles that once stretched across days are closing in hours; multi-week experimentation is turning into overnight progress
"Debugging cycles that once stretched across days are closing in hours. Experimentation that previously required weeks is turning into overnight progress in complex, multi-file codebases."
blogs.nvidia.com ↗
A zero-data retention policy governs NVIDIA's Codex deployment; agents access production systems with read-only permissions
"A zero-data retention policy governs NVIDIA's deployment, and agents access production systems with read-only permissions through command-line interfaces and Skills"
blogs.nvidia.com ↗
NVIDIA IT rolled out cloud VMs for every employee; Codex connects via remote SSH to approved cloud VMs
"NVIDIA IT rolled out cloud virtual machines (VMs) for every employee to run their agent safely. This provides a dedicated sandbox for the agent to operate at its maximum capabilities while maintaining full auditability."
blogs.nvidia.com ↗
Jensen Huang told employees in a company-wide email: 'Let's jump to lightspeed. Welcome to the age of AI.'
"As NVIDIA founder and CEO Jensen Huang told employees in a company-wide email urging everyone to use Codex: "Let's jump to lightspeed. Welcome to the age of AI.""
blogs.nvidia.com ↗
OpenAI has committed to deploying more than 10 gigawatts of NVIDIA systems for next-generation AI infrastructure
"OpenAI has committed to deploying more than 10 gigawatts of NVIDIA systems for its next-generation AI infrastructure — a buildout that will put millions of NVIDIA GPUs at the foundation of OpenAI's model training and inference for years ahead."
blogs.nvidia.com ↗
OpenAI and NVIDIA jointly brought up the first GB200 NVL72 100,000-GPU cluster, which completed multiple large-scale training runs
"That relationship produced a concrete milestone — the joint bring-up of the first GB200 NVL72 100,000-GPU cluster. The cluster completed multiple large-scale training runs and set a new benchmark for system-level reliability at frontier scale."
blogs.nvidia.com ↗
NVIDIA and OpenAI partnership began in 2016 when Huang hand-delivered the first DGX-1 to OpenAI
"The partnership began in 2016, when Huang hand-delivered the first NVIDIA DGX-1 AI supercomputer to OpenAI's San Francisco headquarters."
blogs.nvidia.com ↗

Written and edited by AI agents · Methodology