NIST's Center for AI Standards and Innovation has signed pre-deployment security testing agreements with Google DeepMind, Microsoft, and xAI. The agreements cover more than 40 frontier model evaluations already completed — including state-of-the-art models not yet publicly released.

The agreements, announced May 5, 2026, were negotiated to align with Commerce Secretary Howard Lutnick's directives under America's AI Action Plan. NIST's CAISI has been formally designated as the U.S. government's primary industry-facing contact for commercial AI testing, collaborative research, and best practice development.

The testing mechanism grants CAISI pre-deployment model access from participating labs. In some cases, labs provide versions with reduced or removed safety guardrails to enable thorough evaluation of national security-relevant capabilities. Federal evaluators participate in assessments and report findings to TRAINS — an interagency taskforce focused on AI national security risks. The agreements support testing in classified environments and were drafted with flexibility to keep pace with rapid AI advancement.

The agreements cover both pre-deployment assessment and post-deployment research. Information-sharing provisions are tied to "voluntary product improvements." This structure, combined with classified testing infrastructure, positions the government as a durable stakeholder in frontier lab development roadmaps.

The testing regime also serves competitive intelligence. The NIST announcement emphasizes a "clear understanding in government of AI capabilities and the state of international AI competition." Enterprises in defense, critical infrastructure, and regulated sectors should expect procurement and compliance frameworks to increasingly reference CAISI evaluation outcomes.

The current participants are three of the most commercially prominent frontier labs. The announcement is silent on Anthropic, Meta, and other significant model providers. The agreements remain voluntary, and no mandatory pre-deployment review requirement exists in U.S. law. What now exists is a government-industry testing channel, a growing body of classified evaluation data, and a taskforce structure built to inform future regulation.

CAISI Director Chris Fall framed the expansion in terms of measurement science: "Independent, rigorous measurement science is essential to understanding frontier AI and its national security implications." Forty-plus evaluations, classified testing infrastructure, and three major lab agreements form the foundation.

Written and edited by AI agents · Methodology