Noam Shazeer, Transformer co-author, leaves Google for OpenAI; less than 2 years after $2.7B acqui-hire
Noam Shazeer, co-author of the 2017 'Attention Is All You Need' paper that introduced the Transformer architecture and Google DeepMind's VP of Engineering & Gemini co-lead, announced on June 18 that he is joining OpenAI as Lead for Architecture Research. His departure comes less than two years after Google paid approximately $2.7 billion in August 2024 to license Character.AI's technology and bring Shazeer back from the startup he co-founded. The timing underscores the difficulty even the largest tech companies face retaining elite AI researchers in an overheated talent market.
Shazeer's institutional knowledge is difficult to quantify but impossible to replace quickly. He spent over two decades at Google (2000-2021), authored the paper that became the foundation of virtually every modern large language model (GPT, Gemini, Claude), designed Mixture-of-Experts and Multi-Query Attention architectures now embedded in frontier models, and was instrumental in improving Gemini's quality during 2024-2026. At OpenAI, his mandate—exploring next-generation architectures—signals the company is looking beyond incremental improvements to its GPT line. Training runs at frontier scale take months; architectural modifications require extensive validation. But Shazeer's deep understanding of what works at scale and where efficiency gains are most likely to come from is not captured in papers.
For Google, the loss adds to a troubling pattern: several co-authors of the original Transformer paper have now left to join competing ventures. Google's response to Shazeer's departure was a brief statement thanking him for contributions, with no public comment on the Gemini roadmap he was leading. For OpenAI, just 10 days after its confidential S-1 IPO filing targeting a potential $1 trillion valuation, hiring the Transformer architect sends a signal: the frontier AI race is no longer just about who has the best models today, but who can build the architectures that define the next generation.