AI agents may drive token demand up 24x, raising LLM inference costs: Goldman Sachs
Goldman Sachs analysis warns that widespread AI agent adoption could surge token consumption by up to 24 times current levels, materially increasing inference costs for enterprises. Companies including Uber and Microsoft are already absorbing elevated billing under token-based pricing models.
The finding signals a coming tension: agentic systems require more compute per task but promise higher ROI through autonomous decision-making. CIOs budgeting for multi-model deployment will need to model token burn rates and negotiate volume pricing with providers before agents scale production-wide.