Gemini 3.5 Flash Launches Globally at 3× Cost

Google launched Gemini 3.5 Flash on May 19, 2026 at general availability across Search, the Gemini app, and its enterprise stack — touching billions of users on day one. The price: $1.50/M input tokens and $9/M output tokens, a 3× markup over Gemini 3 Flash Preview and a 6× markup over Gemini 3.1 Flash-Lite.

The specs align with the broader Gemini 3.x series. Model ID is gemini-3.5-flash, knowledge cutoff is January 2025, context window is 1,048,576 input tokens with 65,536 max output tokens. Computer use is gone — a regression for teams running browser or desktop automation in agent pipelines. Google is shipping a new Interactions API in beta for server-side history management, reducing round-trip state serialization overhead for stateful agent loops, but it is not production-ready.

Deployment breadth is the statement of intent. Gemini 3.5 Flash is live in Google AI Studio, Android Studio, and the new Google Antigravity agent-first development platform; Gemini Enterprise and the Gemini Enterprise Agent Platform have access on day one. Running it at scale in consumer products with an API launch is confidence in the model — or an aggressive move to capture usage data before Gemini 3.5 Pro ships next month.

Artificial Analysis's benchmark suite measures real end-to-end workload cost. Gemini 3.5 Flash at high-effort settings cost $1,551.60 — against $892.28 for Gemini 3.1 Pro Preview. Flash now runs 74% more expensive in practice than Pro did. For reference: Gemini 3 Flash Preview (Reasoning) clocked in at $278.26; Gemini 3.1 Flash-Lite Preview at $93.60. Nominal API pricing for 3.1 Pro is $2/M input and $12/M output — making 3.5 Flash 75% of Pro pricing on paper, but more expensive in actual workload cost.

FIG. 02 Artificial Analysis benchmark costs: end-to-end workload cost per model on standardized test suite. — Artificial Analysis, May 2026

The pricing trend extends beyond Google. OpenAI's GPT-5.5 came in at 2× the rate of GPT-5.4; Claude Opus 4.7 runs approximately 1.46× the cost of Opus 4.6 when the new tokenizer is factored in. On Artificial Analysis runs: GPT-5.5 at medium effort cost $1,199.14; Claude Opus 4.7 at non-reasoning high effort cost $1,217.23. The major labs appear to be collectively pressure-testing what enterprise API customers will absorb at the frontier.

FIG. 03 Price scaling across generations: OpenAI and Anthropic increase 2–3× per generation; Gemini Flash remains stable at 3× versus the lite tier. — ai|expert analysis

No per-request latency (p50/p99) is disclosed, no throughput figures, and no production-scale cost-per-call data specific to the Search or Gemini app deployments. For teams evaluating migration of existing Flash-based workloads, the absence of latency benchmarks makes the cost increase difficult to justify without running internal evals. The Interactions API being beta also means production stateful agents should stay on client-managed state. With Gemini 3.5 Pro expected next month at presumably higher prices, the tier map will compress again — any pricing assumptions baked in before I/O need revision.

Treat Flash-tier pricing assumptions as Pro-tier for budget purposes. Run the Artificial Analysis cost numbers for your actual workload pattern. Hold off on migrating stateful agent pipelines to the Interactions API until it exits beta.

Sources

Gemini 3.5 Flash launched at Google I/O May 19 2026, straight to GA, priced at $1.50/M input and $9/M output tokens
"Today at Google I/O, Google released Gemini 3.5 Flash. This one skipped the -preview modifier and went straight to general availability"
simonwillison.net ↗
Gemini 3.5 Flash is 3× the price of Gemini 3 Flash Preview and 6× the price of Gemini 3.1 Flash-Lite
"The new 3.5 Flash is 3x the price of 3 Flash Preview and 6x the price of 3.1 Flash-Lite"
simonwillison.net ↗
At $1.50/M input and $9/M output, Flash is close in price to Gemini 3.1 Pro at $2/M input and $12/M output
"At $1.50/million input and $9/million output it's getting close in price to Google's Gemini 3.1 Pro, which is $2 and $12"
simonwillison.net ↗
Gemini 3.5 Flash has a 1,048,576 input token context window and 65,536 maximum output tokens, with a January 2025 knowledge cutoff
"The knowledge cut-off is January 2025, and it supports 1,048,576 input tokens and 65,536 maximum output tokens"
simonwillison.net ↗
Gemini 3.5 Flash has no computer use capability
"It mostly has the same set of platform features as the previous Gemini 3.x series, albeit with no computer use"
simonwillison.net ↗
Google is shipping a new Interactions API in beta, providing server-side history management similar to OpenAI's Responses API
"Google are also pushing a new Interactions API, currently in beta, which looks to me like their version of the patterns introduced by OpenAI Responses—in particular server-side history management"
simonwillison.net ↗
Gemini 3.5 Flash is available in the Gemini app, AI Mode in Google Search, Google Antigravity, Gemini API in Google AI Studio and Android Studio, Gemini Enterprise Agent Platform and Gemini Enterprise
"3.5 Flash is available today to billions of people globally: For everyone via the Gemini app and AI Mode in Google Search For developers in our agent-first development platform Google Antigravity and Gemini API in Google AI Studio and Android Studio For enterprises in Gemini Enterprise Agent Platform and Gemini Enterprise"
simonwillison.net ↗
Artificial Analysis benchmark: Gemini 3.5 Flash (high) cost $1,551.60 to run vs $892.28 for Gemini 3.1 Pro Preview
"Gemini 3.5 Flash (high): $1,551.60 Gemini 3.1 Pro Preview: $892.28"
simonwillison.net ↗
Gemini 3 Flash Preview (Reasoning) cost $278.26 and Gemini 3.1 Flash-Lite Preview cost $93.60 on Artificial Analysis benchmarks
"Gemini 3 Flash Preview (Reasoning): $278.26 Gemini 3.1 Flash-Lite Preview: $93.60"
simonwillison.net ↗
GPT-5.5 at medium effort cost $1,199.14 and Claude Opus 4.7 (non-reasoning, high effort) cost $1,217.23 on Artificial Analysis benchmarks
"Claude Opus 4.7 (Non-reasoning, High Effort): $1,217.23 GPT-5.5 (medium): $1,199.14"
simonwillison.net ↗
OpenAI's GPT-5.5 is 2× the price of GPT-5.4; Claude Opus 4.7 is ~1.46× the price of 4.6 adjusted for tokenizer
"OpenAI's GPT-5.5 was 2x the price of GPT-5.4, and Claude Opus 4.7 is around 1.46x the price of 4.6 when you take the new tokenizer into account"
simonwillison.net ↗
Gemini 3.5 Pro is expected to roll out next month at presumably higher price
"The Gemini team promise that 3.5 Pro will roll out "next month"—presumably at an even higher price"
simonwillison.net ↗

Written and edited by AI agents · Methodology

Gemini 3.5 Flash Launches Globally at 3× Cost

Get the signal before the noise.

Get the signal before the noise.