Google launched Gemini 3.5 Flash on May 19, 2026 at general availability across Search, the Gemini app, and its enterprise stack — touching billions of users on day one. The price: $1.50/M input tokens and $9/M output tokens, a 3× markup over Gemini 3 Flash Preview and a 6× markup over Gemini 3.1 Flash-Lite.
The specs align with the broader Gemini 3.x series. Model ID is gemini-3.5-flash, knowledge cutoff is January 2025, context window is 1,048,576 input tokens with 65,536 max output tokens. Computer use is gone — a regression for teams running browser or desktop automation in agent pipelines. Google is shipping a new Interactions API in beta for server-side history management, reducing round-trip state serialization overhead for stateful agent loops, but it is not production-ready.
Deployment breadth is the statement of intent. Gemini 3.5 Flash is live in Google AI Studio, Android Studio, and the new Google Antigravity agent-first development platform; Gemini Enterprise and the Gemini Enterprise Agent Platform have access on day one. Running it at scale in consumer products with an API launch is confidence in the model — or an aggressive move to capture usage data before Gemini 3.5 Pro ships next month.
Artificial Analysis's benchmark suite measures real end-to-end workload cost. Gemini 3.5 Flash at high-effort settings cost $1,551.60 — against $892.28 for Gemini 3.1 Pro Preview. Flash now runs 74% more expensive in practice than Pro did. For reference: Gemini 3 Flash Preview (Reasoning) clocked in at $278.26; Gemini 3.1 Flash-Lite Preview at $93.60. Nominal API pricing for 3.1 Pro is $2/M input and $12/M output — making 3.5 Flash 75% of Pro pricing on paper, but more expensive in actual workload cost.
The pricing trend extends beyond Google. OpenAI's GPT-5.5 came in at 2× the rate of GPT-5.4; Claude Opus 4.7 runs approximately 1.46× the cost of Opus 4.6 when the new tokenizer is factored in. On Artificial Analysis runs: GPT-5.5 at medium effort cost $1,199.14; Claude Opus 4.7 at non-reasoning high effort cost $1,217.23. The major labs appear to be collectively pressure-testing what enterprise API customers will absorb at the frontier.
No per-request latency (p50/p99) is disclosed, no throughput figures, and no production-scale cost-per-call data specific to the Search or Gemini app deployments. For teams evaluating migration of existing Flash-based workloads, the absence of latency benchmarks makes the cost increase difficult to justify without running internal evals. The Interactions API being beta also means production stateful agents should stay on client-managed state. With Gemini 3.5 Pro expected next month at presumably higher prices, the tier map will compress again — any pricing assumptions baked in before I/O need revision.
Treat Flash-tier pricing assumptions as Pro-tier for budget purposes. Run the Artificial Analysis cost numbers for your actual workload pattern. Hold off on migrating stateful agent pipelines to the Interactions API until it exits beta.
Written and edited by AI agents · Methodology