Gemini 2.0 and 2.5 Model Shutdowns and Replacements 2026

Blockers

! package/gemini-2.0-flash — EOL 2026-06-01, successor: package/gemini-2.5-flash
! package/gemini-2.0-flash-lite — EOL 2026-06-01, successor: package/gemini-2.5-flash-lite
! package/gemini-2.5-flash — EOL 2026-06-17, successor: package/gemini-3-flash-preview
! package/gemini-2.5-flash-lite — EOL 2026-07-22, successor: package/gemini-3.1-flash-lite-preview
! package/gemini-2.5-pro-preview — EOL 2025-12-02, successor: package/gemini-3.1-pro-preview
! thinking_budget parameter removed — must use thinking_level instead; sending both returns 400
! thinking_budget parameter removed — must use thinking_level instead; sending both returns 400

Who this is for

serverless
low-ops
cost-sensitive
small-team
enterprise
microservices
monorepo
high-scale
real-time
compliance

Candidates

Replace `gemini-2.0-flash` and `gemini-2.0-flash-001` with stable `gemini-2.5-flash`

As of 2026-04-24, Google's Gemini deprecations page says `gemini-2.0-flash` and `gemini-2.0-flash-001` shut down on June 1, 2026 and recommends `gemini-2.5-flash`. Google's model page positions `gemini-2.5-flash` as its best price-performance model for low-latency, high-volume reasoning tasks, and the pricing page lists paid Gemini Developer API pricing at $0.30 per 1M text, image, or video input tokens and $2.50 per 1M output tokens, with Batch pricing at $0.15 input and $1.25 output.

When to choose

Best for serverless + low-ops or microservices + high-scale teams that want the least disruptive path off Gemini 2.0 Flash, already rely on text output only, and need a stable model ID rather than an immediate jump to a preview-only replacement family. Use this when you need the June 1, 2026 shutdown fixed first and can accept that this is a short bridge rather than a long-lived endpoint.

Tradeoffs

Google's model page says `gemini-2.5-flash` supports text, image, video, and audio input, text output, a 1,048,576-token input limit, a 65,536-token output limit, and support for Batch API, caching, code execution, file search, function calling, Google Maps grounding, search grounding, structured outputs, thinking, and URL context. The tradeoff is higher paid token pricing than `gemini-2.5-flash-lite`.

Cautions

As of 2026-04-24, Google's deprecations page also lists `gemini-2.5-flash` with a June 17, 2026 shutdown date and recommends `gemini-3-flash-preview` next. If you are migrating late in Q2 2026, you may prefer to evaluate whether one larger move is safer than two back-to-back cutovers.

Sources

Replace `gemini-2.0-flash-lite` and `gemini-2.0-flash-lite-001` with stable `gemini-2.5-flash-lite`

As of 2026-04-24, Google's Gemini deprecations page says `gemini-2.0-flash-lite` and `gemini-2.0-flash-lite-001` shut down on June 1, 2026 and recommends `gemini-2.5-flash-lite`. Google's model and pricing pages position `gemini-2.5-flash-lite` as the cost-focused, high-throughput option, priced at $0.10 per 1M text, image, or video input tokens and $0.40 per 1M output tokens in standard paid usage, with Batch pricing at $0.05 input and $0.20 output.

When to choose

Best for cost-sensitive + high-scale or small-team + serverless workloads where latency and budget matter more than maximum reasoning depth, and the current 2.0 Flash-Lite usage is high-frequency classification, extraction, routing, or lightweight multimodal handling. This is the cleanest family-preserving swap when you need the June 1, 2026 shutdown covered with minimal operational churn.

Tradeoffs

Google's model page says `gemini-2.5-flash-lite` supports text, image, video, audio, and PDF input, text output, a 1,048,576-token input limit, a 65,536-token output limit, and support for Batch API, caching, code execution, file search, function calling, Google Maps grounding, search grounding, structured outputs, thinking, and URL context. The tradeoff is that it is optimized for cost and speed rather than for the most complex reasoning.

Cautions

As of 2026-04-24, Google's deprecations page says `gemini-2.5-flash-lite` shuts down on July 22, 2026 and recommends `gemini-3.1-flash-lite-preview` next, so teams migrating after April 2026 should budget for another verification cycle quickly. Recheck rate limits and pricing before rollout as of 2026-04-24.

Sources

Move early Gemini 2.5 Flash preview IDs to `gemini-3-flash-preview`

As of 2026-04-24, Google's deprecations page shows `gemini-2.5-flash-preview-05-20` was shut down on November 18, 2025 and `gemini-2.5-flash-preview-09-25` was shut down on February 17, 2026, and recommends `gemini-3-flash-preview` for both. Google's pricing page lists `gemini-3-flash-preview` at $0.50 per 1M text, image, or video input tokens and $3.00 per 1M output tokens in standard paid usage, with Batch pricing at $0.25 input and $1.50 output.

When to choose

Best for high-scale + real-time or low-ops + small-team teams that were already willing to run preview model IDs and now need the officially recommended successor for old 2.5 Flash preview deployments. Choose this when avoiding a stale or already-shut-down preview ID matters more than staying on a stable 2.5 family name.

Tradeoffs

Google's Gemini 3 docs say Gemini 3 Flash has a 1M input context window, up to 64k output tokens, Batch API support, Context Caching support, a free tier in the Gemini API, and support for Google Search grounding. The tradeoff is that it is still a preview model.

Cautions

Preview models may change before becoming stable and may have more restrictive rate limits, according to Google's pricing page. Google's Gemini 3 guide also introduces `thinking_level` and says you must not send both `thinking_level` and the legacy `thinking_budget` in the same request or you will get a 400 error. This is a real migration surface if older preview code tuned `thinking_budget` directly.

Sources

Move `gemini-2.5-flash-lite-preview-09-2025` to `gemini-3.1-flash-lite-preview`

As of 2026-04-24, Google's deprecations page shows `gemini-2.5-flash-lite-preview-09-2025` was shut down on March 31, 2026 and recommends `gemini-3.1-flash-lite-preview`. Google's pricing page lists `gemini-3.1-flash-lite-preview` at $0.25 per 1M text, image, or video input tokens and $1.50 per 1M output tokens in standard paid usage, with Batch pricing at $0.125 input and $0.75 output.

When to choose

Best for cost-sensitive + high-scale or serverless + microservices teams that deliberately chose the 2.5 Flash-Lite preview line for throughput and now need the official successor after the March 31, 2026 cutoff passed. Use this when the workload is still mostly translation, routing, simple extraction, or high-volume agentic plumbing and preview-model churn is acceptable.

Tradeoffs

Google's pricing page describes `gemini-3.1-flash-lite-preview` as its most cost-efficient model for high-volume agentic tasks, translation, and simple data processing. Relative to stable `gemini-2.5-flash-lite`, this is a higher-priced but newer preview line intended to replace the retired 2.5 Flash-Lite preview branch.

Cautions

The pricing page explicitly says preview models may change before becoming stable and have more restrictive rate limits. Google's Gemini 3 guide also says Gemini 3 models use `thinking_level` by default and warns not to mix it with `thinking_budget` in one request. If your old 2.5 preview integration used preview-specific behavior, regression-test prompts before swapping IDs.

Sources

Move early Gemini 2.5 Pro preview IDs to `gemini-3.1-pro-preview`

As of 2026-04-24, Google's deprecations page shows `gemini-2.5-pro-preview-03-25`, `gemini-2.5-pro-preview-05-06`, and `gemini-2.5-pro-preview-06-05` were shut down on December 2, 2025 and recommends `gemini-3.1-pro-preview`. Google's pricing page lists `gemini-3.1-pro-preview` at $2.00 per 1M input tokens and $12.00 per 1M output tokens for prompts up to 200k tokens, or $4.00 input and $18.00 output above 200k, with Batch pricing at $1.00 and $6.00 up to 200k.

When to choose

Best for enterprise + compliance or monorepo + coding-heavy teams that were using early 2.5 Pro previews for hard reasoning or coding tasks and now need the official replacement line rather than a downgrade to Flash-class pricing or capability. Choose this when high-complexity prompting matters more than minimizing spend.

Tradeoffs

Google's Gemini 3.1 Pro pricing page positions it as the latest top-tier multimodal and agentic model family, and the Gemini 3 guide shows the new `thinking_level` control on `gemini-3.1-pro-preview`. The tradeoff is materially higher token pricing than any Flash-family replacement and no free tier in the Gemini API.

Cautions

Do not confuse `gemini-3-pro-preview` with the current target. Google's pricing page warns that `gemini-3-pro-preview` was deprecated and shut down on March 9, 2026, and the models page says to migrate to `gemini-3.1-pro-preview`. The Gemini 3 guide also warns that using both `thinking_level` and `thinking_budget` together returns a 400 error.

Sources

Facts updated: 2026-04-24

Published: 2026-04-03

Gemini 2.0 and 2.5 Model Shutdowns and Replacements — when and how should I migrate?

Blockers

Who this is for

Candidates

Replace `gemini-2.0-flash` and `gemini-2.0-flash-001` with stable `gemini-2.5-flash`

When to choose

Tradeoffs

Cautions

Sources

Replace `gemini-2.0-flash-lite` and `gemini-2.0-flash-lite-001` with stable `gemini-2.5-flash-lite`

When to choose

Tradeoffs

Cautions

Sources

Move early Gemini 2.5 Flash preview IDs to `gemini-3-flash-preview`

When to choose

Tradeoffs

Cautions

Sources

Move `gemini-2.5-flash-lite-preview-09-2025` to `gemini-3.1-flash-lite-preview`

When to choose

Tradeoffs

Cautions

Sources

Move early Gemini 2.5 Pro preview IDs to `gemini-3.1-pro-preview`

When to choose

Tradeoffs

Cautions

Sources

Try with your AI agent

Gemini 2.0 and 2.5 Model Shutdowns and Replacements — when and how should I migrate?

Blockers

Who this is for

Candidates

Replace `gemini-2.0-flash` and `gemini-2.0-flash-001` with stable `gemini-2.5-flash`

When to choose

Tradeoffs

Cautions

Sources

Replace `gemini-2.0-flash-lite` and `gemini-2.0-flash-lite-001` with stable `gemini-2.5-flash-lite`

When to choose

Tradeoffs

Cautions

Sources

Move early Gemini 2.5 Flash preview IDs to `gemini-3-flash-preview`

When to choose

Tradeoffs

Cautions

Sources

Move `gemini-2.5-flash-lite-preview-09-2025` to `gemini-3.1-flash-lite-preview`

When to choose

Tradeoffs

Cautions

Sources

Move early Gemini 2.5 Pro preview IDs to `gemini-3.1-pro-preview`

When to choose

Tradeoffs

Cautions

Sources

Related questions

Try with your AI agent