Google Deploys Faster Gemini 3 Flash to Challenge OpenAI, Sets as Default App Model and Powers Search

Alphabet Inc.'s Google is escalating its AI rivalry with OpenAI by launching Gemini 3 Flash, a faster and more cost-efficient model that went live across its ecosystem Wednesday, just one month after releasing its flagship Gemini 3 Pro. The new model immediately replaced Gemini 2.5 Flash as the default in the Gemini app and began powering AI Mode in Google Search the same day, marking an aggressive push to leverage its distribution advantage in the intensifying competition with OpenAI and other AI rivals.

AI Generated Image

The rollout comes as the AI race increasingly consolidates into a standoff between Google and OpenAI. Last month's Gemini 3 Pro launch reportedly triggered a "code red" at OpenAI, prompting the release of GPT-5.2 and an updated image generation model within weeks. Google has now answered with a model designed to democratize access to advanced AI capabilities through speed and affordability while maintaining frontier-level performance.

"A few weeks ago we released Pro, and we are excited about the reception," said Tulsee Doshi, senior director of product management for Gemini at Google DeepMind. With Gemini 3 Flash, she said, "we bring the model to everyone."

The rapid deployment across Google's products—from Search to its app to enterprise platforms—reflects Google's strategy of leveraging its ubiquity advantage as ChatGPT's traffic growth shows signs of cooling while Gemini's market share gains momentum.

Flash Matches or Exceeds Larger Models on Key Benchmarks

Gemini 3 Flash delivers frontier-level performance despite its focus on speed and efficiency. On GPQA Diamond, measuring PhD-level reasoning, the model scored 90.4%, while achieving 33.7% on Humanity's Last Exam benchmark without tool use—outperforming Gemini 3 Pro's 37.5% score on the latter test. The model reached 81.2% on MMMU-Pro, a multimodality and reasoning benchmark, surpassing all competitors.

More striking is Gemini 3 Flash's performance on coding tasks. On SWE-bench Verified, which evaluates agentic coding capabilities, it achieved a 78% resolution rate—second only to GPT-5.2 and exceeding Gemini 3 Pro as well as other frontier models. This demonstrates the model's particular strength in development workflows where rapid iteration matters.

"We really position flash as more of your workhorse model," Doshi told TechCrunch in a briefing, noting that its combination of speed and reasoning makes it ideal for bulk tasks and high-frequency workflows.

Cost and Speed Advantages Over Predecessor

Google priced Gemini 3 Flash at $0.50 per million input tokens and $3.00 per million output tokens-- slightly higher than Gemini 2.5 Flash’s $0.30 and $2.50 rates. However, the company emphasizes that the new model outperforms the more expensive Gemini 2.5 Pro while operating three times faster, based on Artificial Analysis benchmarking.

Additionally, Gemini 3 Flash uses 30% fewer tokens on average than 2.5 Pro for typical tasks when processing at the highest thinking level, potentially offsetting the higher per-token cost. The model was designed to modulate its thinking—spending more computational resources on complex queries but completing routine tasks more efficiently.

This efficiency enables Google to power AI agents at less than a quarter the cost of Gemini 3 Pro while maintaining higher rate limits, making it particularly attractive for enterprise applications requiring scale and responsiveness.

Broad Rollout Across Consumer and Enterprise Products

Gemini 3 Flash became available globally Wednesday across multiple platforms. In the Gemini app, it replaced 2.5 Flash as the default model for all users at no cost, representing a significant upgrade to the everyday experience. Users can still select the Pro model from the model picker for more complex math and coding questions.

In Google Search, the new Flash model now powers AI Mode, which delivers conversational responses rather than traditional links. According to Robby Stein, vice president of product for Google Search, Gemini 3 Flash excels at refined searches with multiple conditions, such as finding evening activities for parents with young children. Google is also making Gemini 3 Pro available in AI Mode for U.S. users, along with expanded access to Nano Banana Pro, its premium image generation tool.

For developers, Gemini 3 Flash is available through Google AI Studio, Gemini CLI, Vertex AI, and Antigravity, Google's new agentic development platform. The model's multimodal capabilities enable applications from video analysis to in-game assistants requiring split-second responsiveness.

Early Enterprise Adoption Shows Strong Reception

Companies including JetBrains, Figma, Cursor, Harvey, Latitude, Salesforce, Workday, and Bridgewater Associates are already deploying Gemini 3 Flash. Denis Shiryaev, head of AI DevTools Ecosystem at JetBrains, said the model "delivered quality close to Gemini 3 Pro, while offering significantly lower inference latency and cost" in the company's AI Chat and coding evaluations.

Jasjeet Sekhon, chief scientist and head of AIA Labs at Bridgewater Associates, noted that "Gemini 3 Flash is the first to deliver Pro-class depth at the speed and scale our workflows demand."

Since releasing Gemini 3, Google has processed over one trillion tokens per day through its API, underscoring the scale of deployment and developer adoption. The company said the release reflects an industry-wide push where all companies are being challenged to remain active. "All of these models are continuing to be awesome, challenge each other, push the frontier," Doshi said.

NEWS / Analysis