Gemini 3.2 Flash leaks: fast and cheap seems to be the focus
- Gemini 3.2 Flash looks focused on making AI much faster and cheaper without sacrificing too much quality
- According to my sources, Google may rename it to Gemini 3.5 Flash
- It may perform close to Gemini 3.1 Pro level while keeping very low latency with sub-200ms responses rumored for many queries
- Pricing leaks point to around $0.25 input / $2 output per 1M tokens, though honestly that still feels too cheap to fully trust right now
- Google is using stronger distillation and sparsity techniques to compress larger model capabilities into a lightweight version
- Knowledge cutoff is said to be updated to January 2026
- Google also seems focused on grounding + search reliability to reduce hallucinations in real-world workflows
- Expected around Google I/O, possibly 1-2 days before the keynote
顯示更多