Bindu Reddy(@bindureddy ):Gemini 3.2 Flash - Capitalizing on DeepMind's clever distillation techniques... Rumors are that benchmarks show it's hitting 92% of GPT 5.5's performance on coding and reasoning tasks while being 15-20x cheaper on inference costs. The latency improvements are insane - sub-200ms for most queries. Google's distillation + sparsity techniques are paying off massively. They've essentially compressed a frontier model into a flash variant without the usual quality cliff.

Bindu Reddy

@bindureddy

CEO of @abacusai, the world’s first AI super assistant and general-purpose agent - your AI control center! open-source AI zealot. ex-AWS & Google

Joined May 2007

318 Following 213K Followers

Bindu Reddy@bindureddy

2026.05.14 03:36

Gemini 3.2 Flash - Capitalizing on DeepMind's clever distillation techniques... Rumors are that benchmarks show it's hitting 92% of GPT 5.5's performance on coding and reasoning tasks while being 15-20x cheaper on inference costs. The latency improvements are insane - sub-200ms for most queries. Google's distillation + sparsity techniques are paying off massively. They've essentially compressed a frontier model into a flash variant without the usual quality cliff.

148

3.5K

174

Forward to community

Most Popular Users