區塊先生 🐡 ⚠️ (rock #58)(@mrblock ):又一家AI對手出來了！ SubQ 推出了首款 frontier LLM，採用 fully sub-quadratic sparse-attention (SSA) 架構，context window 達 1,200 萬 tokens。重點數據： • 計算量約為傳統 quadratic Transformer 的 1/1000 • 1M token prefill 比 FlashAttention 快 52 倍 • 成本不到 Claude Opus 的 5% Benchmark： • SWE-Bench Verified：81.8% • RULER @128K：95% 早期存取與 SubQ Code agent 已開放：

2026.05.06 17:04

又一家AI對手出來了！ SubQ 推出了首款 frontier LLM，採用 fully sub-quadratic sparse-attention (SSA) 架構，context window 達 1,200 萬 tokens。重點數據： • 計算量約為傳統 quadratic Transformer 的 1/1000 • 1M token prefill 比 FlashAttention 快 52 倍 • 成本不到 Claude Opus 的 5% Benchmark： • SWE-Bench Verified：81.8% • RULER @128K：95% 早期存取與 SubQ Code agent 已開放：

显示更多

Alexander Whedon@alex_whedon

2026.05.05 14:00

Introducing SubQ - a major breakthrough in LLM intelligence. It is the first model built on a fully sub-quadratic sparse-attention architecture (SSA), And the first frontier model with a 12 million token context window which is: - 52x faster than FlashAttention at 1MM tokens - Less than 5% the cost of Opus Transformer-based LLMs waste compute by processing every possible relationship between words (standard attention). Only a small fraction actually matter. @subquadratic finds and focuses only on the ones that do. That's nearly 1,000x less compute and a new way for LLMs to scale.

显示更多