가입 후 초대 링크를 공유하면 동영상 재생 및 초대 보상을 받을 수 있습니다.

Zephyr
@zephyr_z9
AI & Chips | Not Investment Advice | DYOD
가입 August 2023
724 팔로잉 중    148.4K
"early access" Scammy vibes If it's really a sub-quadratic sparse attention arch (SSA), then serving this should be really cheap No point in putting this behind early access
Introducing SubQ - a major breakthrough in LLM intelligence. It is the first model built on a fully sub-quadratic sparse-attention architecture (SSA), And the first frontier model with a 12 million token context window which is: - 52x faster than FlashAttention at 1MM tokens - Less than 5% the cost of Opus Transformer-based LLMs waste compute by processing every possible relationship between words (standard attention). Only a small fraction actually matter. @subquadratic finds and focuses only on the ones that do. That's nearly 1,000x less compute and a new way for LLMs to scale.
더 보기