註冊並分享邀請連結,可獲得影片播放與邀請獎勵。

0xSero
@0xSero
Dad | Open Source AI | Back to Pleroma | ⵣ
加入 December 2020
969 正在關注    50.2K 粉絲
GLM-5.1-478B-NVFP4 Running on: - 4x RTX Pro 6000 - Sglang - 370,000 max tokens (1.75x full context) - p10 27.7 | p90 45.6 tok/s decode (gen) - 1340 tok/s prefill I could get 2x decode if I limit to 64k context (100 tok/s) In this video it operates Figma (:
顯示更多
0
24
305
14
轉發到社區