註冊並分享邀請連結,可獲得影片播放與邀請獎勵。

Lee Sharkey
@leedsharkey
Scruting matrices @ Goodfire | Previously: cofounded Apollo Research
加入 March 2015
1.6K 正在關注    3.3K 粉絲
My team at @GoodfireAI has been cooking up a new way to do interpretability: decompose a language model’s weights, not its activations. Our decomposition natively handles attention (!) and behaves less like a lookup table and more like a generalizing algorithm. (1/6)
顯示更多
0
34
1.5K
192
轉發到社區