Stephen Roller(@stephenroller ):Some teams use sweeps, heuristics, or scaling laws to determine their training LR. At Character, we just have Noam Shazeer dial it to the right value.

Register and share your invite link to earn from video plays and referrals.

Register now

Stephen Roller

@stephenroller

MTS @thinkymachines. previously pre-training @googledeepmind, @character_ai, and @aiatmeta.

Joined February 2008

1.3K Following 5.7K Followers

Stephen Roller@stephenroller

2024.06.14 02:09

Some teams use sweeps, heuristics, or scaling laws to determine their training LR. At Character, we just have Noam Shazeer dial it to the right value.

0

0

16

301

26

Forward to community

Most Popular Users

14.4M Followers

37.3M Followers

3.8M Followers

15.1M Followers

Roshn Saudi League

136.4K Followers

869.1K Followers

Natsume✨枣糕

1.2M Followers

45.3M Followers

390.2K Followers

桃乃木かな

2.1M Followers

3.2M Followers

7.4M Followers

Saturday Night Live

3.4M Followers

2.6M Followers

Sera Choi | 최수연

114.8K Followers