注册并分享邀请链接,可获得视频播放与邀请奖励。

⚡🛡️ Evan Pappas
@Hevalon
🛡️ Ex Technologia Libertas - Έλευθερία διὰ τῆς τέχνης - (Dec/Acc)
加入 September 2009
4K 正在关注    1.4K 粉丝
I built autoresearch-rl and pointed it at GRPO fine-tuning on @basilic_ai A100s. One command. 15 iterations. Zero human intervention. 100% infrastructure success rate. GSM8K pass@1: 26% baseline to 36%. The hard part wasn't the search algorithm. It was the infrastructure.
显示更多
0
3
94
21
转发到社区