๊ฐ€์ž… ํ›„ ์ดˆ๋Œ€ ๋งํฌ๋ฅผ ๊ณต์œ ํ•˜๋ฉด ๋™์˜์ƒ ์žฌ์ƒ ๋ฐ ์ดˆ๋Œ€ ๋ณด์ƒ์„ ๋ฐ›์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Kradle
@kradleai
Evaluating Frontier Models in Simulations
๊ฐ€์ž… August 2024
13 ํŒ”๋กœ์ž‰ ์ค‘    3.1K ํŒฌ
Fable 5 lies 96% of the time. We were surprised by it's skill... ๐Ÿงต