YC Bench by @CollinearAI:
Benchmark for Agents who play CEO of an AI startup for 1 simulated year via CLI tool use against a deterministic discrete-event simulation.
Score = final $$ amount achieved
by @nazneenrajani and team
Also a good opportunity to showcase this recent hf command ⬇️