We're hiring at Gradient.
Building open-source environment infrastructure for our distributed RL training stack — reproducible, scalable to thousand-GPU runs
Looking for 1–2 RL Environments engineers / tech leads: You've designed verifiers, built sandboxes for agentic RL rollouts, or shipped RL training data pipelines that survived contact with real training.
Domain depth in math, code, agent, tool, or GUI is a plus. PhD not required.
Also hiring research interns: PhD / Masters students with hands-on RLHF / RLVR / GRPO / DPO / agentic RL experience. Open-source footprint matters more than paper count. Most intern roles convert post-grad. No age cap. Founding-team-level equity for the right people.
DMs open.