Inference Chips for Agent Workflows
@sdianahu
Most AI chips are designed for "prompt in, response out." Agents don't work that way. They loop, branch, and hold context across dozens of steps, and current GPUs hit 30–40% utilization as a result.
That gap is where purpose-built silicon wins.