AI models aren’t yet general-purpose alignment scientists. Progress isn't as easy to verify on most alignment research tasks: our AARs would find “fuzzier” research much harder.
But our experiment does show that Claude can increase the rate of experimentation and exploration.