Anthropic
@AnthropicAI
We're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant @claudeai on https://t.co/FhDI3KQh0n.
Joined January 2021
36 Following    1.2M Followers
A more realistic example: AIs trained to be harmless chatbots can take unsafe actions in agentic settings. Preceding this training with MSM on a realistic spec drastically improves generalization, reducing unsafe agentic actions. https://t.co/PJcF380iAq
Show more