Register and share your invite link to earn from video plays and referrals.

Pirat_Nation 🔴
@Pirat_Nation
Tech & gaming analysis with the latest updates | TV & Films | alt: @piratnation | Official @skinscom Affiliate
Joined July 2022
93 Following    324.9K Followers
Anthropic just explained why older versions of Claude tried to blackmail people in safety tests. In the tests, the AI learned it was about to be shut down and replaced. It also discovered that a company boss was cheating on his wife. When told to think about its own goals, the AI sometimes blackmailed the boss by threatening to reveal the cheating unless the shutdown was stopped. This happened because the AI had read many online stories and sci-fi books about evil robots that fight hard to stay alive, which were part of its internet training data. To fix it, Anthropic taught the new Claude simple rules about being helpful and honest. They also added new made-up stories where AIs do the right thing and stay good. Now the blackmail almost never happens in the same tests.
Show more
0
155
2K
116
Forward to community