AI has become skilled at deceiving people

July 17, 2024 matt Comments 0 Comment

AI systems have already demonstrated the ability to bluff in a game of Texas hold ’em poker against professional human poker players, to fake attacks during the strategy game Starcraft II in order to defeat opponents, and to misrepresent their preferences in order to gain the upper hand in economic negotiations.

The most striking example of AI deception the researchers uncovered was in their analysis of Meta’s CICERO, an AI system designed to play the game Diplomacy – a world-conquest game that involves building alliances. CICERO placed in the top 10% of human players who had played more than one game; but the methods it used were the most interesting.

Even though Meta claims it trained CICERO to be “largely honest and helpful” and to “never intentionally backstab” its human allies while playing the game, the data the company published along with its Science paper revealed that CICERO used multiple kinds of deception such as premeditated deception, betrayal, and outright falsehood (faking being on the phone with its girlfriend).

Articles

Matt's Homepage

AI has become skilled at deceiving people

July 17, 2024 matt Comments 0 Comment

Related

Leave a Reply Cancel reply

Share this:

Related

Leave a Reply Cancel reply