Why it is so dangerous for AI to learn how to lie: ‘It will deceive us like the rich’ (2024)

A poker player has bad cards but makes the biggest bet. The rest of the players are scared off by the bluff and concede victory. A buyer wants to negotiate for a product, but shows no interest. They look at other things first and ask question. Then, casually, they ask for what they’re really want to get a cheaper price. These two examples are not from humans, but from models made with artificial intelligence (AI).

A new scientific article titled AI deception: A Survey of Examples, Risks, and Potential Solutions, published in the journal Patterns, analyzes known cases of models that have deceived via manipulation, sycophancy and cheating to achieve their goals. Robots are not aware of what they are doing and are only looking for the best way to obtain their objective, but researchers believe that these incipient deceptions do not bode well if legislation does not limit AI options.

“At this point, my biggest fear about AI deception is that a super-intelligent autonomous AI will use its deception capabilities to form an ever-growing coalition of human allies and eventually use this coalition to achieve power, in the long-term pursuit of a mysterious goal that would not be known until after the fact,” says Peter S. Park, a postdoctoral researcher in Existential AI Security at the Massachusetts Institute of Technology (MIT) and one of the paper’s lead authors.

Park’s fear is hypothetical, but we have already seen it happen in AI programmed for a game. Meta announced in 2022 that its Cicero model had beaten human rivals at Diplomacy, a strategy game that — in the company’s words — is a mix of Risk, poker and the television show Survivors. As in real diplomacy, one of the resources players have is to lie and pretend. Meta employees noticed that when Cicero lied, its moves were worse and they programmed it to be more honest. But it wasn’t really.

Peter S. Park and his co-authors also tested Cicero’s honesty. “It fell to us to correct Meta’s false claim about Cicero’s supposed honesty that had been published in Science.” The political context of the Diplomacy game involves less risk than the real-life contexts, such as elections and military conflicts. But three facts should be kept in mind, says Park: “First, Meta successfully trained its AI to excel in the pursuit of political power, albeit in a game. Second, Meta tried, but failed, to train that AI to be honest. And third, it was up to us outside independent scientists to, long after the fact, disprove Meta’s falsehood that its power-seeking AI was supposedly honest. The combination of these three facts is, in my opinion, sufficient cause for concern.”

How AI lies

Researchers believe there are several ways in which specific AI models have shown that they can deceive effectively: they can manipulate as in Diplomacy, pretend by saying they will do something knowing they will not, bluff as in poker, haggle in negotiations, play dead to avoid detection and trick human reviewers into believing that the AI has done what it was supposed to do when it has not.

Not all types of deception involve this type of knowledge. Sometimes, and unintentionally, AI models are “sycophants” and simply agree with the human users. “Sycophancy could lead to persistent false beliefs in human users. Unlike ordinary errors, sycophantic claims are specifically designed to appeal to the user. When a user encounters these claims, they may be less likely to fact-check their sources. This could result in long-term trends away from accurate belief formation,” states the study.

No one knows for sure how to make these models tell the truth, says Park: “With our current level of scientific understanding, no one can reliably train large language models not to deceive.” What’s more, many engineers in many companies are working on creating different and more powerful models. Not everyone has the same initial interest in their robots being honest: “Some engineers take the risk of AI deception very seriously, to the point of advocating for or implementing AI safety measures. Other engineers do not take it so seriously and believe that applying a trial and error process will be enough to move towards safe and non-lying AI. And there are still others who refuse to even accept that the risk of AI deception exists,” says Park.

Deceiving to gain power

In the article, the researchers compare super-intelligent AI to how the rich aspire to gain more power. “Throughout history, wealthy actors have used deception to increase their power,” reads the study.

Park explains that this may happen: “AI companies are in an uncontrolled race to create a super-intelligent AI that surpasses humans in the most of the economically and strategically relevant capabilities. An AI of this type, like the rich, would be expert in carrying out long-term plans in the service of deceptively seeking power over various parts of society, such as influencing politicians with incomplete or false information, financing disinformation in the media or investigators, and evade responsibility using the laws. Just as money translates into power, many AI capabilities, such as deception, also translate into power.”

But not all academics are as concerned as Park. Michael Rovatsos, professor of Artificial Intelligence at the University of Edinburgh, told SMC Spain that the research is too speculative: “I am not so convinced that the ability to deceive creates a risk of ‘loss of control’ over AI systems, if appropriate rigor is applied to their design; the real problem is that this is not currently the case and systems are released into the market without such safety checks. The discussion of the long-term implications of deceptive capabilities raised in the article is highly speculative and makes many additional assumptions about things that may or may not happen in the future.”

The study says that the solution to curtailing the risks of AI deception is legislation. The European Union assigns each AI system one of four risk levels: unacceptable, high, limited, and minimal (or no) risk. Systems with unacceptable risk are prohibited, while systems with high risk are subject to special requirements. “We argue that AI deception presents a wide range of risks to society, so they should be treated by default as high risk or unacceptable risk,” says Park.

Sign up for our weekly newsletter to get more English-language news coverage from EL PAÍS USA Edition

Why it is so dangerous for AI to learn how to lie: ‘It will deceive us like the rich’ (2024)

FAQs

Why it is so dangerous for AI to learn how to lie: ‘It will deceive us like the rich’? ›

Researchers believe there are several ways in which specific AI models have shown that they can deceive effectively: they can manipulate as in Diplomacy, pretend by saying they will do something knowing they will not, bluff as in poker, haggle in negotiations, play dead to avoid detection and trick human reviewers into ...

Why is it so dangerous for artificial intelligence to learn to lie? ›

Propaganda and Misinformation Spread. AI-driven lies are a real danger to us. These systems can make and spread false info very fast. They might seem reliable at first but can hurt our choices later9.

Why is artificial intelligence so dangerous? ›

Bias. AI can inadvertently perpetuate biases that stem from the training data or systematic algorithms. Data ethics is still evolving, but a risk of AI systems providing biased outcomes exists, which could leave a company vulnerable to litigation, compliance issues, and privacy concerns.

Can AI deceive humans? ›

In depth study of AI behavior has revealed that some artificial intelligence programs are now capable of exploiting and deceiving humans in video games or circumventing other software designed to verify that the user is a human and not a bot.

Does AI know how to lie? ›

Many AI systems, new research has found, have already developed the ability to deliberately present a human user with false information. These devious bots have mastered the art of deception.

How is AI used as a lie detector? ›

A desktop app called LiarLiar purportedly uses AI to analyze facial movements, blood flow, and voice intonation in order to detect deception.

What is it called when AI lies? ›

AI hallucinations are instances where a generative AI system produces information that is inaccurate, biased, or otherwise unintended.

What are the negative effects of AI? ›

AI systems often require vast amounts of data to function effectively, which can lead to significant privacy concerns. Personal data collection, storage, and analysis can be intrusive, exposing sensitive information without individuals' consent.

What did Elon Musk say about AI? ›

He said, "Probably none of us will have a job. If you want to do a job that's kinda like a hobby, you can do a job. But otherwise, AI and the robots will provide any goods and services that you want."

Why should we stop AI? ›

We may find it impossible to regain control of the technology we have created. The argument for a total AI ban arises from the technology's very nature—its technological evolution involves acceleration to speeds that defy human control or accountability.

Will AI take over humans? ›

The short answer to this fear is: No, AI will not take over the world, at least not as it is depicted in the movies.

Can AI turn evil? ›

Not only might AI do evil things once we start plugging it into our most sensitive systems, but it might hide its evilness from us, biding time until the perfect strike. You can practically hear those GPUs crackling with villainous glee!

Is AI teaching itself to manipulate humans? ›

Another AI system trained on human feedback was found to have taught itself how to behave in ways that earned positive scores by tricking human reviewers into thinking an intended goal had been accomplished. The potential risks of AI deception are significant and multifaceted.

What is the evil side of AI? ›

Existential Risks and the Possibility of Uncontrolled AI

This could lead to catastrophic consequences, such as the decimation of the human population, the destruction of the planet, or the creation of a dystopian future where humans are subjugated or even replaced by AI overlords.

Can AI read your mind? ›

Not much. Our mental lives are flickering, lightning-fast, multiple-stream affairs, involving real-time percepts, memories, expectations and imaginings, all at once. It's hard to see how a transcript produced by even the most fine-tuned brain scanner, coupled to the smartest AI, could capture all of that faithfully.

What should you not ask AI? ›

Six Things You Should Never Ask An AI Assistant
  • Don't ask voice assistants to perform any banking tasks. ...
  • Don't ask voice assistants to be your telephone operator. ...
  • Don't ask voice assistants for any medical advice. ...
  • Don't ask voice assistants for any illegal or harmful activities.

Why is lie detection difficult? ›

The fact that there is no single cue that lie detectors can consistently rely upon makes lie detection inherently difficult... The meta-analyses further reveal that the majority of the nonverbal and verbal cues that researchers typically examine in deception studies are not related to deception at all."

Why is AI bias dangerous? ›

Biases Baked into Algorithms

AI bias, for example, has been seen to negatively affect non-native English speakers, where their written work is falsely flagged as AI-generated and could lead to accusations of cheating, according to a Stanford University study.

Why can't artificial intelligence be trusted? ›

For one, because AI tools are trained in closed environments and may encounter unfamiliar application environments, they can produce surprising biases due to their limited exposure to real-world data. Further, the processes for testing the presence of bias are difficult.

What is the bad point of AI? ›

The next disadvantage of AI is that it lacks the human ability to use emotion and creativity in decisions. The lack of creativity means AI can't create new solutions to problems or excel in any overly artistic field.

Top Articles
How To Generate Real Estate Leads: 7 Strategies | Mailchimp
How to Generate Leads for Your Business: A Practical Guide
Devin Mansen Obituary
Moon Stone Pokemon Heart Gold
South Park Season 26 Kisscartoon
How to change your Android phone's default Google account
<i>1883</i>'s Isabel May Opens Up About the <i>Yellowstone</i> Prequel
Fototour verlassener Fliegerhorst Schönwald [Lost Place Brandenburg]
Best Cheap Action Camera
CSC error CS0006: Metadata file 'SonarAnalyzer.dll' could not be found
270 West Michigan residents receive expert driver’s license restoration advice at last major Road to Restoration Clinic of the year
Progressbook Brunswick
Santa Clara Valley Medical Center Medical Records
Lenscrafters Huebner Oaks
WWE-Heldin Nikki A.S.H. verzückt Fans und Kollegen
Help with Choosing Parts
Radio Aleluya Dialogo Pastoral
Steamy Afternoon With Handsome Fernando
My.tcctrack
Haunted Mansion Showtimes Near Millstone 14
Dark Chocolate Cherry Vegan Cinnamon Rolls
Caring Hearts For Canines Aberdeen Nc
Weldmotor Vehicle.com
Does Hunter Schafer Have A Dick
Danielle Ranslow Obituary
How To Find Free Stuff On Craigslist San Diego | Tips, Popular Items, Safety Precautions | RoamBliss
Gilchrist Verband - Lumedis - Ihre Schulterspezialisten
Gen 50 Kjv
Ardie From Something Was Wrong Podcast
Cosas Aesthetic Para Decorar Tu Cuarto Para Imprimir
3473372961
Eaccess Kankakee
Deleted app while troubleshooting recent outage, can I get my devices back?
Car Crash On 5 Freeway Today
Mandy Rose - WWE News, Rumors, & Updates
Myql Loan Login
10 games with New Game Plus modes so good you simply have to play them twice
Ksu Sturgis Library
„Wir sind gut positioniert“
South Bend Tribune Online
Craigslist Mexicali Cars And Trucks - By Owner
Sept Month Weather
Low Tide In Twilight Manga Chapter 53
Ezpawn Online Payment
Searsport Maine Tide Chart
Walmart Front Door Wreaths
York Racecourse | Racecourses.net
Oak Hill, Blue Owl Lead Record Finastra Private Credit Loan
Tanger Outlets Sevierville Directory Map
Dmv Kiosk Bakersfield
O'reilly's Eastman Georgia
Arre St Wv Srj
Latest Posts
Article information

Author: Rev. Porsche Oberbrunner

Last Updated:

Views: 6158

Rating: 4.2 / 5 (73 voted)

Reviews: 80% of readers found this page helpful

Author information

Name: Rev. Porsche Oberbrunner

Birthday: 1994-06-25

Address: Suite 153 582 Lubowitz Walks, Port Alfredoborough, IN 72879-2838

Phone: +128413562823324

Job: IT Strategist

Hobby: Video gaming, Basketball, Web surfing, Book restoration, Jogging, Shooting, Fishing

Introduction: My name is Rev. Porsche Oberbrunner, I am a zany, graceful, talented, witty, determined, shiny, enchanting person who loves writing and wants to share my knowledge and understanding with you.