Project

Deceptive AI systems in Human Information Processing

Copyright

Fluid Interfaces

Dall · E 2

This extended abstract was presented at the International Conference on Computational Social Science (IC2S2) 2022.

In the past few years, there has been an increase in AI-based disinformation campaigns, which are attempts to spread misinformation online for strategic reasons. How AI-systems explain how they arrive at their classifications can be deceptive, in that they can be manipulated to make the system appear more reliable than it is. For example, a bot may claim to be human in order to evade detection, or a machine learning system may falsely claim a piece of information to be true when it is not. While previous work has shown that AI-explanations help people determine the veracity of information online and change people’s beliefs, little is known about how susceptible people are to deceptive AI systems.

This project investigates how people's discernment varies when AI systems are perceived as either human fact-checkers or AI fact-checking systems, and when explanations provided by those fact-checkers are either deceptive (i.e. the AI system falsely generating explanations for why a true headline is false or why a false headline is true) or honest (i.e. the AI system accurately generating explanations for why a true headline is true or why a false headline is false). 

In a pilot study, we generated a dataset with honest and deceptive explanations for why a news headline was either true or false  by prompting the state-of-the-art text-generation model GPT-3.  

Copyright

Fluid Interfaces

We find that deceitful explanations significantly reduce accuracy, indicating that people are just as likely to believe deceptive AI explanations as honest AI explanations. Although before getting assistance from an AI-system (pre-explanation), people have significantly higher weighted discernment accuracy on false headlines than true headlines, we found that with assistance from an AI system, discernment accuracy increased significantly when given honest explanations on both true headlines and false headlines, and decreased significantly when given deceitful explanations on true headlines, and false headlines. Further, we did not observe any significant differences in discernment between explanations perceived as coming from a human fact checker compared to an AI-fact checker. Similarly, we found no significant differences in trust between human fact-checkers and AI fact-checkers. 

Copyright

Fluid Interfaces