Publication

Deceptive AI systems that give explanations are just as convincing as honest AI systems in human-machine decision making

Danry, Valdemar, Pat Pataranutaporn, Ziv Epstein, Matthew Groh, and Pattie Maes (Under review). “Deceptive AI systems that give explanations are just as convincing as honest AI systems in human-machine decision making.” Extended Abstract. Presented at the International Conference on Computational Social Science (IC2S2) 2022.

Abstract

In the past few years, there has been an increase in AI-based disinformation campaigns, which are attempts to spread misinformation online for strategic reasons. How AI-systems explain how they arrive at their classifications can be deceptive, in that they can be manipulated to make the system appear more reliable than it is. For example, a bot may claim to be human in order to evade detection, or a machine learning system may falsely claim a piece of information to be true when it is not. While previous work has shown that AI-explanations help people determine the veracity of information online and change people’s beliefs, little is known about how susceptible people are to deceptive AI systems. For instance, previous research on placebic information has shown that any explanation (even poor ones) significantly influences people’s behavior. With the increasing prevalence of large language models like GPT-3 that can automatically generate and target individuals with highly believable and deceptive explanations to manipulate their opinion, and with the same models increasingly being proposed in AI fact-checking systems, a practical question rooted in the theory of placebic information emerges: how do AI systems with honest and deceptive explanations affect people’s ability to discern true news from fake news online?

In this paper, we investigate how people’s discernment varies (1) when AI systems are perceived as either human fact-checkers or AI fact-checking systems, and (2) when explanations provided by those fact-checkers are either deceptive (i.e. the AI system falsely generating explanations for why a true headline is false or why a false headline is true) or honest (i.e. the AI system accurately generating explanations for why a true headline is true or why a false headline is false). In a between-subjects randomized 2×2 factorial design experiment, we had 128 participants provide 1,792 truth discernment judgments on 14 different true and false news headlines. The headlines were randomly assigned with AI-generated explanations that (1) are either labeled as a “human fact-checker” or an “AI fact-checking system”, and (2) are without the participants knowledge being either deceptive or honest. Participants were asked to provide judgment discerning true news from fake news on a likert scale (from “definitely false” to “definitely true”), which is used to calculate weighted discernment accuracy. Participants were asked to provide their judgment twice, one before the explanation (pre-explanation), and one after the explanation, also subjected to self reporting their level of trust in the agent providing them with explanations.

Related Content