We wanted to understand differences in how true and false news spread in social media across a broad range of kinds of news. This interest was prompted by the Boston Marathon bombing, at which time both Soroush Vosoughi and Deb Roy personally experienced the impact of the spread of rumors (news which later turned out sometimes to be false) while trying to find the latest updates. Soroush then developed his PhD around a model for detecting and predicting the veracity of rumors as they began to spread -- which would become the basis for wanting to understand and explain the spread of false news online. Sinan Aral’s work, on the other hand, focused on the impact of social media and social influence on the diffusion of information and behavior in online social networks. We collaborated with Sinan to study a wide range of rumors, and to understand patterns of the spread of true vs. false information overall.
We browsed stories across the websites of six fact-checking news organizations. We identified stories for which there was agreement on their veracity amongst the organizations. We then searched on Twitter for content about these items. We used machine learning to match the text urls, and memes from Twitter to these stories.
False information spreads faster, farther, deeper, and more broadly than true information. False information also tends to be more novel than true news. On average, false news is ~70% more likely to be retweeted than true news.
Bots weren’t as important in the spread of false info as we thought they might be. Also, the strong correlation of emotional responses of surprise and disgust tied to false news.
People are more likely to spread novel and surprising information, which favors the spread of falsity over the truth.
This work explores both false and true information and their spread. According to a recent Gallup Poll, Americans refer to three distinct meanings when using the phrase “fake news”:
1. False news presented as truth
2. Opinion presented as fact
3. True news that cast a politician or political party in a negative light
Our study was focused on contested news which often ends up being addressed by fact-checking organizations, and corresponds to (1) and (2) but not type (3) of “fake news.”
The stories contained in the tweet have already been investigated by some or all of the six independent fact-checking organizations. The veracity of these stories are confirmed and extracted from the organizations’ websites and used as ground truth in our analysis.
We used state-of-the-art detectors (developed by other academic labs) to remove bots from our data. Removing the bots did not alter our metrics or the findings of the study. We believe that although bots were present in our data, they were not the driver of the findings. Humans were.
Bots accelerated the spread of false and true news at approximately the same rate. This suggests that false news spreads farther, faster, deeper and more broadly than the truth because humans, not robots, are more likely to spread it.
Some AI / machine learning methods were used in the process of analyzing and interpreting the data, but the main relationship to AI is in the analysis of bot activity and the role of bots in the spread of false news.
Twitter provided funding and data access to support this research, and permitted us to publish the findings. We have shared these results with Twitter prior to publication.
We are not in a position to comment on Twitter’s plans.
Twitter supported the Lab for Social Machines (LSM) based at the MIT Media Lab for the past four years through funding and access to Twitter data. As Principal Investigator of LSM, Deb Roy sets the research directions for the lab independently of Twitter.
We used the Twitter historical archives for this study. The archives include all tweets ever made, going back to the first tweet. This is different than the public view of Twitter content as only the most recent ~3,200 tweets of an account are publicly viewable.
We verified the accuracy of the claims through six fact-checking news organizations that exhibited 95-98% agreement on the classifications (snopes.com, politifact.com, factcheck.org, truthorfiction.com, hoax-slayer.com and urbanlegends.about.com).
The current study cannot address this question.