Comment: Discerning truth in the age of ubiquitous disinformation

Professor Kalina Bontcheva from the University of Sheffield’s Department of Computer Science comments on fake news and reflects on her experience of giving evidence at the Department for Digital, Culture, Music and Sport (DCMS) enquiry on fake news

The impact of fake news 

By Professor Kalina Bontcheva, Department of Computer Science, 9.1.18

The past few years have heralded the age of ubiquitous disinformation - aka fake news - which poses serious questions over the role of social media and the internet in modern democratic societies.

Topics and examples abound, ranging from the Brexit referendum and the US presidential election to medical misinformation such as miraculous cures for cancer.

Social media now routinely reinforces its users’ confirmation bias, so often, little to no attention is paid to opposing views or critical reflections. Blatant lies often make the rounds, are re-posted and shared thousands of times and sometimes even jump successfully into mainstream media. Debunks and corrections, on the other hand, receive comparatively little attention.

I often get asked: “So why is this happening?”

My short answer is the 4Ps of the modern disinformation age: post-truth politics, online propaganda, polarised crowds and partisan media.

  • Post-truth politics: The first societal and political challenge comes from the emergence of post-truth politics, where politicians, parties and governments tend to frame key political issues in propaganda, instead of facts. Misleading claims are continuously repeated, even when proven untrue through fact-checking by media or independent experts (e.g. the VoteLeave claim that Britain was paying the EU £350 million a week). This has a highly corrosive effect on public trust.
  • Online propaganda and fake news: State-backed (e.g. Russia Today), ideology-driven (e.g. misogynistic or Islamophobic), or for-profit clickbait websites and social media accounts are all engaged in spreading misinformation, often with the intent to deepen social division and/or influence key political outcomes such as the 2016 US presidential election.
  • Partisan media: The pressures of the 24-hour news cycle and today’s highly competitive online media landscape have resulted in reporting poorer quality and worsening opinion diversity, with misinformation, bias and factual inaccuracies routinely creeping in.
  • Polarised crowds: As more and more citizens turn to online sources as their primary source of news, the social media platforms and their advertising and content recommendation algorithms have facilitated the creation of partisan camps and polarised crowds, characterised by flame wars and biased content sharing, which in turn, reinforces their prior beliefs (typically referred to as confirmation bias).

On Tuesday 19 December 2017, I gave evidence in front of the Digital, Culture, Media, and Sports Committee (DCMS) as part of their enquiry into fake news (although I prefer the term disinformation) and automation (aka bots) - their ubiquity, impact on society and democracy, the role of platforms and technology in creating the problem, and briefly whether we use existing technology to detect and neutralise the effect of bots and disinformation.

We only had an hour long session to answer 51 questions, spanning all these issues, so it meant each answer had to be kept very brief. The full transcript is available here.

The list of questions was not given to us in advance, which, coupled with the need for short answers, left me with a number of additional points I would like to make. This is the first of several blog posts where I will revisit some of these questions.

So let me focus here on the first four questions (Q1 to Q4 in the transcript), which were about the availability and accuracy of technology for automatic detection of disinformation on social media platforms. In particular - can such technology identify disinformation in real time (part of Q3) and should it be adopted by the social media platforms themselves (Q4)?

The very short answer is: Yes, in principle, but we are still far from solving key socio-technical issues, so, when it comes to containing the spread of disinformation, we should not use this as yet another stick to beat the social media platforms with.

And here is why this is the case:

  • Non-trivial scalability: While some of our algorithms work in near real time on specific datasets such as tweets about the Brexit referendum - applying them across all posts on all topics as Twitter would need to do, for example, is very far from trivial. Just to give a sense of the scale here - prior to 23 June 2016 (referendum day) we had to process fewer than 50 Brexit-related tweets per second, which was doable. Twitter, however, would need to process more than 6,000 tweets per second, which is a serious software engineering, computational, and algorithmic challenge.
  • Algorithms make mistakes, so while 90 per cent accuracy intuitively sounds very promising, we must not forget the errors 10 per cent in this case, or double that at 80 per cent algorithm accuracy. On 6,000 tweets per second this 10 per cent amounts to 600 wrongly labeled tweets per second rising to 1,200 for the lower accuracy algorithm. To make matters worse, automatic disinformation analysis often combines more than one algorithm - first to determine which story a post refers to and second - whether this is likely true, false, or uncertain. Unfortunately, when algorithms are executed in a sequence, errors have a cumulative effect.
  • These mistakes can be very costly: broadly speaking algorithms make two kinds of errors - false negatives in which disinformation is wrongly labelled as true or bot accounts wrongly identified as human and false positives, correct information is wrongly labelled as disinformation or genuine users being wrongly identified as bots. False negatives are a problem on social platforms, because the high volume and velocity of social posts (e.g. 6,000 tweets per second on average) still leaves us with a lot of disinformation “in the wild”. If we draw an analogy with email spam - even though most of it is filtered out automatically, we are still receiving a significant proportion of spam messages. False positives, on the other hand, pose an even more significant problem, as they could be regarded as censorship. Facebook, for example, has a growing problem with some users having their accounts wrongly suspended.

Views posted in comment articles are those of the author(s) and do not necessarily reflect the opinion of the University of Sheffield.