Did Russian bots impact Brexit?

Don’t believe the new narrative

The New York Times recently ran a report headlined, “Signs of Russian Meddling in Brexit Referendum” based on a report in the Times of London. It makes sensational claims that were picked up and repeated by important politicians … like the British Prime Minister.

LONDON — More than 150,000 Russian-language Twitter accounts posted tens of thousands of messages in English urging Britain to leave the European Union in the days before last year’s referendum on the issue, a team of researchers disclosed on Wednesday.
The separate findings amount to the strongest evidence yet of a Russian attempt to use social media to manipulate British politics in the same way the Kremlin has done in the United States, France and elsewhere.

These claims concern me a great deal. I would go so far as to describe them as deliberate lying, or if you like, fake news.

My credentials

I am one of the very, very few people in the world who has actually fought bots on social media platforms. As a member of the Google abuse team from 2010–2013 I spent a large amount of time working on anti-spam and anti-automation platforms.

A talk I gave in 2012 at an internet engineering conference on anti-spam and anti-hacking at Gmail

One initiative I was particularly proud of was a project started in my 20% time, called BotGuard. We used it to quadruple the cost of black market Google accounts (price of fake accounts is one important metric of success in this space). BotGuard went on to be deployed on most of Google’s most important websites at the time: web search, Gmail, AdSense, account creation, YouTube and a host of smaller sites like Blogger and Google Groups. We enjoyed reading the laments of spammers and even the occasional compliment. Signals gathered from our various anti-spam systems were used to throttle or terminate accounts.

Spam fighting teams tend to be small. They usually fit in an average sized meeting room. After leaving Google I was approached by several other tech companies, including a former abuse-fighting colleague who’d gone to Twitter. The discussions never went anywhere because I was kind of tired of fighting bots and had moved on to Bitcoin. But when I visited my friend at Twitter for lunch one day, I was surprised to discover that they weren’t putting much effort into bot fighting at all. At the time Twitter was taking a pounding in the media for too much abuse in the human sense: people being mean to each other. That seemed like a higher priority.

How did they do that?

When I read that a small group of academics had reliably identified over 100,000 accounts as Russian-controlled bots despite not working at Twitter, I was immediately skeptical. The efforts of my team took years of R&D and constant changes to the source code of the websites themselves. Our most effective techniques weren’t based on the words being posted by bots, which are rarely a reliable signal of anything (it is a myth that spam filters work by spotting “spam words”). They weren’t something any outsider could have replicated. And they never yielded information about who was behind the botted accounts.

So I went looking for the actual research paper that the story was based on.

It was tricky to find. The newspapers obviously won’t link to primary sources. The authors mention it on their website under varying names but don’t link to it. Eventually I located what appears to be the only copy on the internet. It’s called “Social network, sentiment and political outcomes: Evidence from #Brexit” by Gorodnichenko, Pham and Talavera. After emailing one of the authors, I discovered that there are multiple versions in circulation and was sent the second (later) version. It’s interesting to see how the paper evolved over time.

The initial version makes several remarkable claims:

“Public opinions about Brexit were likely to be manipulated by bots”
Leave supporters are affected by bots but not remain supporters.
“Since bots play an important role in aggregating information on Twitter and indeed could compel humans’ opinions, there should be a legal framework to control the use of online bots.”

The second version makes substantially similar claims in different words.

Research about social media might be useful if it could be made reliable. But this paper has such severe methodological errors the entire set of conclusions must be treated as invalid.

How not to spot bots

The biggest problem starts on page 8 of the initial version, where they detail how they performed the technically challenging task of identifying automated accounts:

Bots are defined by three categories: (1) abnormal tweeting time (from 00:00 to 06:00 UK time); (2) abnormal number of tweets per day and (3) tweet sources are platforms

This set of criteria is hopeless! You can’t detect bots with rules like that, if it was this easy Google wouldn’t have had to invest so many years of R&D into the problem. All they’re doing is selecting accounts that happen to use Twitter a lot.

The most obvious problem is that real people have been known to tweet after midnight using a smartphone (“platforms”). Another is that according to the second version of the paper, only 5 tweets had to be posted after midnight for a day to be considered “suspicious”, or alternatively more than 10 tweets at any time on a day, and for an account to be considered a bot only required half of its posting days to be suspicious.

Not surprisingly this ultra-lax approach leads to a lot of humans being classified as bots. The Times sheepishly notes in the final paragraph of their article that the list of “bots” includes the official account of the Russian Embassy in London. I’d expect there to be many more they aren’t letting on.

However, the paper does not discuss the possibility of classification errors. All accounts selected this way are called bots, and all accounts not selected are called humans, with 100% confidence.

The second version of the paper says, “There is a clear pattern in humans’ hourly tweeting activities that they are more active during the time from 6 am until 6 pm then reduce their tweeting activities afterwards. However, we do not observe any clear pattern in the hour-by-hour tweeting activity of bots.”

Sounds like they’re on to something! But this is a very odd statement because they then show the following graphs, which clearly show the so-called “bots” going to sleep in the same way as the “humans” do the night before polling day. Why do they say there is no pattern?

did-russian-bots-impact-brexit-ad66f08c014a

Watch out — the Y axes are not aligned so the magnitude in the graphs aren’t visually comparable.

It would be extremely unusual for bots of any kind to simulate sleep. In fact I never encountered any in all my years of fighting them. When we were identifying bots on the Google network seeing a diurnal activity pattern in the graphs was always taken as proof that our queries had accidentally included real users.

They go on to say:

Most bots accounts are newly created with large number of followers and statuses while the number of friends is significantly lower than those of humans’ accounts.

If we had used logic this sloppy in my team we’d have accidentally terminated half the userbase and wrecked the company. In fact their data shows there is no significant difference in number of friends or statuses (see below). Here they’d only be at risk of terminating one fifth of the entire Twitter userbase:

In our sample, the number of bots accounts for around 20 percent of the total users.

Not something to take lightly!

Finally, the second version of the paper includes this bizarre statement:

Next, if the user gets the score of 1 (suspicious) for majority of days, then the user is defined as a bot otherwise the user is defined as a human. In addition, we are aware of the existence of users whose tweeting activities are only observed for less than three days. Given 99.9% of those users are defined as humans based on our definition, we can rule out the possibility of overestimating the probability of being a Twitter bot.

As their definition of “bot” is simply accounts that are reasonably active, selecting accounts that only tweet on two days in the entire measurement period (i.e. essentially inactive) will obviously show that all such accounts are not “bots”. So how does it follow that this rules out the possibility of overestimation of the bot rate? It’s this sort of thing that makes the paper hard to understand.

Garbage in, garbage out

Having labelled 20% of all Brexit related tweeters as bots, including the Russian embassy, they proceed to do what looks superficially like a mathematical analysis on their dataset. The problem is this:

After defining a “bot” as anyone who tweets a lot using apps and who has sometimes posted after midnight, they discover that “bots” have more followers. But this is what we’d expect to see if they had simply selected people who like to tweet frequently and are less interested in reading tweets — not many people will follow an account that hardly says anything.

They also claim bots are more likely to be newly created, implying the setup of a network of fake accounts specifically for influencing the referendum. But their data tables don’t agree:

It may not be immediately apparent how to interpret this data, because they have decided to present numbers in the form of natural logarithms. No justification for this is provided. It appears to be a form of mathematical obfuscation. I can’t escape the feeling that presenting numbers in this way is designed to make the paper’s results look more scientific than they really are.

Anyway, raising e to the power of x reverses the natural log and puts the figures back into regular integer form, letting us see that the average pre-referendum account age is 1209 days (or 3.3 years) for “humans” and 1042 days (or 2.8 years) for “bots”. The claim “most bots are newly created” is vague, but not supported by the presented data: on average both groups are around three years old.

The average number of followers is 255 for “humans” and 414 for “bots”. The difference between number of friends that “humans” have (354) and “bots” (309) is likewise hardly significant — in fact their table says none of their measurements has a significance of more than 1%. Selecting more recent and more active accounts would obviously show that more content = more followers, and should show a slightly lower number of friends naturally because friends accumulate over time.

All this data says is that if you select users on the basis that they tweet a lot, then they tend to be slightly more recent Twitter users and to use it more intensively.

Unfortunately, because the original dataset is not available the analysis is not reproducible.

Mathiness

By now you see a trend in what I’m pointing out — the English language statements about what the data indicates simply don’t match the data itself.

They say they identified “bots” but their criteria just selects active accounts. They say their “bots” don’t sleep but the graphs show they do, they say bots are significantly newer accounts but the tables show the average age between both groups is around three years. They say bots have a “significantly” lower number of friends (354 vs 309) but their tables state the significance of the difference is only 1%. Equations are everywhere but the inputs to the formulas don’t represent what they’re claimed to represent and the summaries of the outputs are wrong.

These problems crop up frequently in certain areas of academia.

On page 7 we see lots of equations with Greek letters in them which form a vector auto-regressive model: looks impressive! But no matter how clever your mathematical model is, if you feed garbage in you’ll get garbage out.

The authors of this paper are not computer scientists and have no background in bot detection. They are professors of economics and finance — exactly the same segment of society that has been consistently and homogeneously wrong about Brexit. Their track record is so bad that the Bank of England’s chief economist has claimed the entire profession of economics is in crisis.

I’m willing to make aggressive claims about mathematical obfuscation because the problem of obfuscated economics papers is well known in the field. It’s called “mathiness” and was first discussed in a 2015 paper by Paul Romer:

The style that I am calling mathiness lets academic politics masquerade as science. Like mathematical theory, mathiness uses a mixture of words and symbols, but instead of making tight links, it leaves ample room for slippage between statements in natural versus formal language and between statements with theoretical as opposed to empirical content.

That theme was quickly picked up by a variety of other authors who felt the problem had become endemic. Nobel prize winner Paul Krugman has written about this in the field of growth theory, and also wrote days after Leave had won:

What we’re hearing overwhelmingly from economists is the claim that it will also have severe short-run adverse impacts. And that claim seems dubious.
Or maybe more to the point, it’s a claim that doesn’t follow in any clear way from standard macroeconomics — but it’s being presented as if it does. And I worry that what we’re seeing is a case of motivated reasoning, which could end up damaging economists’ credibility.

We know from what happened next (no recession) that Krugman was right.

This paper is a great example of how academic politics has been dressed up to look like science. It’s in reality more like the personal political opinions of the authors projected onto the tweet stream. They use their research to argue for government censorship of social media: “cherishing diversity does not mean that one should allow dumping lies and manipulations to the extent that the public cannot make a well-informed decision … bots could shape public opinions in negative ways. If so, policy-makers should consider mechanisms to prevent abuse of bots in the future” (from the second version). This is just ordinary left wing university politics: people without degrees are weak minded and so governments should control online speech.

The anti-Russia angle that got tacked onto the later version of the paper is probably a mix of the bog standard pro-EU academic slant, combined with the fact that the authors appear to be passionate about the politics of Ukraine — one of them is Ukrainian and strongly anti-Russia, and all three of them are involved with a website called VoxUkraine.

Conclusion

These new claims that Brexit was something to do with bot controlled Twitter accounts can be traced back to a paper that defines a bot as more or less any account that tweets a lot. Its authors have definitely misidentified human accounts as bots yet did not consider error rates in their analysis. Its claims are unverifiable and do not match its own data tables, which are presented in obfuscated form. The authors use their dodgy conclusions to argue for government control of social media. Their claims about Russia assume that any account that is configured to use the Russian translation of Twitter and which mentioned Brexit is part of a badly run but malicious conspiracy, versus the more plausible explanation of simply being Russian people who have opinions on foreign politics. And the authors appear to have pre-existing political biases against Russia, related to events in Ukraine.

Reliably detecting bots is a very difficult problem which only tech companies are in any position to do. Twitter is in a particularly poor position to do this due to their prior focus on harassment rather than bulk automation. But at least they have a chance: no academic attempts to statistically identify bots is going to produce any meaningful results at all. The false classification rates will be so enormous that it renders this entire line of research pointless. If I had implemented bot detection logic like what these academics tried to use, I’d have shut down so many legitimate accounts it would have made the news.

The New York Times refers to the results of this paper as “findings” and “the strongest evidence yet”. As Theresa May’s belief in this theory make clear, that is fantastically dangerous and could in the worst case lead to war. It is the most irresponsible abuse of maths I’ve seen for a long time.

Did Russian bots impact Brexit?

Did Russian bots impact Brexit?

Don’t believe the new narrative

My credentials

How did they do that?

How not to spot bots

Garbage in, garbage out

Mathiness

Conclusion

Recommend

Redux 中间件到底怎么工作的呢？

低代码平台会像中台一样烂大街吗？

HAProxy Forwards Over 2 Million HTTP Requests per Second on a Single Arm-based A...

35 岁以上的程序员都在做什么？

有机美食店电子商务网站模板

网易云信在融合通信场景下的探索和实践之 RTMPGateway 服务架构

Do not use 'miracle cure' ivermectin on COVID-19 patients says WHO

亚马逊物流中心工人投票拒绝组建工会

配色要领，遵循色彩原理来配色

Why cyber warfare isn’t

About Joyk