February 7, 2024

Using AI to monitor the internet for terror content is inescapable—but also fraught with pitfalls

by Stuart Macdonald, Ashley A. Mattheis and David Wells, The Conversation

Credit: Pixabay/CC0 Public Domain

Every minute, millions of social media posts, photos and videos flood the internet. On average, Facebook users share 694,000 stories, X (formerly Twitter) users post 360,000 posts, Snapchat users send 2.7 million snaps and YouTube users upload more than 500 hours of video.

This vast ocean of online material needs to be constantly monitored for harmful or illegal content, like promoting terrorism and violence.

The sheer volume of content means that it's not possible for people to inspect and check all of it manually, which is why automated tools, including artificial intelligence (AI), are essential. But such tools also have their limitations.

The concerted effort in recent years to develop tools for the identification and removal of online terrorist content has, in part, been fueled by the emergence of new laws and regulations. This includes the EU's terrorist content online regulation, which requires hosting service providers to remove terrorist content from their platform within one hour of receiving a removal order from a competent national authority.

Behavior and content-based tools

In broad terms, there are two types of tools used to root out terrorist content. The first looks at certain account and message behavior. This includes how old the account is, the use of trending or unrelated hashtags and abnormal posting volume.

In many ways, this is similar to spam detection, in that it does not pay attention to content, and is valuable for detecting the rapid dissemination of large volumes of content, which are often bot-driven.

The second type of tool is content-based. It focuses on linguistic characteristics, word use, images and web addresses. Automated content-based tools take one of two approaches.

1. Matching

The first approach is based on comparing new images or videos to an existing database of images and videos that have previously been identified as terrorist in nature. One challenge here is that terror groups are known to try and evade such methods by producing subtle variants of the same piece of content.

After the Christchurch terror attack in New Zealand in 2019, for example, hundreds of visually distinct versions of the livestream video of the atrocity were in circulation.

So, to combat this, matching-based tools generally use perceptual hashing rather than cryptographic hashing. Hashes are a bit like digital fingerprints, and cryptographic hashing acts like a secure, unique identity tag. Even changing a single pixel in an image drastically alters its fingerprint, preventing false matches.

Perceptual hashing, on the other hand, focuses on similarity. It overlooks minor changes like pixel color adjustments, but identifies images with the same core content. This makes perceptual hashing more resilient to tiny alterations to a piece of content. But it also means that the hashes are not entirely random, and so could potentially be used to try and recreate the original image.

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Using AI to monitor the internet for terror content is inescapable—but also frau...

Using AI to monitor the internet for terror content is inescapable—but also fraught with pitfalls

Behavior and content-based tools

1. Matching

Recommend

“棍/绳横行全球”！两款超高下载量混变手游爆发，下个新玩法基石？

这篇稿子本来发不出来……

Those free USB sticks in your drawer are somehow crappier than you thought

A small minimal guide on setting up NIP-05 identifier on branle/nostr

Memos降级0.18.1指南 | 只是玩玩

Didn't mike impose single account policy?

ERROR_MESSAGE (Transact-SQL)

阿里云：季度营收增长至280.66亿，利润增长86%

RWA (real world assets): Pioneering Blockchain Asset Tokenization

QQ浏览器变成灰色怎么办

About Joyk