Mitigating hate speech online using AI

Employing AI to reduce the emotional toll of combating online hate speech.

Online hate speech has real-world consequences, including hate crimes and physical attacks. Research shows that exposure to online hate speech weakens empathy [1], which may lead to more criminal behaviour [2].

Key Findings from Online Hate and Harassment: The American Experience 2023 report.

Data reported by the US [3] [4] [5] [6], the UK [7], and European countries [8] indicate that in recent years, online hate speech has exploded, and hate-related crimes are at a record high.

Most of the hate speech still occurs on mainstream social media platforms.

Despite tech companies’ commitments to making their platforms safe, hate speech continues to find its way onto major platforms [9] [10] [11].

Top 3 platforms on which respondents experienced online abuse (percentages — multiple answers). [9], p.24

During the COVID-19 pandemic, there was a spike in hate speech targeting women, with the majority of the abuse occurring on mainstream social media platforms like Twitter (Current X), Facebook, Instagram, WhatsApp, Slack, and Snapchat.

Social media platforms claim to take action, but there are caveats.

Social platforms report record numbers in the removal of hate speech [12], yet their commitments are not always consistent [13] [14] [15].

$Content removal serves to fracture and polarize the internet. Mitigating Hate Speech on Social Media Using AI.$

Simply removing hate speech content does not decrease its prevalence.

In addition, content removal serves to fracture and polarize the internet [16] [17]. As the UN Secretary-General puts it, ‘Social media provides a global megaphone for hate’ [18].

Taking down content hardly changes the minds of people who spread hate.

Straightforward moderation hardly changes the minds of those who speak hatefully [19]. Instead, they might ‘go underground’, sharing hate speech in other online places [20] [21].

Online hate and freedom of speech. Mitigating Hate Speech on Social Media Using AI.

* [19] Online hate and freedom of speech. Qualitative research into the impact of online hate. Page 39

The shooter who attacked a synagogue in Pittsburgh used Gab to share antisemitic hate speech and hinted at his plans before he killed eleven people and injured six more [22].

Research indicates empathetic responses can be used to alter offender actions.

Counter-speech messages that generate empathy for hate speech victims are more likely to convince senders to change their ways [23] [24] [25].

Counter-speech messages that generate empathy for hate speech victims are more likely to convince senders to change their ways. Mitigating Hate Speech on Social Media Using AI.

Hate speakers are 2.5x more likely to delete their hateful comments when confronted with empathetic counterspeech.

Researchers discovered that messages sparking empathy were 2.5 times more effective than humor or warnings of consequences in getting authors to remove offensive comments.

Although empathic counter-speech can be effective, it has its downsides.

The main reason people are hesitant to engage in counter-speech is that it can be emotionally taxing and potentially risky [24].

The main reason people are hesitant to engage in counter-speech is that it can be emotionally taxing and potentially risky. Mitigating Hate Speech on Social Media Using AI.

* [19] Reactions and coping behaviours. Qualitative research into the impact of online hate. Page 27

Counter-speakers often become targets of online attacks themselves, and being empathic towards offenders makes it even harder.

Current LLMs are capable of eliminating the downsides associated with empathic responses.

LLMs like ChatGPT can understand the language semantics of comments and provide context for hate speech.

ChatGPT was prompted to estimate sentiment of a hate speech comment.

They can also understand human biases involved and generate empathic replies for effective counter-speech.

How can social media platforms use LLMs to mitigate hate speech?

There are a few aspects that social media platforms can address to help reduce the downsides associated with empathic responses.

Aspects that social media platforms can address to help reduce the downsides associated with empathic responses. Mitigating Hate Speech on Social Media Using AI.

Emotional health, physical safety, self-control, and guidance are the main themes in hate speech research.

Make combating hate speech less emotionally taxing.

There is a substantial number of people willing to combat hate speech online. Social platforms can support these individuals by flagging instances of hate speech.

Social platforms can flagging instances of hate speech. Mitigating Hate Speech on Social Media Using AI.

Providing context to hate speech may play a crucial role in addressing hate speech.

Big social media platforms already place sensitive content, including nudity, violence, and other sensitive topics, behind a content warning wall, but this does not extend to hate speech.

Flag hateful content settings. Mitigating Hate Speech on Social Media Using AI.

Hate speech content flagging settings.

Providing context for harmful content can protect people from excessive exposure to hate speech and help them choose their battles more wisely.

Context of harmful content can protect people from excessive exposure to hate speech. Mitigating Hate Speech on Social Media Using AI.

Flagged hate speech content in the feed.

Protect people who engage and speak up.

Furthermore, people might be more inclined to join efforts in combating hate speech if they feel safe while commenting on it.

Safety features to combating hate speech online. Mitigating Hate Speech on Social Media Using AI.

Safety and privacy features to protect against stalking.

Social media platforms like Facebook already implement features to protect personal information [26]. They could take a further step by extending these policies to combat hate speech.

Anonymous comments settings.

Hiding personal information from people outside of your circle, especially when commenting on hot topics, might protect individuals from vulnerable social groups and empower them to speak out.

Anonymous comments view in the feed.

Provide guard rails for emotional swings.

People often use hate speech to relieve tension and frustration [27 p.5]. The Gallup Global Emotions Report showed that one in three people worldwide felt daily pain in 2023 [28]. Apparently we all can participate in hate speech time after time.

Hate speech self-check feature. Mitigating Hate Speech on Social Media Using AI.

Emotional self-check feature.

Companies like Grammarly can analyze message sentiment in real-time [29], an approach that social media platforms could adopt to help prevent hate speech.

Message sentiment analysis while typing.

Automatically checking sentiment while typing a post or reply can help individuals in emotional states avoid posting hateful or inappropriate comments.

Indicating harmful speech detection.

Help people counter-speech effectively.

If there were tools to simplify and enhance responses to hateful posts or comments, more people might be encouraged to join the fight against hate speech.

Smart reply feature with suggestions to counter hate speech. Mitigating hate speech online using AI.

Reply suggestions feature.

Social platforms offer resources and guidance to users encountering individuals with self-harm and suicidal thoughts [30], yet similar support is not extended for instances of hate speech [31].

Clear guidance on engaging with hate speech.

By offering comprehensive guides and suggestions generated by (LLMs), social media platforms can help people know how to act when they face hate speech and engage effectively [32].

Smart response suggestions for effectively combating hate speech. Mitigating hate speech online using AI.

Smart response suggestions view.

Counter-speech beyond social platforms.

The use of counter-speech could be extended to modern browsers and operating systems (OS).

Browser integrations for commenting online on websites.

Browser settings or browser extension

Browsers such as Safari, Chrome, and Brave could incorporate built-in features or extensions enabling users to self-check their communication on platforms beyond mainstream social media, including comments on tabloids and news websites.

iOS and Android integrations for commenting through messaging apps.

Mobile and desktop platforms

iOS, along with its sensitive content feature, could introduce a function to flag hate speech in iMessage or while browsing the web in Safari. Users could control this feature directly from their device settings or through social platform preferences, utilizing portable LLMs for support. [33]

Counter-speech app.

To demonstrate the concept, I’ve constructed a custom ChatGPT. This version can assist users in self-checking their messages or crafting responses with empathy to hateful comments.

Custom ChatGPT as a proof of concept.

It uses UN guidelines for how to debate hate speech and knows human cognitive biases that are often present in hate speech.

Case 1: Response to a hateful comment.

You just need to provide the text you’re about to send or the hateful comment you would like to push back on.

Case 2: Pre-checking personal messages before they are posted.

Try DisagreeGPT →

Mitigating hate speech online using AI

Mitigating hate speech online using AI

Employing AI to reduce the emotional toll of combating online hate speech.

Most of the hate speech still occurs on mainstream social media platforms.

Social media platforms claim to take action, but there are caveats.

Taking down content hardly changes the minds of people who spread hate.

Research indicates empathetic responses can be used to alter offender actions.

Although empathic counter-speech can be effective, it has its downsides.

Current LLMs are capable of eliminating the downsides associated with empathic responses.

How can social media platforms use LLMs to mitigate hate speech?

Make combating hate speech less emotionally taxing.

Protect people who engage and speak up.

Provide guard rails for emotional swings.

Help people counter-speech effectively.

Counter-speech beyond social platforms.

Counter-speech app.

Recommend

Polymarket: CZ Will Spend Under 12 Months in Jail

Bitcoin (BTC) Price Slips Under $62K as Hong Kong ETFs Disappoint

Microsoft confirms its next Xbox Game Showcase is on June 9 at 1PM ET

MIT faculty, instructors, students experiment with generative AI in teaching and...

blogger的福音，墙裂推荐一款炒鸡好用的markdown编辑器：Typora

Non-Compliant, So What?

CMIVPS|香港VPS优惠|CN2线路|$2起|可选20Mbps无限流量

Statistics For Programmers - Bayes Theorem

数据重燃通胀担忧！纳指大跌超2%，特斯拉市值一夜蒸发2490亿元！财报后亚马逊跳涨、AM...

NFTFN Blazes Past $500K In Presale; Aims To Outshine BlockDAG's Presale – Crypto...

About Joyk