3

Identifying and Eliminating CSAM in Generative ML Training Data and Models

 6 months ago
source link: https://purl.stanford.edu/kh752sm9123
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Abstract/Contents

Abstract

Generative Machine Learning models have been well documented as being able to produce explicit adult content, including child sexual abuse material (CSAM) as well as to alter benign imagery of a clothed victim to produce nude or explicit content. In this study, we examine the LAION-5B dataset—parts of which were used to train the popular Stable Diffusion series of models—to attempt to measure to what degree CSAM itself may have played a role in the training process of models trained on this dataset. We use a combination of PhotoDNA perceptual hash matching, cryptographic hash matching, k-nearest neighbors queries and ML classifiers.

This methodology detected many hundreds of instances of known CSAM in the training set, as well as many new candidates that were subsequently verified by outside parties. We also provide recommendations for mitigating this issue for those that need to maintain copies of this training set, building future training sets, altering existing models and the hosting of models trained on LAION-5B.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK