Ngo VM, Gajula R, Thorpe C, Mckeever S. Discovering child sexual abuse material creators' behaviors and preferences on the dark web.
CHILD ABUSE & NEGLECT 2024;
147:106558. [PMID:
38041966 DOI:
10.1016/j.chiabu.2023.106558]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 10/30/2023] [Accepted: 11/13/2023] [Indexed: 12/04/2023]
Abstract
BACKGROUND
Producing, distributing or discussing child sexual abuse materials (CSAM) is often committed through the dark web to stay hidden from search engines and to evade detection by law enforcement agencies. Additionally, on the dark web, the CSAM creators employ various techniques to avoid detection and conceal their activities. The large volume of CSAM on the dark web presents a global social problem and poses a significant challenge for helplines, hotlines and law enforcement agencies.
OBJECTIVE
Identifying CSAM discussions on the dark web and uncovering associated metadata insights into characteristics, behaviors and motivation of CSAM creators.
PARTICIPANTS AND SETTING
We have conducted an analysis of more than 353,000 posts generated by 35,400 distinct users and written in 118 different languages across eight dark web forums in 2022. Out of these, approximately 221,000 posts were written in English and contributed by around 29,500 unique users.
METHOD
We propose a CSAM detection intelligence system. The system uses a manually labeled dataset to train, evaluate and select an efficient CSAM classification model. Once we identify CSAM creators and victims through CSAM posts on the dark web, we proceed to analyse, visualize and uncover information concerning the behaviors of CSAM creators and victims.
RESULT
The CSAM classifier, based on Support Vector Machine model, exhibited good performance, achieving the highest precision of 92.3 % and accuracy of 87.6 %. While, the Naive Bayes combination is the best in term of recall, achieving 89 %. Across the eight forums in 2022, our Support Vector Machine model detected around 63,000 English CSAM posts and identified near 10,500 English CSAM creators. The analysis of metadata of CSAM posts revealed meaningful information about CSAM creators, their victims and social media platforms they used. This included: (1) The topics of interest and the preferred social media platforms for the 20 most active CSAM creators (For example, two top creators were interested in topics like video, webcam and general content in forums, and they frequently used platforms like Omegle and Skype); (2) Information about the ages and nationalities of the victims typically mentioned by CSAM creators, such as victims aged 12 and 13 with nationalities including British and Russian; (3) social media platforms preferred by CSAM creators for sharing or uploading CSAM, include Omegle, YouTube, Skype, Instagram and Telegram.
CONCLUSION
Our CSAM detection system exhibits high performance in precision, recall, and accuracy in real-time when classifying CSAM and non-CSAM posts. Additionally, it can extract and visualize valuable and unique insights about CSAM creators and victims by employing advanced statistical methods. These insights prove beneficial to our partners, i.e. national hotlines and child agencies.
Collapse