Towards Cyberbullying Detection: Building, Benchmarking and Longitudinal Analysis of Aggressiveness and Conflicts/Attacks Datasets From Twitter

Paula Alexandra Nunes da Costa Ferreira; Nádia Salgado Pereira; Hugo Rosa; Sofia Oliveira; Luísa Coheur; Sofia Mateus Francisco; Sidclay Souza; Ricardo Ribeiro; João Paulo Carvalho; Paula Paulino; Isabel Trancoso; Ana Margarida Veiga Simão

Ciência_Iscte Publications Publication Detailed Description

Scientific journal paper Q1

Towards Cyberbullying Detection: Building, Benchmarking and Longitudinal Analysis of Aggressiveness and Conflicts/Attacks Datasets From Twitter

Paula Alexandra Nunes da Costa Ferreira (Ferreira, P.); Nádia Salgado Pereira (Pereira, N.); Hugo Rosa (Rosa, H.); Sofia Oliveira (Oliveira, S.); Luísa Coheur (Coheur, L.); Sofia Mateus Francisco (Francisco, S.); Sidclay Souza (Souza, S.); Ricardo Ribeiro (Ribeiro, R.); João Paulo Carvalho (Carvalho, J. P.); Paula Paulino (Paulino, P.); Isabel Trancoso (Trancoso, I.); Ana Margarida Veiga Simão (Veiga-Simão, A. M.); et al.

Journal Title

IEEE Transactions on Affective Computing

Year (definitive publication)

2025

Language

English

Country

United States of America

More Information

Visit Link

Web of Science®

Times Cited: 0

(Last checked: 2026-06-04 18:03)

View record in Web of Science®

Scopus

Times Cited: 0

(Last checked: 2026-05-20 16:01)

View record in Scopus

Google Scholar

Times Cited: 3

(Last checked: 2026-06-02 20:18)

View record in Google Scholar

Overton

This publication is not indexed in Overton

Abstract

Offense and hate speech are a source of online conflicts which have become common in social media and, as such, their study is a growing topic of research in machine learning and natural language processing. This article presents two Portuguese language offense-related datasets that deepen the study of the subject: an Aggressiveness dataset and a Conflicts/Attacks dataset. While the former is similar to other offense detection related datasets, the latter constitutes a novelty due to the use of the history of the interaction between users. Several studies were carried out to construct and analyze the data in the datasets. The first study included gathering expressions of verbal aggression witnessed by adolescents to guide data extraction for the datasets. The second study included extracting data from Twitter (in Portuguese) that matched the most frequent expressions/words/sentences that were identified in the previous study. The third study consisted in the development of the Aggressiveness dataset, the Conflicts/Attacks dataset, and classification models. In our fourth study, we proposed to examine whether online aggression and conflicts/attacks revealed any trend changes over time with a sample of 86 adolescents. With this study, we also proposed to investigate whether the amount of tweets sent over a period of 273 days was related to online aggression and conflicts/attacks. Lastly, we analyzed the percentage of participants who participated in the aggressions and/or attacks/conflicts.

Acknowledgements

Keywords

Aggression,Offense,Hate speech,Social networks,Natural language processing,Dataset

Fields of Science and Technology Classification

Computer and Information Sciences - Natural Sciences

Funding Records

Funding Reference	Funding Entity
PTDC/MHC/PED/3297/2014	Fundação para a Ciência e a Tecnologia
PTDC/PSI-GER/1918/2020	Fundação para a Ciência e a Tecnologia
UIDB/04527/2020	Fundação para a Ciência e a Tecnologia
UIDP/04527/2020	Fundação para a Ciência e a Tecnologia
UIDB/50021/2020	Fundação para a Ciência e a Tecnologia

Contributions to the Sustainable Development Goals of the United Nations

With the objective to increase the research activity directed towards the achievement of the United Nations 2030 Sustainable Development Goals, the possibility of associating scientific publications with the Sustainable Development Goals is now available in Ciência_Iscte. These are the Sustainable Development Goals identified by the author(s) for this publication. For more detailed information on the Sustainable Development Goals, click here.

Publication Identifiers

Scopus (source: Ciência_Iscte)	2-s2.0-85212842471
DOI (source: author)	10.1109/TAFFC.2024.3518587
WoS (source: Ciência_Iscte)	WOS:001566948500004
WoS (source: author)	WOS:001566948500004
Scopus (source: author)	2-s2.0-85212842471
Ciência_Iscte ID	ci-pub-107041

Other Publication Details

Online Publication Year	2024
Publisher	IEEE
Indexes	Web of Science©; Scopus;
ISSN	1949-3045 (print) 1949-3045 (online)
ISBN	--
Impact Factor	--
Volume	16	Number	3
Series
Article Number
Pages	1473 - 1487
Peer Reviewed	Yes
Dissemination Mean	Digital
Publication Date (online)
Publication Date (print)

Altmetric

Dimensions

PlumX Metrics