Acceso abierto

Advancements in Offensive Language Detection: A Comprehensive Review and Experimental Analysis

, , ,  y   
20 feb 2025

Cite
Descargar portada

The proliferation of offensive language in digital communication has become a significant challenge in the internet era, prompting the urgent need for advanced Natural Language Processing (NLP) techniques for its identification and mitigation. With a particular focus on NLP techniques, machine learning, deep learning, and transformer models, this study presents a thorough review of the shifting landscape of offensive language identification from the years 2020 through 2023. The datasets utilized in prior research have been scrutinized, specifically those of Dravidian languages such as Tamil, Malayalam, etc. Preprocessing techniques encompass a range of data cleansing and word embedding methodologies, including TF-IDF and Word2Vec, which are employed to train and optimize the model. We reviewed past work to compare the standard supervised learning models like Support Vector Machine and Naive Bayes to emergent transformer models like BERT, identifying the superior approach that would improve a model’s accuracy and effectiveness.

Idioma:
Inglés
Calendario de la edición:
6 veces al año
Temas de la revista:
Informática, Fundamentos de la informática, Informática teórica, Seguridad informática y criptología