Uneingeschränkter Zugang

Advancements in Offensive Language Detection: A Comprehensive Review and Experimental Analysis

, , ,  und   
20. Feb. 2025

Zitieren
COVER HERUNTERLADEN

Figure 1.

Related Work Study Plan
Related Work Study Plan

Figure 2.

Dataset
Dataset

Figure 3.

Dataset Preprocess
Dataset Preprocess

Figure 4.

Category count
Category count

Figure 5.

Dataset Translation rate
Dataset Translation rate

Figure 6.

Logistic Regression
Logistic Regression

Figure 7.

Support Vector Machine
Support Vector Machine

Figure 8.

K-Nearest Neighbor
K-Nearest Neighbor

Figure 9.

GCN Work Flow
GCN Work Flow

Figure 10.

mBERT
mBERT

Imbalanced Dataset without Translation

Type Algorithm F1 Score (%) Time (s) Kappa (%)
BOW Bernoulli Naive Bayes 88 0.08 27
Support Vector Machine 88 1281.27 45
Logistic Regression 89 2.35 45
K-Nearest Neighbor 86 0.03 0
TF-IDF Bernoulli Naive Bayes 88 0.07 27
Support Vector Machine 89 469.69 47
Logistic Regression 88 1.4 42
K-Nearest Neighbor 86 0.03 1
Word2Vec Bernoulli Naive Bayes 74 0.13 26
Support Vector Machine 86 747.66 0
Logistic Regression 86 2.57 14
K-Nearest Neighbor 86 0.05 4

Balanced Dataset with Translation

Type Algorithm F1 Score (%) Time (s) Kappa (%)
BOW Bernoulli Naive Bayes 71 73 0.17
Support Vector Machine 75 78 2845.36
Logistic Regression 75 76 3.2
K-Nearest Neighbor 18 54 0.06
TF-IDF Bernoulli Naive Bayes 76 74 0.32
Support Vector Machine 75 76 1207.85
Logistic Regression 74 75 1.53
K-Nearest Neighbor 26 52 0.07
Word2Vec Bernoulli Naive Bayes 65 69 0.27
Support Vector Machine 67 72 1912.06
Logistic Regression 69 72 5.97
K-Nearest Neighbor 65 71 0.07

Balanced Dataset without Translation

Type Algorithm F1 Score (%) Time (s) Kappa (%)
BOW Bernoulli Naive Bayes 80 82 0.35
Support Vector Machine 82 83 4290.71
Logistic Regression 82 83 5.2
K-Nearest Neighbor 16 54 0.06
TF-IDF Bernoulli Naive Bayes 86 85 0.17
Support Vector Machine 85 85 1698.51
Logistic Regression 82 83 2.46
K-Nearest Neighbor 7 52 0.06
Word2Vec Bernoulli Naive Bayes 62 66 0.27
Support Vector Machine 68 72 1645.81
Logistic Regression 70 73 7.39
K-Nearest Neighbor 68 72 0.07

Imbalanced Dataset with Translation

Type Algorithm F1 Score (%) Time (s) Kappa (%)
BOW Bernoulli Naive Bayes 86 0.35 25
Support Vector Machine 86 4290.71 25
Logistic Regression 87 5.2 29
K-Nearest Neighbor 86 0.06 0
TF-IDF Bernoulli Naive Bayes 86 0.17 25
Support Vector Machine 87 1698.51 22
Logistic Regression 87 2.46 28
K-Nearest Neighbor 86 0.06 0
Word2Vec Bernoulli Naive Bayes 73 0.12 26
Support Vector Machine 86 444.38 0
Logistic Regression 86 4.3 6
K-Nearest Neighbor 86 0.05 10
Sprache:
Englisch
Zeitrahmen der Veröffentlichung:
6 Hefte pro Jahr
Fachgebiete der Zeitschrift:
Informatik, Grundlagen der Informatik, Theoretische Informatik, IT-Sicherheit und Kryptologie