Abstract
Hate speech detection is crucial as social media diversifies. This research present a lightweight, scalable system using traditional machine learning methods along with a new approach called Spiral-Grey Wolf Optimizer (S-GWO).
S-GWO effectively selects key features that consider both meaning and content from the Term Frequency Inverse Document Frequency (TF-IDF) space, leading to high-quality representation without excessive computing power.
The proposed system was tested on Arabic and another English datasets using six machine learning methods: SVM, RF, LR, KNN, NB, and SGD. It شchieved 92% accuracy and F1 score on the Arabic dataset, while reaching 100% accuracy on the English dataset, significantly reducing
hate speech and toxicity.
Overall, the enhanced algorithms improve accuracy and efficiency, offering an effective alternative to costly deep learning models even with noisy and unbalanced data.
S-GWO effectively selects key features that consider both meaning and content from the Term Frequency Inverse Document Frequency (TF-IDF) space, leading to high-quality representation without excessive computing power.
The proposed system was tested on Arabic and another English datasets using six machine learning methods: SVM, RF, LR, KNN, NB, and SGD. It شchieved 92% accuracy and F1 score on the Arabic dataset, while reaching 100% accuracy on the English dataset, significantly reducing
hate speech and toxicity.
Overall, the enhanced algorithms improve accuracy and efficiency, offering an effective alternative to costly deep learning models even with noisy and unbalanced data.
Keywords
Arabic dialect
Feature representation
Grey wolf optimizer
hate speech
machine learning
Spiral motion