This repository contains the implementation of a FastText-based machine learning model for identifying offensive language in tweets. The project uses the OLID and SOLID datasets for training and testing. It also integrates the LIME (Local Interpretable Model-Agnostic Explanations) technique to enhance interpretability by visually showing key words that influenced the model's decisions.
The project is designed to classify offensive language at three levels:
- Offensive Language Detection: Determines whether a text is offensive or not.
- Categorization of Offensive Language: Classifies offensive language as targeted or untargeted.
- Target Identification: Identifies the target of offensive language as an individual, group, or others (e.g., organizations, events).
This repository is ideal for researchers, developers, and data scientists working on natural language processing (NLP), toxic content moderation, or social media analysis.
- FastText Model:
- Utilizes subword representations for efficient and robust text classification.
- Fast training on large-scale text data with minimal preprocessing.
- Datasets:
- OLID: 14,100 manually labeled tweets with a three-level taxonomy.
- SOLID: 9 million semi-supervised labeled tweets for large-scale training.
- LIME Visualizations: Enhances interpretability by explaining predictions made by the model.
- Hierarchical Approach: Multi-level classification for granular offensive language detection.
- Predefined Scripts: Simplifies training, evaluation, and visualization through dedicated Python scripts.
Below are examples of LIME (Local Interpretable Model-Agnostic Explanations) visualizations. The project interprets model predictions by highlighting the influential words or regions for offensive language classification.
Figure 1: LIME Visualization for Offensive Language Detection
Figure 2: LIME Visualization for Categorization of Offensive Language
Figure 3: LIME Visualization for Target Identification
- OLID Dataset: Contains the training set, test set labels, and test set tweets for OLID.
- SOLID Dataset: Contains the training set, test set labels, and test set tweets for SOLID.
train_fasttext.py
: A Python script to train FastText models on OLID and/or SOLID datasets. It supports three classification levels.test_fasttext.py
: A Python script to evaluate the trained FastText models on the SOLID test set and generate classification reports for each level.lime_popup.py
: A Python script that uses the LIME technique to interpret and visualize individual model predictions.settings.py
: Contains hyperparameters and other settings shared across the other Python scripts.
Results/results_fasttext_dataX_preprocessY.txt
: Six files storing classification reports for FastText models across three levels:data1
: OLID onlydata2
: SOLID onlydata3
: OLID + SOLIDpreprocessTrue
: With tweet text preprocessingpreprocessFalse
: Without tweet text preprocessing
model_fasttext_a/b/c.bin
: Saved FastText models for the three classification levels. Note: The model checkpoint files (model_fasttext_a.bin
,model_fasttext_b.bin
,model_fasttext_c.bin
) are too large to upload to this repository.Documentation.docx
: Detailed documentation about the project.
- 14,100 English tweets manually labeled using a three-level taxonomy:
- Offensive Language Detection
- Categorization of Offensive Language
- Offensive Language Target Identification
- Public dataset for offensive language classification on social media.
- 9 million English tweets created using a semi-supervised approach.
- Significantly larger dataset to improve model performance.
Note: The SOLID Dataset
folder is too large to include in this repository. You can access it via the HuggingFace Page for SOLID.
- Python 3.7
- Required Python libraries:
pandas
,numpy
,fasttext
,nltk
,sklearn
,tqdm
,re
,os
,random
,lime
,tkinter
- Train the FastText model using the
train_fasttext.py
script. - Combine OLID and SOLID datasets or use them independently during training.
- Test the trained model on the SOLID test set using the
test_fasttext.py
script. - Classification metrics (accuracy, precision, recall, F1-score) are provided for each task.
- Use the
lime_popup.py
script to generate LIME explanations for model predictions. - Explanations are saved as HTML files and displayed in a pop-up window.
- Accuracy: 0.92
- F1-Score: 0.91 (NOT), 0.92 (OFF)
- Accuracy: 0.54
- F1-Score: 0.68 (UNT), 0.24 (TIN)
- Accuracy: 0.60
- F1-Score: 0.54 (IND), 0.73 (GRP), 0.19 (OTH)
- OLID Dataset: Predicting the Type and Target of Offensive Posts in Social Media (NAACL-HLT 2019)
- SOLID Dataset: SOLID: A Large-Scale Semi-Supervised Dataset for Offensive Language Identification (ACL-IJCNLP 2021)
- FastText: Enriching Word Vectors With Subword Information (TACL 2017)
- LIME: LIME GitHub Repository