This project builds on the GoEmotions dataset, a human-annotated corpus of 58k Reddit comments labeled for 27 fine-grained emotion categories plus Neutral. Our goal is to analyze, evaluate, and build improved emotion classification systems using both fine-grained and coarse-grained (Ekman) taxonomies.
The project involves:
- Dataset analysis and visualization
- Emotion-word association extraction
- Label remapping to higher-level categories (e.g., Ekman)
- Model training and evaluation using fine-tuned BERT, RoBERTa, and DeBERTa models
- Experiments with contextual metadata: subreddit and author identity
- Source: Reddit comments, curated by Google Research
- Examples: 58,009
- Labels: 27 fine-grained emotion categories + Neutral
- Sequence length: Max 30 tokens
Filtered by rater agreement (>=2), we use the following splits:
- Train: 43,410
- Validation: 5,426
- Test: 5,427
admiration, amusement, anger, annoyance, approval, caring, confusion, curiosity, desire, disappointment, disapproval, disgust, embarrassment, excitement, fear, gratitude, grief, joy, love, nervousness, optimism, pride, realization, relief, remorse, sadness, surprise
-
Raw
.csvfiles indata/full_dataset/include metadata and annotations -
Filtered and remapped
.tsvfiles contain:- Comment text
- Comma-separated emotion IDs
- Comment ID
analyze_data.ipynb: Computes label distributions, correlations, hierarchical clustering.extract_words.ipynb: Computes top words for each emotion using log-odds ratio analysis.replace_emotions.ipynb: Maps fine-grained labels into coarse-grained categories (e.g., Ekman).calculate_metrics.ipynb: Evaluates classifier predictions against ground truth using accuracy, precision, recall, and F1 (macro, micro, weighted).EDA.ipynb: Contains visualizations, label imbalance checks, and exploratory statistics.
We include support for:
- Heatmaps of label correlations
- Dendrograms of emotion label clustering
- Sentiment-colored clustermaps
- Top word bar plots by emotion
- Confusion matrices, classification reports, and macro-F1 plots
This project explores multiple transformer-based emotion classification models trained on the GoEmotions dataset, using both text-only and context-augmented variants.
-
bert_classifier.py: Fine-tunes BERT-base (cased) for multi-label classification on GoEmotions.- Supports optional label remapping and hierarchical loss.
-
roberta_classifier.py: Fine-tunes RoBERTa-base for multi-class classification using cross-entropy loss.- Supports token prepending for contextual metadata (e.g.,
[SUBREDDIT:],[AUTHOR:]) and dynamic embedding resizing. - Integrated with Hugging Face’s
TrainerAPI and macro F1 as the main evaluation metric.
- Supports token prepending for contextual metadata (e.g.,
01_baseline_text_model.ipynb: Trains the baseline RoBERTa model on comment text only (no context).text_model_with_no_context_RoBERTa_cleaned_data.ipynb: RoBERTa trained on cleaned data without any context.text_model_with_subreddit_context_RoBERTa_cleaned_data.ipynb: RoBERTa trained with subreddit prepended.text_model_with_author_context_RoBERTa_cleaned_data.ipynb: RoBERTa trained with author identity prepended.text_model_with_subreddit_and_author_context_RoBERTa_cleaned_data.ipynb: RoBERTa trained with combined subreddit and author context.bert_multi_label_text_classification.ipynb: Fine-tunes multi-label BERT using the raw or remapped emotion labels.deberta_model_raw_data.ipynb: DeBERTa trained on the raw GoEmotions dataset (unfiltered).deberta_clean_data.ipynb: DeBERTa trained on the cleaned dataset without context.cos_deberta_context.ipynb: DeBERTa trained on cleaned data with author + subreddit context.text_model_with_subreddit_and_author_context_deBERTa_cleaned_data.ipynb: Final DeBERTa context-aware model, also used to generate plots.plots_for_paper.ipynb: Generates all final evaluation plots, confusion matrices, and JSON output for paper reporting.
We provide a mapping file ekman_mapping.json that aggregates fine-grained GoEmotions into the six Ekman universal emotions + Neutral. This enables:
- Coarser-grained classification
- Emotion-level analysis at different abstraction levels
Example:
{
"anger": ["anger", "annoyance", "disapproval"],
"joy": ["joy", "gratitude", "love"],
"sadness": ["grief", "remorse", "sadness", "disappointment"],
"neutral": ["neutral"]
}- Mount your Google Drive in Colab:
from google.colab import drive
drive.mount('/content/drive')- Follow individual notebooks/scripts:
- For data analysis: see
analyze_data.ipynb - For top words: see
extract_words.ipynb - For label remapping: see
replace_emotions.ipynb - For evaluation: see
calculate_metrics.ipynb - For EDA and visualizations: see
EDA.ipynbandplots_for_paper.ipynb
- Python 3.8+
- pandas, numpy, matplotlib, seaborn, scikit-learn
- PyTorch or TensorFlow for modeling (depending on classifier)
- Hugging Face Transformers
We replicate and extend the BERT-based baseline reported in the GoEmotions paper. Metrics include:
- Emotion-level F1 (macro, micro)
- Ekman-level and sentiment-level performance
- Reddit-based data introduces demographic and cultural bias
- Labels are context-free, often ambiguous without surrounding conversation
- Annotators were native English speakers from India, which may affect emotion perception
We highlight the importance of cautious deployment and fairness-aware modeling.
If you use this code or dataset, please cite:
@inproceedings{demszky2020goemotions,
author = {Demszky, Dorottya and Movshovitz-Attias, Dana and Ko, Jeongwoo and Cowen, Alan and Nemade, Gaurav and Ravi, Sujith},
title = {{GoEmotions: A Dataset of Fine-Grained Emotions}},
booktitle = {ACL},
year = {2020}
}- Arun Agarwal (UC Berkeley MIDS)
- Original GoEmotions team @ Google Research
Apache 2.0. See LICENSE file for details.
For more on model cards, ethical use, and detailed results, see: