GitHub - Keerthana4110/Amazon-Review-Sentiment-Analysis: Multiclass sentiment analysis on Amazon product reviews using lexicon-based and machine learning models, including AFINN, TF-IDF + XGBoost with nested cross-validation.

Project Description

This project implements a complete multiclass sentiment analysis pipeline using Amazon product reviews. The objective is to classify customer reviews into three sentiment categories: Negative, Neutral, and Positive. The project is designed to demonstrate a structured and reproducible NLP workflow, combining traditional text processing techniques with supervised machine learning models.

The workflow begins with data cleaning and preprocessing of raw review text, followed by feature extraction using TF-IDF vectorization. Multiple baseline approaches are explored, including lexicon-based sentiment scoring, and a stronger machine learning baseline using XGBoost for multiclass classification.

To ensure reliable and unbiased performance estimation, the project applies nested cross-validation, where inner folds are used for hyperparameter tuning and outer folds are used for model evaluation. Model performance is assessed using industry-standard metrics such as accuracy, precision, recall, F1-macro score, confusion matrices, and ROC-AUC (where applicable).

This project emphasizes good machine learning practices such as proper label encoding, stratified sampling, cross-validation, and clear separation of training and evaluation logic. The code is written in a modular and readable manner, making it suitable for academic submissions as well as real-world NLP applications.

Overall, this repository serves as a strong example of how classical NLP methods can be systematically applied, evaluated, and compared for sentiment classification tasks on large-scale text data.

Key Features

Multiclass sentiment classification (Negative / Neutral / Positive)
Text preprocessing and feature engineering using TF-IDF
Nested cross-validation for robust model evaluation
Hyperparameter tuning with GridSearchCV
XGBoost-based multiclass classification
Comprehensive evaluation using multiple performance metrics
Reproducible and well-structured Jupyter Notebook workflow

Technologies Used

Python
pandas, numpy
scikit-learn
XGBoost
Natural Language Processing (NLP)
Jupyter Notebook

Applications

Customer review sentiment analysis
Product feedback monitoring
Business intelligence and customer insights
Benchmarking NLP classification models

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Final.ipynb		Final.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project Description

Key Features

Technologies Used

Applications

About

Uh oh!

Releases

Packages

Languages

Keerthana4110/Amazon-Review-Sentiment-Analysis

Folders and files

Latest commit

History

Repository files navigation

Project Description

Key Features

Technologies Used

Applications

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages