Skip to content

This project performs sentiment analysis on IMDB movie reviews using Natural Language Processing (NLP). It classifies reviews as positive or negative and compares the performance of multiple machine learning models. Models used include Logistic Regression, Naive Bayes, SVM, and Random Forest.

Notifications You must be signed in to change notification settings

jahanOSTG/Movie_Review_Sentiment_NLP1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 

Repository files navigation

IMDB Sentiment Analysis: Multiple NLP Models Comparison

This project is a Natural Language Processing (NLP) pipeline for sentiment analysis on IMDB movie reviews dataset. The goal is to classify reviews as positive or negative and compare the performance of different models.


πŸ› οΈ Tools & Libraries Used

  • Python 3.x
  • Jupyter Notebook / Google Colab
  • Libraries:
    • pandas β†’ Data manipulation
    • numpy β†’ Numerical operations
    • scikit-learn β†’ Machine learning models & metrics
    • nltk β†’ Text preprocessing (stopwords, lemmatization)
    • matplotlib β†’ Accuracy comparison chart
  • Dataset: IMDB Dataset of 50K Movie Reviews

πŸ”Ή Project Workflow

  1. Dataset Loading
    • Load the IMDB dataset (CSV file)
  2. Text Preprocessing
    • Lowercase
    • Remove HTML tags
    • Remove non-alphabetic characters
    • Stopwords removal
    • Lemmatization
  3. Feature Extraction
    • TF-IDF vectorization (max_features=5000)
  4. Train/Test Split
    • 80% training, 20% testing
  5. Model Training & Evaluation
    • Logistic Regression
    • Naive Bayes (MultinomialNB)
    • Support Vector Machine (SVM)
    • Random Forest
    • Metrics:
      • Accuracy
      • Classification Report (Precision, Recall, F1-score)
      • Confusion Matrix
  6. Accuracy Comparison
    • Bar chart visualization of all models

πŸ“Š Example Accuracy Results

Model Accuracy
Logistic Regression 0.88
Naive Bayes 0.85
SVM 0.89
Random Forest 0.86

About

This project performs sentiment analysis on IMDB movie reviews using Natural Language Processing (NLP). It classifies reviews as positive or negative and compares the performance of multiple machine learning models. Models used include Logistic Regression, Naive Bayes, SVM, and Random Forest.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published