Skip to content

EDA, RFM Segmentation, K-Means Clustering, & Recommendation Impact Analysis

Notifications You must be signed in to change notification settings

arguto1993/wine-customer-analysis

Repository files navigation

Wine Customer Analysis

Python Jupyter

A comprehensive data analysis project focused on understanding customer behavior, spending patterns, and campaign effectiveness for a wine retail company.

📊 Project Overview

This repository contains a series of analytical notebooks that explore customer data to derive actionable business insights. The analysis progresses from basic exploratory data analysis through advanced segmentation and clustering techniques.

📁 Dataset

File: dataset/customer_analysis.csv

  • Size: 2,240 customers × 29 attributes
  • Content: Customer demographics, purchase behavior, spending patterns, and marketing campaign responses
  • Documentation: See dataset/dictionary.md for detailed attribute descriptions

Key Data Categories

  • Demographics: Age, education, marital status, income, household composition
  • Products: Spending on wines, fruits, meat, fish, sweets, and gold products
  • Promotion: Campaign acceptance rates and deal purchases
  • Place: Purchase channels (web, catalog, store) and web visits

📓 Analysis Notebooks

The analysis is organized into sequential notebooks, each focusing on specific aspects:

0_customer_eda.ipynb - Exploratory Data Analysis

Initial exploration of the dataset covering:

  • Data quality assessment and validation
  • Demographic profiling
  • Spending pattern analysis
  • Purchase behavior insights
  • Campaign performance overview
  • Key findings and business recommendations

1_correlation_study.ipynb - Correlation Analysis

Statistical correlation analysis to identify:

  • Relationships between customer metrics
  • Strong positive and negative correlations
  • Feature dependencies
  • Insights for feature selection

2_rfm_analysis.ipynb - RFM Segmentation

Customer segmentation using RFM (Recency, Frequency, Monetary) methodology:

  • RFM score calculation
  • Customer tier classification
  • Behavioral segment identification
  • Targeted marketing recommendations

3_kmeans_clustering.ipynb - K-Means Clustering

Unsupervised machine learning for customer segmentation:

  • Optimal cluster determination
  • Customer group profiling
  • Cluster characteristics analysis
  • Segment-specific strategies

4_impact_analysis.ipynb - Impact Analysis

Feature importance and impact assessment:

  • Key drivers of customer behavior
  • Predictive feature analysis
  • Business metric relationships
  • Strategic recommendations

📚 Libraries Used

  • pandas - Data loading and manipulation
  • NumPy - Fundamental numerical operations (underneath pandas)
  • matplotlib & seaborn - Data visualization
  • scikit-learn - Machine learning (specific to clustering notebook)

📦 Prerequisites

To re-run the notebooks locally:

pip install pandas numpy matplotlib seaborn scikit-learn jupyter

📈 Key Insights

The analysis reveals:

  • Wine products dominate customer spending
  • Low campaign acceptance rates suggest need for better targeting
  • Store purchases are the preferred channel
  • Distinct customer segments with different behavioral patterns
  • Opportunities for personalized marketing and product recommendations

For detailed findings and recommendations, refer to the individual notebooks.

📝 Project Structure

wine-customer-analysis/
├── dataset/
│   ├── customer_analysis.csv
│   ├── customer_data_segmented.csv
│   └── dictionary.md
├── 0_customer_eda.ipynb
├── 1_correlation_study.ipynb
├── 2_rfm_analysis.ipynb
├── 3_kmeans_clustering.ipynb
├── 4_impact_analysis.ipynb
└── README.md

🎯 Use Cases

This analysis can help businesses:

  • Identify high-value customer segments
  • Optimize marketing campaign targeting
  • Improve customer retention strategies
  • Develop personalized product recommendations
  • Allocate resources more effectively across channels
  • Create data-driven pricing and promotional strategies

📄 License

This project is available for educational and analytical purposes.

👤 Author

LinkedIn


For questions or suggestions, please open an issue in this repository.

About

EDA, RFM Segmentation, K-Means Clustering, & Recommendation Impact Analysis

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published