A comprehensive data analysis project focused on understanding customer behavior, spending patterns, and campaign effectiveness for a wine retail company.
This repository contains a series of analytical notebooks that explore customer data to derive actionable business insights. The analysis progresses from basic exploratory data analysis through advanced segmentation and clustering techniques.
File: dataset/customer_analysis.csv
- Size: 2,240 customers × 29 attributes
- Content: Customer demographics, purchase behavior, spending patterns, and marketing campaign responses
- Documentation: See
dataset/dictionary.mdfor detailed attribute descriptions
- Demographics: Age, education, marital status, income, household composition
- Products: Spending on wines, fruits, meat, fish, sweets, and gold products
- Promotion: Campaign acceptance rates and deal purchases
- Place: Purchase channels (web, catalog, store) and web visits
The analysis is organized into sequential notebooks, each focusing on specific aspects:
0_customer_eda.ipynb - Exploratory Data Analysis
Initial exploration of the dataset covering:
- Data quality assessment and validation
- Demographic profiling
- Spending pattern analysis
- Purchase behavior insights
- Campaign performance overview
- Key findings and business recommendations
1_correlation_study.ipynb - Correlation Analysis
Statistical correlation analysis to identify:
- Relationships between customer metrics
- Strong positive and negative correlations
- Feature dependencies
- Insights for feature selection
2_rfm_analysis.ipynb - RFM Segmentation
Customer segmentation using RFM (Recency, Frequency, Monetary) methodology:
- RFM score calculation
- Customer tier classification
- Behavioral segment identification
- Targeted marketing recommendations
3_kmeans_clustering.ipynb - K-Means Clustering
Unsupervised machine learning for customer segmentation:
- Optimal cluster determination
- Customer group profiling
- Cluster characteristics analysis
- Segment-specific strategies
4_impact_analysis.ipynb - Impact Analysis
Feature importance and impact assessment:
- Key drivers of customer behavior
- Predictive feature analysis
- Business metric relationships
- Strategic recommendations
- pandas - Data loading and manipulation
- NumPy - Fundamental numerical operations (underneath pandas)
- matplotlib & seaborn - Data visualization
- scikit-learn - Machine learning (specific to clustering notebook)
To re-run the notebooks locally:
pip install pandas numpy matplotlib seaborn scikit-learn jupyterThe analysis reveals:
- Wine products dominate customer spending
- Low campaign acceptance rates suggest need for better targeting
- Store purchases are the preferred channel
- Distinct customer segments with different behavioral patterns
- Opportunities for personalized marketing and product recommendations
For detailed findings and recommendations, refer to the individual notebooks.
wine-customer-analysis/
├── dataset/
│ ├── customer_analysis.csv
│ ├── customer_data_segmented.csv
│ └── dictionary.md
├── 0_customer_eda.ipynb
├── 1_correlation_study.ipynb
├── 2_rfm_analysis.ipynb
├── 3_kmeans_clustering.ipynb
├── 4_impact_analysis.ipynb
└── README.mdThis analysis can help businesses:
- Identify high-value customer segments
- Optimize marketing campaign targeting
- Improve customer retention strategies
- Develop personalized product recommendations
- Allocate resources more effectively across channels
- Create data-driven pricing and promotional strategies
This project is available for educational and analytical purposes.
For questions or suggestions, please open an issue in this repository.