This project, completed as part of the Quantium Virtual Experience Program on Theforage.com, focuses on analyzing retail transaction data for chips and crisps. The analysis aims to provide actionable insights to the category manager to enhance sales and fine-tune customer targeting.
Constructive feedback are most welcome !
- Examine and clean transaction and customer data to ensure data quality
- Identify distinct customer segments based on purchasing behavior
- Analyze sales drivers and develop key performance metrics
- Create data visualizations to effectively communicate insights
- Provide strategic recommendations based on the analysis
- Programming language: Python
- Libraries: pandas, matplotlib, plotly, numpy
The analysis is based on retail transaction data containing:
- Customer purchase information
- Product details for chips and crisps category
- Transaction timestamps and amounts
- Store location data
-
Data Cleaning and Preparation : this step was done manually beforehand.
- Transaction data examination
- Anomaly detection and handling
- Data standardization
-
Customer Segmentation
- Behavioral analysis
- Purchase pattern identification
- Segment profiling
-
Sales Analysis
- Trend identification
- Sales driver analysis
- Performance metrics development
-
Visualization and Reporting
- Creation of informative charts and graphs
- Key findings presentation
- Strategic recommendations
- Clone this repository
- Install required dependencies:
pip install pandas numpy matplotlib - Navigate to the notebooks directory to view the analysis
To go directly to the concerned graph :
- copy the selected title below
- go to notebook
- press ctrl+f, paste. This will lead you directly to the concerned section.
- Sales Drivers :
- 4 primary brands drive the sales, by quantity and by total sales : Kettle, Smiths, Doritos and Pringles
- 4 primary pack sizes drive the sales, by quantity and by total sales : 175G, 150G, 134G, 110G
- Interesting point : As unique products, some 380G, 330G and 175G belong in the top 10 most profitables products, could be an interesting wave to ride. Was it the last marketing campaign effect ? Product layout in store ? Discuss this with stakeholder
- Customer segmentation deep dive :
- New families don't buy much chips, possible campaign targets ? Define which premium type to deliver
- Analyze product-market fit for each lifestage, and select most pertinent premium category to market. Need stakeholder expertise on this
- Older and Midage Singles/Couples drivers of premium products sales, ride that wave
-
Sales trends :
- Sales hit low point on week 20 and week ~32. This low point is felt mostly by budget and mainstream, but not by premium products, look into it
- Plotting weekly sales by lifestage gives an intuitive visualization of profitable segments ;
- Most profitable shops and cumulative gross sales :
- A mean contribution to the turnover would be 0.37% per shop ;
- A median metric could be interesting to visualize to sort the 50% most profitable shops and analyze their strategies (product layout, product selection, discounts, etc.)
The category manager experimented with a product layout on stores 77, 86 and 88 during a trial period. We have to determine whether the product layout had indeed an effect on driving sales by making the customer make more transactions (since he walks through the product aisles, likes what he sees, and spends more). So there are 3 metrics to follow :
- Total transactions ;
- Total customers ;
- Total transactions per customer ; The product layout will have had an influence on sales if and only if the total tx per customer has increased. As will show the following graphs. After this visual method, a last statistical method (Z-test) is used to validate or refute the hypothesis of product layout influence.
Key insights :
- Trial period did not affect sales at all in store 77 ;
- Sales noticed an increase in store 86. After further analysis, it was due to the increase in number of customers, not the increase in the number of transactions per customer ;
- Trial period did affect sales in store 88, and the main reason was the increase in the number of transaction per customer.
Recommandations :
-
Confirm this trend with a statistical significance test (Z-test) ;
-
After passing the data through the Z-test, our hypothesises above have been confirmed. The explanations are available on the notebooks.
-
Now that we have confirmation, we can expand the product layout of store 88 to all the stores, sit, ???, profit.
- Quantium for providing the dataset
- Theforage.com for hosting the virtual experience program
- Me for doing the work :)









