This project is a deep-dive data analysis of Indian Premier League (IPL) matches using Python and Pandas, based on raw JSON data from Cricsheet. The goal is to extract meaningful insights on player and team performances without using any SQL database — everything is done in code.
- 📂 Loads and processes multiple IPL match JSON files
- 🧼 Cleans and transforms data into structured formats
- 📊 Answers key analytical questions such as:
- Who scored the most runs?
- Which team won the most matches?
- What’s the highest successful chase?
- How many centuries have been scored?
- Which team has the best powerplay performance?
- Show partnerships over 100 runs
- 📋 Generates scorecards and player-specific statistics (e.g., Virat Kohli)
- ✅ Entirely built using Python and Pandas (no SQL!)
| Tool | Description |
|---|---|
| 🐍 Python | Core programming language |
| 📊 Pandas | Data analysis and manipulation |
| 📒 Jupyter Notebook | Interactive development environment |
| 🧾 JSON | Raw IPL match data format |
📦 ipl-data-analysis
├── ipl_json/ # Folder with all raw IPL match JSON files
├── IPL data analysis.ipynb # Main notebook with all logic and outputs
├── README.md # Project documentation
To run this project locally:
- Clone the repository
git clone https://github.com/Amanchouhan2708/ipl-data-analysis.git cd ipl-data-analysis
🔮 Future Enhancements Integrate with a SQLite database for SQL-based queries
Add interactive dashboards using Plotly or Streamlit
Build an NLP-powered MCP Server to answer user questions dynamically
Deploy insights as a web app
🧑💻 Author
Aman Chouhan
Data Enthusiast | Python Programmer