This project performs an in-depth analysis of the Indian Premier League (IPL) 2025 dataset using Python.
It explores matches, ball-by-ball data, players, venues, and performance trends to extract insights, visualize trends, and build baseline predictive models (e.g., match outcome prediction).
ipl-2025-analysis/ βββ data/ # Raw IPL 2025 datasets (matches.csv, ball_by_ball.csv, players.csv) βββ notebooks/ β βββ 01-data-cleaning.ipynb β βββ 02-eda.ipynb β βββ 03-modeling.ipynb βββ src/ # Utility scripts (data loaders, functions) βββ reports/ β βββ figures/ # Saved visualizations βββ requirements.txt βββ README.md
β οΈ Note: Raw data files are large and may not be included. Download from the sources below and place them inside thedata/folder.
- IPL 2025 Records β Kaggle
- IPL Dataset 2008β2025 β Kaggle
- GitHub IPL Dataset
- ESPNcricinfo (for validation)
# Clone the repository
git clone https://github.com/yourusername/ipl-2025-analysis.git
cd ipl-2025-analysis
# Create a virtual environment
python -m venv venv
source venv/bin/activate # Linux/Mac
venv\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txt
# Launch Jupyter Lab
jupyter lab
##π Analysis Highlights
β
Season-wise Summary: Matches, winners, average runs/wickets
β
Top Players: Batsmen (runs, strike rates), Bowlers (wickets, economy rates)
β
Venue Analysis: Average scores, win percentages
β
Toss Impact: Toss vs Match Outcome analysis
β
Overs Breakdown: Powerplay, middle overs, death overs
β
Predictive Modeling: Baseline match outcome prediction using ML
β
Visualizations: Matplotlib, Seaborn, and Plotly
##π Future Improvements
Build a live win probability model using ball-by-ball data
Deploy interactive dashboards with Streamlit or Power BI
Extend analysis across multiple IPL seasons (2008β2025)
##π Credits
Data Sources: Kaggle, GitHub IPL Datasets, ESPNcricinfo
Purpose: Data Analysis portfolio project using Python