A machine learning project that predicts optimal locations for electric vehicle (EV) charging stations based on EV population data and geographic factors.
This project analyzes Electric Vehicle Population Data to identify the most suitable cities for installing new charging stations. Users can input a county name to get predictions for cities within that county, helping infrastructure planners make data-driven decisions.
- Data Analysis: Comprehensive analysis of EV distribution patterns
- Machine Learning Models: Multiple ML models (Linear Regression, Ridge, Random Forest, XGBoost)
- Interactive Web App: Streamlit-based interface for easy interaction
- Geographic Visualization: Interactive maps showing EV distribution and charging station suitability
- County-based Predictions: Input county name to get city-level predictions
SMART_CHARGE_LOCATOR/
├── .venv/ # Virtual environment
├── app/ # Streamlit web application
│ └── app.py # Main application file
├── data/ # Data directory
│ ├── processed/ # Processed and cleaned data
│ ├── raw/ # Original dataset
│ └── Electric_Vehicle_Population_Data.csv
├── models/ # Trained ML models
│ ├── linear_regression.pkl
│ ├── ridge_regression.pkl
│ ├── random_forest.pkl
│ └── xgboost.pkl
├── notebooks/ # Jupyter notebooks
│ ├── model_training/ # Model training notebooks
│ │ ├── 4.1_Linear_Regression.ipynb
│ │ ├── 4.2_Ridge_Regression.ipynb
│ │ ├── 4.3_Random_Forest.ipynb
│ │ └── 4.4_XGBoost.ipynb
│ ├── 01_Data_Loading_and_Cleaning.ipynb
│ ├── 02_Exploratory_Data_Analysis.ipynb
│ └── 03_Feature_Engineering.ipynb
├── README.md # Project documentation
└── requirements.txt # Python dependencies
- Python 3.8 or higher
- pip (Python package installer)
-
Clone or download the project
# If using git git clone <repository-url> cd SMART_CHARGE_LOCATOR # Or simply navigate to the project directory cd SMART_CHARGE_LOCATOR
-
Create and activate virtual environment
# Create virtual environment python -m venv .venv # Activate virtual environment # On Windows: .venv\Scripts\activate # On macOS/Linux: source .venv/bin/activate
-
Install dependencies
pip install -r requirements.txt
-
Execute the data processing notebooks (in order):
# Start Jupyter Notebook jupyter notebook # Run notebooks in this order: # 1. 01_Data_Loading_and_Cleaning.ipynb # 2. 02_Exploratory_Data_Analysis.ipynb # 3. 03_Feature_Engineering.ipynb # 4. notebooks/model_training/4.1_Linear_Regression.ipynb # 5. notebooks/model_training/4.2_Ridge_Regression.ipynb # 6. notebooks/model_training/4.3_Random_Forest.ipynb # 7. notebooks/model_training/4.4_XGBoost.ipynb
-
Launch the Streamlit web application
streamlit run app/app.py
-
Open your browser and navigate to
http://localhost:8501
- Loads the Electric Vehicle Population Data
- Handles missing values and data inconsistencies
- Creates additional features like vehicle age
- Extracts geographic coordinates
- Saves cleaned data for further processing
- Analyzes EV distribution by county and city
- Creates charging station suitability scores
- Generates interactive maps
- Identifies patterns in EV adoption
- Saves analysis results
- Creates city-level aggregated features
- Engineers new features for machine learning
- Handles categorical variables
- Prepares training and test datasets
- Saves processed data for model training
- Linear Regression: Baseline model with interpretable coefficients
- Ridge Regression: Regularized linear model to prevent overfitting
- Random Forest: Ensemble method capturing non-linear relationships
- XGBoost: Gradient boosting for high performance predictions
- Select a Model: Choose from available trained models
- Choose a County: Select a county to analyze
- View the Map: Interactive map showing EV distribution and charging scores
- Check Rankings: See top cities by charging station suitability
- Make Predictions: Select a specific city to get charging score predictions
- View Statistics: County-level statistics and model performance metrics
The project includes multiple machine learning models with different strengths:
- Linear Regression: Fast, interpretable, good baseline
- Ridge Regression: Regularized, prevents overfitting
- Random Forest: Handles non-linear relationships, feature importance
- XGBoost: High performance, gradient boosting
- Interactive Maps: Folium-based maps with EV distribution
- Real-time Predictions: Get charging scores for any city
- Model Comparison: Switch between different ML models
- County Filtering: Focus on specific geographic areas
- Performance Metrics: View model accuracy and performance
- Responsive Design: Works on desktop and mobile devices
The project uses the Electric Vehicle Population Data which includes:
- Vehicle information (VIN, make, model, year)
- Geographic data (county, city, state, coordinates)
- EV specifications (electric range, MSRP, vehicle type)
- Registration details and utility information
- Python: Core programming language
- Pandas: Data manipulation and analysis
- NumPy: Numerical computing
- Scikit-learn: Machine learning algorithms
- XGBoost: Gradient boosting framework
- Streamlit: Web application framework
- Plotly: Interactive visualizations
- Folium: Interactive maps
- Jupyter: Notebook environment
- Select your target county
- View the interactive map to see current EV distribution
- Check the top cities ranking for charging station priority
- Use the prediction tool to evaluate specific locations
- Make data-driven decisions for charging station placement
- Run the analysis notebooks to understand EV adoption patterns
- Experiment with different feature engineering approaches
- Compare model performance across different algorithms
- Analyze feature importance to understand key factors
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
This project is for educational purposes as part of the FDM module mini project.
- Electric Vehicle Population Data source
- FDM module instructors
- Open source libraries and frameworks used
For questions or issues:
- Check the notebook documentation
- Review the error messages in the web app
- Ensure all dependencies are installed correctly
- Verify that all notebooks have been executed in order
Smart Charge Locator - Making EV infrastructure planning smarter with data science! 🔋⚡