A Geospatial Machine Learning Platform for Predictive Landmine Risk Assessment
- π― Overview
- ποΈ Technical Architecture
- β¨ Features
- π¬ Methodology
- π Installation & Setup
- π» Usage
- π Model Performance
- π API Reference
- π€ Contributing
- π License
- π Acknowledgments
- π Contact
The Dynamic Landmine Risk Intelligence System is an advanced geospatial machine learning platform designed to predict and visualize landmine risk across geographic regions. Built as a graduate research prototype, this system demonstrates the application of ensemble learning, explainable AI, and interactive geospatial visualization for humanitarian demining operations.
Note: Screenshots are referenced below. To add actual screenshots, please follow the instructions in
docs/screenshots/README.md
Interactive heatmap visualization showing landmine risk probabilities across the geographic region
Real-time model performance metrics including AUC, accuracy, and feature importance rankings
Model interpretability through SHAP values showing feature contributions to predictions
Model Performance Achieved:
- β AUC Score: 0.791 - Demonstrates strong discriminative ability
- β Accuracy: 80.8% - Correctly predicts 4 out of 5 cases
- β Cross-validation: 0.749 Β± 0.012 - Consistent performance across data folds
- β Feature Correlations: 0.085-0.275 - Strong predictive relationships identified
Technical Achievements:
- π Real-time Processing - Sub-second prediction generation
- π¨ Interactive Visualization - Dynamic heatmap with customizable parameters
- π Model Interpretability - SHAP-based explanations for transparent AI
- π Comprehensive Analytics - Full performance metrics and feature analysis
- Real-time Risk Prediction: Machine learning models trained on geospatial features
- Interactive Heatmap Visualization: Dynamic risk assessment across geographic regions
- Model Interpretability: SHAP-based explanations for transparent decision-making
- Incremental Learning: Support for continuous model updates with new field data
- Web-based Interface: Accessible through modern web browsers
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Streamlit Web Interface β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β’ Interactive Controls β’ Real-time Visualization β
β β’ Model Management β’ Data Input/Export β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Machine Learning Pipeline β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β’ Random Forest Classifier β’ Feature Engineering β
β β’ Cross-validation β’ Model Persistence β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Geospatial Data Processing β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β’ Synthetic Data Generation β’ Coordinate Transformation β
β β’ Feature Extraction β’ Spatial Interpolation β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- Frontend: Streamlit, Folium, Matplotlib
- Machine Learning: Scikit-learn, SHAP, Joblib
- Geospatial: Folium, NumPy, Pandas
- Visualization: Plotly, Matplotlib, Streamlit-Folium
- Deployment: Streamlit Cloud
- Dynamic Heatmaps: Real-time risk probability visualization
- Customizable Parameters: Adjustable heatmap radius and opacity
- Multi-layer Display: Risk heatmap with labeled ground truth markers
- Geographic Context: CartoDB Positron basemap for clear visualization
- Ensemble Learning: Random Forest with optimized hyperparameters
- Class Balancing: Automatic handling of imbalanced datasets
- Cross-validation: Robust performance estimation
- Feature Engineering: Geospatial feature extraction and normalization
- SHAP Explanations: Local and global feature importance
- Feature Importance Rankings: Tree-based importance analysis
- Interactive Plots: Beeswarm and bar chart visualizations
- Sample-based Analysis: Efficient computation on data subsets
- Performance Metrics: AUC, Accuracy, Precision, Recall
- Risk Distribution: Histogram visualization of predicted probabilities
- Dataset Statistics: Comprehensive data overview and summaries
- Export Capabilities: CSV download of current dataset
The system employs a sophisticated synthetic data generation approach that simulates realistic landmine risk scenarios:
# Risk function incorporating multiple geospatial factors
risk_score = (
6.0 * vegetation + # Vegetation density
5.0 * soil_moisture + # Soil moisture content
-2.0 * np.log1p(distance_to_road) + # Distance to infrastructure
3.0 * (conflict_intensity / 3.0) + # Historical conflict data
0.003 * (elevation - 1200) + # Topographic elevation
# Interaction effects
2.0 * vegetation * soil_moisture + # Environmental synergy
-1.0 * vegetation * np.log1p(distance_to_road) # Accessibility factor
)Primary Features:
vegetation: Vegetation density index (0-1)soil_moisture: Soil moisture content (0-1)distance_to_road: Distance to nearest road (km)conflict_intensity: Historical conflict level (0-3)elevation: Topographic elevation (meters)
Feature Transformations:
- Logarithmic scaling for distance-based features
- Normalization for continuous variables
- Categorical encoding for conflict intensity
Random Forest Classifier:
RandomForestClassifier(
n_estimators=300, # Ensemble size
max_depth=None, # Unrestricted depth
min_samples_leaf=3, # Minimum leaf samples
class_weight="balanced_subsample", # Handle class imbalance
random_state=42, # Reproducibility
n_jobs=-1 # Parallel processing
)Performance Optimization:
- Stratified train-test splitting
- 5-fold cross-validation
- Class-weighted sampling
- Feature importance analysis
- Python 3.8 or higher
- pip package manager
- Git (for cloning)
- Clone the repository:
git clone https://github.com/KrishJani/Dynamic-Landmine-Risk-Heatmap.git
cd Dynamic-Landmine-Risk-Heatmap- Create virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt- Run the application:
streamlit run app_streamlit.pyFROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
EXPOSE 8501
CMD ["streamlit", "run", "app_streamlit.py", "--server.port=8501", "--server.address=0.0.0.0"]-
Access the Application: Navigate to https://dynamic-landmine-risk-heatmap.streamlit.app/
-
Model Overview: Review current model performance metrics in the sidebar
-
Interactive Controls:
- Adjust heatmap radius (8-40 pixels)
- Retrain model with current dataset
- Add labeled data points (simulate field reports)
-
Visualization:
- Explore the dynamic heatmap
- Analyze risk distribution plots
- Review feature importance rankings
-
Model Interpretability:
- Enable SHAP explanations
- Choose between beeswarm and bar plots
- Analyze local and global feature contributions
from model_and_utils import train_rf, predict_grid, explain_model_shap
from simulate_data import generate_synthetic_geodata
# Generate synthetic data
df = generate_synthetic_geodata(n_points=1200, seed=42)
# Train model
model, metrics, feature_importances = train_rf(df)
# Generate predictions
df_with_predictions = predict_grid(model, df)
# Explain model decisions
explainer, shap_values, X_sample = explain_model_shap(model, df.sample(100))| Metric | Value | Interpretation |
|---|---|---|
| Test AUC | 0.791 | Good discriminative ability |
| Test Accuracy | 0.808 | 80.8% correct predictions |
| CV AUC (mean Β± std) | 0.749 Β± 0.012 | Consistent cross-validation performance |
| Feature Correlation | 0.085-0.275 | Moderate to strong feature-target relationships |
Histogram showing the distribution of predicted risk probabilities across all data points
Bar chart displaying the relative importance of each geospatial feature in the model
- Soil Moisture (0.271) - Primary environmental indicator
- Distance to Road (0.260) - Accessibility factor
- Vegetation (0.248) - Environmental cover
- Elevation (0.166) - Topographic influence
- Conflict Intensity (0.055) - Historical context
- Cross-validation: 5-fold stratified CV ensures robust performance estimation
- Class balancing: Automatic handling of imbalanced datasets
- Feature scaling: Normalized features for optimal model performance
- Hyperparameter optimization: Tuned for maximum discriminative power
Trains a Random Forest classifier with optimized parameters.
Parameters:
df: DataFrame with features and target variabletest_size: Proportion of data for testingrandom_state: Random seed for reproducibilitycv: Number of cross-validation folds
Returns:
model: Trained RandomForestClassifiermetrics: Dictionary of performance metricsfeature_importances: Pandas Series of feature importance scores
Generates risk predictions for all data points.
Parameters:
model: Trained machine learning modeldf: DataFrame with feature columns
Returns:
- DataFrame with added
risk_probacolumn
Generates SHAP explanations for model predictions.
Parameters:
model: Trained machine learning modeldf_sample: Sample DataFrame for explanation
Returns:
explainer: SHAP explainer objectshap_values: Array of SHAP valuesX_sample: Feature matrix
Generates synthetic geospatial data for landmine risk assessment.
Parameters:
n_points: Number of data points to generateseed: Random seed for reproducibilitybbox: Bounding box (lon_min, lat_min, lon_max, lat_max)
Returns:
- DataFrame with geospatial features and binary labels
We welcome contributions to improve the Dynamic Landmine Risk Intelligence System. Please follow these guidelines:
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Make your changes with appropriate tests
- Commit your changes:
git commit -m 'Add amazing feature' - Push to the branch:
git push origin feature/amazing-feature - Open a Pull Request
- Follow PEP 8 style guidelines
- Include docstrings for all functions
- Add type hints where appropriate
- Write comprehensive tests for new features
- Update documentation for API changes
- Model Improvements: Advanced ML algorithms, feature engineering
- Visualization: Enhanced geospatial visualizations, interactive plots
- Performance: Optimization, caching, scalability improvements
- Documentation: Tutorials, examples, API documentation
- Testing: Unit tests, integration tests, performance benchmarks
This project is licensed under the MIT License - see the LICENSE file for details.
- Research Context: Graduate research prototype for geospatial machine learning
- Technologies: Built with Streamlit, Scikit-learn, and Folium
- Deployment: Hosted on Streamlit Cloud
- Data: Synthetic geospatial data for demonstration purposes
Developer: Krish Jani
Email: [email protected]
Project: Dynamic Landmine Risk Intelligence System
Live Demo: https://dynamic-landmine-risk-heatmap.streamlit.app/
This project demonstrates the application of machine learning and geospatial analysis for humanitarian demining operations. The synthetic data and models are designed for research and educational purposes.