This project is our submission for the Data4Good Festival Hackathon, focusing on urban innovation in Berlin π©πͺ. By leveraging open datasets on traffic accidents, air quality, and cycling infrastructure, we developed a data-driven solution to identify safer, cleaner, and more efficient cycling routes throughout the city. The goal is to transform raw data into actionable insights that can enhance the quality of life for Berlin's citizens.
The primary objective of this 48-hour challenge was to design, explore, and prototype a solution that addresses the multifaceted challenges of urban living. Our specific goals were to:
- Analyze and identify accident hotspots across the city.
- Develop a methodology to define and recommend the safest and healthiest cycling routes.
- Create an interactive tool to help cyclists and city planners make better, data-informed decisions.
This project was created during the Data4Good Festival Hackathon, organized by the Hertie School in Berlin, Germany, from April 19-21, 2024. As participants, we were challenged to develop a meaningful, data-driven solution to a real-world problem within a 48-hour timeframe.
Our analysis was built upon three core datasets provided for the hackathon:
- Accidents in Berlin (2021): Detailed records of road accidents, including precise geolocation data.
- Air Pollution Data (2021): Air quality measurements (e.g., NOβ, PM10) from official monitoring stations.
- Urban Infrastructure Data: Geospatial data on existing cycling routes across Berlin.
For a detailed overview of the data preparation process, see the original data preparation notebook.
The project was implemented entirely in Python, leveraging its powerful data science and geospatial libraries.
Our 48-hour sprint followed a condensed data science process:
- Data Exploration & Cleaning: Loaded the datasets, handled missing values, and standardized data types, particularly for geospatial coordinates and timestamps.
- Geospatial Analysis:
- Mapped accident locations to identify high-risk zones or "accident hotspots".
- Integrated air quality data from monitoring stations to create a pollution layer across the city.
- Route Scoring: Developed a simple but effective scoring algorithm to evaluate cycling routes by combining safety and air quality metrics.
- Interactive Visualization: Built an interactive map using Folium to present our findings in an accessible and user-friendly way.
We developed a prototype for an interactive web map called AIDSCA, designed to help cyclists plan their journeys based on safety and environmental factors.
- Interactive Map Layers: The Folium map allows users to toggle different data layers on and off, including:
- π Accident hotspots.
- π¨ Air pollution levels.
- π² Existing cycling infrastructure.
- Route Safety Score: Each cycling route is color-coded based on a calculated safety & health score, which penalizes routes for proximity to accidents and high-pollution zones.
- Data-Driven Insights: The analysis pinpoints specific intersections and road segments that are high-risk, providing a basis for data-driven policy recommendations to the city.
To explore our analysis and replicate the results, please follow these steps:
-
Prerequisites:
- Ensure you have Python 3.x installed.
- We recommend using a virtual environment.
-
Clone the Repository:
git clone https://github.com/Silvestre17/Data4Good_Berlim.git cd Data4Good_Berlim -
Launch Jupyter:
- Run Jupyter Notebook from your terminal:
jupyter notebook
- Open the main project notebook (
ISCTE Derrubar.ipynb) to view the full workflow, from data loading to the final Folium map generation.
- Run Jupyter Notebook from your terminal:
This project was a collaborative effort by:
- AndrΓ© Silvestre
- Rita Matos
- Maria Margarida Pereira
While the project is documented in English, the context is the vibrant, data-rich environment of Berlin, Germany π©πͺ. πΊπ₯¨ We hope our findings and tools can contribute to making Berlin a safer and more sustainable city for cyclists.