Skip to content

data engineering project on paris olympics 2024 using Microsoft Azure and a detailed PowerBI Dashboard

License

Notifications You must be signed in to change notification settings

aRUsh-codes/paris-olympic-de

Repository files navigation

Paris Olympics Data Engineering Project in Microsoft Azure 🏅

Overview 📝

This project demonstrates an end-to-end ETL process for analyzing Olympics data using Microsoft Azure services: Azure Data Factory, Azure Databricks, Azure Storage, and Azure Synapse Analytics. The dataset includes four tables: Athletes, Coaches, Teams, and Medals.

Components Used ⚙️

  1. Azure Data Factory 🏭 Orchestrated the ETL pipeline by extracting data from Azure Storage, transforming it in Databricks, and loading it into Azure Synapse Analytics.

  2. Azure Databricks 🔥 Handled data transformation and analysis using PySpark. The cleaned and processed data was prepared for deeper insights, such as athlete performance trends and medal distributions.

  3. Azure Storage 📦 Stored the raw Olympics data (CSV files) as the source for the pipeline.

  4. Azure Synapse Analytics 📊 Served as the data warehouse, allowing complex SQL queries for team performance and medal analysis.

Dataset 📚

The dataset includes:

  • Athletes
  • Coaches
  • Teams
  • Medals

It can be found on kaggle - "https://www.kaggle.com/datasets/piterfm/paris-2024-olympic-summer-games/data"

Work 🎯

Screen.Recording.2024-09-08.004920.mp4

My PowerBI Dashboard

This is a link to my live PowerBI dashboard. Click the image below to open it:

PowerBI Dashboard

PowerBI Dashboard

PowerBI Dashboard

About

data engineering project on paris olympics 2024 using Microsoft Azure and a detailed PowerBI Dashboard

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published