Skip to content

A collection of real-world ETL projects using Microsoft Power Query for Excel - demonstrating data cleaning, transformation, and automation workflows across diverse datasets.

Notifications You must be signed in to change notification settings

Muskantayal02/Power-Query-ETL-Projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PROJECT OVERVIEW:

This repository showcases multiple ETL (Extract, Transform, Load) projects demonstrating my end-to-end capability in data transformation, validation, and automation using Microsoft Power Query.

Each project demonstrates how real-world data challenges - from messy Excel sheets to combining datasets across files - can be solved through systematic data cleaning, transformation, and automation techniques.

These highlight, different Power Query capabilities - from basic cleaning to advanced joins, appends, and dynamic folder consolidation. It includes multiple data cleaning and reporting scenarios showing how to extract, transform, and load data efficiently for analysis and reporting.

All project outputs are consolidated in one Excel workbook: PowerQuery_ETL_Projects.xlsx

Every sheet in the compiled workbook represents a standalone case study designed to highlight a specific Power Query capability such as data cleaning, merging, appending, and dynamic folder automation.

SKILLS DEMONSTRATED

  • Power Query ETL Process (Extract, Transform, Load)
  • Data Cleaning, Transformation & Structuring
  • Data Validation and Error Handling
  • Query Referencing & Load Optimization
  • Merging, Appending, Filtering and Reshaping Data
  • Dynamic Folder Automation
  • Web Scraping
  • Unpivot & Pivot Transformations
  • Custom Columns & Conditional Logic
  • Connection Management and Query Grouping
  • Automation using Connection-Only Queries
  • Business-Reay Data Modeling
  • Applied Steps Documentation  

PROJECT SUMMARIES

A) Employee Data Cleanup: (Sheet Name - Employee Data Cleaning)

Goal: Clean and prepare HR dataset for reporting.

Steps:

  • Removed extra rows/columns, promoted headers, and handled nulls.
  • Replaced missing gender and department values with defaults.
  • Extracted country from location column using “Column from Examples”.
  • Standardized departments and filtered out ex-employee
  • Converted data types for salary (currency) and date.
1  EmployeeData_AppliedSteps

 

Outcome: Cleaned employee dataset ready for dashboard/reporting.

Skills: Data cleaning, type transformation, text extraction.

 

B) Web Scraping - India Population Data (Sheet Name - India Population Web)

Goal: Transform a messy web dataset into a structured table.

Steps:

  • Connected to HTML table via Get Data → From Web.
  • Removed unnecessary rows and columns, promoted headers
  • Unpivoted year columns to normalize the data and reshape into tidy format.
  • Extracted year values using Text Between Delimiters.
  • Cleaned NA values, removed errors, and ensured standardized column types

 

2  WebScraping_Unpivot  

Outcome: A long-format dataset for population trends by state and year.

Skills: Web data extraction, Web connector, unpivoting, string parsing, data validation.

 

C) Excel Files Consolidation (Sheet Name - Projects Consolidation)

Goal: Combine monthly project data from multiple Excel files into one master table.

Steps:

  • Connected to folder path and combined data using Combine & Transform.
  • Transformed sample file, standardized project names (project-alpha.xlsx → project-alpha).
  • Added a custom column to convert month numbers into date values using a formula =#date(2025, [Month],1)
  • Filtered only .xlsx extensions files, changed types, and loaded final dataset
  • Tested automation by adding a new file and refreshing queries successfully

 

3  FolderConsolidation_Transform  

Outcome: An automated consolidated project database that refreshes on new file addition.

 

Automation for New files  

Initial files:  

Screenshot (602)  

New File added:

  Screenshot (603)  

Refreshed Output(Automated):  

Screenshot (606)

 

Skills: Folder automation, dynamic ETL, custom column formulas, refresh testing.

 

D) Students & Courses (Append and Merge) (Sheet Name - Female Students)

Goal: Combine data from multiple classes and enrich it with course-level details.

Steps:

  • Imported Year 11 and Year 12 student data, and added custom “Year” columns
  • Appended both tables to create a single student dataset
  • Merged with Course table (teacher, credits, assessment type) on Course field using Left Outer Join.
  • Expanded course details (teacher, credits, assessment type).  
4  StudentCourse_Append_Merge

 

Outcome: Unified student-course dataset ready for analytics and academic reporting.

Skills: Append queries, merge queries, relational data integration, reference queries, connection only.

 

E) Filtering Female Students on ≥3 Courses (Sheet Name - Female Students)

Goal: Identify and analyze high-performing female students across multiple courses.

Steps:

  • Created a reference query from the unified student dataset
  • Filtered Gender = Female
  • Applied filter for Credits ≥ 3
  • Customized load behavior (set support tables as connection-only)  
5  FemaleStudents_Filter  

Outcome: A filtered, query-based insight table highlighting targeted academic groups.

Skills: Query referencing, load optimization, conditional filtering.  

DELIVERABLES

  • Excel Workbook: PowerQuery_ETL_Projects.xlsx (contains all project sheets)
  • Transformation Screenshots: Embedded in README below each project section.  

KEY HIGHLIGHTS

  • Combined 5 distinct Power Query projects demonstrating versatile ETL capabilities
  • Used Append, Merge, Unpivot, Custom Columns, Replace Values, and Connection-only Queries
  • Optimized query groups and loading behavior for performance
  • Demonstrated both one-time cleaning and refresh-based automation workflows
  • Showcases proficiency across data preparation, transformation, and integration  

TECH STACK

Tool Purpose
Microsoft Excel (Power Query) Data extraction, transformation, and automation
Category Tools / Skills
Data Extraction & Cleaning Excel Power Query, ETL Techniques
Data Modeling Power Pivot, Data Relationships
Transformation Logic Merge Queries, Append Queries, Conditional Columns
Automation & Formulas Dynamic Ranges, Named Tables
Data Analysis Pivot Tables, Aggregation, Filtering
Documentation Excel Comments, Step Annotations
Conceptual Understanding SQL Joins (theory), Data Integrity Checks
Data Sources Excel files, Web HTML tables
Output Clean Excel datasets and summary pivot reports

 

OUTCOME

  • This project demonstrates hands-on data preparation proficiency using Power Query - efficiently cleaning, merging, and automating data pipelines.
  • It reflects analytical and technical ability to transform raw, unstructured data into structured, decision-ready outputs for analysis and reporting.

 

FUTURE ENHANCEMENTS

  • Link outputs to Power BI dashboards for visualization.
  • Integrate data refresh automation with OneDrive or Power Automate.
  • Add error-handling and dynamic parameterization for folders and URLs.  

HOW TO USE

  • Clone or download this repository.
  • Open PowerQuery_ETL_Projects.xlsx in Excel (ensure Power Query is enabled).
  • Go to Data → Queries & Connections → Refresh All to reload transformations.
  • Open Power Query Editor to explore applied steps for each sheet.  

REPOSITORY STRUCTURE

PowerQuery_Project/
│
├── PowerQuery_ETL_Projects.xlsx      ➜    Final Consolidated workbook with all project outputs
│
├── Raw_Files/
│   ├── M01.xlsx                      ➜ Employee Data Raw
│   ├── students and courses.xlsx     ➜ Student and Course Data Raw
│   ├── Consolidation_Files/          ➜ Projects for Consolidation Raw
│   │   ├── Project-alpha.xlsx
│   │   ├── Project-beta.xlsx
│   │   ├── Project-something.xlsx
│   │   ├── Project-something-else.xlsx
│   │   └── Project-new.xlsx
│
├── Screenshots/
│   ├── 1. EmployeeData_AppliedSteps.png
│   ├── 2. WebScraping_Unpivot.png
│   ├── 3. FolderConsolidation_Transform.png
│   ├── 4. StudentCourse_Append_Merge.png
│   └── 5. FemaleStudents_Filter.png
│
└── README.md

Designed and documented by Muskan Tayal

About

A collection of real-world ETL projects using Microsoft Power Query for Excel - demonstrating data cleaning, transformation, and automation workflows across diverse datasets.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published