This repository contains learning resources from a live hands-on workshop focused on Data Pre-Processing and Visualization, two critical steps performed before applying Machine Learning algorithms.
The session combined conceptual understanding, practical implementation, and interactive Q&A to help students work with real-world data confidently.
In this workshop, we explored how raw, unclean data is transformed into clean, meaningful data using preprocessing techniques and visualization.
Participants learned why these steps are necessary, how to apply them, and when to make the right preprocessing decisions.
Delivered LIVE on Zoom: 14 December 2025
Audience: University students & beginners in Machine Learning
- Conceptual explanation of:
- What data is and why preprocessing is required
- Common data issues (missing values, outliers, categorical data)
- Feature scaling and train-test split
- Importance of data visualization in ML
- Beginner-friendly explanations with real-world analogies
- Used during the live workshop session
- End-to-end implementation of:
- Loading and inspecting raw data
- Handling missing values and duplicates
- Encoding categorical features correctly
- Feature scaling
- Visualizing data using histograms, box plots, and heatmaps
- Includes step-by-step explanations and reasoning
- Designed for live demonstration and self-practice
Notebooks 1.Working on Unclean Smart Watch Records
2. Student Performance Record ``
- Smartwatch health dataset used during the workshop
- Intentionally unclean to simulate real-world scenarios
- Used to demonstrate:
- Data quality issues
- Visualization-driven preprocessing decisions
- Difference between raw vs cleaned data
📁 Dataset file:
unclean_smartwatch_health_data.csv
By using these resources, learners will be able to:
- Understand why preprocessing is essential before ML
- Identify and fix common data quality problems
- Use visualization to guide preprocessing decisions
- Prepare real-world data for machine learning models
Thanks to everyone who joined the live session and actively participated in the Q&A.
Your engagement made the workshop interactive and impactful!
If you find this repository helpful:
- Star the repo
- Share it with others learning Machine Learning
- Feel free to raise issues or suggestions
Happy Learning 🚀
.jpg)