Skip to content

Data Engineering Zoomcamp 2025 Homework Repository. Contains assignments on containerization, workflow orchestration, cloud, data warehouses, analytics engineering, batch processing, and streaming.

Notifications You must be signed in to change notification settings

valeqm/Data-Engineer-Zoomcamp-Homework

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

66 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DE Zoomcamp 2025 Homework Repository

Welcome 👋 This is where I've uploaded the assignments I've completed for the Datatalks Data Engineering Zoomcamp 2025 course.

Overview

Module 1: Containerization and Infrastructure as Code

  • Course overview
  • Introduction to GCP
  • Docker and docker-compose
  • Running Postgres locally with Docker
  • Setting up infrastructure on GCP with Terraform
  • Preparing the environment for the course

📄Homework 1: Docker, SQL and Terraform

Module 2: Workflow Orchestration

  • Data Lake
  • Workflow orchestration
  • Workflow orchestration with Kestra

📄Homework 2: Workflow Orchestration

Workshop 1: Data Ingestion

  • Reading from apis
  • Building scalable pipelines
  • Normalising data
  • Incremental loading

📄Workshop 1: Ingestion with dlt

Module 3: Data Warehouse

  • Data Warehouse
  • BigQuery
  • Partitioning and clustering
  • BigQuery best practices
  • Internals of BigQuery
  • BigQuery Machine Learning

📄Homework 3: Data Warehousing

Module 4: Analytics engineering

  • Basics of analytics engineering
  • dbt (data build tool)
  • BigQuery and dbt
  • Postgres and dbt
  • dbt models
  • Testing and documenting
  • Deployment to the cloud and locally
  • Visualizing the data with google data studio and metabase

📄Homework 4: Analytics Engineering

Module 5: Batch processing

  • Batch processing
  • What is Spark
  • Spark Dataframes
  • Spark SQL
  • Internals: GroupBy and joins

📄Homework 5: Batch

Module 6: Streaming

  • Introduction to Kafka
  • Schemas (avro)
  • Kafka Streams
  • Kafka Connect and KSQL

📄Homework 6: Streaming

Project

Putting everything we learned to practice

  • Week 1 and 2: working on your project
  • Week 3: reviewing your peers

🔗Link to My Project

📖 ZoomCamp Course

Data Engineering Zoomcamp

Author

Valeria Q.M

LinkedIn Credly Google Cloud Skill Boost GitHub Reddit

About

Data Engineering Zoomcamp 2025 Homework Repository. Contains assignments on containerization, workflow orchestration, cloud, data warehouses, analytics engineering, batch processing, and streaming.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages