You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Data Forge — a modern data stack playground to practice flows and best practices, not just tools. Spark, Trino, Kafka, Iceberg, ClickHouse, Airflow, MinIO, Superset — all wired together locally with Docker Compose.
Building a modern data warehouse with Microsoft SQL Server, including ETL processes with Bronze Layer, Silver Layer and the Gold Layer, data modeling and as well as analytics.
A production-style ELT data pipeline built on Databricks using Auto Loader, Delta Live Tables (DLT), CDC-based Silver models, and dynamic Gold-layer dimension & fact tables designed for analytics and dbt-based transformations.
📚 An End-to-End Advanced SQL Project covering Data Warehousing, ETL Pipeline (Bronze → Silver → Gold), Star Schema Modeling, EDA, and Advanced SQL Analytics. Built using PostgreSQL, this project simulates a real-world Data Engineering + Data Analytics workflow using raw ERP & CRM data to generate production-ready customer and product insights.
Data Engineering portfolio showcasing batch pipelines, Data Lake architecture (Bronze/Silver/Gold), robust ingestion, analytical modeling and production-oriented practices.
A complete SQL-based data warehousing project implementing the Medallion Architecture (Bronze, Silver, Gold layers) for structured data ingestion, cleaning, and transformation. The project covers schema creation, stored procedures, data validation scripts, and analytical model design for business-ready reporting.
Data engineering project built with Kedro that implements a complete ETL pipeline for a restaurant dataset. The project follows the Medallion Architecture (Bronze, Silver, Gold) — ingesting raw data, transforming it into analytical tables, and producing business insights such as average order value and ticket counts per order.
Pipeline de Engenharia de Dados (Databricks Free Edition) para o SCANIA Component X Dataset: ingestão via Volumes, Delta Lake e arquitetura Medalhão (Bronze→Silver→Gold), modelagem em Esquema Estrela e dashboards/SQL para manutenção preditiva
End-to-end Azure Databricks retail data engineering project using Medallion Architecture (Bronze, Silver, Gold). Implements Auto Loader, Unity Catalog, Delta Lake, SCD Type 1 & 2 dimensions, and Fact Orders for analytics-ready star schema modeling.
End-to-end data engineering project using Microsoft SQL Server. Implements a layered data warehouse (Bronze, Silver, Gold) with SQL-based ETL, data cleaning, and star schema modeling from CRM and ERP sources, optimized for analytics and reporting.
Quant-AI Predictive Platform 🚀 : Prédiction du prix du Bitcoin T+10 min avec ingestion temps réel Binance, PySpark, Medallion Bronze/Silver, ML séries temporelles et API REST sécurisée (FastAPI + JWT). 📈💡
End-to-End workforce Intelligence Project ~1.3M LinkedIn job postings from 2024. Implements a Bronze–Silver–Gold architecture in Databricks to extract confidence-aware role and skill demand signals. Focuses on skill penetration, persistence, role skill structure, geographic variation, and data coverage to support transparent labor market analysis.
Production-style PySpark data pipeline implementing a Bronze–Silver–Gold architecture to transform raw e-commerce event data into analytics-ready business metrics.
Production-style data pipeline built with AWS MWAA (Apache Airflow) and Snowflake, implementing a Bronze–Silver–Gold architecture. Orchestrates parallel ingestion of healthcare CSV data from S3 into Snowflake using external stages and stored procedures, with full AWS IAM integration and execution verification.