Enterprise-grade Data Warehouse built with SQL Server, Python, and dimensional modeling (Kimball). Includes OLTP modeling, ETL pipelines, SCD Type 2, multi-currency support, and analytical SQL/Python workflows.
This project simulates the analytical backbone of a retail cellphone company.
- Sales performance analysis across stores, brands, models, channels
- RFM segmentation & customer lifecycle insights
- ABC/Pareto analysis for product profitability
- Seller performance tracking (Top / Medium / Low)
- Multi-currency reporting: ARS, USD, EUR, BRL, CNY
- BI-ready datasets for Power BI & Python
This architecture is representative of a real Canadian retail BI system.
OLTP (Normalized 3NF)
β ETL
Data Warehouse (Star Schema)
β
SQL Analytics + Python + Power BI
Operational data model used as the source system for the DW.
Features:
- β 9 normalized tables
- β Referential integrity enforced
- β Real-time operations
- β Source for ETL extraction
Figure 1: Normalized OLTP schema with referential integrity constraints.
Dimensional model optimized for BI, KPIs, aggregations and advanced analysis.
- DimDate β Date dimension with temporal attributes
- DimCustomer β Customer profiles
- DimProduct β Products (brand, model, specs)
- DimStore β Store locations
- DimChannel β Sales channels (Web / Store)
- DimCurrency β Multi-currency support
- DimPaymentMethod β Payment methods
- DimSeller β Sellers with SCD Type 2 versioning
- FactSales β Quantities, revenue, margins, multi-currency metrics
Figure 2: Star schema with fact table and dimension tables optimized for analytical queries.
Daily automated ETL ensuring fresh, consistent analytical data.
- Extract from OLTP
- Transform (business rules, currency conversion)
- SCD Type 2 for sellers (versioning)
- Load DW tables (facts & dimensions)
- DimDate: 10-year temporal attributes (year, month, quarter, weekday)
- DimChannel: Web / Store / Online
- DimCurrency: ARS, USD, EUR, BRL, CNY
Figure 3: ETL pipeline with dimension population and incremental load strategy.
Implemented in SQL and Python.
- YoY / MoM growth rates
- Moving averages
- Running totals
- Trend analysis
- ABC / Pareto analysis
- RFM segmentation (Champions, Loyal, At Risk, Lost)
- Best seller by revenue
- Best store by profit margin
- Top brand/model
- Correct conversion with historical exchange rates
- Aggregation-safe metrics
- Multi-currency reporting (ARS, USD, EUR, BRL, CNY)
retail-sales-data-warehouse-sql-refactored/
βββ 01_base_datos/ # Database creation scripts
βββ 02_oltp/ # OLTP DDL and data load
βββ 03_datawarehouse/ # DW schema definitions
βββ 04_etl/ # ETL initial and incremental
βββ 05_consultas/ # Analytical SQL queries
βββ 06_analisis/ # Jupyter notebooks
βββ 07_validacion/ # Data quality validation
βββ 08_scripts_auxiliares/ # Helper scripts
βββ 09_documentacion/ # Documentation
βββ sql/ # Organized SQL scripts
β βββ ddl/
β βββ dml/
β βββ views/
βββ src/ # Python modules
β βββ etl/
β βββ utils/
βββ notebooks/ # Analysis notebooks
βββ data/ # Raw and processed data
βββ img/ # Architecture diagrams
βββ docs/ # Technical documentation
- SQL Server 2017+ (Developer or Express Edition)
- Python 3.9+
- SSMS or Azure Data Studio (optional)
-- 1. Create databases
:r 01_base_datos/00_creacion_bases.sql
-- 2. OLTP schema
:r 02_oltp/01_ddl_oltp.sql
-- 3. Load sample data
:r 02_oltp/02_carga_oltp.sql
-- 4. DW schema
:r 03_datawarehouse/03_ddl_dw.sql
-- 5. Initial ETL
:r 04_etl/04_etl_dw_inicial.sql-- Daily refresh
:r 04_etl/05_reproceso_diario.sql# Install dependencies
pip install -r requirements.txt
# Launch Jupyter
jupyter notebook notebooks/Notebook_Estadistica_Ventas.ipynb-- Execute any query from 05_consultas/
:r 05_consultas/09_analisis_abc_pareto.sql
:r 05_consultas/10_analisis_rfm.sql| Category | Tools |
|---|---|
| Database | SQL Server 2017+ |
| Languages | T-SQL, Python 3.9+ |
| Python Stack | Pandas, SQLAlchemy, Matplotlib, Seaborn, Plotly |
| BI Tools | Power BI, Jupyter Notebooks |
| IDEs | VS Code, SSMS, Azure Data Studio |
| Version Control | Git, GitHub |
- β OLTP β DW full pipeline with initial and incremental ETL
- β SCD Type 2 for seller dimension versioning
- β Multi-currency support with historical exchange rates
- β Star schema optimized for analytical queries
- β Advanced analytics: ABC/Pareto, RFM, temporal analysis
- β Python integration: Jupyter notebooks with visualization
- β Production-ready: Data quality validation, helper scripts
MIT License β see LICENSE for details.
Ramiro Ottone Villar
Built with β€οΈ for Canadian tech market β showcasing enterprise-grade data engineering and analytics skills.


