The project is an extensive endeavour involving the design and implementation of a data mart, complemented by exploration through various data science techniques and analytical tools. The project is segmented into several phases, each with specific deliverables.
- Teams are tasked with designing an initial conceptual model for an enriched data mart.
- This phase includes defining dimensions, attributes, and key indicators essential for tracking and analyzing trends over time. An example includes tracking financial trends by incorporating dimensions like time, geography, financial products, and customer demographics.
- Teams are required to comprehensively detail all dimensions, measures/facts, and assumptions. They must also create a checklist to ensure avoidance of common design mistakes.
- This phase involves translating the conceptual design into a physical database structure. It includes the selection of suitable database systems and technologies, creation of tables, and establishment of relationships between different data entities.
- Teams engage in data staging, encompassing Extract, Transform, Load (ETL) processes. This involves extracting data from various sources, such as internal databases, external APIs, or open-source datasets. The transformation step ensures data is cleaned, normalized, and formatted correctly, and the load phase involves populating the data mart with this processed data.
- Teams utilize Online Analytical Processing (OLAP) for multi-dimensional data analysis. This includes crafting OLAP queries to perform operations such as drill-down, roll-up, slice, dice, and pivot to analyze data from Page 2 of 2 different perspectives. For example, a team might analyze sales data to compare performance across various regions or time periods. o Additionally, teams are responsible for developing a Business Intelligence (BI) dashboard for data visualization and trend identification. Tools like Power BI, Tableau, or Apache Superset may be used to create interactive dashboards that allow users to explore data through charts, graphs, and maps. A well-constructed dashboard can highlight key performance indicators, uncover historical trends, and aid in decision-making processes.
- The final phase involves applying data mining techniques to discover patterns, classify data, and predict trends. Teams might use algorithms for classification, clustering, or anomaly detection to glean meaningful insights from the data. For instance, a machine learning model could be trained to predict customer churn or to segment customers based on purchasing behavior.
- This phase also includes exploring and summarizing data, preprocessing data, and selecting features for further analysis.
@Lixu4n CELESTE DUGUAY Lixu4n • Git Repository Owner
@Aissatou123 AISSATOU KANGOU DIOME Aissatou123 • Collaborator
@Michel-Ako Michel-Ako • Collaborator