Databricks Digital Twin Solution Accelerator

Overview

This repository contains an end-to-end implementation of a Digital Twin for a ball bearing manufacturing process. A digital twin is a virtual representation of physical assets that synchronizes with real-world sensor data in real-time, enabling monitoring, analysis, and optimization of industrial processes.

The solution demonstrates how to build a complete digital twin system using Databricks, incorporating IoT data ingestion, semantic data modeling with RDF (Resource Description Framework), real-time synchronization, and interactive visualization via Databricks Apps.

What This Solution Does

This accelerator shows you how to:

Ingest IoT sensor data from manufacturing equipment using Databricks Zerobus Ingest
Transform relational data into semantic triples using RDF/OWL ontologies
Map sensor readings to digital twin models with declarative pipelines
Synchronize latest sensor values to Lakebase for low-latency queries
Visualize the digital twin as an interactive knowledge graph
Monitor production lines with real-time dashboards

While this example focuses on manufacturing, the architecture and approach are reusable for many other scenarios including:

Smart buildings and facilities
Energy grid monitoring
Supply chain optimization
Healthcare equipment tracking
Fleet management

Architecture

The solution leverages several Databricks technologies:

Databricks Zerobus - High-throughput IoT data ingestion
Delta Lake & Unity Catalog - Unified data storage and governance
Lakeflow Declarative Pipelines - Data transformation and mapping
Lakebase - Real-time data synchronization
Databricks Apps - Interactive visualization and monitoring

Quick Start

Installation with Databricks Asset Bundles

The fastest way to deploy this solution is using Databricks Asset Bundles, which automates the entire setup process.

Prerequisites

Databricks workspace (AWS, Azure, or GCP)
Databricks CLI installed and configured (installation guide)
Unity Catalog enabled in your workspace
SQL Warehouse for serving queries

Deploy the Solution

Clone this repository to your local machine:

git clone https://github.com/databricks-industry-solutions/digital-twin.git
cd digital-twin

Configure your parameters in the 0-Parameters notebook:

Open 0-Parameters.ipynb and update all values to match your workspace environment (table locations, URLs, credentials, etc.).

Important: These parameters must be configured before running the asset bundles or individual notebooks.

Deploy and run the accelerator:

databricks bundle deploy
databricks bundle run setup_solution_accelerator

The bundle will automatically:

Create all required Delta tables
Set up Zerobus ingestion endpoints
Deploy the mapping pipeline
Configure Lakebase synchronization
Launch the visualization app

Important! Ingesting data via Zerobus Ingest is not strictly necessary to successfully run this Solution Accelerator. If you don't have access to Zerobus, you can skip this task by following these steps:

Comment out or remove the whole task_key named "ingest_data" in the databricks.yml file.
Modify the task_key named "setup_mapping_pipeline" to depend from "create_bronze_table" instead of "ingest_data"

Cleanup

When you're finished, remove all assets created by the accelerator:

databricks bundle run teardown_solution_accelerator
databricks bundle destroy

Manual Installation

If you prefer to run each step individually, follow this notebook-by-notebook approach.

Before you begin: Configure the 0-Parameters notebook with your workspace-specific settings. All other notebooks depend on these parameters.

Notebook Overview

The setup process is divided into several notebooks that illustrate each component of the solution:

0-Parameters

Start here! This notebook contains all the configuration parameters required for the entire accelerator. You must customize these settings for your environment before deploying the solution.

All subsequent notebooks reference these parameters using %run ./0-Parameters, ensuring consistent configuration across the entire solution.

1-Create-Sensor-Bronze-Table

Define and create the Delta table where Zerobus will store incoming IoT telemetry. This notebook also includes a data generator to simulate sensor data if you don't have access to real IoT devices or Zerobus.

2-Ingest-Data-Zerobus

Set up the Zerobus endpoint and connect it to the bronze table. This notebook demonstrates how to write data to the Zerobus API (in production, this would be done by the IoT devices themselves).

Important! Ingesting data via Zerobus Ingest is not strictly necessary to successfully run this Solution Accelerator. If you don't have access to Zerobus, you can skip this notebook and move to the following notebook. If you are deploying via Databricks Asset Bundles, you can skip this task by following these steps:

Comment out or remove the whole task_key named "ingest_data" in the databricks.yml file.
Modify the task_key named "setup_mapping_pipeline" to depend from "create_bronze_table" instead of "ingest_data"

3-Setup-Mapping-Pipeline

Convert incoming sensor data into timestamped RDF triples that are compatible with the digital twin ontology. Uses Lakeflow Declarative Pipelines with the spark-r2r library to perform the semantic mapping. The result is a Delta Lake table containing RDF triples ready for querying and visualization.

4-Sync-To-Lakebase

Enhance query performance by serving the latest sensor readings from Lakebase. The synced table automatically maintains the most recent value from each sensor based on timestamps, providing sub-second query latency.

5-Create-Serving-App

Deploy a Databricks App that serves the triple data and displays the digital twin model as an interactive knowledge graph. The app provides:

Real-time sensor value displays
Interactive graph visualization of the twin model
Filtering and navigation capabilities
Historical trend analysis

6-Cleanup

Remove all resources created by the solution accelerator for a clean slate.

Data Generator

The solution includes a custom line_data_generator library that simulates realistic sensor data from a ball bearing production line. This is useful for:

Testing the solution without real IoT hardware
Generating training data
Demonstrating the system to stakeholders
Load testing the pipeline

Cost Considerations

The cost of running this accelerator depends on:

Cluster/warehouse compute time
Data storage volume in Delta Lake
Lakebase instance size and uptime
App serving hours

It is the user's responsibility to monitor and manage associated costs. Consider starting with smaller configurations and scaling as needed.

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

Security

For security concerns, please review SECURITY.md.

License

The source in this notebook is provided subject to the Databricks License. All included or referenced third-party libraries are subject to their respective licenses.

Support

For questions or issues:

Check the Databricks documentation
Open an issue in this repository
Contact Databricks support if you're a customer

Ready to build your digital twin? Start by cloning this repo and running the Quick Start guide above!

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github		.github
deployment-staging		deployment-staging
example-ttls		example-ttls
frontend		frontend
images		images
line_data_generator		line_data_generator
mapping_pipeline		mapping_pipeline
serving-app		serving-app
.gitignore		.gitignore
0-Parameters.ipynb		0-Parameters.ipynb
1-Create-Sensor-Bronze-Table.ipynb		1-Create-Sensor-Bronze-Table.ipynb
2-Ingest-Data-Zerobus.ipynb		2-Ingest-Data-Zerobus.ipynb
3-Setup-Mapping-Pipeline.ipynb		3-Setup-Mapping-Pipeline.ipynb
4-Sync-To-Lakebase.ipynb		4-Sync-To-Lakebase.ipynb
5-Create-App.ipynb		5-Create-App.ipynb
6-Cleanup.ipynb		6-Cleanup.ipynb
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
SECURITY.md		SECURITY.md
databricks.yml		databricks.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Databricks Digital Twin Solution Accelerator

Overview

What This Solution Does

Architecture

Quick Start

Installation with Databricks Asset Bundles

Prerequisites

Deploy the Solution

Cleanup

Manual Installation

Notebook Overview

0-Parameters

1-Create-Sensor-Bronze-Table

2-Ingest-Data-Zerobus

3-Setup-Mapping-Pipeline

4-Sync-To-Lakebase

5-Create-Serving-App

6-Cleanup

Data Generator

Cost Considerations

Contributing

Security

License

Support

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 6

Uh oh!

Languages

License

databricks-industry-solutions/digital-twin

Folders and files

Latest commit

History

Repository files navigation

Databricks Digital Twin Solution Accelerator

Overview

What This Solution Does

Architecture

Quick Start

Installation with Databricks Asset Bundles

Prerequisites

Deploy the Solution

Cleanup

Manual Installation

Notebook Overview

0-Parameters

1-Create-Sensor-Bronze-Table

2-Ingest-Data-Zerobus

3-Setup-Mapping-Pipeline

4-Sync-To-Lakebase

5-Create-Serving-App

6-Cleanup

Data Generator

Cost Considerations

Contributing

Security

License

Support

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 6

Uh oh!

Languages

Packages