You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+2-8Lines changed: 2 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,11 +16,12 @@ This project template demonstrates how to:
16
16
- utilize [pytest package](https://pypi.org/project/pytest/) to run unit tests on transformations.
17
17
- utilize [argparse package](https://pypi.org/project/argparse/) to build a flexible command line interface to start your jobs.
18
18
- utilize [funcy package](https://pypi.org/project/funcy/) to log the execution time of each transformation.
19
-
- utilize [Databricks CLI](https://docs.databricks.com/en/dev-tools/cli/index.html) and (the new!!!) [Databricks Asset Bundles](https://docs.databricks.com/en/dev-tools/bundles/index.html) to package/deploy/run a Python wheel package on Databricks.
19
+
- utilize [Databricks CLI](https://docs.databricks.com/en/dev-tools/cli/index.html) and [Databricks Asset Bundles](https://docs.databricks.com/en/dev-tools/bundles/index.html) to package/deploy/run a Python wheel package on Databricks.
20
20
- utilize [Databricks SDK for Python](https://docs.databricks.com/en/dev-tools/sdk-python.html) to manage workspaces and accounts. This script enables your metastore system tables that have [relevant data about billing, usage, lineage, prices, and access](https://www.youtube.com/watch?v=LcRWHzk8Wm4).
21
21
- utilize [Databricks Unity Catalog](https://www.databricks.com/product/unity-catalog) instead of Hive as your data catalog and earn for free data lineage for your tables and columns and a simplified permission model for your data.
22
22
- utilize [Databricks Workflows](https://docs.databricks.com/en/workflows/index.html) to execute a DAG and [task parameters](https://docs.databricks.com/en/workflows/jobs/parameter-value-references.html) to share context information between tasks (see [Task Parameters section](#task-parameters)). Yes, you don't need Airflow to manage your DAGs here!!!
23
23
- utilize [Databricks job clusters](https://docs.databricks.com/en/workflows/jobs/use-compute.html#use-databricks-compute-with-your-jobs) to reduce costs.
24
+
- define clusters on AWS and Azure.
24
25
- execute a CI/CD pipeline with [Github Actions](https://docs.github.com/en/actions) after a repo push.
25
26
26
27
For a debate about the use of notebooks x Python packages, please refer to:
@@ -98,13 +99,6 @@ Update "job_clusters" properties on wf_template.yml file. There are different pr
98
99
99
100
Configure [Github Actions repository secrets](https://docs.github.com/en/actions/security-guides/using-secrets-in-github-actions) DATABRICKS_HOST and DATABRICKS_TOKEN.
100
101
101
-
### 5) enable system tables on Catalog Explorer
102
-
103
-
python sdk_system_tables.py
104
-
105
-
106
-
... and now you can code the transformations for each task and run unit and integration tests.
0 commit comments