Draft

Using Databricks Asset Bundles for Client Projects

playbook
databricks
IaC
Author

David Mautz

Published

August 25, 2025

Using Databricks Asset Bundles for Client Projects

Introduction

Databricks Asset Bundles (DABs) are an Infrastructure-as-Code tool for managing Databricks resources, such as notebooks, jobs, Delta Live Tables, and clusters, using YAML configurations. They enable standardized, automated workflows for client projects, improving efficiency, consistency, and compliance. This post outlines when to use DABs, how to implement them, and best practices.

When to Use DABs

DABs are suited for: - Projects with multiple teams needing standardized configurations. - Deployments across development, staging, and production environments. - Regulated industries requiring versioned configurations for compliance. - Workflows needing CI/CD automation for rapid delivery. - Projects with recurring patterns where templates reduce setup time.

How to Implement DABs

  1. Install Databricks CLI: Follow the installation instructions for your enviornment. Then set up your authentication by running command databricks auth login --host dbc-464ba720-0425.cloud.databricks.com Follow the browser based instructions. It will create a configuration profile to use in the future. To use a profile, pass it with the -p flag. databricks clusters list -p <profile-name>

  2. Initialize Project: Synaptiq has a custom template to use when creating bundles for client projects. Synaptiq Bundle Template You can also use some of the default built in templates. Run databricks bundle init synaptiq-dab-template to create a project template with a databricks.yml file and directories for notebooks, code, jobs, and pipelines.

  3. Define Resources: Specify jobs or pipelines in databricks.yml. For example, a job might reference a notebook at ./notebooks/ingest.py.

    bundle:
      name: <project-name>
    resources:
      jobs:
        ingestion:
          tasks:
            - task_key: ingest
              notebook_task:
                notebook_path: ./notebooks/ingest.py

    You can also include resource yml files:

    bundle:
      name: <project-name>
    include:
      - resources/jobs/*.yml

    See Databricks Asset Bundle Configuration for the full yaml specification.

  4. Validate and Deploy: Validate with databricks bundle validate. Deploy to workspaces with databricks bundle deploy -t dev.

  5. Execute and Monitor: Run workflows with databricks bundle run -t dev <project-name>. You can also run individual jobs or pipelines from the databricks web UI.

  6. Customize for Clients: Use variables like ${var.client_catalog} for client-specific settings, such as schemas. See Substitutions and Variables in Databricks Asset Bundles for how to define and use variables in the bundle.

Best Practices

  • Project Structure: Use /notebooks for ipython notebooks, /<client_name> for client-specific code, /src for shared utilities, /resources for YAML files, and /tests for testing.
  • Version Control: Store DABs in Git with branches like feature/<project-name>.
  • CI/CD Automation: Automate validation and deployment with GitHub Actions or Azure DevOps.
  • Testing: Use pytest for unit tests and Nutter for integration tests in CI/CD pipelines.
  • Security: Store credentials in secret scopes or environment variables.
  • Templates: Create reusable DAB templates for common workflows, like ETL pipelines.
  • Documentation: Include a README.md with setup and client-specific details.