Databricks Asset Bundles are at the heart of productivity when using the VS Code extension. These bundles encapsulate files (scripts, notebooks), job configurations and dependencies in a single package defined via a databricks.yml file. Thanks to the extension’s deployment command, it is possible to apply CI/CD practices: a bundle can be pushed into a continuous integration pipeline (GitHub Actions, Azure DevOps) and deployed on a test environment before production. This approach improves repeatability and reduces human error.
The extension lets you run a local Python file on a remote Databricks cluster. The code then runs in the cluster’s Spark environment, facilitating resource-intensive testing of pipelines or transformations. It is also possible to run a .py file or notebook as a Databricks job from within VS Code. This feature is useful for automating recurring or scheduled processing (e.g., refreshing a Delta table or training a model). The extension manages authentication and the necessary dependencies.
Another major advantage is the ability to debug cell by cell using Databricks Connect. By configuring an authentication profile, you can run each cell of a notebook in the context of the cluster, inspecting variables and correcting errors in real time. This brings the development experience closer to that found on a local environment, while benefiting from the power of the cluster.
Bi-directional synchronization between the local folder and the Databricks workspace ensures that changes are reflected on both sides. Teams can collaborate via Git: when a developer pushes a commit, the CI/CD pipeline triggers deployment of the bundle to the shared environment. This approach maintains consistency between source code and notebooks used in production.
To make the most of the :
Structure your projects with clear bundles, separating transformation scripts, notebooks and configurations.
Use version control to track changes and facilitate code reviews.
Define distinct environments (development, test, production) and automate deployments via CI/CD pipelines.
Monitor cluster usage to avoid overheads when running code from VS Code.
How do asset bundles work? Asset Bundles are packages defined by a databricks.yml file, which group together code, notebooks and job configurations. They facilitate deployment and reproduction of environments.
Can I run local notebooks on a Databricks cluster? Yes. The extension supports the execution of notebooks or Python/R/Scala/SQL files as Databricks jobs.
What does cellular debugging offer? It lets you run and test each notebook cell directly in the cluster, inspect the results and quickly correct errors.
How do you integrate this extension into a CI/CD pipeline? By combining Asset Bundles and Git Actions (GitHub Actions, Azure DevOps), you can automate the deployment of your bundles on different environments and ensure continuous delivery.