API-first approach to scan for notebooks

Press/Media

Description

Databricks provides a robust set of APIs that enables programmatic management of accounts and workspaces. For this solution, we leverage the Workspace API, to programmatically list and export notebooks and folders inside our workspace.

We also parallelize the API calls to speed up the cataloging process, and make it configurable to keep it within Databricks rate limit of 30 requests per second. To avoid the “429: Too Many Requests'' error, we have implemented the exponential retrying mechanism, inspired by the Delta Sharing Apache Spark™ connector.

PeriodAug 21 2022

Media contributions

1

Media contributions

  • TitleDatabricks blog
    Country/TerritoryUnited States
    Date08/21/22
    PersonsDarin McBeath