The Wayback Machine - https://web.archive.org/web/20230422132536/https://github.com/databrickslabs
Skip to content
@databrickslabs

Databricks Labs

Labs projects to accelerate use cases on the Databricks Unified Analytics Platform

Pinned

  1. dolly Public

    Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform

    Python 8.9k 874

  2. dbx Public

    🧱 Databricks CLI eXtensions - aka dbx is a CLI tool for development and advanced Databricks workflows management.

    Python 320 101

  3. mosaic Public

    An extension to the Apache Spark framework that allows easy and fast processing of very large geospatial datasets.

    Scala 177 39

Repositories

  • dolly Public

    Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform

    Python 8,904 Apache-2.0 874 4 0 Updated Apr 21, 2023
  • mosaic Public

    An extension to the Apache Spark framework that allows easy and fast processing of very large geospatial datasets.

    Scala 177 39 21 11 Updated Apr 21, 2023
  • dbldatagen Public

    Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines

    Python 147 31 4 4 Updated Apr 21, 2023
  • overwatch Public

    Capture deep metrics on one or all assets within a Databricks workspace

  • tempo Public

    API for manipulating time series on top of Apache Spark: lagged time values, rolling statistics (mean, avg, sum, count, etc), AS OF joins, downsampling, and interpolation

    Jupyter Notebook 248 42 26 6 Updated Apr 20, 2023
  • splunk-integration Public

    Databricks Add-on for Splunk

    Python 19 12 11 6 Updated Apr 20, 2023
  • migrate Public

    Scripts to help customers with one-off migrations between Databricks workspaces.

    Python 129 102 36 3 Updated Apr 20, 2023
  • dlt-meta Public

    This is metadata driven DLT based framework for bronze/silver pipelines

    Python 26 7 0 0 Updated Apr 19, 2023
  • arcuate Public

    Delta Sharing + MLflow for ML model & experiment exchange (arcuate delta - a fan shaped river delta)

    Python 12 1 0 7 Updated Apr 17, 2023
  • dbx Public

    🧱 Databricks CLI eXtensions - aka dbx is a CLI tool for development and advanced Databricks workflows management.

    Python 320 101 66 3 Updated Apr 17, 2023

People

This organization has no public members. You must be a member to see who’s a part of this organization.