Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
-
Updated
Dec 23, 2023 - Java
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
An orchestration platform for the development, production, and observation of data assets.
🧙 The modern replacement for Airflow. Build, run, and manage data pipelines for integrating and transforming data.
Build data pipelines, the easy way 🛠️
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Lean and mean distributed stream processing system written in rust and web assembly.
Open-source data observability for analytics engineers.
MLeap: Deploy ML Pipelines to Production
Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
The best place to learn data engineering. Built and maintained by the data engineering community.
Dataform is a framework for managing SQL based data operations in BigQuery, Snowflake, and Redshift
Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.
One framework to develop, deploy and operate data workflows with Python and SQL.
Data anomalies monitoring as dbt tests and dbt artifacts uploader.
Work with your web service, database, and streaming schemas in a single format.
Dataplane is an Airflow inspired unified data platform with additional data mesh and RPA capability to automate, schedule and design data pipelines and workflows. Dataplane is written in Golang with a React front end.
A curated list of awesome projects and resources related to Kubeflow (a CNCF incubating project)
Pipebird is open source infrastructure for securely sharing data with customers.
Relational data pipelines for the science lab
Add a description, image, and links to the data-pipelines topic page so that developers can more easily learn about it.
To associate your repository with the data-pipelines topic, visit your repo's landing page and select "manage topics."