The Wayback Machine - https://web.archive.org/web/20200911181709/https://github.com/topics/etl-pipeline

#

etl-pipeline

Here are 271 public repositories matching this topic...

benthos

Jeffail / benthos

Star

A stream processor for mundane tasks written in Go

go golang kafka cqrs etl rabbitmq amqp logs message-bus event-sourcing nats stream-processing message-queue streaming-data stream-processor etl-pipeline

Updated Sep 10, 2020
Go

InterestingLab / waterdrop

Star

生产环境的海量数据计算产品，文档地址：

java spark hadoop spark-streaming flink sql-engine etl-framework etl-pipeline

Updated Sep 8, 2020
Java

goodreads_etl_pipeline

san089 / goodreads_etl_pipeline

Star

An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.

Updated Mar 9, 2020
Python

AlexIoannides / pyspark-example-project

Star

Example project implementing best practices for PySpark ETL jobs and applications.

python data-science spark etl pyspark data-engineering etl-pipeline etl-job

Updated Jul 9, 2020
Python

Udacity-Data-Engineering-Projects

san089 / Udacity-Data-Engineering-Projects

Star

Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.

Updated Mar 5, 2020
Python

smooks / smooks

Star

An extensible Java framework for building XML and non-XML (CSV, EDI, Java, etc...) streaming applications

java big-data etl pipeline-framework stream-processing etl-pipeline smooks enterprise-integration

Updated Sep 11, 2020
Java

YotpoLtd / metorikku

Star

Open

Add the ability to override configuration using cli params

2

RonBarabash commented Nov 27, 2017

currently Metorikku is using a simple YAML config file as input.
we need to be able to override this configuration using CLI params

Read more

good first issue help wanted

etlbox / etlbox

Star

A lightweight ETL (extract, transform, load) library and data integration toolbox for .NET.

etl csharp-core etl-framework etl-pipeline etl-jobs

Updated Sep 10, 2020
C#

techascent / tech.ml.dataset

Star

Clojure dataframe library and pipeline for data processing and machine learning

machine-learning clojure csv xlsx datascience dataset dataframe etl-pipeline

Updated Aug 14, 2020
Clojure

usc-isi-i2 / dig-etl-engine

Star

Download DIG to run on your laptop or server.

search-engine crawling information-extraction information-visualization etl-framework etl-pipeline

Updated Jan 9, 2019

maxim2266 / csvplus

Star

csvplus extends the standard Go encoding/csv package with fluent interface, lazy stream operations, indices and joins.

go csv etl stream-processing fluent-interface csv-format go-csv etl-framework etl-pipeline

Updated Mar 16, 2020
Go

setl

SETL-Developers / setl

Star

A simple Spark-powered ETL framework that just works 🍺

data-science machine-learning framework scala big-data spark pipeline etl data-transformation data-engineering dataset data-analysis modularization setl etl-pipeline

Updated Sep 9, 2020
Scala

sundios / SEO-Dashboard

Star

SEO dashboard from Search console Data using the Google Search API, Mysql database , NodeJS RESTAPI( ExpressJS) and reactJs Dashboard

react mysql dashboard rest-api seo expressjs seotools node-js seo-monitor google-search-console etl-pipeline etl-kpi google-search-console-python

Updated Jul 7, 2020
JavaScript

AzureDataFactoryHOL

Mmodarre / AzureDataFactoryHOL

Star

Azure Data Factory Hands On Lab - Step by Step - A Comprehensive Azure Data Factory and Mapping Data Flow step by step tutorial

azure azure-data-factory hands-on-lab azure-key-vault etl-pipeline adf-pipeline filter-activity lookup-activity foreach-activity metadata-activity mapping-dataflows hands-on-azure-data-factory azure-data-factory-tutorial azure-modern-data-warehous web-activity foreach-loop-activity

Updated May 27, 2020

DaFlow

sparsecode / DaFlow

Star

Open

Update README.md of the project.

abhioncbr commented Sep 25, 2018

Update the README.md files of the various modules of the project and detailed README.md for building, deploying and running the project.

Read more

good first issue

jira-database-etl

toddbirchard / jira-database-etl

Star

🚹

💾 Script to import issues from a JIRA instance into a database.

flask etl pandas python3 jira-rest-api flask-sqlalchemy etl-pipeline

Updated Sep 10, 2020
Python

tharwaninitin / etlflow

Star

Functional, Composable library in Scala based on ZIO for writing ETL jobs in AWS and GCP https://tharwaninitin.github.io/etlflow/site/

bigquery aws scala spark etl gcp zio etl-framework etl-pipeline

Updated Sep 10, 2020
Scala

xushiyan / kafka-connect-datagen

Star

A Kafka Connect source connector that generates data for tests

java kafka etl kafka-connect data-generator performance-test integration-test etl-pipeline

Updated Jun 26, 2019
Java

visiologyofficial / vixtract

Star

etl etl-framework etl-pipeline etl-components etl-job etl-automation

Updated Sep 7, 2020
HTML

BBVA / data-refinery

Star

Data transformation

data-science data machine-learning etl datascience etl-pipeline

Updated Apr 9, 2018
Python

jjasghar / COBOL-on-k8s

Star

Running an ETL pipeline with COBOL on Kubernetes

kubernetes yaml s3-bucket cobol etl-pipeline

Updated Sep 10, 2020
Shell

mdh266 / AirflowETL

Star

Blog post on ETL pipelines with Airflow

python airflow sql database schedule etl postgresql data-engineering data-pipeline etl-pipeline

Updated Jun 7, 2020
Jupyter Notebook

cyber-drop / ethereum_analytical_db

Star

Ethereum Analytical Database - Ethereum data access solution that can be used for analytics and application development. The solution works on a fast DB - Clickhouse.

api etl clickhouse ethereum blockchain eth dex erc20 erc223 etl-pipeline erc721 ethereum-etl

Updated Oct 30, 2019
HTML

codingforentrepreneurs / Serverless-Python-Workflow-with-AWS-Lambda

Star

A tutorial to setup and deploy a simple Serverless Python workflow with REST API endpoints in AWS Lambda.

python aws data-science aws-lambda serverless etl webscraping etl-pipeline

Updated Apr 22, 2020
Python

InterestingLab / waterdrop-example

Star

Waterdrop Plugin developing examples.

spark spark-streaming flink sql-engine etl-framework waterdrop etl-pipeline

Updated Jun 11, 2020
Scala

amanjpro / greenish

Star

Open

Add refresh state/group/job button to the dashboard UI

amanjpro opened Jul 17, 2020

enhancement good first issue

Open

Add an endpoint to request the current loaded config file

sanjeevai / disaster-response-pipeline

Star

ETL pipeline combined with supervised learning and grid search to classify text messages sent during a disaster event

sqlite-database supervised-learning grid-search-hyperparameters etl-pipeline data-engineering-pipeline disaster-event

Updated Feb 24, 2019
Python

DAppBoard / dappboard-etl

Star

ETL pipeline for the Ethereum blockchain

javascript etl blockchain ethereum-blockchain etl-pipeline

Updated Feb 13, 2019
JavaScript

aws-samples / amazon-sagemaker-predict-accessibility

Star

Build end-to-end Machine Learning pipeline to predict accessibility of playgrounds in NYC

serverless athena glue autopilot etl-pipeline sagemaker sagemaker-deployment ml-engineering etl-solutions

Updated Jul 9, 2020
Jupyter Notebook

vertica / PSTL

Star

Parallel Streaming Transformation Loader

data-science data-mining hadoop bigdata ingestion realtime-messaging vertica streaming-data etl-pipeline

Updated Apr 23, 2019
Java

Improve this page

Add a description, image, and links to the etl-pipeline topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the etl-pipeline topic, visit your repo's landing page and select "manage topics."

You can’t perform that action at this time.