data-pipelines

🚨🚨 Feature Request

A new implementation (Improvement, Extension)

Is your feature request related to a problem?

Currently, if a user tries to access an index that is larger than the dataset length or tensor length, an internal error is thrown which is not easy to understand.

Description of the possible solution

We can catch the error and throw a more descriptive e

Is your feature request related to a problem? Please describe.
It's cumbersome to create the same step twice.

Describe the solution you'd like
Add a button to duplicate a step in the pipeline editor.

Ideas
We could combine this with some other ideas in a context menu (right click).

Credit to Serhii Ostapchuk for contributing this on Slack.

Sending a rest call to delete a job specification throws 404 where as grpc call works fine. Steps to reproduce

curl -X DELETE "http://localhost:9100/v1/project/my-project/namespace/kush/helloworld" -H  "accept: application/json"

Support copy into queries

Describe the bug
If user was selected as data entity owner he should be excluded from subsequent ownership selections

What is the feature request? What problem does it solve?

As employees leave the organization/company or users change mails , eventually the notification list configured for the job would start containing a lot of invalid mails. This causes issues with SMTP relay (e.g postfix) which could be buffering all invalid requests until the queu is full, which cause all mails coming for all jobs to b

https://github.com/JPHaus/data-engineering-wiki/blob/main/Tools/Apache%20Spark.md

This note is a seedling and is a great place to make your first contribution!

Is your feature request related to a problem? Please describe.
Executing all tests takes already about 30mins. We should try to optimize that.

Describe the solution you'd like
Much time is taken by preparing input data by writing test data to DataObjects (Csv or Hive). This could be significantly reduced by creating a custom DataObject where a DataFrame can be set as input data, which

Apr	MAY	Jun
	06
2021	2022	2023

data-pipelines

Here are 97 public repositories matching this topic...

dagster-io / dagster

activeloopai / Hub

🚨🚨 Feature Request

Is your feature request related to a problem?

Description of the possible solution

orchest / orchest

combust / mleap

odpf / optimus

elementary-data / elementary

dataform-co / dataform

opendatadiscovery / odd-platform

meltano / meltano

datajoint / datajoint-python

vmware / versatile-data-kit

koolreport / core

kevin-hanselman / dud

GoogleCloudPlatform / public-datasets-pipelines

shravan-kuchkula / udacity-data-eng-proj-1

basis-os / basis-devkit

JPHaus / data-engineering-wiki

smart-data-lake / smart-data-lake

beneath-hq / beneath

flipkart-incubator / spark-transformers

immu0001 / Udacity-Data-Engineer-nanodegree

DanilBaibak / ml-in-production

mdh266 / AirflowDataPipeline

elementary-data / dbt-data-reliability

arakat-community / arakat

electronick1 / stepist

pachyderm / neon-workshop

KentHsu / Udacity-Data-Engineering-Nanodgree

CogStack / CogStack-NiFi

RiveryIO / rivery_cli

Improve this page

Add this topic to your repo