data-engineering

Is your feature request related to a problem? Please describe.
As of a couple months ago, the Elasticsearch organization has made the official python elasticsearch plugin incompatible with Amazon supported OpenSearch. If you fire up Superset using the current helm chart and attempt to connect to a recently deployed AWS "Elasticsearch" - which is now an Apache 2.0 licensed OpenSearch - you wi

Opened from the Prefect Public Slack Community

michael.ball: Hey there. I’ve been playing around with Docker storage today, trying to get all source code packaged together with the flows each time they are registered, and am using the files and env_vars attributes as outlined in the Docs. But it seems that my .dockerignore file (in the directory from whic

Describe the bug
data docs columns shrink to 1 character width with long query

To Reproduce
Steps to reproduce the behavior:

make a batch from a long query string
run validation
render result to data docs
See screenshot
<img width="1525" alt="Data_documentation_compiled_by_Great_Expectations" src="https://user-images.githubusercontent.com/928247/103230647-30eca500-4

Can't type anything in the below dialog box where "filter branches" is written.

if they are not class methods then the method would be invoked for every test and a session would be created for each of those tests.

`class PySparkTest(unittest.TestCase):
@classmethod
def suppress_py4j_logging(cls):
logger = logging.getLogger('py4j')
logger.setLevel(logging.WARN)

@classmethod
def create_testing_pyspark_session(cls):
    return Sp

Brief Description of Fix

When I see docs I found [get_features_targets page](https://pyjanitor-devs.github.io/pyjanitor/reference/janitor.functions/janitor.get_features_targe

We store params for each task execution so we re-run the task if any of the parameters change. But since the metadata is a JSON file, we can only store JSON-serializable parameters. The current implementations do all or nothing: if any parameter is not JSON-serializable, it doesn't save anything but it would be better to only ignore the ones we cannot serialize and save the rest

https://githu

In the repository handler

removeEntity tries to delete then if delete is not supported issues a purge, the purge method issues an audit log
There are 2 callers to purgeRelationship only one of which audit logs

This is inconsistent.
I suggest we move the relationship audit log to the purge method, which means that both callers will audit log.

Aug	SEP	Oct
	19
2020	2021	2022

data-engineering

Here are 946 public repositories matching this topic...

apache / superset

eugeneyan / applied-ml

andkret / Cookbook

datastacktv / data-engineer-roadmap

PrefectHQ / prefect

Opened from the Prefect Public Slack Community

great-expectations / great_expectations

Jeffail / benthos

awslabs / aws-data-wrangler

adilkhash / Data-Engineering-HowTo

treeverse / lakeFS

kantord / just-dashboard

quiltdata / quilt

GoogleCloudPlatform / data-science-on-gcp

benthecoder / yt-channels-DS-AI-ML-CS

san089 / goodreads_etl_pipeline

AlexIoannides / pyspark-example-project

pyjanitor-devs / pyjanitor

Brief Description of Fix

oleg-agapov / data-engineering-book

abhishek-ch / around-dataengineering

san089 / Udacity-Data-Engineering-Projects

ploomber / ploomber

gunnarmorling / awesome-opensource-data-engineering

automaticmode / active_workflow

odpi / egeria

dataform-co / dataform

sodadata / soda-sql

kevintpeng / Learn-Something-Every-Day

Cascading / cascading

sderosiaux / every-single-day-i-tldr

aiguofer / gspread-pandas

Improve this page

Add this topic to your repo