data-engineering
Here are 1,101 public repositories matching this topic...
-
Updated
Mar 24, 2022
-
Updated
Jan 2, 2022
-
Updated
Jan 25, 2022
Description
Occasionally I get the following error when running sub-flows using create_flow_run and wait_for_flow_run
Task "wait_for_flow": Exception encountered during task execution!
Traceback (most recent call last):
File "/root/miniconda3/lib/python3.8/site-packages/prefect/engine/task_runner.py", line 876, in get_task_run_state
value = prefect.utilities.executorDescribe the bug
data docs columns shrink to 1 character width with long query
To Reproduce
Steps to reproduce the behavior:
- make a batch from a long query string
- run validation
- render result to data docs
- See screenshot
<img width="1525" alt="Data_documentation_compiled_by_Great_Expectations" src="https://user-images.githubusercontent.com/928247/103230647-30eca500-4
Tell us about the problem you're trying to solve
For some cases users have a more restricted access to their ElasticSearch cluster and need private keys.
Describe the solution you’d like
Add an option to upload the key file OR paste the key in a field in the destination configuration.
There are other destination examples using private keys already.
Describe the alternative you’ve
Under the hood, Benthos csv input uses the standard encoding/csv packages's csv.Reader struct.
The current implementation of csv input doesn't allow setting the LazyQuotes field.
We have a use case where we need to set the LazyQuotes field in order to make things work correctly.
When we show data for a metric, we currently don't include the current day's worth of data. For users just getting set up, they may only have events from today, and want to test out if the query is working, and by excluding events from 'today', they can't see results.
TODO:
- In
packages/back-end/src/services/experiments.tson line329, instead of using the current date as the value
-
Updated
Mar 24, 2022 - Python
(see #672 and #658 for context)
There are a few engines to convert .ipynb to .pdf, however, we only support the default one provided by nbconvert, which depends on pandoc and text. It has a few downsides such as not being able to render embedded charts (see #658), we should add support for other engines
-
Updated
Feb 2, 2022
-
Updated
Mar 24, 2022 - Jupyter Notebook
(1) Add docstrings to methods
(2) Covert .format() methods to f strings for readability
(3) Make sure we are using Python 3.8 throughout
(4) zip extract_all() in ingest_flights.py can be simplified with a Path parameter
-
Updated
Mar 24, 2022 - Scala
if they are not class methods then the method would be invoked for every test and a session would be created for each of those tests.
`class PySparkTest(unittest.TestCase):
@classmethod
def suppress_py4j_logging(cls):
logger = logging.getLogger('py4j')
logger.setLevel(logging.WARN)
@classmethod
def create_testing_pyspark_session(cls):
return Sp
Background
This thread is borne out of the discussion from #968 , in an effort to make documentation more beginner-friendly & more understandable.
One of the subtasks mentioned in that thread was to go through the function docstrings and include a minimal working example to each of the public functions in pyjanitor.
Criteria reiterated here for the benefit of discussion:
It sh
-
Updated
Mar 22, 2022 - Dockerfile
Follow the implementation example of ingestion/tests/integration/ometa/test_ometa_database_service_api.py to implement the testing of the Python client for PipelineService.
-
Updated
Mar 24, 2022 - Python
-
Updated
Mar 5, 2020 - Python
-
Updated
Jun 2, 2021
-
Updated
Mar 22, 2022
-
Updated
Mar 24, 2022 - Python
-
Updated
Nov 6, 2021 - Ruby
When using Ubuntu 'ootb' both natively and within windows WSL2 the asset consumer fvt has a tendency to fail with:
[INFO] --- maven-compiler-plugin:3.8.1:compile (default-compile) @ asset-consumer-fvt ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 7 source files to /home/nigel/src/egeria/open-metadata-test/open-metadata-fvt/access-services-fvt/asset-consumer-fvt/tar
-
Updated
Mar 16, 2022 - TypeScript
-
Updated
Mar 22, 2022 - Java
Improve this page
Add a description, image, and links to the data-engineering topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the data-engineering topic, visit your repo's landing page and select "manage topics."


The Mixed Time-Series chart type allows for configuring the title of the primary and the secondary y-axis.
However, while only the title of the primary axis is shown next to the axis, the title of the secondary one is placed at the upper end of the axis where it gets hidden by bar values and zoom controls.
How to reproduce the bug