#
data-quality
Here are 90 public repositories matching this topic...
Validation of local and remote data tables
mysql
r
spark
data-validation
sqlite
postgresql
data-frame
data-engineering
sparklyr
mssql
easy-to-use
data-quality
data-profiling
tibble
reporting-tools
email-reports
thresholds
database-tables
failure-thresholds
-
Updated
Sep 7, 2020 - R
WeDataSphere is a financial level one-stop open-source suitcase for big data platforms. Currently the source code of Scriptis and Linkis has already been released to the open-source community. WeDataSphere, Big Data Made Easy!
-
Updated
Jul 2, 2020
Qualitis is a one-stop data quality management platform that supports quality verification, notification, and management for various datasource. It is used to solve various data quality problems caused by data processing. https://github.com/WeBankFinTech/Qualitis
workflow
quality
compare
dss
data-quality
quality-improvement
quality-check
linkis
datashperestudio
data-quality-model
-
Updated
Mar 20, 2020 - Java
An RDF Unit Testing Suite
unit-testing
schema
validation
rdf
data-validation
schema-validation
web-ontology-language
data-quality-checks
data-quality
shacl
-
Updated
Feb 14, 2020 - Java
-
Updated
Aug 9, 2020 - Vue
NBi is a testing framework (add-on to NUnit) for Business Intelligence and Data Access. The main goal of this framework is to let users create tests with a declarative approach based on an Xml syntax. By the means of NBi, you don't need to develop C# or Java code to specify your tests! Either, you don't need Visual Studio or Eclipse to compile your test suite. Just create an Xml file and let the framework interpret it and play your tests. The framework is designed as an add-on of NUnit but with the possibility to port it easily to other testing frameworks.
database
etl
nunit
test-automation
test-framework
business-intelligence
cube
data-quality-checks
data-quality
-
Updated
Jun 20, 2020 - C#
Jumbune, an open source BigData APM & Data Quality Management Platform for Data Clouds. Enterprise feature offering is available at http://jumbune.com. More details of open source offering are at,
yarn
hadoop
apm
developer-tools
data-analysis
hadoop-cluster
devops-tools
data-quality
optimization-framework
cluster-monitoring
monitoring-tool
hadoop-monitor
yarn-hadoop-cluster
aiops
hadoop-monitoring
-
Updated
Jul 1, 2020 - Java
Lightweight library to write, orchestrate and test your SQL ETL. Writing ETL with data integrity in mind.
-
Updated
Aug 26, 2020 - Python
Librería para la evaluación de calidad de datos, e interacción con el portal de datos.gov.co
-
Updated
Aug 27, 2020 - Python
数据治理、数据质量检核/监控平台(Django+jQuery+MySQL)
-
Updated
Sep 5, 2020 - Python
DTCleaner: data cleaning using multi-target decision trees.
-
Updated
Jun 21, 2016 - Java
A tool to help improve data quality standards in observational data science.
-
Updated
Aug 19, 2020 - JavaScript
Migrated to: https://gitlab.com/Oslandia/osm-data-classification
-
Updated
Sep 16, 2019 - Python
DataOps for Government
-
Updated
Sep 13, 2018
hive_compared_bq compares/validates 2 (SQL like) tables, and graphically shows the rows/columns that are different.
-
Updated
Dec 13, 2017 - Python
The PEDSnet Data Quality Assessment Toolkit (OMOP CDM)
-
Updated
Jul 21, 2020 - R
Automated data quality suggestions and analysis with Amazon Deequ on AWS Glue
-
Updated
Sep 3, 2020 - Scala
m1n0
commented
Sep 3, 2020
e.g. sqlalchemy.exc.IdentifierError: Identifier 'quality_check_airport_report_next_interval_SNN_daily_unique_quality_check' exceeds maximum length of 63 characters
A Node.js tool to examine the correctness of Open Data Metadata and build custom dataset profiles
-
Updated
Jun 20, 2018 - JavaScript
Data quality control tool built on spark and deequ
-
Updated
Sep 6, 2020 - Scala
This Chrome Extension automatically performs SRM checks and flags potential data quality issues on supported experimentation platforms.
chrome-extension
statistics
statistical-analysis
experimentation
ab-testing
srm
data-quality
vwo
visual-website-optimizer
google-optimize
sample-ratio-mismatch
-
Updated
Aug 13, 2020 - JavaScript
Tutorial and examples of Data Quality in Big Data System
-
Updated
Apr 25, 2017
Data validation library for PySpark 3.0.0
-
Updated
Jul 20, 2020 - Python
hooqu is a library built on top of Pandas-like Dataframes for defining "unit tests for data". This is a spiritual port of Apache Deequ to Python
-
Updated
Sep 6, 2020 - Python
A simple platform dedicated to data quality issues detection, especially in the context of online advertising.
-
Updated
Jul 21, 2020 - Python
Improve this page
Add a description, image, and links to the data-quality topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the data-quality topic, visit your repo's landing page and select "manage topics."


Describe the bug
This is basically one of the issues I called out in #1855:
When I run
datasource newand exit the process at any point (e.g. ctrl+c), I still get a block for the credentials in config_variables.yml. However great_expectations.yml doesn't have the datasource entry. I would expect any kind of failure in the datasource creation process to not leave any artifacts.**To Re