The Wayback Machine - https://web.archive.org/web/20200911183518/https://github.com/topics/data-catalog
Skip to content
#

data-catalog

Here are 34 public repositories matching this topic...

ttannis
ttannis commented Sep 2, 2020

Expected Behavior or Use Case

We recently introduced an improved user interface for column types with nested/complex structures. Current support covers Hive and Presto nested types.
amundsen-io/amundsenfrontendlibrary#627

  1. We want to extend the UI such that each level will be able to be expanded and collapsed.
  2. In addition we want to add a button that will to
vrajat
vrajat commented Feb 14, 2020

It is not surprising that deep and shallow scan show different results. Shallow scan only looks at column names. Deep scan looks at a sample of the data. I've even noticed that two different runs of deep scan show different results as sample rows are different. This is the challenge with not scanning all of the data. Its a trade-off between performance/cost and accuracy. There is no right answer.

National Data Archive (NADA) is an open source data cataloging system that serves as a portal for researchers to browse, search, compare, apply for access, and download relevant census or survey information. It was originally developed to support the establishment of national survey data archives.

  • Updated Sep 11, 2020
  • PHP

Improve this page

Add a description, image, and links to the data-catalog topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the data-catalog topic, visit your repo's landing page and select "manage topics."

Learn more

You can’t perform that action at this time.