The Wayback Machine - https://web.archive.org/web/20210415085912/https://github.com/topics/parquet-files
Skip to content
#

parquet-files

Here are 31 public repositories matching this topic...

Jonathanpro
Jonathanpro commented Jan 2, 2019

Hello everyone,
Recently I tried to set up petastorm on my company's hadoop cluster.
However as the cluster uses Kerberos for authentication using petastorm failed.
I figured out that petastorm relies on pyarrow which actually supports kerberos authentication.

I hacked "petastorm/petastorm/hdfs/namenode.py" line 250
and replaced it with

driver = 'libhdfs'
return pyarrow.hdfs.c

Improve this page

Add a description, image, and links to the parquet-files topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the parquet-files topic, visit your repo's landing page and select "manage topics."

Learn more