The Wayback Machine - https://web.archive.org/web/20210807003202/https://github.com/topics/big-data
Skip to content
#

big-data

Here are 2,552 public repositories matching this topic...

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

  • Updated May 13, 2021
  • Python
kloczek
kloczek commented Jun 9, 2021

After add patch which fixes #4209 I found that sphinx emits some warnings.

+ /usr/bin/python3 setup.py build_sphinx -b man --build-dir build/sphinx
Unable to find pgen, not compiling formal grammar.
running build_sphinx
Running Sphinx v4.0.2
making output directory... done
loading intersphinx inventory from https://docs.python.org/3/objects.inv...
building [mo]: targets for 0 po

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

  • Updated Aug 6, 2021
  • Jupyter Notebook
electrum
electrum commented Aug 5, 2021

If the --server option is used without a protocol, then it should use https when on port 443. For example, these invocations would be equivalent, with the first one having the new behavior:

trino --server example.net:443
trino --server https://example.net:443
trino --server https://example.net

This will make the CLI consistent with the JDBC driver in this regard. While it's t

vespa
kkraune
kkraune commented Apr 2, 2021

... to make it easier to read Vespa documentation on an e-reader / offline

Vespa documentation is generated using Jekyll from .md and .html files, look into options for generating the artifact as part of site generation (there might be plugins we can use here)

shsab
shsab commented Jul 30, 2021

Hi,
I am running the deltaTable = DeltaTable.convertToDelta(spark, f"parquet.{data_path}") to read a DeltaTable from the parquet files but it doesn't return one as suggested in the documents. However, it successfully converts them. If I read them exactly after that line again using forPath, it will give me the DeltaTable.

![image](https://user-images.githubusercontent.com/3591686/12765840

seut
seut commented Jun 22, 2021

Use case:

1.) A user may want to backup all tables but no metadata like users, privileges, etc. without explicitly defining each table inside the CREATE SNAPSHOT statement.

2.) A user may want to transfer users & privileges, custom analyzers or user-defined-functions from one cluster to another without backing up a complete cluster including all data (tables).

*Feature description

Improve this page

Add a description, image, and links to the big-data topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the big-data topic, visit your repo's landing page and select "manage topics."

Learn more