The Wayback Machine - https://web.archive.org/web/20210131115357/https://github.com/topics/big-data
Skip to content
#

big-data

Here are 2,331 public repositories matching this topic...

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

  • Updated Jan 28, 2021
  • Python
ClickHouse
yunchat
yunchat commented Nov 13, 2018

Now insert and query share the resource ( Max Process Count control) 。 When the query with high TPS,the insert will get error (“error: too many process”). I think separator the resource for Insert and Query will makes sense. Ensure enough resource for insert。It looks like Use Yarn, Insert and Query use the different resource quota。
Or the simple way , Can we set Ratio for Insert and

pseudotensor
pseudotensor commented Jan 12, 2021

Problem: the approximate method can still be slow for many trees
catboost version: master
Operating System: ubuntu 18.04
CPU: i9
GPU: RTX2080

Would be good to be able to specify how many trees to use for shapley. The model.predict and prediction_type versions allow this. lgbm/xgb allow this.

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

  • Updated Jan 31, 2021
  • Jupyter Notebook
Holmistr
Holmistr commented Jan 20, 2021

Please describe the problem you are trying to solve
I would like to evict entries based on their creation time. I want to evict the oldest ones first.

Please describe the desired behavior
Basically FIFO eviction. I would like to specify directly in the configuration something like:

<eviction eviction-policy="FIFO" max-size-policy="PER_NODE" size="5000"/>

**Describe alte

vespa

Improve this page

Add a description, image, and links to the big-data topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the big-data topic, visit your repo's landing page and select "manage topics."

Learn more