The Wayback Machine - https://web.archive.org/web/20200819185605/https://github.com/topics/big-data
Skip to content
#

big-data

Here are 2,154 public repositories matching this topic...

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

  • Updated Jul 24, 2020
  • Python
ClickHouse
yunchat
yunchat commented Nov 13, 2018

Now insert and query share the resource ( Max Process Count control) 。 When the query with high TPS,the insert will get error (“error: too many process”). I think separator the resource for Insert and Query will makes sense. Ensure enough resource for insert。It looks like Use Yarn, Insert and Query use the different resource quota。
Or the simple way , Can we set Ratio for Insert and

Open Source Fast Scalable Machine Learning Platform For Smarter Applications: Deep Learning, Gradient Boosting & XGBoost, Random Forest, Generalized Linear Modeling (Logistic Regression, Elastic Net), K-Means, PCA, Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

  • Updated Aug 19, 2020
  • Jupyter Notebook
sancar
sancar commented Jul 16, 2020

With Java clients (3.12. and 4.x ), it looks like I cannot specify a class override for client load balancer. https://docs.hazelcast.org/docs/3.12.4/manual/html-single/#configuring-client-load-balancer
This could be a documentation error, or documentation is right, but the XSD schema is wrong.

<hazelcast-client>
    ...
    <load-balancer type="random">
        yourLoadBalancer
    </
vespa

Improve this page

Add a description, image, and links to the big-data topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the big-data topic, visit your repo's landing page and select "manage topics."

Learn more

You can’t perform that action at this time.