big-data

Now insert and query share the resource ( Max Process Count control) 。 When the query with high TPS，the insert will get error (“error: too many process”). I think separator the resource for Insert and Query will makes sense. Ensure enough resource for insert。It looks like Use Yarn， Insert and Query use the different resource quota。
Or the simple way , Can we set Ratio for Insert and

The latest copy of the CPython grammar tests in test_grammar.py has several @skips and FIXMEs. Some of them seem easy to fix, e.g. some parser bugs or missing warnings that would be helpful, others are entire features. We should fix the easy ones and make sure there are tickets for the rest.

Problem: the approximate method can still be slow for many trees
catboost version: master
Operating System: ubuntu 18.04
CPU: i9
GPU: RTX2080

Would be good to be able to specify how many trees to use for shapley. The model.predict and prediction_type versions allow this. lgbm/xgb allow this.

Remove wasted white space on either side of table
Support adding 1 or 2 more sub fields
Allow hiding columns
Resize columns

Please describe the problem you are trying to solve
I would like to evict entries based on their creation time. I want to evict the oldest ones first.

Please describe the desired behavior
Basically FIFO eviction. I would like to specify directly in the configuration something like:

<eviction eviction-policy="FIFO" max-size-policy="PER_NODE" size="5000"/>

**Describe alte

PrestoDB https://prestodb.io .. is widely used as SQL frontend for many different data-sources, including ElasticSearch, and even files in S3 .. would be very nice if there would be a Connector available for Vespa.

Hi, if my spark app is using 2 storage type, both S3 and Azure Data Lake Store Gen2, could I put spark.delta.logStore.class=org.apache.spark.sql.delta.storage.AzureLogStore, org.apache.spark.sql.delta.storage.S3SingleDriverLogStore

Thanks in advance

Currently, we test Parquet with default config, but not with the optimized writer.

See #6382.

Dec	JAN	Feb
	31
2020	2021	2022

big-data

Here are 2,331 public repositories matching this topic...

apache / spark

binhnguyennus / awesome-scalability

donnemartin / data-science-ipython-notebooks

apache / flink

ClickHouse / ClickHouse

amark / gun

apache / predictionio

prestodb / presto

yahoo / CMAK

heibaiying / BigData-Notes

apache / storm

cython / cython

catboost / catboost

h2oai / h2o-3

apache / zeppelin

pachyderm / pachyderm

apache / couchdb

arkime / arkime

apache / beam

tschellenbach / Stream-Framework

hazelcast / hazelcast

intel-analytics / BigDL

apache / ignite

apache / hive

vespa-engine / vespa

delta-io / delta

linkedin / datahub

TuiQiao / CBoard

databricks / koalas

trinodb / trino

Improve this page

Add this topic to your repo