-
Updated
May 15, 2021
big-data
Here are 2,441 public repositories matching this topic...
-
Updated
May 13, 2021 - Python
-
Updated
May 19, 2021 - C++
-
Updated
May 15, 2021 - JavaScript
-
Updated
Jan 9, 2021 - Scala
-
Updated
May 6, 2021 - Scala
-
Updated
Apr 2, 2021
Problem: the approximate method can still be slow for many trees
catboost version: master
Operating System: ubuntu 18.04
CPU: i9
GPU: RTX2080
Would be good to be able to specify how many trees to use for shapley. The model.predict and prediction_type versions allow this. lgbm/xgb allow this.
-
Updated
May 19, 2021 - Jupyter Notebook
-
Updated
May 19, 2021 - Go
-
Updated
May 19, 2021 - Erlang
x-arkime-cookies
change all x-moloch-cookies to x-arkime-cookies in tests and middleware
-
Updated
Apr 6, 2021 - Python
There is no technical difficulty to support includeValue option, looks like we are just missing it on the API level.
See SO question
-
Updated
May 19, 2021 - Java
-
Updated
May 17, 2021 - Scala
... to make it easier to read Vespa documentation on an e-reader / offline
Vespa documentation is generated using Jekyll from .md and .html files, look into options for generating the artifact as part of site generation (there might be plugins we can use here)
Currently, there is no authentication in our MongoDB tests.
Setting MONGO_INITDB_ROOT_USERNAME and MONGO_INITDB_ROOT_PASSWORD environment variables enable auth in MongoDB docker. https://hub.docker.com/_/mongo
Hi, if my spark app is using 2 storage type, both S3 and Azure Data Lake Store Gen2, could I put spark.delta.logStore.class=org.apache.spark.sql.delta.storage.AzureLogStore, org.apache.spark.sql.delta.storage.S3SingleDriverLogStore
Thanks in advance
Use case:
Right now one can only use date_trunc() to easily define time buckets. date_trunc() only supports predefine time intervals like 1 minute, 1 hour, etc. . In time-series use cases it is often necessary to define different time bucket sizes like e.g. '5 minutes' or '20 minutes'
a workaround for this is the - error prone - integer division on the timestamp e.g.
S-
Updated
May 19, 2021 - TypeScript
Improve this page
Add a description, image, and links to the big-data topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the big-data topic, visit your repo's landing page and select "manage topics."


Lots of error cases are untested, which is worth improving but not urgently so.
Here are some chunks of code that should better be tested (line numbers as found in rev 0448bfbb9a1845a3cca5d07a32fdd4590f538713, [HTML coverage report](https://github.com/cython/cython/suites/2745137547/artifa