The Wayback Machine - https://web.archive.org/web/20210704045931/https://github.com/topics/data-science
Skip to content
#

Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

Here are 19,772 public repositories matching this topic...

jnothman
jnothman commented May 12, 2021

We should be using pkg_resources (or importlib.resources if our min Python version is 3.7) instead of uses of __file__.

$ get grep '__file__' sklearn/
sklearn/__check_build/__init__.py:    local_dir = os.path.split(__file__)[0]
sklearn/datasets/_base.py:    module_path = dirname(__file__)
sklearn/datasets/_base.py:    module_path = dirname(__file__)
sklearn/datasets/_base.py:    

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

  • Updated May 13, 2021
  • Python
george-skal
george-skal commented Jun 28, 2021

Hi all!
I am trying a self-play based scheme, where I want to have two agents in waterworld environment have a policy that is being trained (“shared_policy_1”) and other 3 agents that sample a policy from a menagerie (set) of the previous policies of the first two agents ( “shared_policy_2”).
My problem is that I see that the weights in the menagerie are overwritten in every iteration by the cur

dash
pytorch-lightning
gahdritz
gahdritz commented Jun 27, 2021

🐛 Bug

If the Trainer's profiler parameter is set to "pytorch" and the Trainer's logger is an instance of LoggerCollection, the profiler fails to write to a local file (with a warning).

The path for said file is derived from [this property](https://github.com/PyTorchLightning/pytorch-lightning/blob/28afc7a10d9f9c1160935fb5c81a1a8c0492b392/pytorch_lightning/trainer/properties.py#L22

gensim
danieldeutsch
danieldeutsch commented Jun 2, 2021

Is your feature request related to a problem? Please describe.
I typically used compressed datasets (e.g. gzipped) to save disk space. This works fine with AllenNLP during training because I can write my dataset reader to load the compressed data. However, the predict command opens the file and reads lines for the Predictor. This fails when it tries to load data from my compressed files.

nni