reinforcement-learning

Given a remote function like

@ray.remote
def f():
    pass

We switched from syntax like

f._remote(num_cpus=2)

to

f.options(num_cpus=2).remote()

Regarding deprecating ._remote, their are several steps

Start logging a warning that tells people to use .options() instead.
In a subsequent release, change the warning to a

Description

I'm running a simple test on IWSLT'15 English-Vietnamese:

't2t-trainer --problem=translate_envi_iwslt32k --model=transformer --hparams_set=transformer_base_single_gpu --data_dir=data/ --output_dir=output/

It works but I get different results each time -- even when running on CPU only. I guess there are some random seeds that need to be set (see #485) but t2t-trainer does

The documentation of the Yaml config parameters is missing an explanatory entry for the optional learning_rate_schedule property, described more here.

(Sorry, not a bug, but didn't find a b

Vcpkg is a C++ dependency management system that makes installation and consumption as a dependency very easy. We should support this for VW to allow consuming the lib as easy as possible.

Instructions for creating a new package can be found here: https://github.com/microsoft/vcpkg/blob/master/docs/examples/packaging-github-repos.md

Hello,
I'm starting to work on building an starcraft2 IA using pysc2.
I followed the tutorial https://itnext.io/build-a-zerg-bot-with-pysc2-2-0-295375d2f58e.
It's clear but it doesn't answer some important questions.

Is there detailled information somewhere available?
What I want to do is to separate my army in several parts and send them to different locations.
Idem for the drones.

The

Currently, I want to use the environments of Ant, Hopper, Walker2D and Halfcheetah.
However, it's quite hard to understand the physical meaning of each dimension in state and action space in the xml code. Does anyone know the related document that explains the detail of the state and action space?

I understand that these two python files show two different methods to construct a model. The original n_epoch is 500 which works perfect for both python files. But if I change n_epoch to 20, only tutorial_mnist_mlp_static.py can achieve a high test accuracy (~0.97). The other file tutorial_mnist_mlp_static_2.py only get 0.47.

The models built from these two files looks the same for me (the s

I tried some RNN regression learning based on the code in the "PyTorch-Tutorial/tutorial-contents/403_RNN_regressor.py" file, which did not work for me at all.

According to an accepted answer on stack-overflow (https://stackoverflow.com/questions/52857213/recurrent-network-rnn-wont-learn-a-very-simple-function-plots-shown-in-the-q?noredirect=1#comment92916825_52857213), it turns out that the li

Since Trax is a successor of tensor2tensor (according to the release notes of tensor2tensor v1.15.0), it would be helpful if you could provide examples for more advanced machine learning tasks. An outstanding feature of tensor2tensor are the numerous (and useful) examples which Trax is currently lacking. Such examples would especi

Jupyter containers hosted by Coursera cause a lot of trouble. Perhaps more than they are worth.

They are very limited in terms of lifetime and CPU. The docs say 90 minutes / 0.5-2 CPUs. That's definitely insufficient to train Breakout, for example.
Updating them is inconvenient. We don't h

Lately running into too many Sagemaker issues. Is there any unambiguous documentation on Sagemakers Instances? I could glean the following from different sources:

Sagemaker Instances, Sagemaker being a managed service, have nothing to do with EC2 instances.
Unlike EC2 console, Sagemaker console has no option to view limits or increase limits. One has to go directly to the support page a

一言でいうと

AutoMLアルゴリズムとrandom searchを比較した研究。学習したPolicyとrandom samplingとで条件を揃えて比較(randomは複数シードを取り、最終的なモデルは同epoch数学習)。結果randomを大きく超えるものはなかった。また、Weight Shareをすると探索結果が悪くなるという重要な示唆。

論文リンク

https://arxiv.org/abs/1902.08142

著者/所属機関

Christian Sciuto, Kaicheng Yu, Martin Jaggi, Claudiu Musat, Mathieu Salzmann

AI Lab, Swisscom
CV Lab, EPFL
MLO Lab, EPFL

投稿日付(yyyy/MM/d

The README.md doesn't indicate how to contribute. I made a local branch to fix issue #34 but when I try to push the branch I get the following error, which indicates I don't have permission.

Please indicate in the README how to contribute.

Thanks.

Can you please add normal documentation at least for an src_cpp/elf?
I don’t ask to add documentation for a src_cpp/elfgames.

Can you describe what modifications need to be done if I want to replace dynamic_rnn with tf.keras.RNN in many-to-one example as dynamic_rnn is deprecated now.

How to use Watcher / WatcherClient over tcp/ip network?

Watcher seems to ZMQ server, and WatcherClient is ZMQ Client, but there is no API/Interface to config server IP address.
Do I need to implement a class that inherits from WatcherClient?

Mar	APR	May
	13
2019	2020	2021

reinforcement-learning

Here are 4,412 public repositories matching this topic...

ray-project / ray

tensorflow / tensor2tensor

Description

ddbourgin / numpy-ml

ShangtongZhang / reinforcement-learning-an-introduction

Hvass-Labs / TensorFlow-Tutorials

Unity-Technologies / ml-agents

kmario23 / deep-learning-drizzle

VowpalWabbit / vowpal_wabbit

deepmind / pysc2

bulletphysics / bullet3

tensorlayer / tensorlayer

tensorpack / tensorpack

owainlewis / awesome-artificial-intelligence

MorvanZhou / Reinforcement-learning-with-tensorflow

MorvanZhou / PyTorch-Tutorial

keras-rl / keras-rl

lazyprogrammer / machine_learning_examples

BinRoot / TensorFlow-Book

janhuenermann / neurojs

google / trax

yandexdataschool / Practical_RL

awslabs / amazon-sagemaker-examples

arXivTimes / arXivTimes

一言でいうと

論文リンク

著者/所属機関

投稿日付(yyyy/MM/d

udacity / deep-reinforcement-learning

astorfi / Deep-Learning-Roadmap

pytorch / ELF

andri27-ts / Reinforcement-Learning

easy-tensorflow / easy-tensorflow

microsoft / tensorwatch

tensorforce / tensorforce

Improve this page

Add this topic to your repo