reinforcement-learning
Here are 4,412 public repositories matching this topic...
Description
I'm running a simple test on IWSLT'15 English-Vietnamese:
't2t-trainer --problem=translate_envi_iwslt32k --model=transformer --hparams_set=transformer_base_single_gpu --data_dir=data/ --output_dir=output/
It works but I get different results each time -- even when running on CPU only. I guess there are some random seeds that need to be set (see #485) but t2t-trainer does
-
Updated
Apr 6, 2020 - Python
-
Updated
Apr 4, 2020 - Python
-
Updated
Apr 12, 2020 - Jupyter Notebook
The documentation of the Yaml config parameters is missing an explanatory entry for the optional learning_rate_schedule property, described more here.
(Sorry, not a bug, but didn't find a b
-
Updated
Apr 4, 2020
Vcpkg is a C++ dependency management system that makes installation and consumption as a dependency very easy. We should support this for VW to allow consuming the lib as easy as possible.
Instructions for creating a new package can be found here: https://github.com/microsoft/vcpkg/blob/master/docs/examples/packaging-github-repos.md
Hello,
I'm starting to work on building an starcraft2 IA using pysc2.
I followed the tutorial https://itnext.io/build-a-zerg-bot-with-pysc2-2-0-295375d2f58e.
It's clear but it doesn't answer some important questions.
Is there detailled information somewhere available?
What I want to do is to separate my army in several parts and send them to different locations.
Idem for the drones.
The
Currently, I want to use the environments of Ant, Hopper, Walker2D and Halfcheetah.
However, it's quite hard to understand the physical meaning of each dimension in state and action space in the xml code. Does anyone know the related document that explains the detail of the state and action space?
I understand that these two python files show two different methods to construct a model. The original n_epoch is 500 which works perfect for both python files. But if I change n_epoch to 20, only tutorial_mnist_mlp_static.py can achieve a high test accuracy (~0.97). The other file tutorial_mnist_mlp_static_2.py only get 0.47.
The models built from these two files looks the same for me (the s
-
Updated
Apr 12, 2020 - Python
-
Updated
Mar 12, 2020
-
Updated
Feb 12, 2020 - Python
I tried some RNN regression learning based on the code in the "PyTorch-Tutorial/tutorial-contents/403_RNN_regressor.py" file, which did not work for me at all.
According to an accepted answer on stack-overflow (https://stackoverflow.com/questions/52857213/recurrent-network-rnn-wont-learn-a-very-simple-function-plots-shown-in-the-q?noredirect=1#comment92916825_52857213), it turns out that the li
-
Updated
Apr 2, 2020 - Python
-
Updated
Apr 13, 2020 - Python
-
Updated
Dec 14, 2019 - Jupyter Notebook
-
Updated
Mar 18, 2020 - JavaScript
Since Trax is a successor of tensor2tensor (according to the release notes of tensor2tensor v1.15.0), it would be helpful if you could provide examples for more advanced machine learning tasks. An outstanding feature of tensor2tensor are the numerous (and useful) examples which Trax is currently lacking. Such examples would especi
Jupyter containers hosted by Coursera cause a lot of trouble. Perhaps more than they are worth.
- They are very limited in terms of lifetime and CPU. The docs say 90 minutes / 0.5-2 CPUs. That's definitely insufficient to train Breakout, for example.
- Updating them is inconvenient. We don't h
Lately running into too many Sagemaker issues. Is there any unambiguous documentation on Sagemakers Instances? I could glean the following from different sources:
- Sagemaker Instances, Sagemaker being a managed service, have nothing to do with EC2 instances.
- Unlike EC2 console, Sagemaker console has no option to view limits or increase limits. One has to go directly to the support page a
一言でいうと
AutoMLアルゴリズムとrandom searchを比較した研究。学習したPolicyとrandom samplingとで条件を揃えて比較(randomは複数シードを取り、最終的なモデルは同epoch数学習)。結果randomを大きく超えるものはなかった。また、Weight Shareをすると探索結果が悪くなるという重要な示唆。
論文リンク
https://arxiv.org/abs/1902.08142
著者/所属機関
Christian Sciuto, Kaicheng Yu, Martin Jaggi, Claudiu Musat, Mathieu Salzmann
- AI Lab, Swisscom
- CV Lab, EPFL
- MLO Lab, EPFL
投稿日付(yyyy/MM/d
The README.md doesn't indicate how to contribute. I made a local branch to fix issue #34 but when I try to push the branch I get the following error, which indicates I don't have permission.
Please indicate in the README how to contribute.
Thanks.
-
Updated
Mar 3, 2020 - Python
Documentation
Can you please add normal documentation at least for an src_cpp/elf?
I don’t ask to add documentation for a src_cpp/elfgames.
-
Updated
Mar 21, 2020 - Jupyter Notebook
Can you describe what modifications need to be done if I want to replace dynamic_rnn with tf.keras.RNN in many-to-one example as dynamic_rnn is deprecated now.
How to use Watcher / WatcherClient over tcp/ip network?
Watcher seems to ZMQ server, and WatcherClient is ZMQ Client, but there is no API/Interface to config server IP address.
Do I need to implement a class that inherits from WatcherClient?
-
Updated
Apr 12, 2020 - Python
Improve this page
Add a description, image, and links to the reinforcement-learning topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the reinforcement-learning topic, visit your repo's landing page and select "manage topics."



Given a remote function like
We switched from syntax like
to
Regarding deprecating
._remote, their are several steps.options()instead.