dqn
Here are 532 public repositories matching this topic...
-
Updated
May 29, 2020 - Python
I tried some RNN regression learning based on the code in the "PyTorch-Tutorial/tutorial-contents/403_RNN_regressor.py" file, which did not work for me at all.
According to an accepted answer on stack-overflow (https://stackoverflow.com/questions/52857213/recurrent-network-rnn-wont-learn-a-very-simple-function-plots-shown-in-the-q?noredirect=1#comment92916825_52857213), it turns out that the li
-
Updated
Feb 18, 2019 - Python
The OpenAI Gym installation instructions are missing reference to the "Build Tools for Visual Studio 2019" from the following site.
https://visualstudio.microsoft.com/downloads/
I also found this by reading the following article.
https://towardsdatascience.com/how-to-install-openai-gym-in-a-windows-environment-338969e24d30
Even though this is an issue in the OpenAI gym, a note in this RE
-
Updated
Apr 21, 2020 - Jupyter Notebook
I was surprised to see this loss function because it is generally used when the target is a distribution (i.e. sums to 1). This is not the case for the advantage estimate. However, I worked out the math and it does appear to be doing the right thing which is neat!
I think this trick should be mentioned in the code.
-
Updated
Jun 8, 2020 - Python
-
Updated
Mar 23, 2019 - Python
- I have marked all applicable categories:
- exception-raising bug
- RL algorithm bug
- documentation request (i.e. "X is missing from the documentation.")
- new feature request
- I have visited the [source website], and in particular read the [known issues]
- I have searched through the [issue tracker] for duplicates
- I have mentioned versio
-
Updated
Jun 4, 2018 - Python
-
Updated
Jan 28, 2020 - Python
-
Updated
Jan 17, 2020 - Python
2019-09-17 15:58:06.381228 STDOUT 2012] | chainerrl/tests/links_tests/test_stateless_recurrent_sequential.py .F... [100%]
-- | --
2019-09-17 15:58:06.381228 STDOUT 2012] |
2019-09-17 15:58:06.381229 STDOUT 2012] | =================================== FAILURES ===================================
2019-09-17 15:58:06.381230 STDOUT 2012] | ___________ TestStatelessRecurrentSequential.test_n_
-
Updated
Mar 18, 2020 - Python
-
Updated
Feb 10, 2020 - Jupyter Notebook
-
Updated
Feb 20, 2018 - Python
-
Updated
Oct 29, 2019 - Python
-
Updated
Jul 14, 2019 - Python
-
Updated
Nov 15, 2019 - Python
-
Updated
Jun 8, 2020 - Python
-
Updated
Sep 4, 2018 - Python
-
Updated
Jun 1, 2020 - Java
-
Updated
Jun 8, 2020 - Python
-
Updated
May 28, 2020 - Python
-
Updated
Jun 9, 2020 - Python
Understanding the build process is currently quite difficult because it happens partly in the graph builder, in static and non-static parts of Component, and in various utils.
We should:
- Make fully clear the purpose of each build Op
- Fully document the Structure of the IR generated by the two builds (potentially revive visualisation project for this)
- Clarify the use of Build ops in gra
1-grid-word ---> 1-policy-iteration 에서
코드 전제적으로 width, height 순서가 맞지 않습니다.
코드에서는 widht=5, height=5로 되어 있어, 작동하지만,
width=5, height=6이면, 작동하지 않습니다.
예들 들어,
self.value_table = [[0.0] * env.width for _ in range(env.height)] # height x width
--->
self.value_table = [[0.0] * env.height for _ in range(env.width)] # width x height
코드 전체를 좀 손봐야 할 것 같습니다.
graphic상 에
-
Updated
Mar 24, 2020 - C#
Improve this page
Add a description, image, and links to the dqn topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the dqn topic, visit your repo's landing page and select "manage topics."


I understand that these two python files show two different methods to construct a model. The original n_epoch is 500 which works perfect for both python files. But if I change n_epoch to 20, only tutorial_mnist_mlp_static.py can achieve a high test accuracy (~0.97). The other file tutorial_mnist_mlp_static_2.py only get 0.47.
The models built from these two files looks the same for me (the s