-
Updated
Feb 15, 2022 - Jupyter Notebook
reinforcement-learning-algorithms
Here are 659 public repositories matching this topic...
The following applies to DDPG and TD3, and possibly other models. The following libraries were installed in a virtual environment:
numpy==1.16.4
stable-baselines==2.10.0
gym==0.14.0
tensorflow==1.14.0
Episode rewards do not seem to be updated in model.learn() before callback.on_step(). Depending on which callback.locals variable is used, this means that:
- episode rewards may n
-
Updated
May 26, 2022 - Python
-
Updated
Jan 28, 2022 - Python
-
Updated
Apr 19, 2022 - Python
-
Updated
Feb 9, 2021 - Python
gtrxl
How to run gtrxl with ppo policy? can someone provide an example?
Agent Demo List
-
Updated
Jun 1, 2022 - Jupyter Notebook
-
Updated
Feb 26, 2022 - Julia
-
Updated
Mar 2, 2022 - Python
-
Updated
Oct 28, 2020 - Python
-
Updated
May 3, 2020 - C++
There seem to be some vulnerabilities in our code that might fail easily. I suggest adding more unit tests for the following:
- Custom agents (there's only VPG and PPO on CartPole-v0 as of now. We should preferably add more to cover discrete-offpolicy, continuous-offpolicy and continuous-onpolicy)
- Evaluation for the Bandits and Classical agents
- Testing of convergence of agents as proposed i
-
Updated
Apr 9, 2022 - Python
-
Updated
Jul 4, 2020 - Python
-
Updated
Dec 13, 2018 - Jupyter Notebook
-
Updated
May 26, 2022 - Python
-
Updated
Oct 20, 2021 - Python
-
Updated
Mar 3, 2022 - Python
-
Updated
May 6, 2020 - Python
-
Updated
Jan 27, 2022 - TeX
-
Updated
Mar 15, 2019 - Jupyter Notebook
-
Updated
Apr 25, 2022 - Python
-
Updated
Nov 24, 2021 - Python
-
Updated
Oct 23, 2020 - Jupyter Notebook
-
Updated
May 26, 2022 - Jupyter Notebook
-
Updated
Jan 16, 2019 - Python
-
Updated
Jan 19, 2021
Improve this page
Add a description, image, and links to the reinforcement-learning-algorithms topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the reinforcement-learning-algorithms topic, visit your repo's landing page and select "manage topics."


The documentation of DQN agent (https://stable-baselines3.readthedocs.io/en/master/modules/dqn.html) specifies that log_interval parameter is "The number of timesteps before logging". However, when set to 1 (or any other value) the logging is not made at that pace but is instead made every log_interval episode (and not timesteps). In the example below this is made every 200 timesteps.