-
Updated
Apr 7, 2022 - Python
openai
Here are 390 public repositories matching this topic...
🐛 Bug
The documentation of DQN agent (https://stable-baselines3.readthedocs.io/en/master/modules/dqn.html) specifies that log_interval parameter is "The number of timesteps before logging". However, when set to 1 (or any other value) the logging is not made at that pace but is instead made every log_interval episode (and not timesteps). In the example below this is made every 200 timesteps.
-
Updated
Feb 18, 2022 - Python
-
Updated
Aug 9, 2021 - Python
-
Updated
Mar 15, 2022 - Python
-
Updated
Mar 25, 2022 - Jupyter Notebook
-
Updated
Jul 14, 2021 - Python
-
Updated
Feb 6, 2021 - Python
-
Updated
Feb 10, 2022 - Python
-
Updated
Jul 24, 2021 - Python
-
Updated
Apr 7, 2022 - Python
-
Updated
Jul 14, 2019 - Python
There seem to be some vulnerabilities in our code that might fail easily. I suggest adding more unit tests for the following:
- Custom agents (there's only VPG and PPO on CartPole-v0 as of now. We should preferably add more to cover discrete-offpolicy, continuous-offpolicy and continuous-onpolicy)
- Evaluation for the Bandits and Classical agents
- Testing of convergence of agents as proposed i
-
Updated
Jun 24, 2021 - Python
-
Updated
Apr 7, 2022 - Emacs Lisp
-
Updated
Feb 9, 2018 - Python
-
Updated
Feb 8, 2022 - Python
A great repo and paper: https://github.com/golsun/deep-RL-trading
This could be useful for FinRL maybe as helper / environment function. Training first on rather simple idealized synthetic prices before feeding real data might be beneficial to learn the agent the "basics". Also it's great for testing.
- Sine wave
- Trend curves
- Random walk
-
Updated
Oct 7, 2021 - Python
-
Updated
Mar 12, 2022 - Python
-
Updated
Apr 5, 2022 - Python
-
Updated
Feb 10, 2022 - Python
-
Updated
Jan 28, 2022 - Python
-
Updated
Sep 15, 2021 - Python
-
Updated
Aug 4, 2018 - Python
-
Updated
Dec 31, 2019 - Jupyter Notebook
Improve this page
Add a description, image, and links to the openai topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the openai topic, visit your repo's landing page and select "manage topics."


The following applies to DDPG and TD3, and possibly other models. The following libraries were installed in a virtual environment:
numpy==1.16.4
stable-baselines==2.10.0
gym==0.14.0
tensorflow==1.14.0
Episode rewards do not seem to be updated in
model.learn()beforecallback.on_step(). Depending on whichcallback.localsvariable is used, this means that: