PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
reinforcement-learning
deep-learning
deep-reinforcement-learning
pytorch
atari
hessian
second-order
continuous-control
actor-critic
ale
mujoco
proximal-policy-optimization
ppo
advantage-actor-critic
a2c
acktr
natural-gradients
roboschool
kfac
kronecker-factored-approximation
-
Updated
Jan 17, 2021 - Python

