generalized-advantage-estimation

An implementation from the state-of-the-art family of reinforcement learning algorithms Proximal Policy Optimization using normalized Generalized Advantage Estimation and optional batch mode training. The loss function incorporates an entropy bonus.

python machine-learning reinforcement-learning entropy deep-learning neural-network optimization gae pytorch rl actor-critic proximal-policy-optimization ppo open-ai open-ai-gym generalized-advantage-estimation ppo-pytorch

Updated Dec 26, 2022
Python

nslyubaykin / rnns_for_pomdp

Star

Recurrent Policies for Handling Partially Observable Environments

reinforcement-learning gae lstm policy-gradient pomdp proximal-policy-optimization ppo reccurent-neural-network partially-observable-environment generalized-advantage-estimation

Updated Aug 29, 2022
Jupyter Notebook

nslyubaykin / relax_trpo_example

Star

Example TRPO implementation with ReLAx

reinforcement-learning gae policy-gradient reinforcement-learning-algorithms continuous-control trpo generalized-advantage-estimation discrete-control

Updated Aug 29, 2022
Jupyter Notebook

nslyubaykin / relax_ppo_example

Star

Example PPO implementation with ReLAx

reinforcement-learning gae policy-gradient reinforcement-learning-algorithms continuous-control proximal-policy-optimization ppo generalized-advantage-estimation discrete-control

Updated Aug 29, 2022
Jupyter Notebook

shaheennabi / Proximal-Policy-Optimization-PPO

Sponsor

Star

Modular Implementation of Proximal Policy Optimization (PPO) is a policy gradient reinforcement learning algorithm introduced by OpenAI in 2017. It's designed to be a simpler, more stable, and more sample-efficient alternative to previous policy gradient methods like A3C and TRPO (Trust Region Policy Optimization).

reinforcement-learning policy-gradient proximal-policy-optimization generalized-advantage-estimation

Updated Jan 28, 2026
Python

Improve this page

Add a description, image, and links to the generalized-advantage-estimation topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the generalized-advantage-estimation topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

generalized-advantage-estimation

Here are 9 public repositories matching this topic...

bentrevett / pytorch-rl

adik993 / ppo-pytorch

hcnoh / rl-collection-pytorch

leaderj1001 / Phasic-Policy-Gradient

tomasspangelo / proximal-policy-optimization

nslyubaykin / rnns_for_pomdp

nslyubaykin / relax_trpo_example

nslyubaykin / relax_ppo_example

shaheennabi / Proximal-Policy-Optimization-PPO

Improve this page

Add this topic to your repo