Skip to content
#

generalized-advantage-estimation

Here are 9 public repositories matching this topic...

Language: All
Filter by language

An implementation from the state-of-the-art family of reinforcement learning algorithms Proximal Policy Optimization using normalized Generalized Advantage Estimation and optional batch mode training. The loss function incorporates an entropy bonus.

  • Updated Dec 26, 2022
  • Python

Modular Implementation of Proximal Policy Optimization (PPO) is a policy gradient reinforcement learning algorithm introduced by OpenAI in 2017. It's designed to be a simpler, more stable, and more sample-efficient alternative to previous policy gradient methods like A3C and TRPO (Trust Region Policy Optimization).

  • Updated Jan 28, 2026
  • Python

Improve this page

Add a description, image, and links to the generalized-advantage-estimation topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the generalized-advantage-estimation topic, visit your repo's landing page and select "manage topics."

Learn more