This project implements the Proximal Policy Optimization (PPO) algorithm with an Actor-Critic architecture to train an AI agent to play Super Mario Bros. The agent learns to navigate the game environment by processing visual input (frames from the game) and receiving rewards.
The primary goal is to create an autonomous agent capable of achieving high scores and completing levels in Super Mario Bros. The implementation uses:
- TensorFlow/Keras 🧠 for building and training the neural network models.
- OpenAI Gym and gym-super-mario-bros 🎮 for the game environment.
- PPO Algorithm 📈 for stable and efficient policy updates.
- Actor-Critic Architecture 🎭 where the Actor decides the action and the Critic evaluates the state.
- CNN (Convolutional Neural Network) 🖼️ to process game frames.
- Techniques like frame stacking, grayscale conversion, and image resizing to preprocess observations.
- Parallel environment interaction using multiple "actors" 🏃♂️💨 to gather diverse experiences.
- ✅ Proximal Policy Optimization (PPO)
- ✅ Actor-Critic Neural Network Model
- ✅ Convolutional Neural Network (CNN) for visual input processing
- ✅ Frame Stacking for temporal information
- ✅ Grayscale and Resized Image Observations for efficiency
- ✅ Parallel data collection with multiple game environments (actors)
- ✅ Model saving and loading capabilities 💾
- ✅ Separate modes for training a new model and running a pre-trained model
- Python 3.9 🐍 (MUST!!!!!)
pip
-
Clone the repository:
git clone [https://github.com/omerjakoby/MARIO-RL-PPO.git](https://github.com/omerjakoby/MARIO-RL-PPO.git) cd MARIO-RL-PPO -
Install the required Python libraries: Make sure you are in the project's root directory (
MARIO-RL-PPO) whererequirements.txtis located.pip install -r requirements.txt
The script ppo_mario.py handles both training new models and running pre-trained ones.
If you have pre-trained actor and critic models (e.g., named actor_model_v550 and critic_model_v550):
- You can find the pre-trained actor and critic checkpoints here: [Google Drive Link to Checkpoints] https://drive.google.com/drive/folders/1c5TaCSCHSHkR-eTzTFp5GB3ktPPWsLbu?usp=sharing
- Ensure these model directories are present in the project's root directory or a known location.
- By default, the script tries to load models from paths like
r"actor_model_v550"andr"critic_model_v550". - If the script cannot find the model directories ❗:
You will need to provide the absolute paths to your
actor_model_v550andcritic_model_v550directories. Modify lines 256 and 257 inppo_mario.py:# In ppo_mario.py around line 256: actor = keras.models.load_model(r"C:\path\to\your\actor_model_v550") # Replace with your absolute actor path critic = keras.models.load_model(r"C:\path\to\your\critic_model_v550") # Replace with your absolute critic path
If you want to train models from scratch:
- Comment out the model loading lines (around 256-257) in
ppo_mario.py:# actor = keras.models.load_model(r"actor_model_v550") # critic = keras.models.load_model(r"critic_model_v550")
- When you run the script, you will choose option
2to start training.
Execute the main Python script:
python ppo_mario.py