Skip to content

A simple and deeply commented PyTorch implementation of the RetinaNet paper, built as an educational resource. πŸ“š Demystify the core concepts of object detection with code that sticks closely to the original paper. πŸ’‘

Notifications You must be signed in to change notification settings

Armaggheddon/retinanet_demystified

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation


RetinaNet Demystified Cover Image

Open In Colab


This project is a clean and commented PyTorch implementation of the paper RetinaNet: Focal Loss for Dense Object Detection.

While many RetinaNet implementations exist, this one is built from the ground up with a specific goal: to be a clear educational resource. It follows the original paper as closely as possible, stripping away production-level optimizations and boilerplate. This allows you to focus on the core concepts of RetinaNet without getting lost in the weeds.

It's perfect for anyone looking to understand how RetinaNet really works under the hood.

Key Features

  • Paper-Focused: The implementation sticks closely to the concepts described in the original paper.
  • Deeply Commented: Code blocks are linked back to the specific sections of the paper they implement. This makes it easy to cross-reference and understand the "why" behind the code.
  • Multiple Backbones: Supports ResNet-18, ResNet-34, ResNet-50, ResNet-101, and ResNet-152 backbones right out of the box.
  • Simplicity by Design: Intentionally omits features like custom BatchNorm layers in the heads to keep the focus on the fundamental architecture.

Quick Start: Training on Raccoons 🦝

Let's get this model trained! We'll use the fun Raccoon dataset to prove that our implementation, despite its simplicity, can learn to detect objects.

  1. Clone the repository:

    git clone https://github.com/Armaggheddon/retinanet_demystified.git
    cd retinanet_pytorch
  2. Download the dataset:

    git clone https://github.com/datitran/raccoon_dataset
  3. Set up your environment (Recommended):

    python -m venv .venv
    source .venv/bin/activate  # On Windows use `.venv\Scripts\activate`
    pip install -r requirements.txt
  4. Start Training: Feel free to peek into train.py and tweak the HYPERPARAMETERS!

    python train.py

    After training, your model weights will be saved as retinanet_raccoon_rnXX.pth.

  5. Run Inference: Modify the IMAGE_PATH in load_trained.py to point to a test image.

    python load_trained.py

    Check out the output.png file to see your model in action!

Proof of Life: Does it Learn?

Absolutely! The goal here isn't to set new state-of-the-art records, but to demonstrate that the core architecture works and learns. The plots below show a ResNet18 backbone trained for 20 epochs on the small Raccoon dataset.

You'll notice clear signs of overfitting, which is expected given the dataset's size. But more importantly, you'll see the loss decreasing and the model successfully identifying objects. It's alive!

Train/Eval trend
Average training and evaluation loss per epoch. The model is learning!

Train loss
Training total loss, classification loss, and box regression loss.

Eval loss
Evaluation total loss, classification loss, and box regression loss.

Paper References

The papers referenced throughout the code are:

Tip

When no paper reference is given it is always referred to the main RetinaNet paper.

Did I miss something? πŸ˜…

Feel free to open an issue or submit a pull request. Contributions are welcome! Just remember, the goal is to keep things simple and educational.

About

A simple and deeply commented PyTorch implementation of the RetinaNet paper, built as an educational resource. πŸ“š Demystify the core concepts of object detection with code that sticks closely to the original paper. πŸ’‘

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published