A lightweight, anchor-free object detection system built using MicroDet, optimized for aerial / drone imagery. This project focuses on efficient person detection with minimal computational overhead, making it suitable for edge devices and UAV applications.
-
Lightweight MicroDet architecture
-
Anchor-free detection (DFL-based regression)
-
COCO-format dataset support
-
Mixed-precision (AMP) training
-
EMA (Exponential Moving Average) weights
-
End-to-end pipeline: Train → Validate → Infer
-
Bounding box visualization with NMS
-
Config-driven (.toml) model & training setup
microdet/
├── tmp/
│ ├── model/
│ │ ├── backbone/
│ │ ├── neck/
│ │ ├── detect/
│ │ ├── loss/
│ │ └── model_wrapper.py
│ ├── train/
│ │ ├── train.py
│ │ └── validate.py
│ ├── infer/
│ │ ├── run_infer.py
│ │ └── image.png
│ └── data/
│ ├── coco_dataset.py
│ └── collate.py
├── runs/
│ └── microdet_drone/
│ ├── weights/
│ │ ├── last.ckpt
│ │ └── best.ckpt
│ └── logs/
├── microdet.toml
├── requirements.txt
└── README.mdMicroDet is a one-stage, anchor-free detector designed for speed and efficiency.
-
Backbone: Lightweight CNN for feature extraction
-
Neck: Multi-scale feature aggregation
-
Head:
-
Classification branch (Quality Focal Loss)
-
Regression branch (Distribution Focal Loss – DFL)
-
[8, 16, 32]| Loss Type | Purpose |
|---|---|
| Quality Focal Loss (QFL) | Classification confidence |
| Distribution Focal Loss (DFL) | Bounding box regression |
| GIoU Loss | Box overlap accuracy |
[model.head.loss]
config = [2.0, 0.25, 1.0, 7, "giou"]-
COCO-style annotation format
-
Single class: person
-
Input resolution: 640×640
-
Supports training & validation splits
[data.train]
config = [
"tmp/data/dataset/images",
"tmp/data/dataset/result.json",
[640, 640],
true,
{}
]python -m tmp.train.train \
--config microdet.toml \
--device cudapython -m tmp.train.train \
--config microdet.toml \
--device cuda \
--resume runs/microdet_drone/weights/last.ckpt-
Validation runs every val_interval epochs
-
Metrics:
-
mAP
-
Confidence stability
-
Qualitative bounding box accuracy
-
python tmp/infer/run_infer.py-
Bounding boxes drawn on original image
-
Non-Maximum Suppression (NMS) applied
tmp/infer/output.png✔ Detected persons from aerial view ✔ Bounding boxes after NMS ✔ Scaled correctly to original image resolution
Early training may show many low-confidence boxes; tuning assigner radius and confidence threshold improves results.
[model.head.assigner_cfg]
config = ["CenterAssigner", {
"8" = 5.0,
"16" = 5.0,
"32" = 5.0
}][schedule.optimizer]
config = ["adamw", 0.0005, 0.05, true, true]-
Low confidence during early epochs
-
Over-detection without NMS
-
DFL decoding tensor contiguity issues
-
Correct stride handling during inference
✔ All addressed through architectural tuning and post-processing.
-
Language: Python
-
Framework: PyTorch
-
Vision: OpenCV
-
Model: MicroDet
-
Data Format: COCO
-
Hardware: CUDA GPU
-
Drone surveillance
-
Crowd monitoring
-
Search & rescue
-
Smart city analytics
-
Edge AI deployments
-
Multi-class detection
-
Video inference & tracking
-
Model quantization
-
Edge deployment (Jetson / TPU)
-
Knowledge distillation
Developer • ML Enthusiast • Neovim Customizer • Linux Power User
Hi! I'm Ravindran S, an engineering student passionate about:
- Linux & System Engineering
- AIML (Artificial Intelligence & Machine Learning)
- Full-stack Web Development
- Hackathon-grade project development
You can reach me here: