This repository implements the YAMNet model for audio classification in PyTorch. YAMNet was originally released in TensorFlow by Google. This implementation is adapted from Torch AudioSet, which only supports inference using pretrained weights. In contrast, this version adds full support for training from scratch. It also adds support for an enhanced version of YAMNet which replaces the MobileNetV1 backbone with MobileNetV3.
example.py contains example code training YAMNet on the ESC50 dataset. The project contains dataloaders and scripts for downloading the ESC50 and FSD50K datasets for audio classification. Training YAMNet on ESC50 is as simple as the following:
pip install -r requirements.txt
./download_esc50.sh
python3 example.py ./ESC-50-master ./log.txt ./ckpt.pt