🐦 BirdCLEF 2025 - Soundscape Classification

This repository contains my experiments and solutions for the BirdCLEF 2025 Kaggle Competition. The goal is to identify bird species from long soundscape recordings using machine learning and audio signal processing techniques.

Project Highlights

Custom PyTorch Model: Based on seresnext26t_32x4d backbone using the TIMM library.
Voice Activity Detection (VAD): Filters out non-bird audio segments to focus on relevant parts.
Audio Preprocessing:
- 10-second chunking of soundscape recordings for efficient inference.
- Log-mel spectrograms computed and stored as .npz batches.
Self-Supervised Learning: Pseudo-labeling applied using confident predictions from unlabeled soundscapes.
Efficient Data Handling: Custom DataLoader and Dataset with optional augmentations such as time/frequency masking and noise injection.
Clean Label Management: Generated pseudo-labels saved as pseudo_labels_soundscape.csv for downstream training.

Dataset

The dataset consists of long-form audio recordings of bird soundscapes, provided by the BirdCLEF 2025 Kaggle Competition. Audio files are processed into 10-second chunks, and mel spectrogram features are extracted for model input.

Installation

Clone the repository:

git clone https://github.com/SheemaMasood381/BirdCLEF-2025.git
cd BirdCLEF-2025

Install the required Python packages:

pip install -r requirements.txt

Dependencies include: PyTorch, torchaudio, timm, librosa, numpy, pandas, scikit-learn

Usage

Preprocess the audio recordings (chunking & spectrograms).
Train the model:

python train.py --config configs/train_config.yaml

Inference on new soundscape data:

python inference.py --audio_path path/to/audio

Pseudo-labeling (optional for self-supervised learning):

python pseudo_labeling.py --unlabeled_data path/to/unlabeled

Results

Custom seresnext26t_32x4d model achieved strong performance on BirdCLEF validation data.
Pseudo-labeling improved accuracy by leveraging unlabeled soundscape recordings.

(Specific leaderboard scores can be included if permitted.)

Future Work

Experiment with other transformer-based architectures for audio classification.
Implement ensemble models combining multiple backbones for higher accuracy.
Explore advanced data augmentation techniques to improve generalization.

Acknowledgements

BirdCLEF 2025 Kaggle Competition
PyTorch & TIMM library
Open-source audio processing tools: librosa, torchaudio

License

This project is licensed under the MIT License – see the LICENSE file for details.

cd BirdCLEF-2025

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
01-EDA.ipynb		01-EDA.ipynb
02-Data-Preparation.ipynb		02-Data-Preparation.ipynb
03-MEL-generation.ipynb		03-MEL-generation.ipynb
04-Model-training.ipynb		04-Model-training.ipynb
05-Psedolabelling-for-SSL.ipynb		05-Psedolabelling-for-SSL.ipynb
06-Inference-submissions.ipynb		06-Inference-submissions.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🐦 BirdCLEF 2025 - Soundscape Classification

Project Highlights

Dataset

Installation

Usage

Results

Future Work

Acknowledgements

License

About

Uh oh!

Releases

Packages

Languages

SheemaMasood381/BirdClef-2025-Kaggle

Folders and files

Latest commit

History

Repository files navigation

🐦 BirdCLEF 2025 - Soundscape Classification

Project Highlights

Dataset

Installation

Usage

Results

Future Work

Acknowledgements

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages