This repository contains a collection of scripts, utilities, and a Jupyter Notebook for exploring and processing audio signals. It is designed for educational purposes and provides a hands-on approach to understanding audio signal processing concepts such as sine waves, Fourier Transform, Short-Time Fourier Transform (STFT), and Mel Frequency Cepstral Coefficients (MFCC).
audioprocessing.ipynb: A comprehensive Jupyter Notebook that walks through various audio processing techniques, including:- Generating sine waves.
- Visualizing waveforms.
- Applying Fourier Transform and STFT.
- Extracting MFCCs.
audiofiles/: Contains sample audio files for analysis and experimentation.notes/: Includes.wavfiles such asC2_C4.wavandG6_A6.wavfor pitch and frequency analysis.
images/: Contains visual aids and diagrams used in the notebook, such as:- Fourier Transform illustrations.
- Mel Scale and MFCC process diagrams.
utils/: A collection of Python utility scripts for audio processing:plot_waveform.py: Functions to plot waveforms.read_audio_and_plot.py: Reads audio files and plots their waveforms.record_audio_and_plot.py: Records audio and visualizes it.save_and_play_signal.py: Saves and plays generated audio signals.
pyproject.toml: Defines the project dependencies and metadata.README.md: This file, providing an overview of the repository.uv.lock: Dependency lock file.
-
Audio Signal Generation:
- Generate sine and cosine waves with customizable frequencies, amplitudes, and phases.
- Combine multiple signals to create complex waveforms.
-
Visualization:
- Plot time-domain signals.
- Visualize frequency-domain representations using FFT and STFT.
-
Audio Analysis:
- Extract and analyze Mel Frequency Cepstral Coefficients (MFCC).
- Understand the sensitivity of human hearing to different frequencies.
-
Interactive Learning:
- Step-by-step explanations and visualizations in the Jupyter Notebook.
- Links to external resources for deeper understanding.
- Python 3.13 or higher.
- Recommended: A virtual environment to manage dependencies.
-
Install libraries using uv
uv sync
- Open
audioprocessing.ipynband follow the cells step-by-step.
-
Sine Waves:
- Mathematical representation:
-
Fourier Transform:
- Convert time-domain signals to frequency-domain.
-
Short-Time Fourier Transform (STFT):
- Analyze how frequencies change over time.
-
Mel Frequency Cepstral Coefficients (MFCC):
- Extract features for audio classification and speech recognition.
- Mel-Spectrogram and MFCCs | Lecture 72 (Part 1) | Applied Deep Learning
- Mel Frequency Cepstral Coefficients (MFCC) Explained