Skip to content

Detects guitar chords from raw audio using FFT-based spectral analysis and a modular Rust scoring engine. Includes batch tools, spectrograms, and extendable chord templates.

License

Notifications You must be signed in to change notification settings

pineapple-bois/Guitar_Chord_Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

97 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Guitar Chord Analysis

This project implements fast, robust chord recognition from raw guitar audio using Rust. Frequency-domain analysis is performed via FFT-based signal processing, and chord classification is structured around extendable music-theoretic templates. All chord samples were recorded on an unamplified acoustic guitar (4 s per sample), providing a controlled dataset for iterative algorithm development.

The architecture is designed to be modular and testable, with a clear path toward real-time analysis.


Contents


Background & Theory

Pitch Classes and the Cycle of Fifths

In the 12-tone equal temperament system, each pitch class of the chromatic scale can be represented by an integer modulo 12 with;

Pitch $\text{C}$ $\text{C}\sharp$ $\text{D}$ $\text{D}\sharp$ $\text{E}$ $\text{F}$ $\text{F}\sharp$ $\text{G}$ $\text{G}\sharp$ $\text{A}$ $\text{A}\sharp$ $\text{B}$
Integer 0 1 2 3 4 5 6 7 8 9 10 11

The cycle of fifths is built by repeatedly moving a perfect fifth. Since a perfect fifth equals 7 semitones, the transformation on pitch classes is given by:

$$f(n)=7n\bmod 12$$

By performing calculations modulo 12, any value that exceeds 11 wraps around to start again at 0, ensuring that the pitch classes form a continuous cyclic system. Starting with C (0), we obtain:

  • $\text{C} \to 7 \times 0 \mod 12 = 0 \ (\text{C})$
  • $\text{G} \to 7 \times 1 \mod 12 = 7 \ (\text{G})$
  • $\text{D} \to 7 \times 2 \mod 12 = 2 \ (\text{D})$
  • $\dots$ and so on (as illustrated in the LH figure below)

Dodecagram Cycle-Fifths

The RH figure is a permutation of the 12 pitch classes mapped to the unit circle and orders them so that:

  • Keys with fewer accidentals (sharps or flats) are grouped near the top,
  • Clockwise movement around the circle accumulates sharps.
  • Counter-clockwise movement accumulates flats.

Two notes are said to be enharmonically equivalent if they have the same pitch on a tempered instrument. In this project, this equivalence is ignored such that,

  • $\text{C}\sharp$ and $\text{D}\flat$ $\to \text{C}\sharp$
  • $\text{F}\sharp$ and $\text{G}\flat$ $\to \text{F}\sharp$
  • $\text{G}\sharp$ and $\text{A}\flat$ $\to \text{G}\sharp$
  • $\text{A}\sharp$ and $\text{B}\flat$ $\to \text{A}\sharp$

Chord Construction via Interval Templates

Chords are built by choosing specific pitches (or scale degrees) from a scale. For example, in the case of the major scale (for the key of C major: $\text{C, D, E, F, G, A, B}$). The simplest chords are triads;

  • A major chord is constructed by taking the $1^{\text{st}}$, $3^{\text{rd}}$, and $5^{\text{th}}$ (e.g., $\text{C, E, G}$ in C major)
  • A minor chord is built using the $1^{\text{st}}$, $\flat 3^{\text{rd}}$, and $5^{\text{th}}$ degrees (e.g., $\text{C}, \text{E}\flat, \text{G}$ in C minor)

More complex chords include an additional note:

  • Dominant 7th: $1^{\text{st}}$, $3^{\text{rd}}$, $5^{\text{th}}$ and $\flat 7^{\text{th}}$$\text{C, E, G, B}\flat$ for C7
  • Major 7th: $1^{\text{st}}$, $3^{\text{rd}}$, $5^{\text{th}}$ and $7^{\text{th}}$$\text{C, E, G, B}$ for Cmaj7
  • Minor 7th: $1^{\text{st}}$, $\flat 3^{\text{rd}}$, $5^{\text{th}}$ and $\flat 7^{\text{th}}$$\text{C, E}\flat, \text{G, B}\flat$ for Cmin7

The above interval structures can be represented by a list of semitone distances relative to a root note at index 0;

  • Major: [0, 4, 7]
  • Minor: [0, 3, 7]
  • Dominant 7th: [0, 4, 7, 10]
  • Major 7th: [0, 4, 7, 11]
  • Minor 7th: [0, 3, 7, 10]

In Rust, we define an enum ChordType to build a library of CHORD_TEMPLATES (which is instantiated at runtime as a HashMap - more details can be found in chords.md)

/// Types of chords supported, such as major, minor, dominant 7th, etc.
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
pub enum ChordType {
    Major,
    Minor,
    Dominant7,
    Major7,
    Minor7,
}

impl ChordType {
    /// Returns the semitone intervals defining the chord relative to its root.
    /// In code, each chord type is represented by its semitone intervals:
    pub fn intervals(&self) -> &'static [u8] {
        match self {
            ChordType::Major => &[0, 4, 7],
            ChordType::Minor => &[0, 3, 7],
            ChordType::Dominant7 => &[0, 4, 7, 10],
            ChordType::Major7 => &[0, 4, 7, 11],
            ChordType::Minor7 => &[0, 3, 7, 10],
        }
    }
}

CHORD_TEMPLATES is fully extendable and will support suspended chords in the future. Chord classification is a complex subject: Wikipedia - List of chords


Discrete Pitch Classes from a Continuous Spectrum

When we analyse a sound using the FFT, we obtain a spectrum that shows energy spread across a continuous range of frequencies.

Music theory divides this continuous range into discrete steps; most commonly, 12 semitones per octave. This division is logarithmic; each semitone is a frequency multiple of $2^{1/12}$. In other words, if a note has frequency $f$, the next semitone up has frequency $f \times 2^{1/12}$

The MIDI tuning system formalises this idea by quantising the continuous frequency spectrum into discrete steps. If $f$ is a frequency in $\text{Hz}$ then the corresponding MIDI note number, $N_{\text{MIDI}}$ is given,

$$N_{\text{MIDI}}=69 + 12\log_2\left(\frac{f}{440,\text{Hz}}\right)$$

Where $440 \ \text{Hz}$ is the standard tuning frequency for $\text{A}4$.

We define a function in Rust to quantise frequencies

/// List of note names representing the 12 chromatic pitch classes.
pub static NOTES: [&str; 12] = [
    "C", "C#", "D", "D#", "E", "F", "F#", "G", "G#", "A", "A#", "B"
];

pub fn freq_to_note(freq: f32) -> String {
    let a4 = 440.0;
    let midi = (69.0 + 12.0 * (freq / a4).log2()).round() as i32;
    let note_name = NOTES[(midi as usize) % 12];
    let octave = (midi / 12) - 1;
    format!("{}{}", note_name, octave)
}

An inverse function note_to_freq was also developed. More information can be found in pitch.md


Audio I/O and FFT

The hound crate is used to read .wav files and extract audio samples as Vec<f32> along with the sample rate in Hz. Both 16-bit integer and 32-bit float formats are supported.

See read_wav_file() for details. Usage;

let path = Path::new("assets/guitar_samples/A_chords/Amaj.wav")
let (sample_rate, samples) = read_wav_file(path).unwrap();
// Debug/sanity check
println!("Sample rate: {}", sample_rate);
println!("First 10 samples: {:?}", &samples[..10]);

Fast Fourier transform

The FFT $\widetilde{V}$ of some data array $V$ is a transformation from temporal space to frequency space. This operation is delegated to the rustfft crate with wrapper function compute_fft following NumPy's norm='forward' convention.

Usage:

let fft_output: Vec<Complex<f32>> = compute_fft(&samples);

Energy Spectral Density (ESD)

The ESD (or power) of the fft_output is defined as,

$$\widetilde{S}_n=|\widetilde{V}_n|^2$$

The power vector has real components. Hence,

let power: Vec<f32> = fft_output.iter().map(|c| c.norm_sqr()).collect();

To plot $\widetilde{S}$ as a function of frequency, we need to normalise samples by sample_time and shift the zero component of frequency to the centre of the spectrum.

let freqs = fftfreq(samples.len(), 1.0 / sample_rate as f32);
let shifted_freqs = fftshift_real(&freqs);
let shifted_power = fftshift_real(&power);

Core helpers aimed to replicate the numpy.fft backend. See fftfreq, fftshift for details.

Using the plotters crate, the below figure is the ESD plot of Amaj7.wav. See plot_energy_spectral_density for details.

img

The resonance artefact in the above figure has approximate frequency $f\in (110, 115) \ \text{Hz}$. So this straddles $\text{A}$ to $\text{A}$#. This range is filtered in ChordMatcher's peak filtering step:

// Cull known broad guitar-body resonance (≈110–115 Hz)
let body_filtered: Vec<PeakData> = peaks
    .iter()
    .filter(|p| !(p.freq >= 110.0 && p.freq <= 115.00))
    .cloned()
    .collect();

Largest $M$ peaks of the ESD

To classify chords, we want to capture tuples of (freqency, amplitude) $(f, A)$ where $A_n$ is the amplitude of $\widetilde{S}(f_n)$

This process in documented ad nauseum in peak_detection.md.

High-level we;

  • Isolate spectral peaks whose amplitudes exceed a specified threshold_percentile.
  • Apply a min_spacing constraint to avoid closely spaced duplicates.
  • Select the top $M$ peaks by amplitude after filtering.
  • Map each frequency to its closest equal-tempered pitch class.
  • Return an ordered list of PeakData containing freq, amplitude, and note.

Usage:

pub struct PeakData {
    pub freq: f32,
    pub amplitude: f32,
    pub note: String,
}

let opts = RunOptions {
    top_n: 6,
    threshold_percentile: 99.5,
    min_spacing: 10.0,
};

let peaks = detect_peaks_from_fft(
    &shifted_freqs,
    &shifted_power,
    opts.threshold_percentile,   
    opts.min_spacing,            
    opts.top_n
);

For Amaj7.wav we have;

Top 6 peaks:

  Freq: 208.25 Hz, Amp: 1.56e-4, Note: G#3
  Freq: 108.75 Hz, Amp: 1.40e-4, Note: A2
  Freq: 164.50 Hz, Amp: 4.70e-5, Note: E3
  Freq: 220.50 Hz, Amp: 3.12e-5, Note: A3
  Freq: 278.00 Hz, Amp: 2.89e-5, Note: C#4
  Freq: 330.75 Hz, Amp: 8.91e-6, Note: E4

Short Time Fourier Transform (STFT)

src/analysis/spectrogram.rs module:

This module implements a windowed STFT to compute a dB-scaled spectrogram using a real-to-complex FFT (realfft) over overlapping frames.

The goal is to analyse how the spectral content of a signal evolves over time. Unlike a single FFT, which returns global frequency components, the STFT slides a window along the signal to produce a time–frequency representation.

Summary

  • Input: time-domain signal as Vec<f32>
  • Output: Spectrogram struct with
    • Time axis (Vec<f32>)
    • Frequency axis up to max_freq (Vec<f32>)
    • 2D dB matrix (Array2<f32>)
  • Crates used:
    • realfft: Efficient real-to-complex FFT
    • ndarray: Matrix representation

Methodology

  1. Frame-wise windowing

    The input signal is segmented into overlapping frames of length window_size. Successive frames are offset by hop_size samples. The number of frames is computed as:

    $$n_{\text{frames}} = \left\lfloor \frac{L - N}{H} \right\rfloor + 1$$

  • $L$ is total sample length
  • $N$ is window_size
  • $H$ is hop_size

  1. Hann window application

    Each frame is multiplied elementwise with a Hann window:

    $$w[n] = 0.5 \left(1 - \cos\left(\frac{2\pi n}{N-1}\right)\right)$$

    This reduces discontinuities at frame boundaries and mitigates spectral leakage.


  1. Real-to-complex FFT

    We compute the FFT of each windowed frame using:

    let mut planner = RealFftPlanner::<f32>::new();
    let r2c = planner.plan_fft_forward(window_size);

    Only the first $N/2 + 1$ frequency bins are kept due to Hermitian symmetry of the real FFT.


  1. Power spectrum → dB scale

    For each frequency bin $k$ of each frame, compute:

    $$P_k = |\widetilde{V}_k|^2$$

    The power at each frequency bin is converted to a decibel (dB) scale by applying a logarithm. A small constant is added to the power values beforehand to prevent issues with taking the log of zero.


  1. Matrix construction

    All frame-wise dB spectra are stacked into an output matrix of shape:

    $$\text{shape} = \left[n_{\text{freq bins}},\ n_{\text{frames}}\right]$$

    The frequency axis is truncated to remove bins with $f &gt; \text{max}_{\text{freq}}$


Output structure

pub struct Spectrogram {
    pub freqs: Vec<f32>,    // Frequency axis (Hz)
    pub times: Vec<f32>,    // Time axis (s)
    pub data: Array2<f32>,  // Spectrogram in dB [freq_bins x time_bins]
}

Notes

  • A reference STFT crate was considered but abandoned due to outdated or broken dependencies.
  • This implementation ensures full control over axis truncation, scale handling, and matrix layout.
  • The returned spectrogram is suitable for visualisation with plotters.

Parameters

Parameter Meaning Example
window_size Length of each FFT frame (samples) 1024
hop_size Step size between frames (samples) 256
max_freq Upper limit for frequency axis (Hz) 3000.0

These correspond to nperseg and noverlap = nperseg - hop_size in SciPy’s spectrogram().

For Amaj7.wav we have;

img


Chord Matching Engine

src/analysis/matching.rs module

ChordMatcher is a pluggable scoring engine that converts a small set of top-N spectral peaks into a chord label. Its design deliberately separates fast maths from music theory heuristics, making it easy to extend. Is it exposed via the public API identify_chord.

Layer File / Type What it does How to extend
Peak prep PeakData (processing::peak_detection) FFT peaks are de-noised → [PeakData]. Tune top_n, threshold_percentile, or swap in a smarter peak-finder.
Metadata cache ScoringMetadata One pass builds:
pitch_classes, pc_counts, max_amplitudes, bass_pc.
Add new derived features (e.g., spectral centroid) without touching the scoring loop.
Template loop ChordMatcher::score_chord Iterates over CHORD_TEMPLATES; computes a weighted score. Add a new weight term by editing one function; all templates inherit it automatically.
Heuristic knobs ChordMatcher { amplitude_scale, penalty_extra, … } Field values drive the formula. Expose them via CLI flags or a TOML config for easy tuning.
Filtering filter_peaks + Voicing enum Drops impossible bass notes (Open, Barre, etc.). Add Voicing::PowerChord, Voicing::DropD, etc.
Logging plug-in Box<dyn MatchLogger> Injects FileLogger, NullLogger, or your own implementation. Swap in a JsonLogger or tracing subscriber without recompiling the matcher.

Scoring formula (default weights)

total_score =
base_score                                // template-note recall
+ amplitude_scale  × Σ loudness(matched)    // louder matches = better
+ order_bonus       (root in bass)          // voicing preference
+ repetition_weight × extra occurrences     // open-string drones, etc.
  − penalty_extra     × extraneous notes      // stray peaks hurt

All coefficients are public fields, so experiments in Python can overwrite these values via serde/TOML and re-run without code edits.

Usage:

#[derive(Deserialize)]
struct MatcherCfg {
    amplitude_scale: f32,
    penalty_extra:   f32,
    order_bonus:     f32,
    repetition_weight: f32,
}

let cfg: MatcherCfg = toml::from_str(&fs::read_to_string("weights.toml")?)?;
let matcher = ChordMatcher {
    amplitude_scale:   cfg.amplitude_scale,
    penalty_extra:     cfg.penalty_extra,
    order_bonus:       cfg.order_bonus,
    repetition_weight: cfg.repetition_weight,
    ..Default::default()
};

Extending CHORD_TEMPLATES

New qualities (sus2, add9, diminished, …) live in theory/chords.rs:

ChordType::Sus2 => &[0, 2, 7],

The matcher auto-discovers them at run-time; no other changes needed.

TL;DRChordMatcher is a self-contained, hot-swappable component: adjust weights, add templates, or plug in alternative loggers without ripple effects across the codebase.


Batch-Matching Diagnostics & Logging Pipeline

The batch-matching toolchain ties together three key modules:

Rust file Responsibility
analysis/matching.rs Core heuristic engine — converts FFT peak lists into a best-guess chord plus a rich MatchDetails breakdown.
utils/logging.rs Writes per-sample logs, full Markdown diagnostics, and a CSV summary (results.csv). Location defaults to output/logs/, but tests can override via SA_LOG_BASE.
dev/debug_batch_matching.rs Convenience runner that loops over a folder of samples, feeds each one through the matcher, and hands the results to the logger.

A thin CLI wrapper lives in src/bin/test_batch_matching.rs. Run it from project root:

cargo run --bin test_batch_matching

What gets produced?

output/
├── diagrams/
│   └── group/              # SVG images of the chord diagram
└── logs/
    ├── group/              # one folder per chromatic root
    │  ├─ passes/           # .log files for correct predictions
    │  ├─ failures/         # .log files for mismatches
    │  └─ full_scores/      # Markdown tables of every candidate ≥2-note match
    └─ results.csv          # flat file with 1 row per sample
  • .log — one-liner score breakdown (quick grep-able view).
  • full_scores/<label>.md — rich Markdown report with peak lists and a sortable table of candidate chords.
  • results.csv — machine-readable dataset for later analysis.

Chord Diagrams

utils/chord_diagrams.rs

What it does Key points
Generates a guitar-fret SVG (e.g. F#sus2.svg) and drops it into output/diagrams/…. • Built on a fork of whostolemyhat/chord-gen.
• Added Rust-native API; no more CLI calls.
• Optional background rectangle for light/dark integration.
Picks the correct shape family (Barre5, Barre6, Open), then transposes & sets the barre-fret. assets/voicings.json holds base shapes.
BARRE5_OFF / BARRE6_OFF offset tables ensure the barre is drawn at the right fret.
• Graceful fallback (Barre5 → Barre6 → Open) if a quality is missing.
Lets callers decide where the file goes via PathMode. PathMode::Testoutput/diagrams/test/{file}.svg
PathMode::Groupoutput/diagrams/{root}_group/{file}.svg
Integrates with the matcher in one line (only for Certain matches). render_diagram(label, &name, false, PathMode::Group)?;

Tiny API

use utils::chord_diagrams::{render_diagram, PathMode};

render_diagram(
    "Bmaj_sample",  // stem
    "Bmaj",         // chord label
    true,           // background on
    PathMode::Group, // or Test
)?;

Internal logic handles creation of Open or Barre chords

Amaj7 Cycle-Fifths

Fork vs Upstream

Upstream chord-gen Our fork
CLI-only (--frets …) Added types::Chord + render_svg() (returns u64 tmp-name).
Random filenames We rename to {label}.svg after render.
Fixed footer credit Template stripped; MIT licence left intact.

TL;DR — drop render_diagram() into any MatchOutcome branch and they’ll be ready-to-embed SVG chord charts alongside logs files.


Next step: Python parameter tuning

The CSV is designed for painless import into pandas / Jupyter:

import pandas as pd

df = pd.read_csv("output/logs/results.csv")
df.head()

Possible investigations: (02/05/2025)

  • Feature importance — correlate base_score, amplitude_bonus, etc. with is_match to spot overweight or under-weight terms.
  • Threshold sweeps — grid-search penalty_extra, order_bonus, etc. to maximise accuracy.
  • Harmonic library — mine recurring “extraneous” notes to seed a lookup table of expected overtones (e.g., open-string drones, body resonances).

Once tuned, we'll feed the new constants back into analysis/matching.rs (or promote them to a config file) and re-run the batch script. Rinse & repeat until the hit-rate nudges past the current 80 % success mark!


Update: (08/05/2025)

A Python data analysis project was created to run some grid searches, and we eventually entered a feedback loop for weights.toml updates. Therefore, new scoring features were developed in ChordMatcher:

  1. detect_triad — Returns true if root–3rd–5th or root-2nd-5th are all present in the detected set.
  2. seventh_evidence
    • $+$score when 7th is confidently present,
    • $-$score when expected but weak / missing.
  3. Normalised peak amplitudes — Help with the scoring transparency.

These optimisations led to a 100% success rate for the library of samples (now extended with sus2 chords). However, there were some borderline cases. Hence, it was necessary to measure confidence in a ChordMatcher match. Two parameters were defined:

  • DELTA_S_STAR (0.45): $\Delta\text{S}=\text{S}_1 - \text{S}_2$ where
    • $\text{S}_1=$ best_match
    • $\text{S}_2=$ second_match
  • CONF_STAR (0.75): $\sum\text{S}=\exp(\text{S}_1) / \exp(\text{S}_1) + \exp(\text{S}_2)$
    • Soft‑max confidence of the best chord over the runner‑up

The constants come from the fifth‑percentile of each metric measured on the clean 100%‑accurate sample set.

In practice

  • A loud rogue peak that makes a second template almost as plausible drops ΔS, failing the gate.
  • A low‑fret recording where amplitudes are flatter shrinks ΣS, also failing.
  • Clean, confident matches clear both thresholds and are logged as passes.

Defined: MatchOutcome enumerable type:

#[derive(Debug)]
pub enum MatchOutcome {
    Certain { name: String, details: MatchDetails, conf: f32 },
    Ambiguous { name: String, second: String, delta: f32, conf: f32 },
    NoMatch,
}

NoMatch is returned if and only if peak data is invalid. Therefore, a mechanism needs to be written to handle Ambiguous when transitioning to streaming audio.

The chord library was recorded using 5th-fret barre shapes and a capo (to make it as painless as possible). $\text{G}$ and $\text{G}$# were the only groups with Ambiguous matches by the above metrics:

There's some distance between the highest and lowest chord total_scores. Therefore, further grid sweeps are required to push that confidence threshold north.


Project Structure

signal_analysis/
├── assets/ 
│    ├── figures/         
│    └── guitar_samples/            # Sample guitar recordings (WAV files)
├── notes/                          # Markdown notes documenting modules
│    ├── analysis/         
│    ├── processing/
│    ├── theory/
│    └── utils.md
├── output/                         # Plots and results (built by Rust)
├── src/
│    ├── analysis/                  # Visualisation and STFT
│    │    ├── mod.rs
│    │    ├── spectrogram.rs
│    │    └── visualisation.rs
│    ├── audio/                     # (Placeholder - future audio features)
│    │    └── mod.rs
│    ├── processing/                # FFT utilities, spectral analysis
│    │    ├── mod.rs
│    │    ├── fft_utils.rs
│    │    └── peak_detection.rs
│    ├── theory/                    # Static music theory templates
│    │    ├── mod.rs
│    │    ├── chords.rs
│    │    └── notes.rs
│    ├── utils/                     # Sample loading, WAV file reading
│    │    ├── mod.rs
│    │    ├── run_samples.rs
│    │    ├── sample_loader.rs
│    │    └── wav_reader.rs
│    ├── lib.rs                     # Library API (re-exports modules)
│    └── main.rs                    # CLI runner for experiments
├── Cargo.toml
└── README.md

Implemented Features

  • WAV file parsing — Reads normalised Vec<f32> samples + sample rate via hound
  • Static chord template library — Defines CHORD_TEMPLATES for major, minor, dominant7, major7 and minor7 structures
  • FFT computation — Forward transform using rustfft (NumPy norm='forward' convention)
  • Energy spectral density — Computes $|\widetilde{V}|^2$ from the complex FFT
  • Frequency bin generation — Replicates NumPy’s fftfreq
  • Spectrum shifting — Centres zero frequency using a real-valued fftshift
  • Peak detection — Extracts top-$N$ spectral peaks using percentile thresholding and spacing constraints
  • Pitch class quantisation — Logarithmic MIDI mapping with equal-tempered tuning
  • Short-Time Fourier Transform (STFT) — Hann-windowed, frame-wise time–frequency analysis
  • Spectrogram plotting — High-resolution dB-scaled figures via plotters + colorous
  • Unit tests — Coverage for FFT utilities, peak filtering and chord template logic

Project Status and Future Work

This project is currently a research and prototyping platform for chord recognition from raw guitar audio. Development is organised around the following milestones:

Batch Audio Analysis (Completed, tuning ongoing)

A full pipeline exists for loading .wav samples, computing FFTs, detecting peaks, matching chord templates and writing structured logs. Current work focuses on hyperparameter optimisation and confidence metrics.

Spectrogram Generation (Completed)

A self-contained STFT implementation produces dB-scaled spectrograms suitable for plotting or later real-time visualisation.

Command-Line Tooling (In development)

A dedicated CLI is being built to:

  • Load and process an audio file directly from the command line
  • Display classification results in the terminal
  • Optionally save plots and logs to disk

Future integration may include argument parsing (clap) and progress bars (indicatif).

Batch Processing and Folder Watching (Prototype complete)

Infrastructure exists to scan a directory of samples, process each file, and aggregate results to CSV/Markdown. JSON-based persistence via serde is supported.

Live Audio Input (Planned)

The architecture was designed with streaming support in mind. Planned extensions include:

  • Real-time audio capture via cpal
  • Incremental FFT and chord matching
  • Real-time confidence gating

GUI Application (Planned)

Longer-term, the goal is to provide a lightweight graphical interface for:

  • Drag-and-drop audio files
  • Real-time spectrogram visualisation
  • Live chord identification

Potential frameworks include egui/eframe or iced.


Contributing

Contributions are welcome. Areas of interest include:

  • Improved DSP algorithms (FFT/STFT performance, peak analysis)
  • Extensions to the chord template library (sus, add, diminished, etc.)
  • Additional heuristics or confidence measurements for the matcher
  • CLI improvements and configuration interfaces
  • Real-time or GUI-based features
  • Python bindings or integration examples

If you are interested in collaborating, please open an issue or submit a pull request. The repository is structured for clarity and extensibility, and internal modules are documented with the intention of lowering the barrier to entry for contributors.


Built with 🦀 Rust and 🎸 music in mind.

Licence: MIT Pineapple Bois


Brian Eno articulates well the complexity in objectively quantifying musical experience.

"As far as your mind is concerned, nothing happens the same twice, even if in every technical sense, the thing is identical. Your perception is constantly shifting. It doesn’t stay in one place."1

Footnotes

  1. More fabulous quotes from Eno

About

Detects guitar chords from raw audio using FFT-based spectral analysis and a modular Rust scoring engine. Includes batch tools, spectrograms, and extendable chord templates.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages