Montreal Forced Aligner (MFA) Project

Project Overview

This project provides Forced Alignment using Montreal Forced Aligner (MFA) on an audio dataset. The goal is to generate time-aligned phonetics and word level transcriptions from audio files and their corresponding text transcripts.

1. Environment Setup

Prerequisite

Conda environment (version used: 24.9.2)

Installing Montreal Forced Aligner

Instructions to install MFA.
Create conda environment:

cd forced_alignment
mkdir envs
conda create --prefix envs/mfa_env python=3.11
conda activate envs/mfa_env
conda install -c conda-forge montreal-forced-aligner

Used latest (version: 3.3.8) montreal-forced-aligner

2. Data Preparation

Place your audio and text files in data/, such that the basename of corresponding files should be same except extension

data
|-- /example_corpus
|   |-- F2BJRLP1.wav
|   |-- F2BJRLP1.txt
|   |-- F2BJRLP2.wav
|   |-- F2BJRLP2.txt
|   |-- F2BJRLP3.wav
|   |-- F2BJRLP3.txt
    ...

Note:
- *.wav: Audio files.
- *.txt: Corresponding text transcriptions.

3-1. Align with pre-trained models - Option 1

a. Retrieve TOP dictionaries

Decide the top dictionaries based on the number of oov (out-of-vocabulary) words

Usage

python src/validate_dicts.py `input_dir` `out_dir` `lang`

Example:

python src/validate_dicts.py --input_dir "data/example_corpus" --out_dir "egs/validation_output" --lang "english"

b. Align the Data for Top 3 Dictionaries

Perform the forced alignment on the audio and transcription files using the top dictionaries we get from validation

Usage:

python src/align_data.py `input_dir` `oov_summary_file` `lang` `top_k`

Example:

python src/align_data.py --input_dir "data/example_corpus" --oov_summary_file "egs/validation_output/validation_oov_summary.txt" --lang "english" --top_k 3

3-2. Align with Pre-trained Models - Option 2

Perform validation and alignment using a single bash file
Example:
```
bash egs/align_data.sh
```

4. Training Custom Dictionary:

As the corpus goes on increasing the number of out-of vocabulary words also increases.
Token for OOV words corresponding phonemes - spn
Using a pre-trained G2P (grapheme to Phoneme) model, we can estimate pronounciation for those OOVs
By combining G2P generated dict for OOVs can be combined with the pre-trained to perform the alignments

Usage:

python src/train_dict.py `txt_dir` `corpus_dir` `base_dictionary` `g2p_model` `acoustic_model` `output_dir`

Example

python src/train_dict.py --txt_dir "data/txt_corpus" --corpus_dir "data/example_corpus" --base_dictionary "dictionaries/english_us_arpa.dict" --g2p_model "english_us_arpa" --acoustic_model "english_us_arpa" --output_dir "egs/custom_dict/aligned_data"

Note:
- dictionaries/english_us_arpa.dict is the pre-trained dictionary that can be downloaded from english_us_arpa.dict

Report

A short PDF is attached in docs/ with relevant screenshots showing that the G2P based custom dictionary will help to handle the OOVs, and the ARPABET phoneme dictionary provides better alignments as compared to other dictionaries for the example corpus I have taken
Output textgrid files are visualized using praat software available at PRAAT FOR LINUX

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
dictionaries		dictionaries
docs		docs
egs		egs
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Montreal Forced Aligner (MFA) Project

Project Overview

1. Environment Setup

Prerequisite

Installing Montreal Forced Aligner

2. Data Preparation

3-1. Align with pre-trained models - Option 1

a. Retrieve TOP dictionaries

b. Align the Data for Top 3 Dictionaries

3-2. Align with Pre-trained Models - Option 2

4. Training Custom Dictionary:

Report

References:

About

Uh oh!

Releases 1

Languages

BNarayanaReddy/forced_alignment

Folders and files

Latest commit

History

Repository files navigation

Montreal Forced Aligner (MFA) Project

Project Overview

1. Environment Setup

Prerequisite

Installing Montreal Forced Aligner

2. Data Preparation

3-1. Align with pre-trained models - Option 1

a. Retrieve TOP dictionaries

b. Align the Data for Top 3 Dictionaries

3-2. Align with Pre-trained Models - Option 2

4. Training Custom Dictionary:

Report

References:

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Languages