This is a simple yet useful app that helps you manually correct automatic transcriptions made by Whisper. The app allows you to play audio files and edit their corresponding transcriptions.
- 🔊 Play audio files directly from the browser.
- ✍️ Edit and save transcriptions.
- 📁 Automatically saves the old version of the transcription with
_oldversion.txt.
- 🐍 Python 3.x
- Clone this repository to your local machine.
- Place your audio dataset in the
audiosfolder. For each.wavfile, the transcription MUST have the same name but with a.txtextension.
-
Run the Installation Script
This script will create a virtual environment, install the necessary dependencies, and run the Flask application.
./install.sh
-
Run the Application
Use the following command to run the application if it's not already running from the installation script.
./run.sh
-
Access the Application
Open your web browser and go to
http://127.0.0.1:5000/.
Your project directory should look like this:
project/
├── app.py
├── requirements.txt
├── install.sh
├── run.sh
├── README.md
├── static/
│ └── play.png # (Optional: sound icon if needed)
├── templates/
│ └── index.html
├── audios/ # Directory containing subfolders with .wav and .txt files
- Place your
.wavaudio files and their corresponding.txttranscription files in theaudiosdirectory. Ensure that each.wavfile has a corresponding.txtfile with the same name. - Run the application using the installation script.
- Use the web interface to play audio files and manually correct transcriptions.
- Press the "Next" button to save the edited transcription and proceed to the next file. The old version of the transcription will be saved with
_oldversion.txt.
If you find this tool useful in your research or work, please consider citing or acknowledging it as follows:
Guerrero-López, Alejandro. 2024. Manual Transcription Editor: A tool for manually correcting automatic transcriptions. Available at https://github.com/aguerrerolopez/TranscriptionEditor.
Thanks to GPT for programming almost all of this project. Am I now a prompt engineer? 😉
This project is licensed under the MIT License.