This repo contains Scala CLI scripts for generating songdb TSV files used by audacious-uade.
Also an experimental Shazam like tool is included (see Audio Matching) for identifying music from audio files or via microphone.
- songdb/ - Scala CLI, SQL scripts and raw source TSVs to generate the final processed TSV files
- tsv/encoded/ - the songdb TSV files used by audacious-uade. The files are "encoded" to almost binary format to optimize for size and fast in-memory songdb initialization.
- tsv/pretty/ - pretty printed / clear text versions of the TSV files. See TSV Format Specification.
- misc/ - misc bash scripts
There are two alternative hashing methods provided and separate TSVs for each under md5 and xxh32 subfolders.
- MD5 - 48-bits (MSB) as hex, hash calculated from whole file
- XXH32+filesize - 48-bits as hex (32-bit + 16-bit). Calculated+concatenated as hex(XXH32(file)) + hex(filesize & 0xFFFF). XXH32 is calculated from max first 256k bytes only, filesize is full filesize.
tsv/pretty/*/songlengths.tsv- subsong and songlengths infotsv/pretty/*/modinfos.tsv- module file format and channel infotsv/pretty/*/metadata.tsv- all metadata from different sources distilled to single TSV
tsv/pretty/*/amp.tsv- author/album metadata sourced from AMPtsv/pretty/*/demozoo.tsv- author/publisher/album/year metadata sourced from Demozootsv/pretty/*/fujiology.tsv- author/publisher/album/year metadata sourced from Fujiologytsv/pretty/*/modland.tsv- author/album metadata sourced from Modlandtsv/pretty/*/modsanthology.tsv- author/publisher/album/year metadata sourced from Mods Anthologytsv/pretty/*/oldexotica.tsv- author/publisher/album/year metadata sourced from ExoticA (old)tsv/pretty/*/unexotica.tsv- author/publisher/album/year metadata sourced from UnExoticAtsv/pretty/*/wantedteam.tsv- author/publisher/album/year metadata sourced from Wanted Team
songdb/sources/amigamega.tsv- module infos and songlengths for AmigaMegasongdb/sources/aminet.tsv- module infos and songlengths for Aminetsongdb/sources/amp.tsv- module infos and songlengths for AMPsongdb/sources/demozoo_leftovers.tsv- module infos and songlengths for Demozoo downloads (excluding duplicated sources). Link list is generated withsongdb/scripts/sql/demozoo_leftovers.sqlsongdb/sources/demozoo_music.tsv- Demozoo metadata generated with SQL query (songdb/scripts/sql/demozoo_music.sql) from Demozoo postgres database dumpsongdb/sources/demozoo_prods.tsv- Demozoo metadata generated with SQL query (songdb/scripts/sql/demozoo_prods.sql) from Demozoo postgres database dumpsongdb/sources/fujiology.tsv- module infos and songlengths for Fujiologysongdb/sources/mbnet.tsv- module infos and songlengths for MBnet Apajasongdb/sources/modland.tsv- module infos and songlengths for Modlandsongdb/sources/modland_incoming.tsv- modules infos and songlengths for Modland incoming directorysongdb/sources/modsanthology.tsv- module infos and songlengths for Mods Anthologysongdb/sources/modarchive.tsv- module infos and songlengths for ModArchivesongdb/sources/nostalgicplayer.tsv- module infos and songlengths for NostalgicPlayersongdb/sources/oldexotica.tsv- module infos and songlengths for ExoticA (old)songdb/sources/soamc.tsv- module infos and songlengths for SOAMC=songdb/sources/tosecmusic.tsv- module infos and songlengths for TOSEC Musicsongdb/sources/tosecmusic_unknown.tsv- module infos and songlengths for TOSEC Music Unknownsongdb/sources/unexotica.tsv- module infos and songlengths for UnExoticAsongdb/sources/wantedteam.tsv- module infos and songlengths for Wanted Teamsongdb/sources/zakalwe.tsv- module infos and songlengths for Zakalwe chip git reposongdb/sources/audio/*.tsv- audio fingerprints (chromaprint), zstd compressed in git. Seescripts/sources/audio.scfor format.
Except for demozoo.tsv, the raw TSV source files are generated using the precalc binary+script from audacious-uade from my local copy/mirror/snapshot of the various sites/sources.
Note: Some additional required files not included in Github, specifically local mirror of some of source web pages and/or database files are needed to actually run the Scala songdb.sc script.
Note: Only files playable by audacious-uade are included in the database. The script runs completely locally and does not download anything from internet.
Here are example snippets and short spec for the pretty printed TSVs. Example parsing code can be found in songdb/scripts/pretty.sc
ff5c7b3227e0 0 65920,p 65920,p,!
fffd7a7d8547 1 250840,p+s
fffdc1d765c3 0 40880,l 117860,l 8780,s 79340,l 8080,s 19000,s
Format: [hash]<TAB>[minsubsong]<TAB>[[songlength(ms),songend[,!]]<SPACE>[songlength(ms),songend[,!]]<SPACE>[...]]
- Duplicate subsongs are denoted by
!
fffdc1d765c3 CustomPlay
fffdd3c2bef3 Scream Tracker 3.2x (GUS) 8
fffe869a7f8d AHX v2
Format: [hash]<TAB>[format]<TAB>[channels]
feaa9d2a4869 Scorpik Alchemy Toxic Ziemniak 1992
feaba2f4c992 Jazz
feabaabf8a62 Mantronix~Tip Blue House Productions~Rebels~Sonic Projects Blue House 2 1991
Format: [hash]<TAB>[authors]<TAB>[publishers]<TAB>[album]<TAB>[year]
- Multiple authors or publishers are separated by
~
The TSV files use UTF-8 encoding.
Note: I reserve the right to change the format or location in Github of any of the TSV or other files at any time.
Identify Amiga exotic modules and tracker music from audio files or via microphone.
The tool uses simple brute force approach for chroma similarity matching, so can be a bit slow. On M4 Max it takes from half a minute to a minute or two depending on input length. All CPU cores are utilized.
Proper implementation should use something like https://github.com/acoustid/acoustid-index or https://github.com/acoustid/pg_acoustid
It's recommended to record at least 30s of audio, but the more the better. Accuracy can depend on many factors, like audio quality and unique audio features available. For best results use fpcalcand audio_match.sc directly with chromaprint generated from the original audio file (like YouTube rip), instead of using microphone.
Requirements: scala-cli (https://scala-cli.virtuslab.org/), chromaprint (fpcalc), 8GB+ of memory. For microphone support: sox, (macOS) mic permission for terminal. Also make sure mic input volume is high enough.
Setup:
cd songdb
zstd -d sources/audio/audio_*.zst
./audio_match.scUsage:
./audio_match.sc # Prints usage
./audio_match.sc AQAAC1EShUokRcMfoT-OX8RfNKH... # Match specific chromaprint
fpcalc -plain somefile.wav | ./audio_match.sc - # Calculate and match chromaprint from audiofile
./record.sh # Prints usage
./record.sh 0 # Interactive recording and matching using microphone
./record.sh 30 # Record and match 30 seconds using microphoneSee songdb/audio_match.sc and songdb/record.sh sources for more details.
Note:: Decompress the files in sources/audio first with zstd -d sources/audio/audio_*.zst
Note:: Run ./audio_match.sc once before running ./record.sh. It will fetch/install the Scala dependencies on first run, which takes a while.
Note:: Only tested on macOS and Linux.
Output:
Score | MD5 | Sub | Authors | Album | Publishers | Year | Filenames
----------------------------------------------------------------------------------------------------------------------------
0,935 | 98d24339316c | 1 | Interphace | The Co-Operation Demo | Andromeda & Infernal Minds | 1990 | MOD.dawn, dawn.mod
0,787 | a241710e5f1f | 1 | | | | | lords of the boards.xm
0,781 | d3a158c9db44 | 0 | Slammy | | | | final voyage.it
List of top matched entries with match score, MD5, subsong and some metadata from songdb.
You can grep the MD5 from songdb/sources/*.tsv and tsv/pretty/md5/*.tsv to locate the matching mod file and all available metadata.
The Scala and SQL scripts are licensed under GPL-2.0-or-later.
For any applicable sui generis rights or copyrights I may have over the database files, they are provided under CC BY-NC-SA 4.0 license.
Sources used for the database:
- AmigaMega - https://amigamega.com/index.html
- Aminet - https://aminet.net/
- AMP - https://amp.dascene.net/
- Demozoo - https://demozoo.org/
- ExoticA (old) - http://old.exotica.org.uk/
- Fujiology - https://fujiology.untergrund.net/
- MBnet Apaja - https://archive.org/details/modit
- ModArchive - https://modarchive.org/
- Modland - http://ftp.modland.com/
- Mods Anthology - https://archive.org/details/cdrom-amiga-mods-anthology-1
- NostalgicPlayer - https://nostalgicplayer.dk/
- SOAMC= - https://www.paula8364.com
- TOSEC - https://www.tosecdev.org/
- TOSEC Music - https://www.nzbking.com/details:5191a2da123c8c751b3ffcbb/
- UnExoticA - https://www.exotica.org.uk/wiki/UnExoticA
- Wanted Team - http://wt.exotica.org.uk/
- WHDLoad-Database - https://github.com/MrV2K/WHDLoad-Database
- Wikipedia - https://wikipedia.org
- Zakalwe - git://zakalwe.fi/chip
This database is also used by:
- HippoPlayer - https://github.com/koobo/HippoPlayer