Converting the surprisingness of language into musical compositions
Live Demo: https://surprisal.onrender.com
cat.mp4
Surprisal theory suggests that the more surprising a word is in context, the longer it takes the human brain to process. Consider these sentences:
- "The man fed the cat some tuna." (low surprisal)
- "The lawyer presented the cat with a lawsuit." (high surprisal)
The word "cat" is far more surprising in the legal context! This "surprisingness" can be quantified using Claude Shannon's information theory formula:
Surprisal(x) = -log₂ P(x | context)
- Text → Music: Input text → Calculate word surprisal → Map the numeric values to musical pitches → Generate melody
- Music → Text: Play musical notes → Find words that would have a similar surprisal value in the given context → Generate text
This is meant as a fun experiment to help build intuition about how humans process natural language as well as how LLMs model the compositional features of communication. The surprisal data of a sentence could be abstracted and presented in many different ways, but we thought musical melody would be a form where the abstraction actually uses some similar properties of perception and processing.
The results change on the model used, both due to the underlying tokenization process that each uses as well as the statistical models that develop during their training. Making these differences audible (and interactive) has been a fun way to build new intuition and make the "black box" of the models' inner workings more accessible.
We have chosen to focus on small models, partly to lower the computational overhead required, but also to get a sense of how these little guys are trying to squeeze as much coherence as possible out of their training. The live demo only exposes one model, but cloning the repo and running it locally would allow you to experiment with the other models we have selected or to choose your own!
# Clone and setup
git clone https://github.com/wobblybits/surprisal.git
cd surprisal
pip install -r requirements.txt
python app.pyThe first time you run it, the transformers library will download and cache the model tensors, which all combined is ~3GB. You can disable certain models in config.py. If you want to add your own models, you will need to edit app.py to provide the configuration details as well as enable them in config.py.
├── app.py # Main Flask application
├── assets/js/
│ ├── config.js # Configuration and presets
│ ├── surprisal-app.js # Main application logic
│ └── utilities.js # Helper functions and error handling
├── templates/wireframe.html # Main UI template
├── requirements.txt # Python dependencies
└── .env.example
| Model | Size | Description |
|---|---|---|
| GPT-2 | 124M | OpenAI's foundational model |
| DistilGPT-2 | 88M | A distilled version of GPT2 |
| SmolLM | 135M | Hugging Face's optimized small model |
| Nano Mistral | 170M | Compact Mistral variant |
| Qwen 2.5 | 494M | Multilingual question-answering model |
| Flan T5 | 74M | Google's text-to-text transformer |
Each model has different tokenization and surprisal characteristics, leading to unique musical interpretations.
- Testing the Predictions of Surprisal Theory in 11 Languages (Wilcox et al., TACL 2023)
- Expectation-based syntactic comprehension (Levy, Cognition 2008)
- A mathematical theory of communication (Shannon, 1948)
- Language models from Hugging Face
- Audio synthesis with Tone.js
- Icons from Flaticon
- Fonts from Google Fonts and Old School PC Fonts
- Sound effects from Pixabay
More detailed attributions are included at the bottom of the main html file.
Built with ❤️ at the Recurse Center.
