Skip to content

Developed a real-time AI system combining emotion recognition, RAG-based LLM inference, and generative audio-visual media. Integrated memory-augmented feedback for dynamic emotional adaptation. Improved user emotional resonance by 56% and satisfaction by 67% in pilot studies, showcasing the power of personalized, adaptive multimedia environments.

Notifications You must be signed in to change notification settings

AnkitGundewar/Emotion-Driven_Audio-Visual_Experience_System

 
 

Repository files navigation

Emotion-Driven Audio-Visual Experience: Enhancing Human-Computer Interaction through Real-Time Multimodal Feedback

Introduction

This is our final project source code for CS5170 Spring 2025 under Professor Stacey Marsella. This project explores an AI-driven system for generating immersive and personalized audio-visual experiences by detecting user emotions and intent in real time.

Notes

Touch Designer files are above the limit of 100M on github therefore are hosted at this link

This repository acts a dump just to show source code. Because of that, The conda environment setup, the paths defined and other specifics of setting up pipelines in TouchDesigner are not defined. Thus you may not be able to reproduce the results. We will add a step-by-step guide to this readme later on.

LLM api keys are also required by the user to be input.

  • Streamlit UI: This is the UI for our project which is built using Streamlit Framework. Currently the paths are not setup correctly on github and the system needs MusicGen model from Meta needs to be downloaded and setup locally so work properly. Work upon this to add proper instructions on how to setup is still going on.
  • TouchDesiner: TouchDesigner is a node-based visual programming platform designed for real-time, interactive multimedia development. It enables rapid prototyping and deployment of audio-visual experiences by chaining modular operators in a dataflow graph. Some modules have been setup in their default paths which need to be setup. Before running this part of the system, Stable diffusion model needs to we downloaded and setup locally. Work upon this to add proper instructions on how to setup is still going on.

Demos

Here are the demos links from our presentation. The demos have been recorded on the system running on RTX 4080 MaxQ having 12 GB VRAM and Core 9 185H. The videos have been uploaded as unlisted on youtube. Links:

Outcomes

About

Developed a real-time AI system combining emotion recognition, RAG-based LLM inference, and generative audio-visual media. Integrated memory-augmented feedback for dynamic emotional adaptation. Improved user emotional resonance by 56% and satisfaction by 67% in pilot studies, showcasing the power of personalized, adaptive multimedia environments.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 84.6%
  • Jupyter Notebook 15.4%