SangeetAI -- Emotion Based Music Player

An Android Application which recommends and plays songs based on the user's emotion detected from a facial image. Powered by a custom deep learning model (VGG19, TensorFlow Lite) trained on FER2013.

It does image processing of the user's photo taken from the camera, runs emotion classification using an on-device TFLite model, and then plays/recommends songs according to the user's mood.

Install & try the app: Download APK

Overall ML & App Flow

flowchart TD
    A[Dataset<br/>FER2013] --> B[Image Preprocessing]
    B --> C[Normalization]
    B --> D[Augmentation] 
    B --> E[Image Resize<br/>224*224]
    C --> F[Training of<br/>VGG19 Model]
    D --> F
    E --> F
    F --> G[Saving model in<br/>.h5 format]
    F --> H[Testing Model in<br/>Colab]
    F --> I[TensorflowLite]
    I --> J[Saving Model as<br/>tflite file]
    J --> K[Sangeet-AI<br/>Android App]
    K --> L[Emotion<br/>Detection using<br/>front camera]
    K --> M[Music/Media<br/>Recommendation<br/>based on users<br/>mood.]
    classDef goldBox fill:#B8860B,stroke:#8B7355,stroke-width:2px,color:#000
    classDef blackBox fill:#2F2F2F,stroke:#555,stroke-width:2px,color:#fff
    classDef grayBox fill:#D3D3D3,stroke:#A9A9A9,stroke-width:2px,color:#000
    classDef whiteBox fill:#fff,stroke:#000,stroke-width:2px,color:#000
    class B,G,H,I,J goldBox
    class F,K blackBox
    class A grayBox
    class C,D,E,L,M whiteBox

Deep Technical Overview

ML Development (ml-notebooks)

1. Dataset Acquisition and Setup

Direct Kaggle Integration: Dataset downloaded directly to Google Colab using Kaggle API
FER2013 Dataset: 35,887 grayscale images (48x48) across 7 emotion classes
- Classes: Angry, Disgust, Fear, Happy, Neutral, Sad, Surprise
Data Split: 80% training, 10% validation, 10% testing using splitfolders

2. Advanced Image Preprocessing Pipeline (`CaseStudyImagePreprocessing.ipynb`)

2.1 Normalization Implementation

def normalization(image):
    Imax = max(image)
    Imin = min(image)
    return (image - Imin) / (Imax - Imin)

Min-Max Scaling: Pixel values normalized to [0, 1] range
Batch Processing: All images processed through normalization pipeline
Format Conversion: Images converted from JPG to PNG for consistency

2.2 Data Augmentation Techniques

Rotation: Images rotated at 15-20 degree intervals using imutils.rotate_bound
Horizontal Flipping: Mirror images using numpy.fliplr
Affine Transformation: Custom transformation matrices for geometric variations
Output: 3x augmentation per original image (rotation + h_flip + affine_transform)

2.3 Image Resizing and Preprocessing

Target Size: All images resized to 224x224 for VGG19 compatibility
Batch Processing: Efficient processing using OpenCV and NumPy arrays
Memory Optimization: Images saved as .npy files for faster loading
Pixel Normalization: Final rescaling by 1./255 for neural network input

3. Model Architecture and Training (`CaseStudyProject.ipynb`)

3.1 VGG19 Transfer Learning Implementation

vgg = VGG19(input_shape=[224, 224, 3], weights='imagenet', include_top=False)

# Freeze pre-trained layers
for layer in vgg.layers:
    layer.trainable = False

# Custom classifier head
x = Flatten()(vgg.output)
prediction = Dense(7, activation='softmax')(x)
model = Model(inputs=vgg.input, outputs=prediction)

3.2 Training Configuration

Loss Function: Sparse Categorical Crossentropy
Optimizer: Adam optimizer for adaptive learning rate
Metrics: Accuracy tracking for performance monitoring
Early Stopping: Patience=5 on validation loss to prevent overfitting
Batch Size: 32 for optimal GPU memory utilization

3.3 Model Evaluation and Metrics

Classification Report: Precision, recall, F1-score per emotion class
Confusion Matrix: Detailed error analysis across emotion categories
Accuracy Scoring: Final model performance validation
Visualization: Training/validation loss and accuracy curves

4. TensorFlow Lite Conversion and Optimization

converter = tf.lite.TFLiteConverter.from_keras_model(model)
tfmodel = converter.convert()
open('linear.tflite','wb').write(tfmodel)

Model Conversion: Keras model (.h5) converted to TensorFlow Lite (.tflite)
Mobile Optimization: Model optimized for on-device inference
Size Reduction: Significant model size reduction for mobile deployment
Inference Speed: Optimized for real-time emotion detection

Android Integration

Key Android Libraries & Their Roles

1. Camera & Image Capture

Otalia Studios CameraView
- Used for high-level camera integration (front/back camera switching, easy photo capture).
- Handles permission checks, lifecycle management, and provides a simple API for camera events.
- Allows for capturing images directly as Bitmap objects, suitable for ML preprocessing.
AndroidX AppCompat and Core Libraries
- Standard Android compatibility, UI, and lifecycle management.

2. Image Preprocessing & ML Inference

TensorFlow Lite
- Loads and runs the custom-trained emotion detection model.
- Provides APIs for model loading, input/output tensor management, and fast on-device inference.
- Used with TensorImage, ImageProcessor, ResizeOp, NormalizeOp, and TensorBuffer for preprocessing and inference.

3. Music Playback & UI

JcPlayer
- Provides a modern, feature-rich music player interface for Android.
- Supports playlists, notifications, background playback, and easy integration with URIs/URLs.
- Used to stream songs from Firebase Storage.
- Customizable UI components for play/pause, next/previous, and notifications.
Android MediaPlayer
- Used internally (by JCPlayer or custom logic) for audio playback control.

4. UI Components & Navigation

ConstraintLayout, RelativeLayout, ListView, ImageView, TextView (Android SDK)
- For flexible and responsive UI design.
- Used to display detected emotion, emoji, and song lists.
Intent & Activity Navigation
- Used to transition between the main camera/emotion detection screen and the song recommendation/playback screen.

5. Utility Libraries

Android Handler, Runnable
- Used for timed UI effects (e.g., ripple animation) and asynchronous operations.
Android Permission Management
- Ensures camera, microphone, and storage permissions are checked and requested as needed.

TFLite Model Loading and Inference

private int loadTfLiteModel(InputStream inputStream) {
    // Read TFLite model from assets or file
    byte[] buffer = Files.createTempDirectory("temp").resolve("temp_model").toFile().readAllBytes();
    tfliteModel = assetModel = tfliteModel.createFromBuffer(tfliteContent.x);
    
    // Preprocess image for model
    TensorImage tensorImage = TensorImage.fromBitmap(croppedBitmap);
    ImageProcessor processor = new ImageProcessor.Builder()
        .add(new ResizeOp(224, 224, ResizeOp.ResizeMethod.NEAREST_NEIGHBOR))
        .add(new NormalizeOp(0.f, 1.0f))
        .build();
    TensorImage preprocessedImage = processor.process(tensorImage);
    
    // Prepare input/output buffers
    TensorBuffer inputBuffer = preprocessedImage.getTensorBuffer();
    TensorBuffer outputBuffer = TensorBuffer.createFixedSize(new int[]{1, 7}, DataType.FLOAT32);
    
    // Run inference
    tflite.run(inputFeatures.getFloatArray(), outputBuffer.getFloatArray());
    
    // Postprocessing: find highest probability
    float[] outputs = outputBuffer.getFloatArray();
    int index = 0;
    float max = 0.0f;
    for(int i=0; i<outputs.length;i++){
        if(outputs[i]>max){
            max=outputs[i];
            index=i;
        }
    }
    // index = predicted emotion class
    model.close();
    return index;
}

Preprocessing matches the training pipeline: resize, normalize.
Output is a probability array [p1, p2, ..., p7]. The index with the max value is chosen as the emotion.
Postprocessing maps the index to human-readable emotion (e.g., 0=Angry, 1=Disgust, ..., 6=Neutral).

UI & Recommendation Logic

MainActivity:
- Handles camera permissions and image capture.
- Calls loadTfLiteModel() on the captured image.
- Gets the predicted emotion index.
- Sets AppController.currentMood accordingly.
ExpressionDisplayActivity:
- Reads currentMood.
- Sets emoji, background, and playlist based on emotion.
- Plays a recommended song via JCPlayer.

Project Structure

app/src/main/java/blog/cosmos/home/sangeetai/: Main application logic
app/src/main/java/blog/cosmos/home/sangeetai/activity/: App screens/activities
app/src/main/java/blog/cosmos/home/sangeetai/utils/: Utility classes (image processing, TFLite inference)
app/src/main/java/blog/cosmos/home/sangeetai/constants/: Mood constants
app/src/main/java/blog/cosmos/home/sangeetai/interfaces/: Callbacks/interfaces
ml-notebooks/: Jupyter notebooks for data prep, training, and TFLite conversion

Example Song URIs

Sad Songs:

Happy Songs:

Chhote Chhote Peg: https://firebasestorage.googleapis.com/v0/b/moodey-music.appspot.com/o/y2mate.com%20-%20Chhote%20Chhote%20Peg%20Video%20%20Yo%20Yo%20Honey%20Singh%20%20Neha%20Kakkar%20%20Navraj%20Hans%20%20Sonu%20Ke%20Titu%20Ki%20Sweety.mp3?alt=media&token=2fe3c966-15f0-48d8-bc2b-9e54d582ffca

App Flow Summary

User launches app.
CameraView opens, user takes a selfie.
TensorFlow Lite processes image with custom VGG19 model, outputs emotion probabilities.
Predicted emotion index mapped to mood (e.g., Happy, Sad, Angry, etc.).
JCPlayer UI displays matching playlist and controls.
User plays, pauses, or skips songs. UI updates with emoji and background for emotion.

Permissions & Testing

Camera, microphone, and storage permissions managed and checked at runtime.
Unit tests and instrumented tests in app/src/test and app/src/androidTest.

Extensibility

New emotions: Retrain model with additional classes and update mapping logic in Android code.
More songs: Extend playlists/HashMaps in AppController.
Model upgrades: Replace .tflite file and update inference logic for better accuracy.

Contributions

Feel free to open issues, make feature requests, or submit pull requests!

This README covers comprehensive technical aspects of both the ML pipeline and Android implementation, providing detailed documentation for the complete SangeetAI system.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.idea		.idea
app		app
docs		docs
gradle/wrapper		gradle/wrapper
ml-notebooks		ml-notebooks
.gitignore		.gitignore
README.md		README.md
build.gradle		build.gradle
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle		settings.gradle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SangeetAI -- Emotion Based Music Player

Overall ML & App Flow

Deep Technical Overview

ML Development (ml-notebooks)

1. Dataset Acquisition and Setup

2. Advanced Image Preprocessing Pipeline (`CaseStudyImagePreprocessing.ipynb`)

2.1 Normalization Implementation

2.2 Data Augmentation Techniques

2.3 Image Resizing and Preprocessing

3. Model Architecture and Training (`CaseStudyProject.ipynb`)

3.1 VGG19 Transfer Learning Implementation

3.2 Training Configuration

3.3 Model Evaluation and Metrics

4. TensorFlow Lite Conversion and Optimization

Android Integration

Key Android Libraries & Their Roles

1. Camera & Image Capture

2. Image Preprocessing & ML Inference

3. Music Playback & UI

4. UI Components & Navigation

5. Utility Libraries

TFLite Model Loading and Inference

UI & Recommendation Logic

Project Structure

Example Song URIs

App Flow Summary

Permissions & Testing

Extensibility

Contributions

About

Uh oh!

Releases

Packages

Languages

s0oraj/SangeetAI

Folders and files

Latest commit

History

Repository files navigation

SangeetAI -- Emotion Based Music Player

Overall ML & App Flow

Deep Technical Overview

ML Development (ml-notebooks)

1. Dataset Acquisition and Setup

2. Advanced Image Preprocessing Pipeline (CaseStudyImagePreprocessing.ipynb)

2.1 Normalization Implementation

2.2 Data Augmentation Techniques

2.3 Image Resizing and Preprocessing

3. Model Architecture and Training (CaseStudyProject.ipynb)

3.1 VGG19 Transfer Learning Implementation

3.2 Training Configuration

3.3 Model Evaluation and Metrics

4. TensorFlow Lite Conversion and Optimization

Android Integration

Key Android Libraries & Their Roles

1. Camera & Image Capture

2. Image Preprocessing & ML Inference

3. Music Playback & UI

4. UI Components & Navigation

5. Utility Libraries

TFLite Model Loading and Inference

UI & Recommendation Logic

Project Structure

Example Song URIs

App Flow Summary

Permissions & Testing

Extensibility

Contributions

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

2. Advanced Image Preprocessing Pipeline (`CaseStudyImagePreprocessing.ipynb`)

3. Model Architecture and Training (`CaseStudyProject.ipynb`)

Packages