An Android Application which recommends and plays songs based on the user's emotion detected from a facial image. Powered by a custom deep learning model (VGG19, TensorFlow Lite) trained on FER2013.
It does image processing of the user's photo taken from the camera, runs emotion classification using an on-device TFLite model, and then plays/recommends songs according to the user's mood.
Install & try the app: Download APK
flowchart TD
A[Dataset<br/>FER2013] --> B[Image Preprocessing]
B --> C[Normalization]
B --> D[Augmentation]
B --> E[Image Resize<br/>224*224]
C --> F[Training of<br/>VGG19 Model]
D --> F
E --> F
F --> G[Saving model in<br/>.h5 format]
F --> H[Testing Model in<br/>Colab]
F --> I[TensorflowLite]
I --> J[Saving Model as<br/>tflite file]
J --> K[Sangeet-AI<br/>Android App]
K --> L[Emotion<br/>Detection using<br/>front camera]
K --> M[Music/Media<br/>Recommendation<br/>based on users<br/>mood.]
classDef goldBox fill:#B8860B,stroke:#8B7355,stroke-width:2px,color:#000
classDef blackBox fill:#2F2F2F,stroke:#555,stroke-width:2px,color:#fff
classDef grayBox fill:#D3D3D3,stroke:#A9A9A9,stroke-width:2px,color:#000
classDef whiteBox fill:#fff,stroke:#000,stroke-width:2px,color:#000
class B,G,H,I,J goldBox
class F,K blackBox
class A grayBox
class C,D,E,L,M whiteBox
- Direct Kaggle Integration: Dataset downloaded directly to Google Colab using Kaggle API
- FER2013 Dataset: 35,887 grayscale images (48x48) across 7 emotion classes
- Classes: Angry, Disgust, Fear, Happy, Neutral, Sad, Surprise
- Data Split: 80% training, 10% validation, 10% testing using
splitfolders
def normalization(image):
Imax = max(image)
Imin = min(image)
return (image - Imin) / (Imax - Imin)- Min-Max Scaling: Pixel values normalized to [0, 1] range
- Batch Processing: All images processed through normalization pipeline
- Format Conversion: Images converted from JPG to PNG for consistency
- Rotation: Images rotated at 15-20 degree intervals using
imutils.rotate_bound - Horizontal Flipping: Mirror images using
numpy.fliplr - Affine Transformation: Custom transformation matrices for geometric variations
- Output: 3x augmentation per original image (rotation + h_flip + affine_transform)
- Target Size: All images resized to 224x224 for VGG19 compatibility
- Batch Processing: Efficient processing using OpenCV and NumPy arrays
- Memory Optimization: Images saved as
.npyfiles for faster loading - Pixel Normalization: Final rescaling by 1./255 for neural network input
vgg = VGG19(input_shape=[224, 224, 3], weights='imagenet', include_top=False)
# Freeze pre-trained layers
for layer in vgg.layers:
layer.trainable = False
# Custom classifier head
x = Flatten()(vgg.output)
prediction = Dense(7, activation='softmax')(x)
model = Model(inputs=vgg.input, outputs=prediction)- Loss Function: Sparse Categorical Crossentropy
- Optimizer: Adam optimizer for adaptive learning rate
- Metrics: Accuracy tracking for performance monitoring
- Early Stopping: Patience=5 on validation loss to prevent overfitting
- Batch Size: 32 for optimal GPU memory utilization
- Classification Report: Precision, recall, F1-score per emotion class
- Confusion Matrix: Detailed error analysis across emotion categories
- Accuracy Scoring: Final model performance validation
- Visualization: Training/validation loss and accuracy curves
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tfmodel = converter.convert()
open('linear.tflite','wb').write(tfmodel)- Model Conversion: Keras model (.h5) converted to TensorFlow Lite (.tflite)
- Mobile Optimization: Model optimized for on-device inference
- Size Reduction: Significant model size reduction for mobile deployment
- Inference Speed: Optimized for real-time emotion detection
-
- Used for high-level camera integration (front/back camera switching, easy photo capture).
- Handles permission checks, lifecycle management, and provides a simple API for camera events.
- Allows for capturing images directly as
Bitmapobjects, suitable for ML preprocessing.
-
AndroidX AppCompat and Core Libraries
- Standard Android compatibility, UI, and lifecycle management.
- TensorFlow Lite
- Loads and runs the custom-trained emotion detection model.
- Provides APIs for model loading, input/output tensor management, and fast on-device inference.
- Used with
TensorImage,ImageProcessor,ResizeOp,NormalizeOp, andTensorBufferfor preprocessing and inference.
-
- Provides a modern, feature-rich music player interface for Android.
- Supports playlists, notifications, background playback, and easy integration with URIs/URLs.
- Used to stream songs from Firebase Storage.
- Customizable UI components for play/pause, next/previous, and notifications.
-
Android MediaPlayer
- Used internally (by JCPlayer or custom logic) for audio playback control.
-
ConstraintLayout, RelativeLayout, ListView, ImageView, TextView (Android SDK)
- For flexible and responsive UI design.
- Used to display detected emotion, emoji, and song lists.
-
Intent & Activity Navigation
- Used to transition between the main camera/emotion detection screen and the song recommendation/playback screen.
-
Android Handler, Runnable
- Used for timed UI effects (e.g., ripple animation) and asynchronous operations.
-
Android Permission Management
- Ensures camera, microphone, and storage permissions are checked and requested as needed.
private int loadTfLiteModel(InputStream inputStream) {
// Read TFLite model from assets or file
byte[] buffer = Files.createTempDirectory("temp").resolve("temp_model").toFile().readAllBytes();
tfliteModel = assetModel = tfliteModel.createFromBuffer(tfliteContent.x);
// Preprocess image for model
TensorImage tensorImage = TensorImage.fromBitmap(croppedBitmap);
ImageProcessor processor = new ImageProcessor.Builder()
.add(new ResizeOp(224, 224, ResizeOp.ResizeMethod.NEAREST_NEIGHBOR))
.add(new NormalizeOp(0.f, 1.0f))
.build();
TensorImage preprocessedImage = processor.process(tensorImage);
// Prepare input/output buffers
TensorBuffer inputBuffer = preprocessedImage.getTensorBuffer();
TensorBuffer outputBuffer = TensorBuffer.createFixedSize(new int[]{1, 7}, DataType.FLOAT32);
// Run inference
tflite.run(inputFeatures.getFloatArray(), outputBuffer.getFloatArray());
// Postprocessing: find highest probability
float[] outputs = outputBuffer.getFloatArray();
int index = 0;
float max = 0.0f;
for(int i=0; i<outputs.length;i++){
if(outputs[i]>max){
max=outputs[i];
index=i;
}
}
// index = predicted emotion class
model.close();
return index;
}- Preprocessing matches the training pipeline: resize, normalize.
- Output is a probability array
[p1, p2, ..., p7]. The index with the max value is chosen as the emotion. - Postprocessing maps the index to human-readable emotion (e.g., 0=Angry, 1=Disgust, ..., 6=Neutral).
-
MainActivity:
- Handles camera permissions and image capture.
- Calls
loadTfLiteModel()on the captured image. - Gets the predicted emotion index.
- Sets
AppController.currentMoodaccordingly.
-
ExpressionDisplayActivity:
- Reads
currentMood. - Sets emoji, background, and playlist based on emotion.
- Plays a recommended song via JCPlayer.
- Reads
app/src/main/java/blog/cosmos/home/sangeetai/: Main application logicapp/src/main/java/blog/cosmos/home/sangeetai/activity/: App screens/activitiesapp/src/main/java/blog/cosmos/home/sangeetai/utils/: Utility classes (image processing, TFLite inference)app/src/main/java/blog/cosmos/home/sangeetai/constants/: Mood constantsapp/src/main/java/blog/cosmos/home/sangeetai/interfaces/: Callbacks/interfacesml-notebooks/: Jupyter notebooks for data prep, training, and TFLite conversion
Sad Songs:
- Jag Soona Soona Lage: https://firebasestorage.googleapis.com/v0/b/moodey-music.appspot.com/o/jag%20soona%20lage.mp3?alt=media&token=23158f8f-f376-4244-a09d-a85500675ebb
- Tujhe Bhula Diya: https://firebasestorage.googleapis.com/v0/b/moodey-music.appspot.com/o/tujhe%20bhula%20diya.mp3?alt=media&token=b9063733-08fa-4134-adc3-cdba117e4ad0
Happy Songs:
- User launches app.
- CameraView opens, user takes a selfie.
- TensorFlow Lite processes image with custom VGG19 model, outputs emotion probabilities.
- Predicted emotion index mapped to mood (e.g., Happy, Sad, Angry, etc.).
- JCPlayer UI displays matching playlist and controls.
- User plays, pauses, or skips songs. UI updates with emoji and background for emotion.
- Camera, microphone, and storage permissions managed and checked at runtime.
- Unit tests and instrumented tests in
app/src/testandapp/src/androidTest.
- New emotions: Retrain model with additional classes and update mapping logic in Android code.
- More songs: Extend playlists/HashMaps in AppController.
- Model upgrades: Replace
.tflitefile and update inference logic for better accuracy.
Feel free to open issues, make feature requests, or submit pull requests!
This README covers comprehensive technical aspects of both the ML pipeline and Android implementation, providing detailed documentation for the complete SangeetAI system.

