Skip to content

HarshitMathur01/SHL-Hiring-Assessment---Kaggle

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 

Repository files navigation

🧠 SHL Grammar Scoring - Kaggle Competition 2025

This repository contains my complete solution for the SHL Hiring Assessment hosted on Kaggle. The task is to predict grammar proficiency scores (0 to 5) from audio clips of spoken English by candidates. I explored multiple deep learning and machine learning approaches using both audio signals and transcribed text.

Final Best Approach: A multi-modal ensemble combining audio features + Whisper transcripts, fed into an MLP head.


πŸ“Œ Problem Overview

Given a dataset of audio responses and grammar scores:

  • Predict a continuous grammar score for new audio clips.
  • Evaluation Metric: Mean Squared Error (MSE)

πŸ—‚οΈ Dataset

Each training sample includes:

  • .wav audio file (spoken answer)
  • label (grammar proficiency, float between 0 and 5)

πŸ” Approaches Overview

πŸ”Ή Approach 1: Audio Features β†’ XGBoost / MLP

  • Used librosa to extract:
    • Waveform
  • Models:
    • facebook/wav2vec2-base-960h for audio features
    • XGBoostRegressor with RandomizedSearchCV for tuning
    • Deep MLP Regressor for required output
  • Insights:
    • Fast to compute but limited by shallow semantics

πŸ”Ή Approach 2: Transcript β†’ Text-based Regression

  • Used Whisper to transcribe audio to text
  • Processed text with:
    • BERT tokenizer + embeddings (bert-base-uncased)
  • Fed into:
    • MLP
  • Strength: Captured syntactic and grammatical errors well

πŸ”Ή Approach 3: Audio Embeddings from Wav2Vec2 + Text fusision

  • Used facebook/wav2vec2-base-960h to extract embeddings from raw waveforms
  • Pros:
    • Learned rich acoustic representations

βœ… Approach 4 (Best): Audio + Text Fusion

  • Combined:
    • WaveLM audio embeddings
    • Whisper transcripts β†’ BERT embeddings
  • Concatenated into a single feature vector
  • Fed into a custom MLP regressor
  • Result: Lowest MSE on validation set

πŸ§ͺ Model Architecture

Input (Audio Features + Text Embeddings)
             ↓
          Concatenation
             ↓
         BatchNorm1d
             ↓
            MLP
        (ReLU + Dropout)
             ↓
          Linear Out
             ↓
        Grammar Score

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published