Builds a machine learning classification model to predict categories of interview responses using real-world conversational data.
This repository hosts a data-driven interview response classification project implemented in Python. It includes:
- A Jupyter notebook (
Interview_Response_Classifier_F.ipynb) containing data exploration, feature engineering, model training, evaluation, and inference. - CSV datasets:
train_IA_-_train.csv,test_IA_-_test.csv, anddata_with_pred.csvwith predictions. - Enables prediction of response categories (e.g., satisfactory, unsatisfactory) using popular supervised techniques like Logistic Regression, Random Forest, and SVM.
βββ Interview_Response_Classifier_F.ipynb # Interactive notebook with full ML pipeline βββ train_IA_-train.csv # Training dataset βββ test_IA-_test.csv # Test dataset βββ data_with_pred.csv # Test results + model predictions βββ .gitignore # Files to exclude from Git βββ README.md # Project overview and instructions
-
π Data Loading & Exploration
Insights into question-response patterns and class balance. -
π§© Feature Engineering
Techniques like TF-IDF text encoding, response length, keyword frequency, sentiment scoring. -
π€ Modeling & Classification
Implements models such as Logistic Regression, Random Forest, SVM for predicting response quality. -
π Performance Evaluation
Metrics include accuracy, precision, recall, F1βscore, confusion matrix, and ROC-AUC. -
π Result Analysis
Predictions are saved indata_with_pred.csvfor interpretation and further use.