A machine learning project that predicts obesity categories using lifestyle and physical condition data. The model classifies individuals into different obesity levels based on various behavioral and demographic features.
This project implements a machine learning model to predict obesity levels based on eating habits, physical condition, and lifestyle choices. It uses a comprehensive dataset with both categorical and numerical features to make accurate predictions across seven different obesity categories.
- Multi-class obesity prediction
- Comprehensive data preprocessing
- Feature importance analysis
- Model performance evaluation
- Cross-validation implementation
- Handling of imbalanced classes
- Interactive visualizations
- Python 3.9 or higher
- Required libraries:
numpy>=1.21.0 pandas>=1.3.0 scikit-learn>=1.0.0 seaborn>=0.11.0 matplotlib>=3.4.0 imblearn>=0.8.0
-
Clone the repository:
git clone https://github.com/SnowAncestor/AI_Project.git cd AI_Project -
Create and activate a virtual environment (recommended):
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install required packages:
pip install -r requirements.txt
- Later..
| Feature | Description | Values | Encoding |
|---|---|---|---|
| Gender | Gender of the individual | Female, Male | 0, 1 |
| family_history_with_overweight | Family history of overweight | no, yes | 0, 1 |
| FAVC | Consumption of high-calorie food | no, yes | 0, 1 |
| CAEC | Eating between meals | Always, Frequently, Sometimes, no | 0, 1, 2, 3 |
| SMOKE | Smoking status | no, yes | 0, 1 |
| SCC | Calories monitoring | no, yes | 0, 1 |
| CALC | Alcohol consumption | Always, Frequently, Sometimes, no | 0, 1, 2, 3 |
| MTRANS | Transportation used | Automobile, Bike, Motorbike, Public_Transportation, Walking | 0, 1, 2, 3, 4 |
| Feature | Description | Categories | Encoding |
|---|---|---|---|
| NObeyesdad | Obesity level | Insufficient_Weight, Normal_Weight, Obesity_Type_I, Obesity_Type_II, Obesity_Type_III, Overweight_Level_I, Overweight_Level_II | 0, 1, 2, 3, 4, 5, 6 |
- Age (years)
- Height (meters)
- Weight (kilograms)
- FCVC (Frequency of vegetable consumption)
- NCP (Number of main meals)
- CH2O (Water consumption)
- FAF (Physical activity frequency)
- TUE (Time using technology devices)
The current model achieves the following metrics:
- Accuracy: [Later]
-
Data Preprocessing
- Handling missing values
- Feature encoding
- Feature scaling
- Class balancing using SMOTE
-
Feature Engineering
- BMI calculation
- Feature interactions
- Polynomial features
-
Model Training
- Cross-validation
- Hyperparameter tuning
- Model evaluation
- Training data = 85%, Testing = 15%
- [Dataset Source](In The Data File)