Skip to content

Using demographic and behavioral data, this research constructs a machine learning model that can forecast how much each person's healthcare will cost. We use the Random Forest Regressor to train and test the model, and we use GridSearchCV to fine-tune its hyperparameters.

Notifications You must be signed in to change notification settings

Mekusgood/healthcare-cost-prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Healthcare Cost Prediction with Machine Learning

This project builds a machine learning model to predict individual healthcare charges using demographic and behavioral data. The model is trained and evaluated using Random Forest Regressor with hyperparameter tuning via GridSearchCV.

Python 3.8+ NumPy pandas matplotlib scikit-learn

πŸ”§ Technologies

  • Python (pandas, scikit-learn, seaborn, matplotlib)
  • Jupyter Notebook
  • Streamlit (for optional dashboard)

πŸ“ Contents

  • Healthcare_Cost_Prediction.ipynb: Complete analysis, model training, and evaluation
  • data.csv: Sample dataset (or instructions to load it)
  • requirements.txt: Python dependencies

πŸ“Š Key Results

  • Best model RMSE: ~$4,800
  • RΒ² Score: ~0.82
  • Major cost drivers: Smoking status, BMI, Age

πŸš€ Run It

Clone the repository and open the notebook with Jupyter:

git clone https://github.com/Mekusgood/healthcare-cost-prediction.git
cd healthcare-cost-prediction
jupyter lab

About

Using demographic and behavioral data, this research constructs a machine learning model that can forecast how much each person's healthcare will cost. We use the Random Forest Regressor to train and test the model, and we use GridSearchCV to fine-tune its hyperparameters.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published