This work demonstrates how to apply linear regression using Python and scikit-learn, starting from a simple synthetic example and progressing to a real-world healthcare dataset.
- Model training and prediction
- Evaluation using Mean Squared Error and R² Score
- Plotting regression line
- EDA
- Feature selection and preprocessing
- One-hot encoding of categorical variables
- Splitting into train/test sets
- Model training and evaluation
- Actual vs Predicted cost visualization
- Python
- scikit-learn (LinearRegression, train_test_split, metrics, OneHotEncoder)
- pandas for data handling
- matplotlib and seaborn for visualization
- Extend to Polynomial Regression
- Try regularization (Ridge, Lasso)
- Explore healthcare datasets with more features or classification tasks