This project aims to predict house prices using Machine Learning algorithms based on features such as square footage, bedrooms, bathrooms, year built, and neighborhood. The model assists real estate stakeholders in making data-driven decisions.
- Records: 50,000 houses
- Features: Square footage, bedrooms, bathrooms, neighborhood, year built
- Target: House sale price
- Python (Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn)
- Jupyter Notebook for analysis and visualization
-
Data Preprocessing:
- Removed outliers using IQR method
- Encoded categorical features (Neighborhood) using Label Encoding
- Scaled numerical features using Min-Max scaling
-
Exploratory Data Analysis (EDA):
- Visualized price distributions and feature correlations
- Analyzed trends like square footage vs. price and year built vs. price
-
Model Training & Evaluation:
- Models Implemented:
βοΈ Linear Regression
βοΈ Ridge Regression
βοΈ Decision Tree Regressor
βοΈ Gradient Boosting Regressor - Evaluation Metrics: RΒ² Score, MAE, MSE, RMSE
- Models Implemented:
| Model | RΒ² Score | MAE | RMSE |
|---|---|---|---|
| Linear Regression | 0.57 | $39,866 | $49,681 |
| Ridge Regression | 0.57 | $39,866 | $49,680 |
| Decision Tree | 0.56 | $40,026 | $49,886 |
| Gradient Boosting | 0.57 | $39,890 | $49,730 |
- Square footage is the most significant predictor of house prices
- Neighborhood has a strong impact on pricing trends
- Gradient Boosting provides the best performance among non-linear models
- LinkedIn: Adithya Vardhan Reddy
- GitHub: Project Repository
Β© 2025 Adithya Vardhan Reddy