A lightweight implementation of linear regression written in Python for the 42 project.
The goal is to predict the price of a car based on its mileage by training a simple linear model from scratch — without using machine learning libraries.
The project demonstrates a full ML workflow: data loading, model training, evaluation, and visualization.
The model uses the following equation:
price = θ₀ + θ₁ × mileage
Where:
θ₀is the intercept (base price)θ₁is the slope (impact of mileage on price)
The training algorithm uses gradient descent to minimize the mean squared error between predicted and actual prices.
After training, the computed θ₀ and θ₁ values are saved into a theta.json file.
predict.py then loads these values to estimate the price for any given mileage.
train.pytrains the model and saves thetas intheta.jsonpredict.pyloads them to make predictions
| File | Description |
|---|---|
| train.py | Train model and save parameters |
| predict.py | Predict price from mileage |
| plot.py | (bonus) Plot regression line and dataset |
| supervisor.py | (bonus) Show R² and mean error (in €) |
| reset_theta.py | (bonus) Reset theta.json |
| dataset_stats.py | (bonus) Show dataset statistics |
python3 train.py # Train the model
python3 predict.py # Predict price
python3 plot.py # (bonus) Visualize data
python3 supervisor.py # (bonus) Evaluate model- Python 3.8+
- numpy, matplotlib
Install dependencies:
pip install numpy matplotlib