Real‑Time Bidding (RTB) — Click‑Through Rate (CTR) Prediction API
Predict ad click probability in real‑time bidding scenarios.
This project is a FastAPI‑based REST API designed to predict the Click‑Through Rate (CTR) for ad impressions in Real‑Time Bidding (RTB) systems.
Key features:
/predictendpoint — returns predicted CTR for a given ad impression./trainendpoint — trains a new LightGBM model on the provided dataset./healthendpoint — health‑check.- Fully containerized with Docker.
Technologies: Python, FastAPI, LightGBM, Pandas, Docker, Docker‑Compose.
RTB-style_pet/
├── app/ # Main application code
│ ├── __init__.py
│ ├── main.py # FastAPI app & endpoints
│ ├── schemas.py # Pydantic request/response models
│ ├── utils.py # Helper functions
│ └── models/ # Directory for saved models
├── docker/ # Docker configurations
│ ├── Dockerfile
│ ├── Dockerfile.airflow
│ └── docker-compose.yml
├── data/ # Place your datasets here (ignored by .gitignore)
├── requirements.txt # Python dependencies
├── README.md
└── .gitignore
- Python 3.9+
- Docker (recommended for easiest setup)
-
Clone the repository:
git clone https://github.com/your-username/RTB-style_pet.git cd RTB-style_pet -
Install dependencies:
pip install -r requirements.txt
-
Start the API server:
uvicorn app.main:app --reload --port 8000
-
Open Swagger UI in your browser:
http://localhost:8000/docs
-
Build and launch containers:
docker-compose up --build
-
The API will be available at:
http://localhost:8000
Note:
- The
train.csvfile must be placed inside thedata/folder before training a model.
POST http://localhost:8000/predict
Request Body (JSON):
{
"hour": 14102100,
"banner_pos": 0,
"site_id": "1fbe01fe",
"site_domain": "f3845767",
"site_category": "28905ebd",
"app_id": "ecad2386",
"app_domain": "7801e8d9",
"app_category": "07d7df22",
"device_id": "a99f214a",
"device_type": 1,
"device_conn_type": 2,
"C1": 1005
}Response:
{
"predicted_ctr": 0.00123,
"is_weekend": false
}GET http://localhost:8000/health
Response:
{"status": "healthy"}GET http://localhost:8000/train
Triggers model training using data/train.csv.
After successful training, the model is saved to app/models/.
Response example:
{
"status": "success",
"auc": 0.92345,
"logloss": 0.12345,
"model_path": "/app/models/avazu_improved_auc_0.92345.txt"
}Requirement:
train.csvmust exist indata/.
Dockerfile— builds the FastAPI application image.docker-compose.yml— orchestrates the API service (and optionally Airflow).
To stop containers:
docker-compose downWhen /train is called, the following steps are executed:
- Load
train.csv(≈1.2 M rows). - Feature engineering
- Extract
day,is_weekend, cyclic time features (hour_sin,hour_cos). - Frequency & Target encoding for categorical fields.
- Interaction features (
hour_site,hour_app,site_app).
- Extract
- Train‑Validation split (85%/15%).
- Train LightGBM model with early stopping.
- Evaluate using AUC‑ROC and LogLoss.
- Save the model to
app/models/.
- Python 3.9
- FastAPI – web framework
- LightGBM – gradient boosting machine
- Pandas / NumPy – data manipulation
- Docker & Docker‑Compose – containerization
Tip:
Check the interactive API documentation at http://localhost:8000/docs after launching the server!