ScreenStats is a web application that allows users to compare the careers of movie actors side-by-side. It visualizes data such as box office performance, genre distribution, and critical reception, providing insights into an actor's filmography.
- Actor Comparison: Compare two actors to see who has higher box office gross, more movies, or better ratings.
- Data Visualization: Interactive charts powered by Plotly to visualize career trajectories.
- Comprehensive Data: Sourced from The Movie Database (TMDB), covering movies, genres, budgets, and revenue.
- Responsive Design: Built with Bootstrap 5 for a seamless experience on desktop and mobile.
- Backend: Django 5 (Python)
- Database: PostgreSQL
- Frontend: Django Templates, Bootstrap 5, Plotly.js
- Data Collection: Custom Python pipeline with
psycopg2andrequests - Infrastructure: Google Cloud Platform (Cloud Run, Cloud SQL, Cloud Build, Secret Manager)
ScreenStats/
├── screen_stats/ # Main Django application
│ ├── models.py # Database models (Actor, Movie, etc.)
│ ├── views.py # View logic for comparisons
│ └── settings.py # Project settings
├── data_collection/ # Data pipeline scripts
│ ├── main.py # Main scraper script (TMDB -> Postgres)
│ └── testing/ # Data integrity tests
├── templates/ # HTML templates
├── static/ # Static assets (CSS, JS, Images)
├── build_and_deploy_app.sh # Cloud Run deployment script
└── deploy_local.sh # Script for running locally with Cloud SQL
- Python 3.10+
- PostgreSQL installed and running locally
- A TMDB API Key (Get one here)
git clone https://github.com/SYusupov/ScreenStats.git
cd ScreenStatspython -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txtCreate a .env file in the root directory or export these variables:
export DB_NAME="screenstats_db"
export DB_USER="your_db_user"
export DB_PASS="your_db_password"
export DB_HOST="127.0.0.1"
export DB_PORT="5432"
export TMDB_ACCESS_TOKEN="your_tmdb_token"
export SECRET_KEY="your_django_secret_key"
export DEBUG="True"# Create the database (if not exists)
createdb screenstats_db
# Run migrations
python manage.py migrate
# Populate initial data (Genres)
python manage.py populate_genresTo populate your local database with movie/actor data:
cd data_collection
# Ensure DB env vars are set for this script too
python main.pypython manage.py runserverVisit http://127.0.0.1:8000 in your browser.
The project includes a comprehensive test suite for the data collection pipeline.
cd data_collection/testing
./run_tests.shSee data_collection/testing/TEST_README.md for more details on testing.
The application is designed to be deployed on Google Cloud Run.
The build_and_deploy_app.sh script handles:
- Installing dependencies
- Submitting the build to Cloud Build (using Buildpacks)
- Running migrations (via Cloud Run Jobs)
- Deploying the service to Cloud Run
./build_and_deploy_app.shTo run locally but connect to the production Cloud SQL database:
./deploy_local.shNote: This requires GCP credentials and the Cloud SQL Auth Proxy.