Netflix RecSys is a modular, local-first Netflix-style movie recommendation system. It features:
- FastAPI backend
- Sentence-Transformers for embeddings
- ChromaDB for vector storage
- Reranking (cosine similarity, TensorFlow cross-encoder)
- LangChain for LLM explanations
- Simple HTML/JS frontend
-
README.md
This file. Project overview, file walkthrough, and credits. -
RUN_INSTRUCTIONS.md
Step-by-step guide to running, training, and extending the system locally. -
docker-compose.yml
Docker Compose setup for ChromaDB, API, and frontend services. -
data/
- processed/sample_movies.jsonl
Sample movie dataset (JSONL format) for ingestion and testing.
- processed/sample_movies.jsonl
-
requirements.txt
All Python dependencies for the API service. -
app/
- main.py
FastAPI app entrypoint. Loads API routes and health check. - api.py
Registers and includes all API routers. - routes/recommend.py
/api/recommend/queryendpoint: Accepts a query, retrieves candidates, reranks, and returns recommendations.
- main.py
- embedder.py
Embedding pipeline using sentence-transformers. Also provides a CLI for ingesting data into ChromaDB. - requirements.txt
Dependencies for embedding and ingestion.
- chroma_client.py
ChromaDB client for upserting and searching movie vectors. - requirements.txt
ChromaDB dependency.
- model.py
Lightweight in-memory reranker using cosine similarity. - tf_crossencoder.py
TensorFlow-based cross-encoder reranker (trainable, optional). - requirements.txt
Numpy dependency for reranking.
- selenium_scraper.py
Selenium-based IMDB plot fetcher (stub/example). - requirements.txt
Selenium and requests dependencies.
- rag.py
LangChain-based RAG pipeline for LLM-powered recommendation explanations.
- index.html
Simple HTML/JS frontend for demoing the recommendation API.
-
Ingest Data:
Useservices/embeddings/embedder.pyto encode movie plots and store them in ChromaDB. -
API Service:
services/api/app/main.pylaunches a FastAPI server exposing/api/recommend/query. -
Retrieval & Reranking:
services/retriever/chroma_client.pyretrieves candidates from ChromaDB.services/reranker/model.pyreranks candidates by similarity to the query.
-
Frontend:
services/frontend/index.htmlprovides a simple UI to test recommendations. -
Optional Extensions:
services/reranker/tf_crossencoder.py: Trainable reranker.services/langchain/rag.py: LLM-based explanations for recommendations.services/scraper/selenium_scraper.py: IMDB plot scraping utility.
This project is open for suggestions and improvements!
Feel free to open issues or pull requests.
Made by Rashmin Gajera.
