🧠 Reddit-RecSys

Multimodal Recommendation System (WIP)

This project is a real-time multimodal recommendation system built on top of Reddit data. It processes image-caption pairs using CLIP to create joint embeddings, stores them in Qdrant, and supports semantic retrieval based on text or image input.

🔧 Key Components

🔄 Ingestion: Reddit image posts and captions pulled from multiple subreddits
🧠 Embedding: Featurization using open-clip-torch
🗃️ Vector Storage: Stored and queried via Qdrant
🪄 Orchestration: Apache Airflow (deployed via Helm on Kubernetes)
🔍 Retrieval: ANN-based search with image or text queries

⚠️ Work In Progress — Setup, DAGs, and usage instructions will be added soon.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
dags		dags
environment		environment
sample_inputs		sample_inputs
scripts		scripts
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
query.py		query.py
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 Reddit-RecSys

Multimodal Recommendation System (WIP)

🔧 Key Components

About

Uh oh!

Releases

Packages

Languages

sarabesh/reddit-recsys

Folders and files

Latest commit

History

Repository files navigation

🧠 Reddit-RecSys

Multimodal Recommendation System (WIP)

🔧 Key Components

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages