Find research labs that match your interests using AI-powered search
A web application that helps students discover research opportunities by matching their interests with lab descriptions using semantic search technology.
College Lab Match uses AI to understand what you're interested in and finds research labs that align with your goals. Instead of manually browsing hundreds of lab websites, just describe your research interests and get personalized matches with similarity scores.
Example: Search for "machine learning computer vision" and get labs like Stanford AI Lab (85% match), MIT CSAIL (78% match), etc.
-
Local Development:
git clone <repository-url> cd collegelabmatch.com pip install -r requirements.txt python run_server.py
-
Using Docker:
docker compose up
Create a .env file:
PINECONE_API_KEY=your_api_key_here
HUGGINGFACE_API_KEY=your_hf_token_herePinecone API Key:
- Sign up at pinecone.io
- Create a new project
- Create an index named
collegelabmatchwith dimension384and cosine similarity - Copy your API key
Hugging Face API Token:
- Sign up at huggingface.co
- Go to Settings > Access Tokens
- Create a new token with "Read" permissions
- Copy your token
python sample_labs.py # Populates database with sample labs- Enter your interests: "deep learning", "cancer research", "robotics", etc.
- Choose number of results: 5, 10, or 20 labs
- Get matched labs with similarity scores and contact info
- Browse results and reach out to labs that interest you
Tech Stack: Python FastAPI + Vanilla JavaScript + Pinecone Vector DB + HuggingFace Inference API
Key Files:
backend/main.py- FastAPI serverfrontend/index.html- Web interfacebackend/services/- AI and database logic
API Endpoints:
POST /api/search-labs- Search for matching labsGET /api/health- Health check
The project includes automated deployment to AWS Lightsail via GitHub Actions.
Required GitHub Secrets:
DOCKERHUB_ACCESS_TOKENLIGHTSAIL_SSH_KEYPINECONE_API_KEY
Required GitHub Variables:
DOCKERHUB_USERNAMELIGHTSAIL_PUBLIC_IPLIGHTSAIL_USERNAMEDOMAIN_NAME
Push to main branch to trigger automatic deployment.
- Text Processing: Your interests are converted to vectors using HuggingFace Inference API
- Similarity Search: Pinecone finds labs with similar vector representations
- Ranking: Results are ranked by similarity score (0-100%)
- Display: Get lab details, professor info, and contact information
- Check that your
.envfile has the correct Pinecone API key - Run
python sample_labs.pyif you get no search results - Open an issue if you encounter bugs