AI-powered document Q&A system that lets you upload PDFs and ask questions with source citations, using Retrieval-Augmented Generation (RAG).
Upload a PDF → Ask a question → Get an answer + cited snippets
(Add screenshots / short GIF here)
- ✅ Upload PDF documents
- ✅ Ask natural-language questions
- ✅ Answers include source citations (document + chunk)
- ✅ Persistent vector store (data survives restart)
- ✅ Simple REST API (FastAPI)
- ✅ Optional lightweight web UI (if you built one)
- Backend: FastAPI (Python)
- Vector DB: ChromaDB
- Embeddings: OpenAI
text-embedding-3-small - LLM:
gpt-4o-mini
- PDF → Text extraction
- Chunking (split into small overlapping text blocks)
- Embeddings (convert chunks into vectors)
- Store vectors + metadata in ChromaDB
- User question → Embedding
- Similarity search in ChromaDB (top-k chunks)
- Retrieved chunks → injected as context into the LLM prompt
- LLM generates answer + citations
- Support more file types (DOCX, TXT)
- Better chunking strategies
- Conversation memory
- Multiple document queries