Skip to content

fahmiaziz98/rag-pgvector-api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

📚 Retrieval-Augmented Generation (RAG) System with PGVector

This repository contains a RAG (Retrieval-Augmented Generation) system powered by an open-source LLM, PGVector as the vector database, and LangChain for orchestration. The project includes full documentation, setup instructions, and evaluation tools.


✅ Deliverables

  • RAG model serving system based on an open-source LLM.
  • Uses PGVector as the vector database (rubythalib/pgvector:latest).
  • Complete documentation for setup and installation.
  • Evaluation spreadsheet containing:
    • 25 questions
    • 25 ground-truth answers from SOP
    • 25 model-generated answers (LLM output)

⚙️ Tech Stack

Component Technology
LLM llama-3.1-8b-instant (Groq)
Embedding sentence-transformers/all-MiniLM-L6-v2 (HF)
Vector Store PGVector
Orchestration LangChain
API Serving FastAPI

🚀 Installation & Setup

1. Install UV

curl -Ls https://astral.sh/uv/install.sh | bash

# Make sure ~/.local/bin is in PATH
export PATH="$HOME/.local/bin:$PATH"

2. Clone Repository

git clone https://github.com/fahmiaziz98/technical_test.git
cd technical_test

3. Create and Activate Virtual Environment

uv venv .venv
source .venv/bin/activate

4. Install Dependencies

uv pip install -r requirements.txt

5. Setup PGVector via Docker

docker run --name pgvector-container \
  -e POSTGRES_USER=user \
  -e POSTGRES_PASSWORD=user \
  -e POSTGRES_DB=SOP_perusahaan \
  -p 6024:5432 \
  -d rubythalib/pgvector:latest

6. Setup Environment Variables

cp .env.example .env

Fill the .env file with your local credentials. Generate a Groq API Key at: https://console.groq.com/keys


📥 Index Documents into Vector DB

uv run etl/indexing.py

▶️ Run the API Service

uv run src/service.py

API Docs available at: http://0.0.0.0:8000/docs

Example CURL request:

curl -X 'POST' \
  'http://0.0.0.0:8000/api/v1/ask' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "session_id": "123sh",
  "query": "What are the requirements for working overtime and how should it be reported?",
  "method": "hybrid" # or 'native'
}'

Example Response:

{
  "session_id": "123sh",
  "query": "What are the requirements for working overtime and how should it be reported?",
  "answer": "The requirement for working overtime is direct supervisor approval. Employees working beyond regular hours are entitled to overtime compensation. Overtime reports must be submitted no later than 1 business day after the overtime is performed.",
  "metadata": {
    "method": "hybrid",
    "model": "llama-3.1-8b-instant",
    "retriever_config": {
      "type": "hybrid",
      "collection": "doc_SOP_v2",
      "top_k": 3,
      "vector_store_top_k": 3,
      "bm25_top_k": 3,
      "weights": [0.5, 0.5],
      "rerank_top_n": 5
    }
  }
}

📊 Evaluate LLM Outputs

uv run evaluate.py --method native --delay 5 --input evaluasi_data.xlsx

Evaluation results will be saved into a spreadsheet containing:

  • Column A: Question
  • Column B: Ground-truth answer (SOP)
  • Column C: Model-generated answer (LLM)
  • Column D: Native output
  • Column E: Hybrid output

LLM Evaluation


🧹 Clean Up

Stop and remove PGVector container:

docker stop pgvector-container
docker rm pgvector-container

✅ TODO Checklist

  • RAG model serving based on company SOP documents

  • PGVector with image rubythalib/pgvector:latest

  • API service with FastAPI

  • Documentation for installation & usage

  • Script for indexing documents into vector DB

  • /ask endpoint with hybrid & native retrieval

  • Spreadsheet-based evaluation of answers

  • CURL example for manual testing

  • .env and Groq API key handling

  • Document extraction using SmolDocling VLM (GPU-based for accuracy)

  • Hybrid retrieval + reranker for better contextual answers

  • Improved embeddings for better retrieval quality, e.g.:


📖 References

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages