SovereignRAG

Sovereign Retrieval Augmented Generation for Secure Code Analysis

🚀 Open-source sovereign RAG engine for security code auditing.

100% offline
Runs on personal hardware (Apple M1 fully supported)
Uses LLMs like phi3, mistral, codellama via Ollama
Vector DB via ChromaDB
Preprocessing with spaCy for efficient semantic chunking
BSD 3-Clause License

Philosophy

Sovereign AI - private, secure, under full control.

This project is part of a broader experiment applying sovereign AI pipelines for security code analysis. It's a fully private, offline, sovereign AI pipeline for security code auditing. Free from cloud lock-in, vendor tracking or corporate surveillance.

Build your own lab. Own your models. Control your data.

Install

This project is designed to run via Docker. The Makefile wraps the common Docker Compose commands.

Requirements:

Docker and Docker Compose

Optional (local dev only):

Python 3.10+
Ollama

Optional (local dev only, CLI entry point):

pip install -e .
sovereign-rag --help

Build images:

make build

Usage (Docker)

SovereignRAG provides a unified CLI interface with colored output for better readability. There are two main commands:

Ingest PDF Documents

To ingest security-related PDF documents into the vector database:

make ingest PDF_DIR=./raw_pdfs MODEL=all-MiniLM-L6-v2

Options:

--pdf-dir: Directory containing PDF files to index (default: ./raw_pdfs/)
--model: Sentence transformer model to use (default: all-MiniLM-L6-v2)

Example:

make ingest PDF_DIR=./security_pdfs MODEL=all-MiniLM-L6-v2

Query for Security Analysis

To analyze a source code file for security vulnerabilities:

make query PATH=./src/sovereign_rag/query.py EXT=py MODEL=qwen2.5:3b-instruct

Options (via Makefile vars):

PATH: Path to the source code file or directory to analyze (required)
EXT: File extension filter when PATH is a directory (optional)
MODEL: Ollama model to use (default: qwen2.5:3b-instruct)
OLLAMA_URL: Ollama API URL (default: http://ollama:11434)

Makefile helpers

Common tasks:

make up
make pull-model MODEL=qwen2.5:3b-instruct
make ingest PDF_DIR=./raw_pdfs MODEL=all-MiniLM-L6-v2
make query PATH=./src EXT=py MODEL=qwen2.5:3b-instruct
make down

Docker (manual)

If you prefer raw Docker Compose commands instead of the Makefile:

docker compose build
docker compose up -d ollama
docker compose exec ollama ollama pull qwen2.5:3b-instruct
docker compose run --rm app env PYTHONPATH=src python -m sovereign_rag.cli ingest --pdf-dir ./raw_pdfs --model all-MiniLM-L6-v2
docker compose run --rm app env PYTHONPATH=src python -m sovereign_rag.cli query --path ./src --extension py --model qwen2.5:3b-instruct --ollama-url http://ollama:11434
docker compose down

Outputs and data persistence:

ChromaDB data: ./chroma_db (host) is mounted to /app/chroma_db (container)
Reports: ./output (host) is mounted to /app/output (container)
PDFs: ./raw_pdfs (host) is mounted to /app/raw_pdfs (container)

Stop services:

docker compose down

Docker: Development image (requirements-dev)

Use the dev image to get pytest, ruff, and other developer tools from requirements/requirements_dev.txt.

Build and open a shell in the dev container:

docker compose build app-dev
docker compose run --rm app-dev bash

Inside the dev container, run common tasks:

# Format
ruff format .

# Lint
ruff check .

# Tests
pytest -q

# App commands (same as prod), pointing to Ollama service
PYTHONPATH=src python -m sovereign_rag.cli ingest --pdf-dir ./raw_pdfs/ --model all-MiniLM-L6-v2
PYTHONPATH=src python -m sovereign_rag.cli query --path ./src --extension py --model qwen2.5:3b-instruct --ollama-url http://ollama:11434

Development (local, optional)

Code Formatting

This project uses ruff for code formatting. To format all Python files in the project, run:

ruff format .

This will automatically format your code according to the style defined in the pyproject.toml file.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.github		.github
requirements		requirements
sample_data		sample_data
src/sovereign_rag		src/sovereign_rag
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
sovereign-rag-faster.gif		sovereign-rag-faster.gif
sovereign-rag.png		sovereign-rag.png
sovereign_rag.iml		sovereign_rag.iml
test_import.py		test_import.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SovereignRAG

Philosophy

Install

Usage (Docker)

Ingest PDF Documents

Query for Security Analysis

Makefile helpers

Docker (manual)

Docker: Development image (requirements-dev)

Development (local, optional)

Code Formatting

About

Uh oh!

Releases

Sponsor this project

Uh oh!

Packages

Languages

Uh oh!

License

spacexnu/sovereign-rag

Folders and files

Latest commit

History

Repository files navigation

SovereignRAG

Philosophy

Install

Usage (Docker)

Ingest PDF Documents

Query for Security Analysis

Makefile helpers

Docker (manual)

Docker: Development image (requirements-dev)

Development (local, optional)

Code Formatting

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Languages

Packages