Brain Box 🧠

Brain Box is a Retrieval-Augmented Generation (RAG) chat bot. The backend ingests documents, builds a vector store, and serves a chat API that answers questions using retrieved context.

Features

Vector embedding and storage using ChromaDB
Semantic retrieval and prompt orchestration with LangChain
It uses Models powered by Azure AI Foundry
Document ingestion and vector store rebuild on upload
Pluggable LLM / embedding configuration via environment variables

Repository Structure

backend/ — FastAPI app and RAG logic (app.py, main.py, vector_store.py, etc.)
frontend/ — Streamlit UI
data/ — Uploaded documents and vector store files
templates/ — HTML templates (if used)
docker-compose.yml — Docker compose for local deployment

Requirements

Python 3.10+
See backend/requirements.txt and frontend/requirements.txt for project-specific dependencies

Quickstart - Docker

docker compose up --build

Run locally

Create and activate a virtual environment:

python -m venv .venv
source .venv/bin/activate

Install backend dependencies:

pip install -r backend/requirements.txt

Install frontend dependencies (if using Streamlit UI):

pip install -r frontend/requirements.txt

Create a .env file in the project root (see next section for variables).
Run the backend (development):

# from project root
python backend/app.py
# or
uvicorn backend.app:app --host 0.0.0.0 --port 8000 --reload

Run the frontend UI (Streamlit):

streamlit run frontend/streamlit_app.py --server.port 8501

.env — recommended variables

Create a .env file at the project root. The backend reads configuration from environment variables (via os.environ in the code). Below are recommended variables and example values — adapt these to your environment and provider.

Example .env:

# Azure AI Foundry
AZURE_API_KEY
API_VERSION
AZURE_ENDPOINT 

# Models
MODEL_NAME=gpt-4.1-mini   # or any supported model name
EMBEDDING_DEPLOYMENT_NAME=text-embedding-3-small
DEPLOYMENT_NAME = "gpt-4.1-mini"

How to use the .env file:

Place .env in the repository root. Your shell or a dotenv loader will expose these values to the Python process.
If you rely on python-dotenv in the codebase, the app will load .env automatically; otherwise export variables before starting the app:

Notes & Troubleshooting

Ensure OPENAI_API_KEY (or other LLM provider keys) are valid and have required permissions.
If the app cannot initialize the bot, check logs for missing env vars or missing dependencies.
Uploaded documents are stored in data/ — if vector files are stored there too, keep backups before deleting.

Next steps / customization

Swap embedding or LLM models by changing EMBEDDING_MODEL and LLM_MODEL in .env.
Add provider-specific secrets (Pinecone, Azure, etc.) to .env and update backend/vector_store.py accordingly.
Add authentication to the API in front of the endpoints for production.

Images

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.idea		.idea
backend		backend
frontend		frontend
templates		templates
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Brain Box 🧠

Features

Repository Structure

Requirements

Quickstart - Docker

Run locally

.env — recommended variables

Notes & Troubleshooting

Next steps / customization

Images

About

Uh oh!

Languages

License

ankit282k/Brain-Box

Folders and files

Latest commit

History

Repository files navigation

Brain Box 🧠

Features

Repository Structure

Requirements

Quickstart - Docker

Run locally

.env — recommended variables

Notes & Troubleshooting

Next steps / customization

Images

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages