DocuMind

Designed to solve the privacy risks of uploading sensitive financial/legal documents to cloud LLMs like GPT-4. DocuMind is an advanced, privacy-focused AI assistant that allows users to chat with their documents and the web. Built with a "local-first" philosophy, it leverages Ollama to run powerful LLMs directly on your machine, ensuring your data never leaves your control.

Whether you need to summarize a research paper, extract insights from a financial report, or query a GitHub repository, DocuMind provides a seamless, chat-based interface to interact with your content.

Features

📄 Chat with PDFs: Upload PDF documents and ask questions. The system uses RAG (Retrieval-Augmented Generation) to provide accurate answers with source citations.
🌍 Universal Ingestion: Paste URLs from Wikipedia, GitHub, or any other website. DocuMind scrapes and processes the content, making it instantly queryable.
🎧 Audio Summaries: Generate podcast-style audio summaries of your documents to listen on the go.
🔒 Local & Private: Powered by Ollama, all processing happens locally. No API keys required, no data leakage.
🧠 Model Agnostic: Switch between different LLMs (Llama 3, Mistral, Gemma) on the fly directly from the chat interface.
⚡ Real-time Streaming: Enjoy a smooth, typewriter-style chat experience with low latency.
🎨 Modern UI: A clean, responsive interface built with Next.js and Shadcn UI, featuring dark mode support and syntax highlighting for code.

Technology Stack

Frontend

Framework: Next.js 16 (App Router)
Styling: Tailwind CSS
Components: Shadcn UI
Icons: Lucide React
Markdown: react-markdown with react-syntax-highlighter

Backend

API: FastAPI
Vector Store: ChromaDB
LLM Orchestration: LangChain
PDF Processing: PyMuPDF (fitz)
Web Scraping: beautifulsoup4 & WebBaseLoader
Audio Generation: gTTS (Google Text-to-Speech)

Prerequisites

Before you begin, ensure you have the following installed:

Node.js (v18 or higher)
Python (v3.10 or higher)
Ollama: Download and install from ollama.com.
- Pull a model: ollama pull llama3 (or your preferred model).

Installation

1. Clone the Repository

git clone https://github.com/iamdanwi/pdf-assitant.git
cd documind

2. Backend Setup

Navigate to the server directory and set up the Python environment.

cd server
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -r requirements.txt

Note: If requirements.txt is missing, install the core dependencies:

pip install fastapi uvicorn[standard] langchain-ollama langchain-community chromadb pymupdf python-multipart httpx gTTS beautifulsoup4

3. Frontend Setup

Navigate to the client directory and install dependencies.

cd ../client
npm install

Usage

Running the Application

Start the Backend Server:

# In /server directory
source .venv/bin/activate
uvicorn app.main:app --reload

The API will be available at http://localhost:8000.

Start the Frontend Client:
```
# In /client directory
npm run dev
```
The application will be available at http://localhost:3000.

How to Use

Select a Model: Use the dropdown in the chat input to select your installed Ollama model.
Add Content:
- Upload PDF: Click the paperclip icon to upload a document.
- Add URL: Click "Add from URL" to paste a link (e.g., a Wikipedia article).
Chat: Type your questions. The AI will answer based on the uploaded context.
Listen: Click "Generate Audio" to hear a summary of the active document.
New Chat: Click the + icon in the sidebar to clear the context and start fresh.

Project Structure

documind/
├── client/                 # Next.js Frontend
│   ├── app/
│   │   ├── components/     # React Components (ChatInterface, MessageBubble)
│   │   ├── hooks/          # Custom Hooks (useChat)
│   │   └── lib/            # Utilities
│   └── public/
├── server/                 # FastAPI Backend
│   ├── app/
│   │   ├── api/            # API Routes (chat, ingest, audio)
│   │   ├── core/           # Configuration
│   │   └── services/       # Business Logic (Vector Store)
│   └── static/             # Generated audio files
└── README.md

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository.
Create your feature branch (git checkout -b feature/AmazingFeature).
Commit your changes (git commit -m 'Add some AmazingFeature').
Push to the branch (git push origin feature/AmazingFeature).
Open a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
client		client
server		server
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DocuMind

Features

Technology Stack

Frontend

Backend

Prerequisites

Installation

1. Clone the Repository

2. Backend Setup

3. Frontend Setup

Usage

Running the Application

How to Use

Project Structure

Contributing

License

About

Uh oh!

Releases

Packages

Languages

iamdanwi/pdf-assistant

Folders and files

Latest commit

History

Repository files navigation

DocuMind

Features

Technology Stack

Frontend

Backend

Prerequisites

Installation

1. Clone the Repository

2. Backend Setup

3. Frontend Setup

Usage

Running the Application

How to Use

Project Structure

Contributing

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages