Turn your documents into Claude's memory.
Textrawl is a personal knowledge base that lets Claude search through your emails, PDFs, notes, and web pages. Ask questions about your own documents right from Claude Desktop - no copy-pasting, no context limits.
Your Knowledge Base
┌─────────────────────────────────┐
┌──────────────┐ │ │
│ │ │ Emails PDFs Notes │
│ Claude │◄───── search ─────►│ │ │ │ │
│ Desktop │ │ ▼ ▼ ▼ │
│ │ │ ┌──────────────────────────┐ │
└──────────────┘ │ │ Hybrid Search Engine │ │
│ │ │ (semantic + keywords) │ │
│ │ └──────────────────────────┘ │
▼ │ │ │
"What did │ ▼ │
Sarah say │ PostgreSQL + pgvector │
about the │ (Supabase) │
project?" │ │
└─────────────────────────────────┘
▲
│
┌──────────┴──────────┐
│ │
Desktop App CLI Tools
(drag & drop) (batch import)
Beyond keyword search. Most search tools only match exact words. Textrawl combines semantic understanding (finds "automobile" when you search "car") with traditional keyword matching - so you get relevant results without missing exact phrases.
Your data, your choice. Use OpenAI's embeddings for best accuracy, or run completely locally with Ollama - no API costs, no data leaving your machine.
Import everything. Emails from Gmail exports, PDFs from your research, saved web pages, Google Takeout archives - Textrawl converts them all into searchable knowledge.
| Feature | Description |
|---|---|
| Hybrid Search | Vector similarity + full-text search with Reciprocal Rank Fusion |
| Desktop App | Drag-and-drop file conversion and upload (macOS, Windows, Linux) |
| Multi-Format | PDF, DOCX, XLSX, PPTX, HTML, MBOX/EML emails, Google Takeout |
| MCP Integration | Works natively with Claude Desktop and other MCP clients |
| Flexible Embeddings | OpenAI (cloud) or Ollama (free, local) |
| Smart Chunking | Paragraph-aware splitting with overlap for context |
| CLI Tools | Batch processing for large archives |
| Cloud Ready | Deploy to Docker, Cloud Run, or any container platform |
git clone https://github.com/jeffgreendesign/textrawl.git
cd textrawl
pnpm install
pnpm setup # Interactive setup for credentials
pnpm dev # Start the server- Create a free project at supabase.com
- Run
scripts/setup-db.sqlin the SQL Editor (orsetup-db-ollama.sqlfor Ollama) - (Optional) For memory tools, also run
scripts/setup-db-memory.sql(orsetup-db-memory-ollama.sql) - (Optional) For conversation tools, also run
scripts/setup-db-conversation.sql(orsetup-db-conversation-ollama.sql/setup-db-conversation-ollama-v2.sql) - Run
scripts/security-rls.sqlfor security hardening - Copy your project URL and service role key to
.env
Add to your Claude config (~/Library/Application Support/Claude/claude_desktop_config.json). Create this file if it doesn't exist:
{
"mcpServers": {
"textrawl": {
"command": "npx",
"args": [
"mcp-remote",
"http://localhost:3000/mcp",
"--header",
"Accept: application/json, text/event-stream"
]
}
}
}Note: Requires Node.js 20+. If using nvm, ensure your default is set: nvm alias default 22
If you've set API_BEARER_TOKEN in .env, add the auth header:
"--header",
"Authorization: Bearer YOUR_TOKEN_HERE"Restart Claude Desktop - you'll now see Textrawl's tools available.
ChatGPT Desktop supports MCP servers natively (Pro/Plus required):
- Open Settings → Connectors → Advanced → Developer mode
- Add a new connector with your server URL:
http://localhost:3000/mcp - If using auth, add the
Authorization: Bearer YOUR_TOKENheader
See OpenAI MCP documentation for details.
Option A: Desktop App (easiest)
pnpm desktop:devDrag files onto the window to convert and upload.
Option B: CLI (for batch imports)
pnpm convert -- mbox ~/Mail/archive.mbox
pnpm upload -- ./converted/| Guide | Description |
|---|---|
| CLI Tools | Batch conversion and upload from command line |
| Security | Row Level Security and access controls |
| Variable | Required | Description |
|---|---|---|
SUPABASE_URL |
Yes | https://your-project.supabase.co |
SUPABASE_SERVICE_KEY |
Yes | Service role key |
EMBEDDING_PROVIDER |
No | openai (default) or ollama |
OPENAI_API_KEY |
If OpenAI | For text-embedding-3-small |
OLLAMA_BASE_URL |
If Ollama | Default: http://localhost:11434 |
OLLAMA_MODEL |
If Ollama | Default: nomic-embed-text |
API_BEARER_TOKEN |
Prod only | Min 32 chars (openssl rand -hex 32) |
PORT |
No | Default: 3000 |
LOG_LEVEL |
No | debug, info, warn, error |
ALLOWED_ORIGINS |
No | Comma-separated CORS origins |
ENABLE_MEMORY |
No | Enable memory tools (default: true); requires setup-db-memory.sql or setup-db-memory-ollama.sql |
ENABLE_CONVERSATIONS |
No | Enable conversation memory tools (default: true); requires setup-db-conversation.sql or setup-db-conversation-ollama.sql or setup-db-conversation-ollama-v2.sql |
ENABLE_INSIGHTS |
No | Enable proactive insight tools (default: true) |
ENABLE_MEMORY_EXTRACTION |
No | Enable LLM-based memory extraction (default: false) |
ANTHROPIC_API_KEY |
If extraction | Required for extract_memories tool |
EXTRACTION_MODEL |
No | Model for extraction (default: claude-3-haiku-20240307) |
COMPACT_RESPONSES |
No | Token-efficient responses (default: true) |
| Tool | Description |
|---|---|
search_knowledge |
Hybrid semantic + full-text search |
get_document |
Retrieve document by ID |
list_documents |
List with pagination and filtering |
update_document |
Update title and/or tags |
add_note |
Add markdown note to knowledge base |
Enable with ENABLE_MEMORY=true (default). Requires scripts/setup-db-memory.sql or setup-db-memory-ollama.sql.
| Tool | Description |
|---|---|
remember_fact |
Store facts about entities (people, projects, concepts) |
recall_memories |
Semantic search across stored memories |
relate_entities |
Create relationships between entities |
get_entity_context |
Get all memories and relations for an entity |
list_entities |
List all known entities |
forget_entity |
Delete an entity and all its memories |
memory_stats |
Get memory statistics |
extract_memories |
Extract entities and facts from text using LLM |
Enable with ENABLE_CONVERSATIONS=true (default). Requires running one of the conversation schema scripts:
scripts/setup-db-conversation.sql(OpenAI embeddings)scripts/setup-db-conversation-ollama.sql(Ollama v1 - nomic-embed-text, 1024d)scripts/setup-db-conversation-ollama-v2.sql(Ollama v2 - nomic-embed-text-v2-moe, 768d)
| Tool | Description |
|---|---|
save_conversation_context |
Save conversation summary and turns for recall |
recall_conversation |
Semantic search across past conversations |
list_conversations |
List recent conversation sessions |
get_conversation |
Get full conversation by session ID or key |
delete_conversation |
Delete a conversation session |
conversation_stats |
Get conversation storage statistics |
| Tool | Description |
|---|---|
search_with_context |
Search across documents, memories, and conversations simultaneously |
knowledge_stats |
Get statistics about the knowledge base |
Enable with ENABLE_INSIGHTS=true (default).
| Tool | Description |
|---|---|
get_insights |
View discovered cross-source connections and patterns |
discover_connections |
Trigger an insight scan across the knowledge base |
dismiss_insight |
Dismiss an insight from the queue |
insight_stats |
Get insight queue and processing statistics |
| Parameter | Type | Default | Description |
|---|---|---|---|
query |
string | required | Search query |
limit |
number | 10 | Max results (1-50) |
fullTextWeight |
number | 1.0 | Keyword weight (0-2) |
semanticWeight |
number | 1.0 | Semantic weight (0-2) |
minScore |
number | 0 | Min relevance threshold (0-1) |
tags |
string[] | - | Filter by tags (AND logic) |
sourceType |
string | - | note, file, or url |
curl -X POST http://localhost:3000/api/upload \
-H "Authorization: Bearer YOUR_TOKEN" \
-F "[email protected]" \
-F "title=Optional Title" \
-F "tags=tag1,tag2"Limits: 10MB max file size, 10 uploads/min
Formats: .pdf, .docx, .txt, .md
Response:
{
"success": true,
"documentId": "uuid",
"title": "Document Title",
"tags": ["tag1", "tag2"],
"chunksCreated": 12
}GET /health- Basic healthGET /health/ready- Readiness probe (checks DB)GET /health/live- Liveness probe
docker-compose up -d
docker-compose logs -f# Create secrets in Secret Manager first
export GCP_PROJECT_ID=your-project-id
./scripts/deploy.shpnpm dev # Watch mode
pnpm build # Production build
pnpm start # Run production
pnpm typecheck # Type check
pnpm lint # Biome lint check
pnpm quality # Lint + typecheck combined
pnpm inspector # MCP Inspector
pnpm setup # Generate .env with secure token
pnpm desktop:dev # Run desktop app
pnpm docs:dev # Run docs siteRun PostgreSQL + pgvector locally instead of using Supabase:
# Start local Postgres with pgvector
docker-compose -f docker-compose.local.yml up -d
# Initialize the database schema
docker exec -i textrawl-postgres psql -U postgres -d textrawl < scripts/setup-db.sql
# Optional: Start pgAdmin at http://localhost:5050
docker-compose -f docker-compose.local.yml --profile tools up -dRun embeddings locally with Ollama instead of OpenAI:
# Start Postgres + Ollama
docker-compose -f docker-compose.local.yml --profile ollama up -d
# Pull the embedding model (~274MB)
docker exec textrawl-ollama ollama pull nomic-embed-text
# Use the Ollama-specific schema (1024 dimensions)
docker exec -i textrawl-postgres psql -U postgres -d textrawl < scripts/setup-db-ollama.sqlSet in .env:
EMBEDDING_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=nomic-embed-textSupported Ollama models: nomic-embed-text (recommended), mxbai-embed-large
Note: OpenAI uses 1536-dimension embeddings, Ollama models use 1024. Use
setup-db.sqlfor OpenAI orsetup-db-ollama.sqlfor Ollama. You cannot mix providers without re-embedding all documents.
| Issue | Solution |
|---|---|
| Invalid Supabase URL | Format: https://your-project.supabase.co (no trailing slash) |
| Missing service role key | Use service role key from Settings > API, not anon key |
| No search results | Check chunks table has embeddings; lower minScore |
| MCP tools not in Claude | Restart Claude Desktop; check curl http://localhost:3000/health |
| Rate limit exceeded | API: 100/min, Upload: 10/min |
Contributions welcome! See CONTRIBUTING.md for guidelines.
MIT - see LICENSE