A local-first AI assistant for your codebase. No cloud, no API keys, just you and your repo.
DevForge keeps your repository context synced and lets you ask questions about your code instantly - all running locally with Ollama.
Born from frustration with cloud tools and constant copy-pasting. Built during weekend vibe-coding sessions (name inspired by Skyrim armor crafting ๐ ).
- ๐ Incremental Context Updates - Only re-embeds changed files (hash-based delta detection)
- ๐ง Smart Retrieval - Cosine similarity search with deduplication and freshness scoring
- ๐ 100% Local - Everything runs on your machine. No API keys, no data leaves your laptop
- โก Fast Queries - Vector embeddings for semantic code search
- ๐ฏ Cited Answers - Every response includes source references
[file:start-end] - ๐ก๏ธ Privacy First - Your code stays yours
-
Install Ollama - ollama.ai
# Start Ollama server ollama serve -
Pull Required Models
ollama pull gemma3:latest ollama pull nomic-embed-text:latest
# Clone the repo
git clone <your-repo-url>
cd devforge
# Install dependencies
npm install
# Make it globally available (optional)
npm linkNavigate to any Git repository and run:
devforge readThis will:
- โ Check Ollama is running and models are available
- โ
Create
.devforge/directory (config, ignore patterns, manifest, index) - โ Scan your repo respecting ignore patterns
- โ Chunk files into ~2000 character segments
- โ Generate embeddings for all code
- โ Store vectors in local index
Output:
๐ Preflight: Ollama & models
โ
Ollama running, required models found.
๐ง Initializing DevForge at: /path/to/repo/.devforge
โ
Created .devforge/config.json
โ
Created .devforge/ignore
๐ Scanning repository (respecting config & ignore)โฆ
โ
Refreshed manifest: 42 files
Languages: js(15), json(8), md(3)
๐ง Embedding repo content (added/changed files only)โฆ
ยท (1/42) src/chat.js โฆ ok (3 chunks)
ยท (2/42) src/read.js โฆ ok (5 chunks)
...
โ
Embedded 42 file(s), 156 chunk(s)
# Simple query
devforge ask "where is JWT verified?"
# Show sources
devforge ask "how does the embedding work?" --show-sources
# Use more context chunks
devforge ask "explain the authentication flow" -k 20
# Skip refresh if repo hasn't changed
devforge ask "what database is used?" --no-refresh
# Longer responses
devforge ask "summarize the API architecture" --max-tokens 1024Example Output:
๐ Retrieving relevant codeโฆ ok (12 chunk(s))
JWT verification happens in the authentication middleware. The token is
extracted from the Authorization header and verified using the JWT_SECRET
from the environment [src/middleware/auth.js:23-45]. If verification fails,
it returns a 401 Unauthorized response.
Sources:
- src/middleware/auth.js:23-45
- src/config/auth.js:10-15
Made changes to your code? Just run:
devforge readDevForge will:
- Detect added, changed, and removed files (using SHA-256 hashes)
- Only re-embed the delta
- Update the index incrementally
Incremental Update Example:
๐ Scanning repositoryโฆ
โ
Refreshed manifest: 43 files
ฮ changes โ added: 1, changed: 2, removed: 0
โ Added:
src/new-feature.js
โ๏ธ Changed:
src/auth.js
src/config.js
๐ง Embedding repo content (added/changed files only)โฆ
ยท (1/3) src/new-feature.js โฆ ok (4 chunks)
ยท (2/3) src/auth.js โฆ ok (3 chunks)
ยท (3/3) src/config.js โฆ ok (2 chunks)
โ
Embedded 3 file(s), 9 chunk(s)
devforge read --forceClears cache and rebuilds the entire index from scratch.
DevForge creates .devforge/config.json on first run:
{
"airgap": false,
"max_file_kb": 256,
"models": {
"chat": "gemma3:latest",
"embed": "nomic-embed-text:latest"
},
"paths": {
"include": ["**/*"],
"exclude": [
"node_modules/**",
"dist/**",
".git/**",
"*.lock",
"*.bin",
"*.jpg",
"*.png",
"*.pdf",
".env",
"**/*.key",
"**/*.pem"
]
},
"retrieval": {
"bm25_k": 80,
"vec_k": 80,
"final_k": 20
}
}Edit config.json to use different models:
{
"models": {
"chat": "llama3.2:latest", // Bigger model if you have RAM
"embed": "nomic-embed-text:latest"
}
}Edit .devforge/ignore to exclude specific files:
# Custom ignores
secrets.txt
internal/**
*.tmp
Test that DevForge is working.
Initialize or refresh repository context.
Options:
--force- Clear cache and rebuild from scratch
Query your codebase.
Options:
-k, --topk <n>- Number of chunks to retrieve (default: 12)--max-tokens <n>- Max tokens for answer (default: 512)--model <name>- Override chat model--show-sources- Print source references--no-refresh- Skip context refresh (faster, use existing index)
DevForge uses fast-glob to find files matching your include/exclude patterns, computing SHA-256 hashes for change detection.
Files are split into ~2000 character chunks with small overlap to preserve context across boundaries.
Each chunk is embedded using nomic-embed-text (produces 768-dimensional vectors). These are stored in .devforge/index.json.
When you ask a question:
- Your question is embedded into the same vector space
- Cosine similarity scores all chunks
- Top-k most relevant chunks are retrieved
- Deduplication ensures diverse file coverage
The context (retrieved chunks) + your question are sent to the chat model, which generates an answer with source citations.
On subsequent devforge read:
- Compares current file hashes to previous manifest
- Only re-embeds changed files
- Prunes removed files from index
- Keeps everything in sync
Performance:
- Start with
gemma3:latest(efficient, good quality) - Upgrade to
llama3.2:latestormixtralif you have 16GB+ RAM - Use
--no-refreshfor quick queries when repo hasn't changed
Better Results:
- Be specific in questions: "How does X validate Y?" vs "Tell me about X"
- Increase
-kfor complex queries that span multiple files - Use
--show-sourcesto verify context relevance
Keep Context Fresh:
- Run
devforge readafter pulling changes - Consider adding a git post-merge hook:
#!/bin/bash # .git/hooks/post-merge devforge read --silent
"Ollama is not running"
# Start Ollama server
ollama serve"Missing model(s)"
ollama pull gemma3:latest
ollama pull nomic-embed-text:latest"No vectors found"
# Initialize the repo first
devforge readSlow embeddings on large repos:
- Adjust
max_file_kbin config to skip huge files - Add more patterns to
.devforge/ignore - Embeddings are one-time cost; subsequent updates are incremental
Poor answer quality:
- Try increasing
-kto retrieve more context - Use
--show-sourcesto check if relevant code was retrieved - Consider using a larger chat model
.devforge/
โโโ config.json # Configuration
โโโ ignore # Additional ignore patterns
โโโ manifest.json # File metadata + hashes
โโโ index.json # Vector embeddings
src/
โโโ cli.js # Command-line interface
โโโ read.js # Init, scan, embed
โโโ chat.js # Query & retrieval
โโโ scan.js # File scanning + hashing
โโโ embeddings.js # Chunking + embedding
โโโ ollama.js # Ollama API checks
This started as a weekend project, but contributions are welcome!
- Fork the repo
- Create a feature branch
- Make your changes
- Submit a PR
ISC
Built out of frustration with cloud tools that never stayed in sync with my local repo. Wanted something dead simple that just works - no APIs, no copy-pasting, no nonsense.
The name "DevForge" came to me while crafting armor in Skyrim. Sometimes the best tools are the ones you build while vibe-coding on weekends.
Made with โ and weekend energy
