A comprehensive Docker-based Bitcoin blockchain analysis platform combining Bitcoin Core, Neo4j graph database, GraphQL API, and Python analysis tools for private blockchain research and chain analysis.
- Bitcoin Core Full Node: Private node with minimal network participation
- Neo4j Graph Database: Transaction graph for network analysis
- GraphQL API: Unified query interface for Bitcoin + Neo4j data
- Electrs: Fast UTXO indexing and queries
- Jupyter Notebooks: Interactive analysis environment
- Python Analysis Tools: Pre-built scripts for address clustering and chain analysis
βββββββββββββββββββ ββββββββββββββββ βββββββββββββββ
β Bitcoin Core βββββββΊβ Neo4j Graph βββββββΊβ GraphQL β
β (Full Node) β β (Tx Graph) β β API Server β
ββββββββββ¬βββββββββ ββββββββββββββββ ββββββββ¬βββββββ
β β
β ββββββββββββββββ β
ββββββββββββββββΊβ Electrs β β
β (Indexer) β β
ββββββββββββββββ β
β
ββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β Jupyter Notebooks + Analysis Tools β
β β’ Address clustering β’ UTXO tracking β
β β’ Transaction flow β’ Network visualization β
βββββββββββββββββββββββββββββββββββββββββββββββββββ
- Docker & Docker Compose (v2.0+)
- Storage: ~2TB for full setup (600GB Bitcoin + 600GB Electrs + 600GB Neo4j + 200GB overhead)
- RAM: 16GB minimum, 32GB recommended
- CPU: 4+ cores recommended
# Clone the repository
git clone <your-repo-url>
cd bitcoin-analysis-stack
# Copy environment template
cp .env.example .env
# Edit configuration (change passwords!)
nano .env# Start all services
docker-compose up -d
# Check status
docker-compose ps
# View logs
docker-compose logs -fBitcoin Core will take several days to sync the entire blockchain. Monitor progress:
# Check Bitcoin sync status
docker-compose exec bitcoin bitcoin-cli getblockchaininfo
# Check Neo4j importer progress
docker-compose logs -f btc-importerOnce synced, access:
- Jupyter Notebooks: http://localhost:8888
- Neo4j Browser: http://localhost:7474 (login: neo4j/bitcoin123)
- GraphQL Playground: http://localhost:8000/graphql
from bitcoinrpc.authproxy import AuthServiceProxy
btc = AuthServiceProxy("http://btcuser:btcpass@localhost:8332")
# Get blockchain info
info = btc.getblockchaininfo()
print(f"Blocks: {info['blocks']}")
# Get specific transaction
tx = btc.getrawtransaction("txid_here", True)// Find most active addresses
MATCH (a:Address)<-[r:OUTPUTS_TO]-(t:Transaction)
RETURN a.address, count(t) as tx_count, sum(r.value) as total_received
ORDER BY tx_count DESC
LIMIT 10;
// Find transaction path between addresses
MATCH path = shortestPath(
(a1:Address {address: 'addr1'})-[:OUTPUTS_TO|SPENT_IN*..10]-(a2:Address {address: 'addr2'})
)
RETURN path;
// Cluster addresses by common spending
MATCH (a1:Address)<-[:OUTPUTS_TO]-(:Transaction)-[:SPENT_IN]->
(:Transaction)-[:OUTPUTS_TO]->(a2:Address)
WHERE a1 <> a2
RETURN a1.address, collect(DISTINCT a2.address) as cluster
LIMIT 10;query {
blockchainInfo {
blocks
chain
difficulty
}
block(height: 800000) {
hash
time
txCount
}
addressInfo(address: "1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa") {
balance
txCount
firstSeen
}
addressConnections(address: "...", limit: 10) {
fromAddress
toAddress
totalAmount
txCount
}
}# Analyze specific address
docker-compose exec jupyter python /home/jovyan/scripts/analyze_address.py 1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa
# Or from host (if you have dependencies installed)
python scripts/analyze_address.py <address>bitcoin-analysis-stack/
βββ docker-compose.yml # Main orchestration file
βββ .env.example # Environment template
βββ config/
β βββ bitcoin.conf # Bitcoin Core configuration
βββ services/
β βββ importer/ # Bitcoin β Neo4j importer
β β βββ Dockerfile
β β βββ importer.py
β β βββ requirements.txt
β βββ graphql/ # GraphQL API server
β β βββ Dockerfile
β β βββ server.py
β β βββ requirements.txt
β βββ blocksci/ # BlockSci analysis (optional)
β βββ Dockerfile
βββ scripts/
β βββ analyze_address.py # Address analysis tool
βββ notebooks/
β βββ 01_getting_started.ipynb # Tutorial notebook
βββ README.md
# Bitcoin RPC
BITCOIN_RPC_USER=btcuser
BITCOIN_RPC_PASSWORD=btcpass
# Neo4j
NEO4J_USER=neo4j
NEO4J_PASSWORD=bitcoin123
NEO4J_HEAP_SIZE=4G
# Importer
IMPORT_START_BLOCK=0
IMPORT_BATCH_SIZE=100
IMPORT_MODE=continuousKey settings:
listen=0- Don't accept incoming connectionsmaxconnections=8- Minimal network participationtxindex=1- Required for full transaction lookupprune=0- Keep full blockchain (change to 550 for pruned)
# Start all services
docker-compose up -d
# Start specific service
docker-compose up -d bitcoin neo4j
# Stop all services
docker-compose down
# Restart service
docker-compose restart btc-importer
# View logs
docker-compose logs -f bitcoin
docker-compose logs -f neo4j
docker-compose logs -f btc-importer# Backup Bitcoin data
docker-compose stop bitcoin
docker run --rm -v bitcoin-analysis-stack_bitcoin_data:/data -v $(pwd)/backups:/backup alpine tar czf /backup/bitcoin-backup.tar.gz /data
# Backup Neo4j data
docker-compose exec neo4j neo4j-admin dump --database=neo4j --to=/data/neo4j-backup.dump
# Clean up everything (β οΈ DELETES ALL DATA)
docker-compose down -v# Bitcoin Core CLI
docker-compose exec bitcoin bitcoin-cli getblockcount
docker-compose exec bitcoin bitcoin-cli getpeerinfo
# Neo4j Cypher Shell
docker-compose exec neo4j cypher-shell -u neo4j -p bitcoin123
# GraphQL health check
curl http://localhost:8000/healthEdit config/bitcoin.conf:
dbcache=4096 # Increase for faster sync (MB)
par=8 # Parallel script verification threads
maxmempool=1000 # Max mempool size (MB)Edit .env:
NEO4J_HEAP_SIZE=8G # Increase for better performance
NEO4J_PAGECACHE=4G # Cache for graph dataEdit .env:
IMPORT_BATCH_SIZE=500 # Process more blocks at once
IMPORT_START_BLOCK=800000 # Skip old blocksIdentify addresses controlled by the same entity using common-input-ownership heuristic:
from py2neo import Graph
graph = Graph("bolt://localhost:7687", auth=("neo4j", "bitcoin123"))
# Find co-spent addresses (likely same wallet)
query = """
MATCH (a1:Address)<-[:OUTPUTS_TO]-(:Transaction)-[:SPENT_IN]->
(spend:Transaction)-[:SPENT_IN]->(:Transaction)-[:OUTPUTS_TO]->(a2:Address)
WHERE a1 <> a2
RETURN a1.address, collect(DISTINCT a2.address) as cluster
"""
result = graph.run(query).data()Track BTC flow through the network:
# Find all transactions between two addresses
query = """
MATCH path = (a1:Address {address: $from})-[:OUTPUTS_TO|SPENT_IN*..10]->(a2:Address {address: $to})
RETURN path
LIMIT 10
"""Query unspent outputs via Electrs or Bitcoin Core RPC.
Use NetworkX or Pyvis to visualize transaction graphs (see notebooks).
- Initial sync time: 3-7 days for full Bitcoin blockchain
- Storage: ~2TB required for complete setup (Bitcoin + Electrs + Neo4j)
- Neo4j size: Graph database is similar in size to blockchain (~600GB) due to relationship storage
- BlockSci: Requires manual compilation (Dockerfile is placeholder)
- Privacy: While minimizing network participation, your node still connects to peers
- Change default passwords in
.env - Don't expose RPC/GraphQL ports to public internet
- Use firewalls to restrict access
- This is for research only, not production use
# Check logs
docker-compose logs bitcoin
# Verify connectivity
docker-compose exec bitcoin bitcoin-cli getpeerinfo
# Increase connections
# Edit config/bitcoin.conf: maxconnections=16# Increase heap size in .env
NEO4J_HEAP_SIZE=8G
# Restart
docker-compose restart neo4j# Check if Bitcoin is synced
docker-compose exec bitcoin bitcoin-cli getblockchaininfo
# Check Neo4j connection
docker-compose logs btc-importer
# Restart importer
docker-compose restart btc-importer# Check service status
curl http://localhost:8000/health
# Check logs
docker-compose logs graphql
# Restart
docker-compose restart graphqlContributions welcome! Please:
- Fork the repository
- Create a feature branch
- Submit a pull request
MIT License - See LICENSE file for details
| Service | Port | Purpose |
|---|---|---|
| Bitcoin Core RPC | 8332 | Blockchain queries |
| Neo4j Browser | 7474 | Graph UI |
| Neo4j Bolt | 7687 | Graph queries |
| GraphQL API | 8000 | Unified API |
| Jupyter | 8888 | Analysis notebooks |
| Electrs | 50001 | UTXO indexer |
- Start with
notebooks/01_getting_started.ipynb - Explore Neo4j Browser with sample queries
- Try GraphQL Playground queries
- Run
analyze_address.pyon known addresses - Build custom analysis scripts
Note: This stack is designed for research and educational purposes. Use responsibly and respect privacy considerations when analyzing blockchain data.