A Python implementation of Bitcask, a log-structured hash table for fast key-value storage, originally developed by Basho (Riak).
- Fast writes: Append-only design ensures O(1) disk writes
- Fast reads: In-memory hash table provides O(1) lookups
- Crash recovery: Append-only logs enable consistent recovery
- Automatic compaction: Merge process reclaims space from deleted/overwritten keys
- Hint files: Speed up startup time by avoiding full data file scans
- TTL Support: Automatic key expiration with time-to-live settings
- String Support: Automatic conversion between strings and bytes with configurable encoding
- Snapshots: Create thread-safe read-only point-in-time views of the database
- Single writer, multiple readers: Ensures consistency while allowing concurrent reads
- Configurable modes: Read-only, write-only, and sync-on-put options
- MutableMapping interface: Dict-like API for ease of use
# Clone the repository
git clone <repository-url>
cd bitcask
# Install with development dependencies
pip install -e ".[dev,test]"
# Or just install the package
pip install -e .
# Run tests to verify installation
make testfrom bitcask import Bitcask
# Basic usage - supports both bytes and strings
with Bitcask("mydb") as db:
# Write data (strings are auto-converted to bytes)
db.put("key", "value")
# Read data (returns bytes)
value = db.get("key")
print(value) # b"value"
# Can also use bytes directly
db.put(b"binary_key", b"binary\x00value")
# Delete data
db.delete("key")
# List all keys (returns bytes)
for key in db.keys():
print(key)Store a key-value pair with optional TTL (time-to-live) in seconds. Strings are automatically converted to bytes. Returns timestamp.
# Using strings (auto-converted to bytes)
timestamp = db.put("user:123", "John Doe")
timestamp = db.put("session:456", "data", ttl=3600) # Expires in 1 hour
# Using bytes directly
timestamp = db.put(b"binary_key", b"binary\x00value")Retrieve a value by key. Strings are auto-converted to bytes for lookup. Always returns bytes. Raises KeyError if not found.
try:
value = db.get("user:123") # Returns b"John Doe"
# Decode if you need a string
text = value.decode('utf-8')
except KeyError:
print("Key not found")Mark a key as deleted (tombstone). Strings are auto-converted to bytes. Raises KeyError if not found.
db.delete("user:123") # String key
db.delete(b"binary_key") # Bytes keyReturn all active keys (excludes deleted keys).
all_keys = db.keys()Fold over all key-value pairs.
def sum_values(key, value, acc):
return acc + int(value)
total = db.fold(sum_values, 0)Set TTL on an existing key. Strings are auto-converted to bytes. Returns timestamp.
db.expire("user:123", 3600) # Expire in 1 hourGet remaining TTL in seconds. Strings are auto-converted to bytes. Returns -1 if no TTL, -2 if key doesn't exist.
remaining = db.ttl("session:456")
if remaining > 0:
print(f"Expires in {remaining} seconds")Remove TTL from a key, making it persistent. Strings are auto-converted to bytes.
db.persist("session:456") # Remove expirationCreate a read-only snapshot of the current database state. Returns a new Bitcask instance with a deep copy of the keydir, providing a consistent point-in-time view that remains unchanged even as the original database is modified. Each snapshot has its own file handle and lock for thread-safe concurrent access.
with Bitcask("mydb") as db:
db.put("key", "value1")
# Create point-in-time snapshot
snap = db.snapshot()
# Modify original
db.put("key", "value2")
# Snapshot still sees original value
assert snap.get("key") == b"value1"
assert db.get("key") == b"value2"
# Clean up snapshot when done
snap.close()Bitcask(
path="mydb", # Database directory
read_only=False, # Open in read-only mode
write_only=False, # Open in write-only mode (no reads)
sync_on_put=False, # Sync to disk after each write
max_file_size=1048576, # Max size before file rotation (1MB default)
encoding='utf-8' # Encoding for string to bytes conversion
)# Automatic open/close with context manager
with Bitcask("mydb") as db:
db.put("key", "value") # Strings are auto-converted
# Manual open/close
db = Bitcask("mydb")
db.open()
db.put("key", "value")
db.close()Bitcask implements Python's MutableMapping protocol:
with Bitcask("mydb") as db:
# Dict-like operations with strings or bytes
db["key"] = "value" # Strings auto-converted
value = db["key"] # Returns bytes
del db["key"] # Delete
# Mix strings and bytes
db[b"binary"] = b"data"
db["text"] = "content"
# Iteration (keys are always bytes)
for key in db:
print(key, db[key])
# Length
print(len(db))
# Membership test (strings auto-converted)
if "key" in db:
print("Key exists")Perfect for read-heavy workloads with shared access:
with Bitcask("mydb", read_only=True) as reader:
value = reader.get("key") # Strings work here too
# reader.put("key", "value") # Raises ReadOnly exceptionOptimized for high-throughput append operations (logging, ingestion):
with Bitcask("mydb", write_only=True) as writer:
writer.put("log1", "data") # Fast, no keydir overhead
# writer.get("log1") # Raises WriteOnly exceptionEnsures durability for critical data:
with Bitcask("mydb", sync_on_put=True) as db:
db.put("critical", "data") # Immediately synced to diskReclaim space by removing old versions and deleted keys:
db = Bitcask("mydb")
db.merge() # Must be called when database is closedmydb/
├── bitcask.data # Active write file
├── bitcask.data.lock # Write lock file
├── 1.bitcask.data # Immutable data file
├── 1.bitcask.hint # Hint file for faster loading
├── 2.bitcask.data # Immutable data file
└── 2.bitcask.hint # Hint file
+-----+----------+--------+--------+-----+-------+
| CRC | Tstamp | KeySz | ValSz | Key | Value |
+-----+----------+--------+--------+-----+-------+
| 4B | 4B | 2B | 4B | var | var |
+-----+----------+--------+--------+-----+-------+
+-----+----------+--------+--------+----------+-----+
| CRC | Tstamp | KeySz | ValSz | ValuePos | Key |
+-----+----------+--------+--------+----------+-----+
| 4B | 4B | 2B | 4B | 8B | var |
+-----+----------+--------+--------+----------+-----+
- Thread-Safe: All operations are protected by an internal RLock
- Multi-threaded: Safe to share a single Bitcask instance across threads
- Performance: ~8% overhead for writes, ~20% for reads vs single-threaded
- Single Writer: Only one process can write at a time (enforced via file lock)
- Multiple Readers: Any number of processes can read simultaneously
- Snapshot Isolation: Readers see a point-in-time snapshot
# Thread-safe: Single instance, multiple threads
db = Bitcask("mydb")
db.open()
def worker(thread_id):
for i in range(1000):
db.put(f"key_{thread_id}_{i}".encode(), b"value")
value = db.get(f"key_{thread_id}_{i}".encode())
# Safe to use from multiple threads
threads = [threading.Thread(target=worker, args=(i,)) for i in range(10)]
for t in threads:
t.start()
for t in threads:
t.join()
db.close()
# Process-safe: Multiple processes
# Writer process
with Bitcask("mydb").open_context() as writer:
writer.put(b"key", b"value")
# Multiple reader processes (can run simultaneously)
with Bitcask("mydb", read_only=True).open_context() as reader:
value = reader.get(b"key")- Keydir stores all keys in memory: ~145 bytes overhead per key in Python
- 1 million keys ≈ 145MB RAM
- Use write-only mode for append-only workloads to avoid keydir overhead
- Snapshots deep-copy the keydir, doubling memory for keys (data files are shared)
- Sequential writes to append-only log
- Configurable file size for rotation
- Optional sync-on-put for durability vs speed trade-off
- Thread-safe with ~8% overhead vs single-threaded
- O(1) key lookups via in-memory hash table
- Single disk seek per read
- Hint files speed up startup time
- Thread-safe with ~20% overhead vs single-threaded
On a typical development machine:
- Sequential writes: ~50,000 ops/sec
- Random reads: ~100,000 ops/sec
- Mixed workload: ~30,000 ops/sec
Run make benchmark to test on your hardware.
The project uses pytest for testing with comprehensive test coverage:
# Run all tests with pytest (94 tests)
make test
# Run with verbose output
make test-verbose
# Run with coverage report
make test-coverage
# Run tests in parallel (faster)
make test-parallel
# Run specific test file
env/bin/python -m pytest bitcask/__tests__/test_basic_operations.py
# Clean up test artifacts
make clean
# Run benchmarks
make benchmark # Standard benchmark
make benchmark-threaded # Multi-threaded benchmark
# Show all available targets
make helpTest coverage includes:
- ✅ Basic CRUD operations (9 tests)
- ✅ File rotation and merging (8 tests)
- ✅ Corruption recovery (10 tests)
- ✅ Concurrent access and thread safety (14 tests)
- ✅ Edge cases (empty values, large keys) (10 tests)
- ✅ Mode restrictions (8 tests)
- ✅ Error handling (10 tests)
- ✅ TTL support (12 tests)
- ✅ String conversion (10 tests)
- ✅ Snapshot functionality (11 tests)
- ✅ Legacy compatibility tests (24 tests)
Total: 128 tests with ~85% code coverage
from bitcask import Bitcask
import threading
def analyze_snapshot(snap):
"""Analyze data in snapshot without affecting live database"""
total_keys = len(snap.keys())
print(f"Snapshot has {total_keys} keys")
# Perform analysis on consistent view
for key in snap.keys():
value = snap.get(key)
# Process data...
snap.close()
with Bitcask("production_db") as db:
# Add initial data
for i in range(1000):
db.put(f"key{i}", f"value{i}")
# Create snapshot for analysis
snap = db.snapshot()
# Multiple threads can safely read from the same snapshot
threads = []
for i in range(5):
thread = threading.Thread(target=analyze_snapshot, args=(snap,))
threads.append(thread)
thread.start()
# Continue modifying the database while snapshot analysis runs
for i in range(1000):
db.put(f"key{i}", f"modified{i}")
# Wait for analysis to complete
for thread in threads:
thread.join()
# Snapshots saw the original state, not the modificationsclass Cache:
def __init__(self, path):
self.db = Bitcask(path)
self.db.open()
def set(self, key: str, value: str, ttl=None):
# Strings are auto-converted, TTL is supported
self.db.put(key, value, ttl=ttl)
def get(self, key: str) -> str:
try:
# Get returns bytes, decode to string
return self.db.get(key).decode('utf-8')
except KeyError:
return None
def close(self):
self.db.close()# High-speed log ingestion - strings work seamlessly
with Bitcask("logs", write_only=True) as logs:
for i in range(1_000_000):
timestamp = time.time()
logs.put(f"log:{timestamp}", log_data) # String key
# Later, process logs
with Bitcask("logs", read_only=True) as logs:
for key in logs.keys(): # Keys are bytes
if key.startswith(b"log:"):
process_log(logs.get(key))class SessionStore:
def __init__(self):
self.db = Bitcask("sessions", sync_on_put=True)
self.db.open()
def save_session(self, session_id: str, data: dict, ttl=3600):
key = f"session:{session_id}" # String key
value = json.dumps(data) # String value
# Sessions automatically expire after TTL seconds
self.db.put(key, value, ttl=ttl)
def get_session(self, session_id: str) -> dict:
key = f"session:{session_id}"
try:
value = self.db.get(key) # Returns bytes
return json.loads(value.decode('utf-8'))
except KeyError:
return None # Session expired or doesn't exist
def extend_session(self, session_id: str, ttl=3600):
key = f"session:{session_id}"
self.db.expire(key, ttl) # Strings work with all methods
def delete_session(self, session_id: str):
key = f"session:{session_id}"
self.db.delete(key)class Cache:
def __init__(self, path="cache", encoding='utf-8'):
# Pass custom encoding if needed
self.db = Bitcask(path, encoding=encoding)
self.db.open()
def set(self, key: str, value: str, ttl=300):
# Default 5-minute TTL, strings auto-converted
self.db.put(key, value, ttl=ttl)
def get(self, key: str) -> str:
try:
return self.db.get(key).decode(self.db._encoding)
except KeyError:
return None # Cache miss or expired
def close(self):
self.db.close()- Key Size: Maximum 65KB (2-byte length field)
- Value Size: Maximum 4GB (4-byte length field)
- Memory: All keys must fit in memory
- No Transactions: Single key operations only
- Single Machine: No built-in replication/distribution
This implementation follows the Bitcask paper with some Python-specific adaptations:
- CRC Validation: All entries are CRC32 protected
- Crash Recovery: Corrupted entries are skipped with logging
- File Locking: OS-level locks prevent multiple writers
- Tombstones: Deleted keys marked with special value
- Hint Files: Created during merge for faster startup
- Python Optimizations: Uses bytearray for efficient I/O
- Snapshots: Each snapshot gets its own file handle and lock for thread-safe access
- TTL Support: Value-embedded expiration timestamps with lazy deletion
- String Support: Automatic encoding/decoding with configurable character sets
- Install development dependencies:
pip install -e ".[dev,test]" - Run existing tests:
make test - Add tests for new features in
bitcask/__tests__/ - Ensure all tests pass and coverage is maintained
- Follow existing code style (no comments unless necessary)
- Run benchmarks to ensure performance is not degraded
See LICENSE
Based on the Bitcask paper by Justin Sheehy and David Smith at Basho Technologies.