Bitcask - Python Implementation

A Python implementation of Bitcask, a log-structured hash table for fast key-value storage, originally developed by Basho (Riak).

Features

Fast writes: Append-only design ensures O(1) disk writes
Fast reads: In-memory hash table provides O(1) lookups
Crash recovery: Append-only logs enable consistent recovery
Automatic compaction: Merge process reclaims space from deleted/overwritten keys
Hint files: Speed up startup time by avoiding full data file scans
TTL Support: Automatic key expiration with time-to-live settings
String Support: Automatic conversion between strings and bytes with configurable encoding
Snapshots: Create thread-safe read-only point-in-time views of the database
Single writer, multiple readers: Ensures consistency while allowing concurrent reads
Configurable modes: Read-only, write-only, and sync-on-put options
MutableMapping interface: Dict-like API for ease of use

Installation

# Clone the repository
git clone <repository-url>
cd bitcask

# Install with development dependencies
pip install -e ".[dev,test]"

# Or just install the package
pip install -e .

# Run tests to verify installation
make test

Quick Start

from bitcask import Bitcask

# Basic usage - supports both bytes and strings
with Bitcask("mydb") as db:
    # Write data (strings are auto-converted to bytes)
    db.put("key", "value")
    
    # Read data (returns bytes)
    value = db.get("key")
    print(value)  # b"value"
    
    # Can also use bytes directly
    db.put(b"binary_key", b"binary\x00value")
    
    # Delete data
    db.delete("key")
    
    # List all keys (returns bytes)
    for key in db.keys():
        print(key)

API Reference

Core Operations

`put(key: bytes|str, value: bytes|str, ttl: int = None) -> int`

Store a key-value pair with optional TTL (time-to-live) in seconds. Strings are automatically converted to bytes. Returns timestamp.

# Using strings (auto-converted to bytes)
timestamp = db.put("user:123", "John Doe")
timestamp = db.put("session:456", "data", ttl=3600)  # Expires in 1 hour

# Using bytes directly
timestamp = db.put(b"binary_key", b"binary\x00value")

`get(key: bytes|str) -> bytes`

Retrieve a value by key. Strings are auto-converted to bytes for lookup. Always returns bytes. Raises KeyError if not found.

try:
    value = db.get("user:123")  # Returns b"John Doe"
    # Decode if you need a string
    text = value.decode('utf-8')
except KeyError:
    print("Key not found")

`delete(key: bytes|str)`

Mark a key as deleted (tombstone). Strings are auto-converted to bytes. Raises KeyError if not found.

db.delete("user:123")  # String key
db.delete(b"binary_key")  # Bytes key

`keys() -> List[bytes]`

Return all active keys (excludes deleted keys).

all_keys = db.keys()

`fold(function, initial) -> Any`

Fold over all key-value pairs.

def sum_values(key, value, acc):
    return acc + int(value)

total = db.fold(sum_values, 0)

`expire(key: bytes|str, ttl: int) -> int`

Set TTL on an existing key. Strings are auto-converted to bytes. Returns timestamp.

db.expire("user:123", 3600)  # Expire in 1 hour

`ttl(key: bytes|str) -> int`

Get remaining TTL in seconds. Strings are auto-converted to bytes. Returns -1 if no TTL, -2 if key doesn't exist.

remaining = db.ttl("session:456")
if remaining > 0:
    print(f"Expires in {remaining} seconds")

`persist(key: bytes|str) -> int`

Remove TTL from a key, making it persistent. Strings are auto-converted to bytes.

db.persist("session:456")  # Remove expiration

`snapshot() -> Bitcask`

Create a read-only snapshot of the current database state. Returns a new Bitcask instance with a deep copy of the keydir, providing a consistent point-in-time view that remains unchanged even as the original database is modified. Each snapshot has its own file handle and lock for thread-safe concurrent access.

with Bitcask("mydb") as db:
    db.put("key", "value1")
    
    # Create point-in-time snapshot
    snap = db.snapshot()
    
    # Modify original
    db.put("key", "value2")
    
    # Snapshot still sees original value
    assert snap.get("key") == b"value1"
    assert db.get("key") == b"value2"
    
    # Clean up snapshot when done
    snap.close()

Configuration Options

Bitcask(
    path="mydb",              # Database directory
    read_only=False,          # Open in read-only mode
    write_only=False,         # Open in write-only mode (no reads)
    sync_on_put=False,        # Sync to disk after each write
    max_file_size=1048576,    # Max size before file rotation (1MB default)
    encoding='utf-8'          # Encoding for string to bytes conversion
)

Context Manager

# Automatic open/close with context manager
with Bitcask("mydb") as db:
    db.put("key", "value")  # Strings are auto-converted

# Manual open/close
db = Bitcask("mydb")
db.open()
db.put("key", "value")
db.close()

Dictionary Interface

Bitcask implements Python's MutableMapping protocol:

with Bitcask("mydb") as db:
    # Dict-like operations with strings or bytes
    db["key"] = "value"        # Strings auto-converted
    value = db["key"]          # Returns bytes
    del db["key"]              # Delete
    
    # Mix strings and bytes
    db[b"binary"] = b"data"
    db["text"] = "content"
    
    # Iteration (keys are always bytes)
    for key in db:
        print(key, db[key])
    
    # Length
    print(len(db))
    
    # Membership test (strings auto-converted)
    if "key" in db:
        print("Key exists")

Operating Modes

Read-Only Mode

Perfect for read-heavy workloads with shared access:

with Bitcask("mydb", read_only=True) as reader:
    value = reader.get("key")  # Strings work here too
    # reader.put("key", "value")  # Raises ReadOnly exception

Write-Only Mode

Optimized for high-throughput append operations (logging, ingestion):

with Bitcask("mydb", write_only=True) as writer:
    writer.put("log1", "data")  # Fast, no keydir overhead
    # writer.get("log1")  # Raises WriteOnly exception

Sync-On-Put Mode

Ensures durability for critical data:

with Bitcask("mydb", sync_on_put=True) as db:
    db.put("critical", "data")  # Immediately synced to disk

File Management

Merge (Compaction)

Reclaim space by removing old versions and deleted keys:

db = Bitcask("mydb")
db.merge()  # Must be called when database is closed

File Structure

mydb/
├── bitcask.data           # Active write file
├── bitcask.data.lock      # Write lock file
├── 1.bitcask.data         # Immutable data file
├── 1.bitcask.hint         # Hint file for faster loading
├── 2.bitcask.data         # Immutable data file
└── 2.bitcask.hint         # Hint file

Data Format

Data File Entry

+-----+----------+--------+--------+-----+-------+
| CRC | Tstamp   | KeySz  | ValSz  | Key | Value |
+-----+----------+--------+--------+-----+-------+
| 4B  | 4B       | 2B     | 4B     | var | var   |
+-----+----------+--------+--------+-----+-------+

Hint File Entry

+-----+----------+--------+--------+----------+-----+
| CRC | Tstamp   | KeySz  | ValSz  | ValuePos | Key |
+-----+----------+--------+--------+----------+-----+
| 4B  | 4B       | 2B     | 4B     | 8B       | var |
+-----+----------+--------+--------+----------+-----+

Concurrency Model

Thread Safety

Thread-Safe: All operations are protected by an internal RLock
Multi-threaded: Safe to share a single Bitcask instance across threads
Performance: ~8% overhead for writes, ~20% for reads vs single-threaded

Process Safety

Single Writer: Only one process can write at a time (enforced via file lock)
Multiple Readers: Any number of processes can read simultaneously
Snapshot Isolation: Readers see a point-in-time snapshot

# Thread-safe: Single instance, multiple threads
db = Bitcask("mydb")
db.open()

def worker(thread_id):
    for i in range(1000):
        db.put(f"key_{thread_id}_{i}".encode(), b"value")
        value = db.get(f"key_{thread_id}_{i}".encode())

# Safe to use from multiple threads
threads = [threading.Thread(target=worker, args=(i,)) for i in range(10)]
for t in threads:
    t.start()
for t in threads:
    t.join()

db.close()

# Process-safe: Multiple processes
# Writer process
with Bitcask("mydb").open_context() as writer:
    writer.put(b"key", b"value")

# Multiple reader processes (can run simultaneously)
with Bitcask("mydb", read_only=True).open_context() as reader:
    value = reader.get(b"key")

Performance Considerations

Memory Usage

Keydir stores all keys in memory: ~145 bytes overhead per key in Python
1 million keys ≈ 145MB RAM
Use write-only mode for append-only workloads to avoid keydir overhead
Snapshots deep-copy the keydir, doubling memory for keys (data files are shared)

Write Performance

Sequential writes to append-only log
Configurable file size for rotation
Optional sync-on-put for durability vs speed trade-off
Thread-safe with ~8% overhead vs single-threaded

Read Performance

O(1) key lookups via in-memory hash table
Single disk seek per read
Hint files speed up startup time
Thread-safe with ~20% overhead vs single-threaded

Benchmark Results

On a typical development machine:

Sequential writes: ~50,000 ops/sec
Random reads: ~100,000 ops/sec
Mixed workload: ~30,000 ops/sec

Run make benchmark to test on your hardware.

Testing

The project uses pytest for testing with comprehensive test coverage:

# Run all tests with pytest (94 tests)
make test

# Run with verbose output
make test-verbose

# Run with coverage report
make test-coverage

# Run tests in parallel (faster)
make test-parallel

# Run specific test file
env/bin/python -m pytest bitcask/__tests__/test_basic_operations.py

# Clean up test artifacts
make clean

# Run benchmarks
make benchmark            # Standard benchmark
make benchmark-threaded   # Multi-threaded benchmark

# Show all available targets
make help

Test coverage includes:

✅ Basic CRUD operations (9 tests)
✅ File rotation and merging (8 tests)
✅ Corruption recovery (10 tests)
✅ Concurrent access and thread safety (14 tests)
✅ Edge cases (empty values, large keys) (10 tests)
✅ Mode restrictions (8 tests)
✅ Error handling (10 tests)
✅ TTL support (12 tests)
✅ String conversion (10 tests)
✅ Snapshot functionality (11 tests)
✅ Legacy compatibility tests (24 tests)

Total: 128 tests with ~85% code coverage

Examples

Point-in-Time Snapshots

from bitcask import Bitcask
import threading

def analyze_snapshot(snap):
    """Analyze data in snapshot without affecting live database"""
    total_keys = len(snap.keys())
    print(f"Snapshot has {total_keys} keys")
    
    # Perform analysis on consistent view
    for key in snap.keys():
        value = snap.get(key)
        # Process data...
    
    snap.close()

with Bitcask("production_db") as db:
    # Add initial data
    for i in range(1000):
        db.put(f"key{i}", f"value{i}")
    
    # Create snapshot for analysis
    snap = db.snapshot()
    
    # Multiple threads can safely read from the same snapshot
    threads = []
    for i in range(5):
        thread = threading.Thread(target=analyze_snapshot, args=(snap,))
        threads.append(thread)
        thread.start()
    
    # Continue modifying the database while snapshot analysis runs
    for i in range(1000):
        db.put(f"key{i}", f"modified{i}")
    
    # Wait for analysis to complete
    for thread in threads:
        thread.join()
    
    # Snapshots saw the original state, not the modifications

Simple Cache

class Cache:
    def __init__(self, path):
        self.db = Bitcask(path)
        self.db.open()
    
    def set(self, key: str, value: str, ttl=None):
        # Strings are auto-converted, TTL is supported
        self.db.put(key, value, ttl=ttl)
    
    def get(self, key: str) -> str:
        try:
            # Get returns bytes, decode to string
            return self.db.get(key).decode('utf-8')
        except KeyError:
            return None
    
    def close(self):
        self.db.close()

Log Storage

# High-speed log ingestion - strings work seamlessly
with Bitcask("logs", write_only=True) as logs:
    for i in range(1_000_000):
        timestamp = time.time()
        logs.put(f"log:{timestamp}", log_data)  # String key

# Later, process logs
with Bitcask("logs", read_only=True) as logs:
    for key in logs.keys():  # Keys are bytes
        if key.startswith(b"log:"):
            process_log(logs.get(key))

Session Store with TTL

class SessionStore:
    def __init__(self):
        self.db = Bitcask("sessions", sync_on_put=True)
        self.db.open()
    
    def save_session(self, session_id: str, data: dict, ttl=3600):
        key = f"session:{session_id}"  # String key
        value = json.dumps(data)  # String value
        # Sessions automatically expire after TTL seconds
        self.db.put(key, value, ttl=ttl)
    
    def get_session(self, session_id: str) -> dict:
        key = f"session:{session_id}"
        try:
            value = self.db.get(key)  # Returns bytes
            return json.loads(value.decode('utf-8'))
        except KeyError:
            return None  # Session expired or doesn't exist
    
    def extend_session(self, session_id: str, ttl=3600):
        key = f"session:{session_id}"
        self.db.expire(key, ttl)  # Strings work with all methods
    
    def delete_session(self, session_id: str):
        key = f"session:{session_id}"
        self.db.delete(key)

Cache with TTL

class Cache:
    def __init__(self, path="cache", encoding='utf-8'):
        # Pass custom encoding if needed
        self.db = Bitcask(path, encoding=encoding)
        self.db.open()
    
    def set(self, key: str, value: str, ttl=300):
        # Default 5-minute TTL, strings auto-converted
        self.db.put(key, value, ttl=ttl)
    
    def get(self, key: str) -> str:
        try:
            return self.db.get(key).decode(self.db._encoding)
        except KeyError:
            return None  # Cache miss or expired
    
    def close(self):
        self.db.close()

Limitations

Key Size: Maximum 65KB (2-byte length field)
Value Size: Maximum 4GB (4-byte length field)
Memory: All keys must fit in memory
No Transactions: Single key operations only
Single Machine: No built-in replication/distribution

Implementation Details

This implementation follows the Bitcask paper with some Python-specific adaptations:

CRC Validation: All entries are CRC32 protected
Crash Recovery: Corrupted entries are skipped with logging
File Locking: OS-level locks prevent multiple writers
Tombstones: Deleted keys marked with special value
Hint Files: Created during merge for faster startup
Python Optimizations: Uses bytearray for efficient I/O
Snapshots: Each snapshot gets its own file handle and lock for thread-safe access
TTL Support: Value-embedded expiration timestamps with lazy deletion
String Support: Automatic encoding/decoding with configurable character sets

Contributing

Install development dependencies: pip install -e ".[dev,test]"
Run existing tests: make test
Add tests for new features in bitcask/__tests__/
Ensure all tests pass and coverage is maintained
Follow existing code style (no comments unless necessary)
Run benchmarks to ensure performance is not degraded

License

See LICENSE

Acknowledgments

Based on the Bitcask paper by Justin Sheehy and David Smith at Basho Technologies.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
bitcask		bitcask
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
STATUS.md		STATUS.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

License

jsherer/bitcask

Folders and files

Latest commit

History

Repository files navigation

Bitcask - Python Implementation

Features

Installation

Quick Start

API Reference

Core Operations

put(key: bytes|str, value: bytes|str, ttl: int = None) -> int

get(key: bytes|str) -> bytes

delete(key: bytes|str)

keys() -> List[bytes]

fold(function, initial) -> Any

expire(key: bytes|str, ttl: int) -> int

ttl(key: bytes|str) -> int

persist(key: bytes|str) -> int

snapshot() -> Bitcask

Configuration Options

Context Manager

Dictionary Interface

Operating Modes

Read-Only Mode

Write-Only Mode

Sync-On-Put Mode

File Management

Merge (Compaction)

File Structure

Data Format

Data File Entry

Hint File Entry

Concurrency Model

Thread Safety

Process Safety

Performance Considerations

Memory Usage

Write Performance

Read Performance

Benchmark Results

Testing

Examples

Point-in-Time Snapshots

Simple Cache

Log Storage

Session Store with TTL

Cache with TTL

Limitations

Implementation Details

Contributing

License

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`put(key: bytes|str, value: bytes|str, ttl: int = None) -> int`

`get(key: bytes|str) -> bytes`

`delete(key: bytes|str)`

`keys() -> List[bytes]`

`fold(function, initial) -> Any`

`expire(key: bytes|str, ttl: int) -> int`

`ttl(key: bytes|str) -> int`

`persist(key: bytes|str) -> int`

`snapshot() -> Bitcask`

Packages