Skip to content

jsherer/skiplist

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SkipList

A Python implementation of an indexed Skip List data structure with O(log n) operations for insert, find, and remove.

Features

  • O(log n) average complexity for insertions, deletions, and searches
  • Index-based access - access elements by position in O(log n) time
  • Range queries - efficiently find all elements within a key range
  • Nearest neighbor search - find the closest elements to a given key
  • Iterator support - iterate over elements in sorted order
  • Flexible key/value storage - supports any comparable keys and arbitrary values

Installation

# Clone the repository
git clone <repository-url>
cd skiplist

# Install development dependencies
make install

Quick Start

from skiplist import SkipList

# Create a new skiplist
sl = SkipList()

# Insert key-value pairs
sl.insert(3, "three")
sl.insert(1, "one")
sl.insert(4, "four")
sl.insert(2, "two")

# Find elements
node = sl.find(3)  # Returns node with key=3
value = sl.get(3)  # Returns "three"
value = sl.get(5, default="not found")  # Returns "not found"

# Check membership
if 3 in sl:
    print("Found!")

# Access by index (elements are kept sorted)
first = sl[0]  # Returns node with key=1
second = sl[1]  # Returns node with key=2

# Remove elements
sl.remove(2)

# Iterate in sorted order
for node in sl:
    print(f"{node.key}: {node.value}")

Advanced Features

Finding Closest Elements

# Find predecessor, exact match, and successor
pred, exact, succ = sl.find_closest(3.5)
# pred: largest key < 3.5
# exact: key == 3.5 (None if not found)
# succ: smallest key > 3.5

Range Queries

# Find all nodes with keys in range [10, 20]
for node in sl.find_range(10, 20, inclusive=True):
    print(node.key, node.value)

# Find all nodes with keys in range [10, 20)
for node in sl.find_range(10, 20, inclusive=False):
    print(node.key, node.value)

Running Tests

# Run all tests
make test

# Run specific test suites
make test-basic      # Basic operations
make test-indexing   # Index-based access
make test-closest    # Nearest neighbor and range queries
make test-edge       # Edge cases
make test-performance # Performance benchmarks

# Run with coverage
make test-coverage

Benchmarks

Compare SkipList performance against Python's sorted list (using bisect):

# Run standard benchmark
make benchmark

# Quick benchmark with small datasets
make benchmark-small

# Extended benchmark with large datasets (up to 1M elements)
make benchmark-large

Performance Characteristics

Operation SkipList Sorted List (bisect) Python list dict
Insert (random) O(log n) O(n) O(1)* O(1)
Find/Search O(log n) O(log n) O(n) O(1)
Remove O(log n) O(n) O(n) O(1)
Index access O(log n) O(1) O(1) N/A
Range query O(log n + k) O(log n + k) O(n) N/A
Maintain order Yes Yes No No

* Append only; maintaining order requires O(n) insertion

When to Use SkipList

Use SkipList when you need:

  • A sorted collection with frequent insertions and deletions
  • Range queries and nearest-neighbor searches
  • Index-based access to sorted elements
  • A self-balancing data structure

Consider alternatives when:

  • You need O(1) index access (use Python list)
  • You need O(1) key lookup without ordering (use dict)
  • Data is mostly static after initial load (use sorted list with bisect)
  • Working with small datasets where Python's built-in optimizations excel

Project Structure

skiplist/
├── skiplist/              # Main package
│   ├── __init__.py       # Package exports
│   ├── skiplist.py       # Core implementation
│   └── __tests__/        # Test suite
│       ├── test_basic_operations.py
│       ├── test_indexing.py
│       ├── test_closest_and_range.py
│       ├── test_edge_cases.py
│       └── test_performance.py
├── tools/
│   └── benchmark.py      # Performance comparison tool
├── Makefile             # Build automation
└── README.md           # This file

Implementation Details

The SkipList is implemented with:

  • Probabilistic balancing using geometric distribution (p=0.5 by default)
  • Indexed nodes for O(log n) positional access
  • Sentinel nodes for clean boundary handling
  • Configurable maximum height (default 16 levels)

API Reference

SkipList Class

Constructor

SkipList(p=0.5, distribution=random_height)
  • p: Probability for level promotion (default 0.5)
  • distribution: Function to generate random heights

Methods

Method Description Time Complexity
insert(key, value=None) Insert a key-value pair O(log n)
find(key) Find node with given key O(log n)
get(key, default=None) Get value for key O(log n)
remove(key) Remove key from list O(log n)
find_closest(key) Find nearest neighbors O(log n)
find_range(start, end, inclusive=True) Find nodes in range O(log n + k)
__len__() Get number of elements O(1)
__contains__(key) Check if key exists O(log n)
__getitem__(index) Get node at index O(log n)
__iter__() Iterate in sorted order O(n)

License

See LICENSE file for details

Contributing

Contributions are welcome! Please ensure all tests pass and add tests for new features:

# Run tests before submitting
make test

# Check code style (if configured)
make lint

Acknowledgments

Based on the Skip List data structure invented by William Pugh in 1989.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published