Skip to content

This project implements two fundamental approaches for computing disparity maps from stereo image pairs: Block Matching and Dynamic Programming. These methods are used to estimate depth information from stereo images, a core problem in computer vision.

Notifications You must be signed in to change notification settings

ranimeshehata/Stereo-Vision

Repository files navigation

Stereo Vision - Disparity Map Estimation

This project implements two fundamental approaches for computing disparity maps from stereo image pairs: Block Matching and Dynamic Programming. These methods are used to estimate depth information from stereo images, a core problem in computer vision.


📋 Table of Contents


🎯 Overview

Stereo vision is the process of extracting 3D depth information from two or more 2D images taken from different viewpoints. By finding corresponding pixels between stereo image pairs, we can calculate disparity (horizontal displacement), which is inversely proportional to depth.

Applications:

  • Autonomous vehicles (depth perception)
  • 3D reconstruction
  • Robot navigation
  • Augmented reality
  • Medical imaging

🧮 Algorithms Implemented

1. Block Matching (block_matching.ipynb)

A sliding window approach that matches rectangular blocks between left and right images.

Features:

  • Methods: Sum of Absolute Differences (SAD) and Sum of Squared Differences (SSD)
  • Window sizes: 1×1, 5×5, 9×9 pixels
  • Max disparity: 64 pixels
  • Advantages: Simple, parallelizable, efficient
  • Limitations: Sensitive to texture-less regions, produces noisy results

2. Dynamic Programming (dynamic_programming.ipynb)

An optimization-based approach that finds the optimal pixel correspondence along scanlines.

Features:

  • Cost function: Squared intensity difference normalized by σ²
  • Occlusion handling: Penalty-based model (c₀ = 1)
  • Ordering constraint: Preserves pixel order within scanlines
  • Visualization: Alignment path plots showing pixel correspondences
  • Advantages: Handles occlusions, globally optimal per scanline
  • Limitations: O(N²) complexity, no inter-scanline consistency

🔧 Installation

Prerequisites

  • Python 3.8+
  • Jupyter Notebook or JupyterLab

Install Dependencies

pip install opencv-python numpy matplotlib tqdm

Or using conda:

conda install -c conda-forge opencv numpy matplotlib tqdm

🚀 Usage

1. Block Matching

Open and run block_matching.ipynb:

jupyter notebook block_matching.ipynb

The notebook will:

  • Load 3 stereo image pairs from images/
  • Compute disparity maps using SAD and SSD with different window sizes
  • Save results to outputs/pair*_*.png

2. Dynamic Programming

Open and run dynamic_programming.ipynb:

jupyter notebook dynamic_programming.ipynb

The notebook will:

  • Load stereo image pairs
  • Compute disparity maps using DP algorithm
  • Generate visualizations with multiple colormaps
  • Create alignment path plots (bonus section)
  • Save all outputs to outputs/

Parameters

Block Matching:

  • max_disparity: Maximum horizontal search range (default: 64)
  • window_size: Matching window size (1, 5, or 9)
  • method: "SAD" or "SSD"

Dynamic Programming:

  • sigma: Pixel noise standard deviation (default: 2)
  • c0: Occlusion penalty cost (default: 1)

📊 Results

Output Files

  1. Disparity Maps:

    • Grayscale images where brightness indicates disparity (depth)
    • Brighter pixels = larger disparity = closer to camera
  2. Colormap Visualizations:

    • Gray: Traditional representation
    • Hot: Warm colors highlight large disparities
    • Plasma: Perceptually uniform colormap
  3. Alignment Paths (DP only):

    • Shows optimal pixel matching for middle scanline
    • Diagonal segments = matched pixels
    • Horizontal/vertical segments = occlusions

Quality Comparison

Method Smoothness Edge Preservation Occlusion Handling Speed
Block Matching (SAD) Low Medium Poor Fast
Block Matching (SSD) Low Medium Poor Fast
Dynamic Programming High Good Excellent Slow

📚 Theory

Disparity-Depth Relationship

depth = (baseline × focal_length) / disparity

Where:

  • baseline: Distance between camera centers
  • focal_length: Camera focal length
  • disparity: Horizontal pixel displacement

Block Matching Cost Functions

Sum of Absolute Differences (SAD):

cost(x,y,d) = Σ |Il(x+i, y+j) - Ir(x+i-d, y+j)|

Sum of Squared Differences (SSD):

cost(x,y,d) = Σ (Il(x+i, y+j) - Ir(x+i-d, y+j))²

Dynamic Programming Formulation

Cost Matrix:

d(i,j) = (Il(i) - Ir(j))² / σ²

DP Recurrence:

D(i,j) = min {
    D(i-1,j-1) + d(i,j)    // match pixels
    D(i-1,j) + c₀           // skip left pixel (occluded)
    D(i,j-1) + c₀           // skip right pixel (occluded)
}

Complexity:

  • Time: O(H × W²) for H×W image
  • Space: O(W²) per scanline

🔍 Key Concepts

Why Grayscale?

  • Computational efficiency: 1 channel vs 3 RGB channels
  • Structural focus: Edges and textures visible without color
  • Robustness: Less sensitive to lighting/color calibration differences
  • Standard practice: Most stereo algorithms use intensity

Ordering Constraint

Pixels maintain their relative order: if i₁ < i₂ in left image matches j₁ < j₂ in right image.

Occlusion

Pixels visible in one image but hidden in the other due to:

  • Depth discontinuities
  • Object boundaries
  • Different viewpoints

About

This project implements two fundamental approaches for computing disparity maps from stereo image pairs: Block Matching and Dynamic Programming. These methods are used to estimate depth information from stereo images, a core problem in computer vision.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •