Skip to content

This project implements a comprehensive frequency domain analysis system for digital images using the Discrete Fourier Transform (DFT). The system decomposes images into distinct frequency bands, reconstructs images from various frequency combinations and quantitatively evaluates reconstruction quality using standard image quality metrics.

Notifications You must be signed in to change notification settings

IAnuragMahapatra/Image-Spectrum-Analysis

Repository files navigation

Advanced Image Spectrum Analysis and Reconstruction

Mathematical Foundation and Implementation

Overview

This project implements a comprehensive frequency domain analysis system for digital images using the Discrete Fourier Transform (DFT). The system decomposes images into distinct frequency bands, reconstructs images from various frequency combinations, and quantitatively evaluates reconstruction quality using standard image quality metrics.


1. Theoretical Framework

1.1 The Discrete Fourier Transform (DFT)

The 2D Discrete Fourier Transform converts a spatial domain image f(x, y) of size M × N into its frequency domain representation F(u, v):

F(u, v) = Σ(x=0 to M-1) Σ(y=0 to N-1) f(x, y) · exp[-j·2π·(ux/M + vy/N)]

Where:

  • f(x, y): Input image intensity at spatial coordinates (x, y)
  • F(u, v): Complex-valued frequency coefficient at frequency coordinates (u, v)
  • j: Imaginary unit (√-1)
  • M, N: Image dimensions (height, width)
  • u, v: Frequency domain coordinates (0 ≤ u < M, 0 ≤ v < N)

Physical Interpretation:

  • Each F(u, v) represents the amplitude and phase of a sinusoidal wave with frequency (u, v)
  • Low frequencies (near center after fftshift) represent gradual intensity changes
  • High frequencies (far from center) represent rapid intensity changes (edges, textures)

1.2 The Inverse Discrete Fourier Transform (IDFT)

The IDFT reconstructs the spatial domain image from its frequency representation:

f(x, y) = (1/MN) · Σ(u=0 to M-1) Σ(v=0 to N-1) F(u, v) · exp[j·2π·(ux/M + vy/N)]

Reconstruction Property: If F(u, v) is the complete DFT of f(x, y), then IDFT[F(u, v)] = f(x, y) exactly (within numerical precision).

1.3 Spectral Representations

Magnitude Spectrum

The magnitude spectrum represents the amplitude of each frequency component:

|F(u, v)| = √[Re[F(u, v)]² + Im[F(u, v)]²]

Properties:

  • Always non-negative: |F(u, v)| ≥ 0
  • Symmetric for real-valued images: |F(u, v)| = |F(-u, -v)|
  • Contains information about "how much" of each frequency is present
  • Does not contain phase information

Phase Spectrum

The phase spectrum represents the spatial distribution of frequency components:

φ(u, v) = arctan(Im[F(u, v)] / Re[F(u, v)])

Properties:

  • Range: -π ≤ φ(u, v) ≤ π
  • Anti-symmetric for real images: φ(u, v) = -φ(-u, -v)
  • Critical for preserving spatial structure
  • More important than magnitude for human perception

Log-Scaled Magnitude

For visualization, we apply logarithmic scaling:

L(u, v) = log(1 + |F(u, v)|)

This compresses the dynamic range, making both small and large frequency components visible.


2. Frequency Band Filtering

2.1 Ideal Filter Design

We partition the frequency domain into three non-overlapping regions using ideal (sharp cutoff) filters.

Distance Metric

For a frequency point (u, v) relative to the center (M/2, N/2):

D(u, v) = √[(u - M/2)² + (v - N/2)²]

Low-Pass Filter (LPF)

Passes only low frequencies within radius D_L:

H_low(u, v) = { 1,  if D(u, v) ≤ D_L
              { 0,  otherwise

Effect: Retains smooth variations, removes fine details

  • Preserves: Overall shape, gradual shading, large structures
  • Removes: Edges, textures, noise

High-Pass Filter (HPF)

Passes only high frequencies beyond radius D_H:

H_high(u, v) = { 1,  if D(u, v) > D_H
               { 0,  otherwise

Effect: Retains fine details, removes smooth variations

  • Preserves: Edges, textures, rapid changes
  • Removes: Smooth regions, overall brightness

Mid-Pass Filter (MPF)

Passes frequencies in the annular region between D_L and D_H:

H_mid(u, v) = { 1,  if D_L < D(u, v) ≤ D_H
              { 0,  otherwise

Effect: Retains intermediate-scale features

  • Preserves: Medium-scale structures, moderate edges
  • Removes: Both very smooth and very fine details

Complete Partition Property

The three filters form a complete partition of the frequency domain:

H_low(u, v) + H_mid(u, v) + H_high(u, v) = 1  ∀(u, v)

This ensures that every frequency component belongs to exactly one band.

2.2 Frequency Domain Filtering

Filtering in the frequency domain is performed by element-wise multiplication:

F_low(u, v)  = F(u, v) · H_low(u, v)
F_mid(u, v)  = F(u, v) · H_mid(u, v)
F_high(u, v) = F(u, v) · H_high(u, v)

Linearity Property:

F(u, v) = F_low(u, v) + F_mid(u, v) + F_high(u, v)

This decomposition is exact and reversible.


3. Image Reconstruction

3.1 Reconstruction Cases

We generate seven reconstructions by applying IDFT to different frequency band combinations:

Case Frequency Bands Formula Expected Visual Characteristics
Full Low + Mid + High IDFT[F_low + F_mid + F_high] Perfect reconstruction
Low + Mid Low + Mid IDFT[F_low + F_mid] Smooth with moderate detail
Low + High Low + High IDFT[F_low + F_high] Smooth regions with sharp edges
Mid + High Mid + High IDFT[F_mid + F_high] Detailed but lacking overall structure
Low Only Low IDFT[F_low] Blurred, smooth approximation
Mid Only Mid IDFT[F_mid] Medium-scale features only
High Only High IDFT[F_high] Edge map, high contrast

3.2 Reconstruction Process

For any frequency combination F_combined(u, v):

  1. Unshift: Apply inverse fftshift to move zero-frequency back to corners

    F_unshifted = ifftshift[F_combined]
    
  2. Apply IDFT: Transform back to spatial domain

    f_complex(x, y) = IDFT[F_unshifted]
    
  3. Extract Real Part: Discard negligible imaginary components

    f_real(x, y) = Re[f_complex(x, y)]
    
  4. Denormalize: Scale back to [0, 255] range

    f_reconstructed(x, y) = clip(f_real(x, y) · 255, 0, 255)
    

4. Quality Evaluation Metrics

4.1 Mean Squared Error (MSE)

MSE measures the average squared difference between original and reconstructed images:

MSE = (1/MN) · Σ(x=0 to M-1) Σ(y=0 to N-1) [f(x, y) - f̂(x, y)]²

Properties:

  • Range: [0, ∞)
  • Lower is better (0 = perfect reconstruction)
  • Units: squared intensity values
  • Sensitive to outliers
  • Does not correlate well with perceived quality

4.2 Peak Signal-to-Noise Ratio (PSNR)

PSNR expresses MSE on a logarithmic decibel scale:

PSNR = 10 · log₁₀(MAX² / MSE)  [dB]

Where MAX = 255 for 8-bit images.

Properties:

  • Range: [0, ∞) dB (typically 20-50 dB for images)
  • Higher is better (∞ = perfect reconstruction)
  • Logarithmic scale makes differences more interpretable
  • PSNR > 40 dB: Excellent quality
  • PSNR 30-40 dB: Good quality
  • PSNR 20-30 dB: Acceptable quality
  • PSNR < 20 dB: Poor quality

Relationship to MSE:

PSNR increases by ~10 dB when MSE decreases by factor of 10

4.3 Structural Similarity Index (SSIM)

SSIM measures perceptual similarity by comparing luminance, contrast, and structure:

SSIM(x, y) = [l(x, y)]^α · [c(x, y)]^β · [s(x, y)]^γ

Where:

  • l(x, y): Luminance comparison

    l(x, y) = (2μ_x μ_y + C₁) / (μ_x² + μ_y² + C₁)
    
  • c(x, y): Contrast comparison

    c(x, y) = (2σ_x σ_y + C₂) / (σ_x² + σ_y² + C₂)
    
  • s(x, y): Structure comparison

    s(x, y) = (σ_xy + C₃) / (σ_x σ_y + C₃)
    

Notation:

  • μ_x, μ_y: Mean intensities
  • σ_x, σ_y: Standard deviations
  • σ_xy: Covariance
  • C₁, C₂, C₃: Stability constants
  • α, β, γ: Exponents (typically all = 1)

Properties:

  • Range: [-1, 1] (typically [0, 1] for similar images)
  • Higher is better (1 = perfect structural similarity)
  • Correlates better with human perception than MSE/PSNR
  • SSIM > 0.95: Excellent similarity
  • SSIM 0.90-0.95: Good similarity
  • SSIM 0.80-0.90: Moderate similarity
  • SSIM < 0.80: Poor similarity

5. Implementation Details

5.1 Algorithm Pipeline

Input: Grayscale image f(x, y)

Step 1: Preprocessing
    f_norm(x, y) ← f(x, y) / 255

Step 2: Forward Transform
    F(u, v) ← FFT2D[f_norm(x, y)]
    F_shifted(u, v) ← fftshift[F(u, v)]

Step 3: Spectrum Analysis
    |F(u, v)| ← √[Re² + Im²]
    φ(u, v) ← arctan(Im/Re)
    L(u, v) ← log(1 + |F(u, v)|)

Step 4: Filter Design
    Compute D(u, v) for all (u, v)
    H_low ← (D ≤ D_L)
    H_high ← (D > D_H)
    H_mid ← (D_L < D ≤ D_H)

Step 5: Frequency Decomposition
    F_low ← F_shifted · H_low
    F_mid ← F_shifted · H_mid
    F_high ← F_shifted · H_high

Step 6: Reconstruction (for each combination)
    F_combined ← Σ selected bands
    F_unshifted ← ifftshift[F_combined]
    f_complex ← IFFT2D[F_unshifted]
    f_reconstructed ← clip(Re[f_complex] · 255, 0, 255)

Step 7: Quality Evaluation
    For each reconstruction:
        MSE ← mean[(f - f̂)²]
        PSNR ← 10·log₁₀(255²/MSE)
        SSIM ← structural_similarity(f, f̂)

Output: Reconstructed images + quality metrics

### 5.2 Computational Complexity

**FFT/IFFT:** O(MN log(MN)) per transform
**Filter Application:** O(MN) per filter
**Metric Calculation:** O(MN) per metric

**Total Complexity:** O(MN log(MN)) dominated by FFT operations

### 5.3 Numerical Considerations

1. **Floating Point Precision:** Use float64 for FFT to minimize numerical errors
2. **Normalization:** Input images normalized to [0, 1] before FFT
3. **Centering:** fftshift centers DC component for intuitive visualization
4. **Clipping:** Output values clipped to [0, 255] to handle numerical artifacts

---

## 6. Usage

### 6.1 Requirements

```bash
pip install numpy opencv-python matplotlib scikit-image

6.2 Running the Analysis

python image_fft_analyzer.py

6.3 Configuration

Edit constants in image_fft_analyzer.py:

INPUT_IMAGE_PATH = 'input_image.jpg'        # Input image path
LOW_PASS_RADIUS_PERCENT = 0.05              # D_L = 5% of image height
HIGH_PASS_RADIUS_PERCENT = 0.15             # D_H = 15% of image height

6.4 Output Files

  1. spectrum_analysis.png: Original image and log-magnitude spectrum
  2. frequency_filters.png: Visualization of LPF, MPF, HPF masks
  3. reconstruction_results.png: All 7 reconstructions with metrics table

7. Mathematical Insights

7.1 Parseval's Theorem

Energy is conserved between spatial and frequency domains:

Σ(x,y) |f(x, y)|² = (1/MN) · Σ(u,v) |F(u, v)|²

This ensures that filtering in frequency domain affects image energy predictably.

7.2 Convolution Theorem

Spatial domain convolution ↔ Frequency domain multiplication:

f(x, y) ⊗ h(x, y) ↔ F(u, v) · H(u, v)

Our frequency domain filtering is equivalent to spatial convolution with filter impulse response.

7.3 Frequency Localization

  • DC Component F(0, 0): Average image intensity
  • Low Frequencies: Smooth variations, overall shape
  • Mid Frequencies: Edges, moderate details
  • High Frequencies: Fine textures, noise

7.4 Ideal vs. Practical Filters

Ideal Filters (Used Here):

  • Sharp cutoff in frequency domain
  • Cause ringing artifacts (Gibbs phenomenon) in spatial domain
  • Mathematically simple, computationally efficient

Practical Alternatives:

  • Butterworth: Smooth transition, less ringing
  • Gaussian: No ringing, but less selective
  • Chebyshev: Sharper cutoff with controlled ripple

7.5 Expected Reconstruction Quality

Based on frequency content preservation:

Highest Quality (PSNR > 35 dB, SSIM > 0.95):

  • Low + Mid (preserves most visual information)

Good Quality (PSNR 25-35 dB, SSIM 0.85-0.95):

  • Low + High (preserves structure and edges)
  • Mid + High (preserves details)

Poor Quality (PSNR < 25 dB, SSIM < 0.85):

  • Low Only (too blurred)
  • Mid Only (lacks structure)
  • High Only (edge map only)

8. Applications

8.1 Image Compression

Low-frequency bands contain most visual information with fewer coefficients, enabling compression.

8.2 Noise Reduction

High-frequency filtering removes noise while preserving image structure.

8.3 Edge Detection

High-pass filtering isolates edges and fine details.

8.4 Image Enhancement

Selective frequency amplification enhances specific features.

8.5 Frequency Analysis

Understanding which frequencies contribute to image perception and quality.


9. References

9.1 Foundational Theory

  1. Cooley, J. W., & Tukey, J. W. (1965). "An algorithm for the machine calculation of complex Fourier series." Mathematics of Computation, 19(90), 297-301.

  2. Gonzalez, R. C., & Woods, R. E. (2018). Digital Image Processing (4th ed.). Pearson. Chapter 4: Filtering in the Frequency Domain.

  3. Oppenheim, A. V., & Schafer, R. W. (2009). Discrete-Time Signal Processing (3rd ed.). Prentice Hall.

9.2 Quality Metrics

  1. Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). "Image quality assessment: from error visibility to structural similarity." IEEE Transactions on Image Processing, 13(4), 600-612.

  2. Huynh-Thu, Q., & Ghanbari, M. (2008). "Scope of validity of PSNR in image/video quality assessment." Electronics Letters, 44(13), 800-801.

9.3 Frequency Domain Filtering

  1. Butterworth, S. (1930). "On the theory of filter amplifiers." Wireless Engineer, 7, 536-541.

  2. Lim, J. S. (1990). Two-Dimensional Signal and Image Processing. Prentice Hall. Chapter 6: Image Enhancement.


10. Mathematical Notation Summary

Symbol Meaning
f(x, y) Spatial domain image
F(u, v) Frequency domain representation
M, N Image dimensions (height, width)
j Imaginary unit (√-1)
Convolution operator
Σ Summation
Re[·] Real part
Im[·] Imaginary part
|F(u, v)| Magnitude spectrum
φ(u, v) Phase spectrum
D(u, v) Distance from frequency center
D_L Low-pass cutoff radius
D_H High-pass cutoff radius
H_low, H_mid, H_high Filter transfer functions
f̂(x, y) Reconstructed image
μ Mean
σ Standard deviation
σ_xy Covariance

11. Conclusion

This implementation provides a complete framework for understanding how different frequency components contribute to image structure and quality. By decomposing images into low, mid, and high frequency bands and reconstructing from various combinations, we gain insight into:

  1. Frequency Importance: Low frequencies carry most structural information
  2. Perceptual Quality: Mid frequencies are critical for perceived sharpness
  3. Detail Preservation: High frequencies contain fine details but contribute less to overall quality
  4. Metric Behavior: SSIM correlates better with perception than MSE/PSNR

The mathematical rigor ensures reproducible results and provides a foundation for advanced image processing applications.


Author: Image Processing Analysis Tool Date: November 19, 2025 License: MIT Version: 1.0.0

About

This project implements a comprehensive frequency domain analysis system for digital images using the Discrete Fourier Transform (DFT). The system decomposes images into distinct frequency bands, reconstructs images from various frequency combinations and quantitatively evaluates reconstruction quality using standard image quality metrics.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages