This project implements a comprehensive frequency domain analysis system for digital images using the Discrete Fourier Transform (DFT). The system decomposes images into distinct frequency bands, reconstructs images from various frequency combinations, and quantitatively evaluates reconstruction quality using standard image quality metrics.
The 2D Discrete Fourier Transform converts a spatial domain image f(x, y) of size M × N into its frequency domain representation F(u, v):
F(u, v) = Σ(x=0 to M-1) Σ(y=0 to N-1) f(x, y) · exp[-j·2π·(ux/M + vy/N)]Where:
- f(x, y): Input image intensity at spatial coordinates (x, y)
- F(u, v): Complex-valued frequency coefficient at frequency coordinates (u, v)
- j: Imaginary unit (√-1)
- M, N: Image dimensions (height, width)
- u, v: Frequency domain coordinates (0 ≤ u < M, 0 ≤ v < N)
Physical Interpretation:
- Each F(u, v) represents the amplitude and phase of a sinusoidal wave with frequency (u, v)
- Low frequencies (near center after fftshift) represent gradual intensity changes
- High frequencies (far from center) represent rapid intensity changes (edges, textures)
The IDFT reconstructs the spatial domain image from its frequency representation:
f(x, y) = (1/MN) · Σ(u=0 to M-1) Σ(v=0 to N-1) F(u, v) · exp[j·2π·(ux/M + vy/N)]Reconstruction Property: If F(u, v) is the complete DFT of f(x, y), then IDFT[F(u, v)] = f(x, y) exactly (within numerical precision).
The magnitude spectrum represents the amplitude of each frequency component:
|F(u, v)| = √[Re[F(u, v)]² + Im[F(u, v)]²]Properties:
- Always non-negative: |F(u, v)| ≥ 0
- Symmetric for real-valued images: |F(u, v)| = |F(-u, -v)|
- Contains information about "how much" of each frequency is present
- Does not contain phase information
The phase spectrum represents the spatial distribution of frequency components:
φ(u, v) = arctan(Im[F(u, v)] / Re[F(u, v)])Properties:
- Range: -π ≤ φ(u, v) ≤ π
- Anti-symmetric for real images: φ(u, v) = -φ(-u, -v)
- Critical for preserving spatial structure
- More important than magnitude for human perception
For visualization, we apply logarithmic scaling:
L(u, v) = log(1 + |F(u, v)|)This compresses the dynamic range, making both small and large frequency components visible.
We partition the frequency domain into three non-overlapping regions using ideal (sharp cutoff) filters.
For a frequency point (u, v) relative to the center (M/2, N/2):
D(u, v) = √[(u - M/2)² + (v - N/2)²]Passes only low frequencies within radius D_L:
H_low(u, v) = { 1, if D(u, v) ≤ D_L
{ 0, otherwiseEffect: Retains smooth variations, removes fine details
- Preserves: Overall shape, gradual shading, large structures
- Removes: Edges, textures, noise
Passes only high frequencies beyond radius D_H:
H_high(u, v) = { 1, if D(u, v) > D_H
{ 0, otherwiseEffect: Retains fine details, removes smooth variations
- Preserves: Edges, textures, rapid changes
- Removes: Smooth regions, overall brightness
Passes frequencies in the annular region between D_L and D_H:
H_mid(u, v) = { 1, if D_L < D(u, v) ≤ D_H
{ 0, otherwiseEffect: Retains intermediate-scale features
- Preserves: Medium-scale structures, moderate edges
- Removes: Both very smooth and very fine details
The three filters form a complete partition of the frequency domain:
H_low(u, v) + H_mid(u, v) + H_high(u, v) = 1 ∀(u, v)
This ensures that every frequency component belongs to exactly one band.
Filtering in the frequency domain is performed by element-wise multiplication:
F_low(u, v) = F(u, v) · H_low(u, v)
F_mid(u, v) = F(u, v) · H_mid(u, v)
F_high(u, v) = F(u, v) · H_high(u, v)
Linearity Property:
F(u, v) = F_low(u, v) + F_mid(u, v) + F_high(u, v)
This decomposition is exact and reversible.
We generate seven reconstructions by applying IDFT to different frequency band combinations:
| Case | Frequency Bands | Formula | Expected Visual Characteristics |
|---|---|---|---|
| Full | Low + Mid + High | IDFT[F_low + F_mid + F_high] | Perfect reconstruction |
| Low + Mid | Low + Mid | IDFT[F_low + F_mid] | Smooth with moderate detail |
| Low + High | Low + High | IDFT[F_low + F_high] | Smooth regions with sharp edges |
| Mid + High | Mid + High | IDFT[F_mid + F_high] | Detailed but lacking overall structure |
| Low Only | Low | IDFT[F_low] | Blurred, smooth approximation |
| Mid Only | Mid | IDFT[F_mid] | Medium-scale features only |
| High Only | High | IDFT[F_high] | Edge map, high contrast |
For any frequency combination F_combined(u, v):
-
Unshift: Apply inverse fftshift to move zero-frequency back to corners
F_unshifted = ifftshift[F_combined] -
Apply IDFT: Transform back to spatial domain
f_complex(x, y) = IDFT[F_unshifted] -
Extract Real Part: Discard negligible imaginary components
f_real(x, y) = Re[f_complex(x, y)] -
Denormalize: Scale back to [0, 255] range
f_reconstructed(x, y) = clip(f_real(x, y) · 255, 0, 255)
MSE measures the average squared difference between original and reconstructed images:
MSE = (1/MN) · Σ(x=0 to M-1) Σ(y=0 to N-1) [f(x, y) - f̂(x, y)]²
Properties:
- Range: [0, ∞)
- Lower is better (0 = perfect reconstruction)
- Units: squared intensity values
- Sensitive to outliers
- Does not correlate well with perceived quality
PSNR expresses MSE on a logarithmic decibel scale:
PSNR = 10 · log₁₀(MAX² / MSE) [dB]
Where MAX = 255 for 8-bit images.
Properties:
- Range: [0, ∞) dB (typically 20-50 dB for images)
- Higher is better (∞ = perfect reconstruction)
- Logarithmic scale makes differences more interpretable
- PSNR > 40 dB: Excellent quality
- PSNR 30-40 dB: Good quality
- PSNR 20-30 dB: Acceptable quality
- PSNR < 20 dB: Poor quality
Relationship to MSE:
PSNR increases by ~10 dB when MSE decreases by factor of 10
SSIM measures perceptual similarity by comparing luminance, contrast, and structure:
SSIM(x, y) = [l(x, y)]^α · [c(x, y)]^β · [s(x, y)]^γ
Where:
-
l(x, y): Luminance comparison
l(x, y) = (2μ_x μ_y + C₁) / (μ_x² + μ_y² + C₁) -
c(x, y): Contrast comparison
c(x, y) = (2σ_x σ_y + C₂) / (σ_x² + σ_y² + C₂) -
s(x, y): Structure comparison
s(x, y) = (σ_xy + C₃) / (σ_x σ_y + C₃)
Notation:
- μ_x, μ_y: Mean intensities
- σ_x, σ_y: Standard deviations
- σ_xy: Covariance
- C₁, C₂, C₃: Stability constants
- α, β, γ: Exponents (typically all = 1)
Properties:
- Range: [-1, 1] (typically [0, 1] for similar images)
- Higher is better (1 = perfect structural similarity)
- Correlates better with human perception than MSE/PSNR
- SSIM > 0.95: Excellent similarity
- SSIM 0.90-0.95: Good similarity
- SSIM 0.80-0.90: Moderate similarity
- SSIM < 0.80: Poor similarity
Input: Grayscale image f(x, y)
Step 1: Preprocessing
f_norm(x, y) ← f(x, y) / 255
Step 2: Forward Transform
F(u, v) ← FFT2D[f_norm(x, y)]
F_shifted(u, v) ← fftshift[F(u, v)]
Step 3: Spectrum Analysis
|F(u, v)| ← √[Re² + Im²]
φ(u, v) ← arctan(Im/Re)
L(u, v) ← log(1 + |F(u, v)|)
Step 4: Filter Design
Compute D(u, v) for all (u, v)
H_low ← (D ≤ D_L)
H_high ← (D > D_H)
H_mid ← (D_L < D ≤ D_H)
Step 5: Frequency Decomposition
F_low ← F_shifted · H_low
F_mid ← F_shifted · H_mid
F_high ← F_shifted · H_high
Step 6: Reconstruction (for each combination)
F_combined ← Σ selected bands
F_unshifted ← ifftshift[F_combined]
f_complex ← IFFT2D[F_unshifted]
f_reconstructed ← clip(Re[f_complex] · 255, 0, 255)
Step 7: Quality Evaluation
For each reconstruction:
MSE ← mean[(f - f̂)²]
PSNR ← 10·log₁₀(255²/MSE)
SSIM ← structural_similarity(f, f̂)
Output: Reconstructed images + quality metrics
### 5.2 Computational Complexity
**FFT/IFFT:** O(MN log(MN)) per transform
**Filter Application:** O(MN) per filter
**Metric Calculation:** O(MN) per metric
**Total Complexity:** O(MN log(MN)) dominated by FFT operations
### 5.3 Numerical Considerations
1. **Floating Point Precision:** Use float64 for FFT to minimize numerical errors
2. **Normalization:** Input images normalized to [0, 1] before FFT
3. **Centering:** fftshift centers DC component for intuitive visualization
4. **Clipping:** Output values clipped to [0, 255] to handle numerical artifacts
---
## 6. Usage
### 6.1 Requirements
```bash
pip install numpy opencv-python matplotlib scikit-imagepython image_fft_analyzer.pyEdit constants in image_fft_analyzer.py:
INPUT_IMAGE_PATH = 'input_image.jpg' # Input image path
LOW_PASS_RADIUS_PERCENT = 0.05 # D_L = 5% of image height
HIGH_PASS_RADIUS_PERCENT = 0.15 # D_H = 15% of image height- spectrum_analysis.png: Original image and log-magnitude spectrum
- frequency_filters.png: Visualization of LPF, MPF, HPF masks
- reconstruction_results.png: All 7 reconstructions with metrics table
Energy is conserved between spatial and frequency domains:
Σ(x,y) |f(x, y)|² = (1/MN) · Σ(u,v) |F(u, v)|²
This ensures that filtering in frequency domain affects image energy predictably.
Spatial domain convolution ↔ Frequency domain multiplication:
f(x, y) ⊗ h(x, y) ↔ F(u, v) · H(u, v)
Our frequency domain filtering is equivalent to spatial convolution with filter impulse response.
- DC Component F(0, 0): Average image intensity
- Low Frequencies: Smooth variations, overall shape
- Mid Frequencies: Edges, moderate details
- High Frequencies: Fine textures, noise
Ideal Filters (Used Here):
- Sharp cutoff in frequency domain
- Cause ringing artifacts (Gibbs phenomenon) in spatial domain
- Mathematically simple, computationally efficient
Practical Alternatives:
- Butterworth: Smooth transition, less ringing
- Gaussian: No ringing, but less selective
- Chebyshev: Sharper cutoff with controlled ripple
Based on frequency content preservation:
Highest Quality (PSNR > 35 dB, SSIM > 0.95):
- Low + Mid (preserves most visual information)
Good Quality (PSNR 25-35 dB, SSIM 0.85-0.95):
- Low + High (preserves structure and edges)
- Mid + High (preserves details)
Poor Quality (PSNR < 25 dB, SSIM < 0.85):
- Low Only (too blurred)
- Mid Only (lacks structure)
- High Only (edge map only)
Low-frequency bands contain most visual information with fewer coefficients, enabling compression.
High-frequency filtering removes noise while preserving image structure.
High-pass filtering isolates edges and fine details.
Selective frequency amplification enhances specific features.
Understanding which frequencies contribute to image perception and quality.
-
Cooley, J. W., & Tukey, J. W. (1965). "An algorithm for the machine calculation of complex Fourier series." Mathematics of Computation, 19(90), 297-301.
-
Gonzalez, R. C., & Woods, R. E. (2018). Digital Image Processing (4th ed.). Pearson. Chapter 4: Filtering in the Frequency Domain.
-
Oppenheim, A. V., & Schafer, R. W. (2009). Discrete-Time Signal Processing (3rd ed.). Prentice Hall.
-
Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). "Image quality assessment: from error visibility to structural similarity." IEEE Transactions on Image Processing, 13(4), 600-612.
-
Huynh-Thu, Q., & Ghanbari, M. (2008). "Scope of validity of PSNR in image/video quality assessment." Electronics Letters, 44(13), 800-801.
-
Butterworth, S. (1930). "On the theory of filter amplifiers." Wireless Engineer, 7, 536-541.
-
Lim, J. S. (1990). Two-Dimensional Signal and Image Processing. Prentice Hall. Chapter 6: Image Enhancement.
| Symbol | Meaning |
|---|---|
| f(x, y) | Spatial domain image |
| F(u, v) | Frequency domain representation |
| M, N | Image dimensions (height, width) |
| j | Imaginary unit (√-1) |
| ⊗ | Convolution operator |
| Σ | Summation |
| Re[·] | Real part |
| Im[·] | Imaginary part |
| |F(u, v)| | Magnitude spectrum |
| φ(u, v) | Phase spectrum |
| D(u, v) | Distance from frequency center |
| D_L | Low-pass cutoff radius |
| D_H | High-pass cutoff radius |
| H_low, H_mid, H_high | Filter transfer functions |
| f̂(x, y) | Reconstructed image |
| μ | Mean |
| σ | Standard deviation |
| σ_xy | Covariance |
This implementation provides a complete framework for understanding how different frequency components contribute to image structure and quality. By decomposing images into low, mid, and high frequency bands and reconstructing from various combinations, we gain insight into:
- Frequency Importance: Low frequencies carry most structural information
- Perceptual Quality: Mid frequencies are critical for perceived sharpness
- Detail Preservation: High frequencies contain fine details but contribute less to overall quality
- Metric Behavior: SSIM correlates better with perception than MSE/PSNR
The mathematical rigor ensures reproducible results and provides a foundation for advanced image processing applications.
Author: Image Processing Analysis Tool Date: November 19, 2025 License: MIT Version: 1.0.0