Skip to content

LFIC-DRASC: Deep Light Field Image Compression Using Disentangled Representation and Asymmetrical Strip Convolution

License

Notifications You must be signed in to change notification settings

SYSU-Video/LFIC-DRASC

Repository files navigation

LFIC-DRASC - Light Field Image Compression Neural Network

GitHub stars Python PyTorch Paper License Last commit

LFIC-DRASC: Deep Light Field Image Compression Using Disentangled Representation and Asymmetrical Strip Convolution
[paper] [code]
Shiyu Feng, Yun Zhang, Linwei Zhu, Sam Kwong
IEEE Transactions on Broadcasting (TBC), 2025

Project Introduction

LFIC-DRASC is a deep learning model for light field image compression, which maintains spatial consistency through a special network structure, improving compression efficiency and visual quality.

Abstract

Light-Field (LF) image is emerging 4D data of light rays that is capable of realistically presenting spatial and angular information of 3D scene. However, the large data volume of LF images becomes the most challenging issue in real-time processing, transmission, and storage. In this paper, we propose an end-to-end deep LF Image Compression method Using Disentangled Representation and Asymmetrical Strip Convolution (LFIC-DRASC) to improve coding efficiency. Firstly, we formulate the LF image compression problem as learning a disentangled LF representation network and an image encodingdecoding network. Secondly, we propose two novel feature extractors that leverage the structural prior of LF data by integrating features across different dimensions. Meanwhile, disentangled LF representation network is proposed to enhance the LF feature disentangling and decoupling. Thirdly, we propose the LFIC-DRASC for LF image compression, where two Asymmetrical Strip Convolution (ASC) operators, i.e., horizontal and vertical, are proposed to capture long-range correlation in LF feature space. These two ASC operators can be combined with the square convolution to further decouple LF features, which enhances the model’s ability in representing intricate spatial relationships. Experimental results demonstrate that the proposed LFIC-DRASC achieves an average of 20.5% bit rate reductions compared with the state-of-the-art methods. Source code and pre-trained models of LFIC-DRASC are available at https://github.com/SYSU-Video/LFIC-DRASC.

Framework Overview

Framework Overview

Environment Configuration

Main dependencies:

torch==2.0.1
torchvision==0.15.2
compressai==1.1.5
pytorch-msssim==1.0.0

Complete dependencies can be installed with the following command:

pip install -r requirements.txt

Usage

Training Model

python train.py -d dataset --N 48 --angRes 13 --n_blocks 1 -e 100 -lr 1e-4 -n 8 --lambda 3e-3 --batch-size 16 --test-batch-size 8 --aux-learning-rate 1e-3 --patch-size 832 832 --cuda --save --seed 1926 --gpu-id 0,1,2,3 --savepath ./checkpoint

Main parameters:

  • -d dataset: Training dataset path
  • --N 48: Number of channels
  • --angRes 13: Angular resolution
  • --n_blocks 1: Number of iteration blocks
  • --lambda 3e-3: Rate-distortion parameter

Update Entropy Model

python updata.py checkpoint_path -n checkpoint_name

Model Testing

python Inference.py --dataset test_directory --output_path output_directory -p checkpoint.pth.tar

Note: We have retrained the models on RTX4090 and updated their checkpoints, which are provided in the ckpt folder.

About

LFIC-DRASC: Deep Light Field Image Compression Using Disentangled Representation and Asymmetrical Strip Convolution

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages