DT-JRD

DT-JRD: Deep Transformer-based Just Recognizable Difference Prediction Model for Video Coding for Machines
[paper] [code]
Junqi Liu, Yun Zhang, Xiaoqi Wang, Long Xu, Sam Kwong
IEEE Transactions on Multimedia (TMM), 2025

Abstract

Just Recognizable Difference (JRD) represents the minimum visual difference that is detectable by machine vision, which can be exploited to promote machine vision-oriented visual signal processing. In this paper, we propose a Deep Transformer-based JRD (DT-JRD) prediction model for Video Coding for Machines (VCM), where the accurately predicted JRD can be used to reduce the coding bit rate while maintaining the accuracy of machine tasks. Firstly, we model the JRD prediction as a multi-class classification and propose a DT-JRD prediction model that integrates an improved embedding, a content and distortion feature extraction, a multi-class classification, and a novel learning strategy. Secondly, inspired by the perception property that machine vision exhibits a similar response to distortions near JRD, we propose an asymptotic JRD loss by using Gaussian Distribution-based Soft Labels (GDSL), which significantly extends the number of training labels and relaxes classification boundaries. Finally, we propose a DT-JRD-based VCM to reduce the coding bits while maintaining the accuracy of object detection. Extensive experimental results demonstrate that the mean absolute error of the predicted JRD by the DT-JRD is 5.574, outperforming the state-of-the-art JRD prediction model by 13.1%. Coding experiments show that compared with the VVC, the DT-JRD-based VCM achieves an average of 29.58% bit rate reduction while maintaining the object detection accuracy.

Requirements

🧩 This project was trained and tested with:

🐍 Python 3.10.14

📦 To install required packages, simply run:

pip install -r requirements.txt

🗂️ Project Directory Structure

DT-JRD
├── jsonfiles/
│   ├── all_GT_classes.json
│   ├── all_objects_infos.json
│   ├── coco80_indices.json
│   ├── JRD_info.json
│   ├── test_names.json
│   ├── train_names.json
│   └── val_names.json
├── data/
│   ├── original/
│   └── distorted/
├── pre_weights/
│   └── pretrained_vit.pth
├── train.py
├── test.py
├── dataset.py
├── model.py
└── utils.py

📥 The pretrained ViT weights can be downloaded from:

📊 Dataset

(1) In this work and our previous study ([paper] [code]), we use the OW-JRD (Object-wise Just Recognizable Distortion) dataset. The dataset consists of original and distorted images of detected objects from the COCO test set. The prediction models presented in this paper are trained and validated on the OW-JRD dataset. DT-JRD is a no-reference model, and therefore only the original data is required. In contrast, BC-JRD is a full-reference model, which necessitates both the original and distorted data for training and validation.
(2) In addition, we provide the JRD image dataset (extraction code: hrrn), which consists of 8,471 original images selected from the COCO test set along with 64 corresponding distorted versions. The OW-JRD dataset is constructed based on the JRD image dataset, which also enables secondary development for customized dataset construction and experiments involving VVC encoding.

Train

 python train.py --size 384 --epochs 20 --batchsize 32 --lr 0.01 --gpus 0,1 --device cuda:0

Test

 python test.py --train_weights your_checkpoint_path

📖 Citation

If you find our work useful or relevant to your research, please kindly cite our paper:

@article{liu2024dt,
  title={DT-JRD: Deep Transformer based Just Recognizable Difference Prediction Model for Video Coding for Machines},
  author={Liu, Junqi and Zhang, Yun and Wang, Xiaoqi and Long, Xu and Kwong, Sam},
  journal={arXiv preprint arXiv:2411.09308},
  year={2024}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DT-JRD

Abstract

Requirements

🗂️ Project Directory Structure

📊 Dataset

Train

Test

📖 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
jsonfiles		jsonfiles
DT-JRD.png		DT-JRD.png
LICENSE		LICENSE
README.md		README.md
dataset.py		dataset.py
model.py		model.py
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py
utils.py		utils.py

License

SYSU-Video/DT-JRD

Folders and files

Latest commit

History

Repository files navigation

DT-JRD

Abstract

Requirements

🗂️ Project Directory Structure

📊 Dataset

Train

Test

📖 Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages