HEViTPose: towards high-accuracy and efficient 2D human pose estimation with cascaded group spatial reduction attentio
[HEViTPose: towards high-accuracy and efficient 2D human pose estimation with cascaded group spatial reduction attention]
Chengpeng Wu
With the code contained in this repo, you should be able to reproduce the following results.
| Method | Test set | Input size | Params | GFLOPs | Hea | Sho | Elb | Wri | Hip | Kne | Ank | Total |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| HEViTPose-T | MPII val | 256×256 | 3.21M | 1.75G | 95.9 | 94.9 | 87.4 | 81.6 | 87.4 | 81.6 | 77.2 | 87.2 |
| HEViTPose-S | MPII val | 256×256 | 5.88M | 3.64G | 96.3 | 95.2 | 88.7 | 83.3 | 88.5 | 83.9 | 79.5 | 88.5 |
| HEViTPose-B | MPII val | 256×256 | 10.63M | 5.58G | 96.5 | 95.6 | 89.5 | 84.5 | 89.1 | 85.7 | 81.1 | 89.4 |
| HEViTPose-T | MPII test-dev | 256×256 | 3.21M | 1.75G | 97.6 | 95.1 | 89.0 | 83.6 | 89.1 | 83.9 | 79.1 | 88.7 |
| HEViTPose-S | MPII test-dev | 256×256 | 5.88M | 3.64G | 97.8 | 95.9 | 90.5 | 86.0 | 89.7 | 86.0 | 81.7 | 90.1 |
| HEViTPose-B | MPII test-dev | 256×256 | 10.63M | 5.58G | 98.0 | 96.1 | 91.3 | 86.5 | 90.2 | 86.6 | 83.0 | 90.7 |
| Method | Test set | Input size | AP | AP .5 | AP .75 | AP (M) | AP (L) | AR |
|---|---|---|---|---|---|---|---|---|
| HEViTPose-B | COCO val | 256×256 | 75.4 | 93.6 | 83.5 | 72.4 | 79.6 | 78.2 |
| HEViTPose-B | COCO test-dev | 256×256 | 72.6 | 92.0 | 80.9 | 69.2 | 78.2 | 78.0 |
Some examples of the prediction results of the HEViTPose network model for human posture include occlusion, multiple people, viewpoint and appearance change on the MPII (top) and COCO (bottom) data sets.
git clone https://github.com/T1sweet/HEViTPose
cd ./HEViTPose conda create -n HEViTPose python=3.9
conda activate HEViTPoseOur model is trained in a GPU platforms and relies on the following versions: torch==1.10.1+cu113, torchvision==0.11.2+cu113
conda install pytorch torchvision cudatoolkit=11.3 -c pytorchOur code is based on the MMPose 0.29.0 code database, and dependencies can be installed through the methods provided by MMPose. Install MMCV using MIM.
conda install pytorch torchvision cudatoolkit=11.3 -c pytorch
pip install -U openmim
mim install mmcv-full==1.4.5Install other dependency.
pip install -r requirements.txtDownload MPII and COCO from website and put the zip file under the directory following below structure, (xxx.json) denotes their original name.
./data
|── coco
│ └── annotations
| | └──coco_train.json(person_keypoints_train2017.json)
| | └──coco_val.json(person_keypoints_val2017.json)
| | └──coco_test.json(image_info_test-dev2017.json)
| └── images
| | └──train2017
| | | └──000000000009.jpg
| | └──val2017
| | | └──000000000139.jpg
| | └──test2017
| | | └──000000000001.jpg
├── mpii
│ └── annotations
| | └──mpii_train.json(refer to DEKR, link:https://github.com/HRNet/DEKR)
| | └──mpii_val.json
| | └──mpii_test.json
| | └──mpii_gt_val.mat
| └── images
| | └──100000.jpg
Change the checkpoint path by modifying pretrained in HEViTPose-B_mpii_256x256.py, and run following commands:
python tools/test.py config checkpoint
config option means the configuration file, which must be set.
checkpoint option means the training weight file and must be set.
# evaluate HEViTPose-B on mpii val set
python tools/test.py ../configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/HEViTPose-B_mpii_256x256.py /work_dir/HEViTPose/HEViTPose-B.pth
# evaluate HEViTPose-S on mpii val set
python tools/test.py ../configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/HEViTPose-S_mpii_256x256.py /work_dir/HEViTPose/HEViTPose-S.pth
# evaluate HEViTPose-T on mpii val set
python tools/test.py ../configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/HEViTPose-T_mpii_256x256.py /work_dir/HEViTPose/HEViTPose-T.pth
# evaluate HEViTPose-B on coco val set
python tools/test.py ../configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/HEViTPose-B_coco_256x256.py /work_dir/HEViTPose/HEViTPose-B_coco.pthChange the checkpoint path by modifying pretrained in HEViTPose-B_mpii_256x256.py, and run following commands:
# evaluate HEViTPose-B on mpii val set
python tools/train.py ../configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/HEViTPose-B_mpii_256x256.py
# evaluate HEViTPose-B on coco val2017 set
python tools/train.py ../configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/HEViTPose-B_coco_256x256.pyIf you have any questions about this code or paper, feel free to contact me at [email protected].
If you find this code useful for your research, please cite our paper:
@misc{wu2024hevitpose,
title = {HEViTPose: High-Efficiency Vision Transformer for Human Pose Estimation},
author = {Chengpeng Wu, Guangxing Tan*, Chunyu Li},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2023},
eprint={2311.13615 },
archivePrefix={arXiv},
primaryClass={cs.CV}
}This algorithm is based on code database MMPose, and its main ideas are inspired by EfficientViT, PVTv2, Swin and other papers.
@misc{mmpose2020,
title={OpenMMLab Pose Estimation Toolbox and Benchmark},
author={MMPose Contributors},
howpublished = {\url{https://github.com/open-mmlab/mmpose}},
year={2020}
}
