Skip to content

This repository is the official GitHub page of MLBCAP, the first-place winner of the 2nd SciCap Challenge. MLBCAP has been accepted for presentation at AI4Research @AAAI 2025.

Notifications You must be signed in to change notification settings

teamreboott/MLBCAP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 

Repository files navigation

MLBCAP

This repository is the official GitHub page of MLBCAP, the first-place winner of the 2nd SciCap Challenge. MLBCAP has been accepted for presentation at AI4Research @ AAAI 2025.

Paper: Link
Dataset (HuggingFace): Link

πŸ“Œ Introduction

Scientific figure captioning is a challenging task that demands contextually accurate descriptions of visual content. Existing approaches often oversimplify the task by treating it as either an image-to-text conversion or text summarization problem, leading to suboptimal results. Furthermore, commonly used datasets derived from arXiv papers are plagued with low-quality captions, making them unsuitable for effectively training large language models (LLMs).

MLBCAP addresses these challenges by leveraging a multi-LLM collaborative approach to generate high-quality captions. πŸš€

MLBCAP Diagram


πŸ“Š Dataset Overview

This dataset stems from the results of the 2nd Scicap Challenge, utilizing the hidden test dataset from the competition. The dataset is composed of synthetic high-quality captions generated by MLBCAP.

Note: This dataset is based on the hidden test dataset from the challenge, and the original captions from arXiv papers are not publicly available.


πŸ† 2nd Scicap Challenge

The 2nd Scicap Challenge was held during IJCAI 2024 (August 3-9, Jeju Island, South Korea). The competition featured two tracks based on caption length constraints:

  • Short Caption Track: At least 30% of the generated captions must be shorter than the author-written captions.
  • Long Caption Track: At least 30% of the generated captions must be longer than the author-written captions.

✨ Features of the Dataset

The dataset includes the following features:

  • figure_type: Extracted from the Scicap dataset
  • ocr: Extracted from the Scicap dataset
  • paragraph: Extracted from the Scicap dataset
  • mention: Extracted from the Scicap dataset
  • categories: Extracted from the Scicap dataset
  • figure_description: Generated by GPT-4o
  • mlbcap_long: Captions generated by MLBCAP-long
  • mlbcap_short: Captions generated by MLBCAP-short

🌟 Quality of MLBCAP's Captions

Human evaluation within the Scicap Challenge confirms the high quality of MLBCAP-generated captions. Three judges evaluated the captions with the following results:

  • MLBCAP-long: Demonstrated higher quality compared to the original captions written by arXiv authors. πŸ’ͺ
  • MLBCAP-short: Achieved a similar quality to the original captions written by authors. 🀝

Quality Evaluation


πŸ“Ž Citation

If you use MLBCAP in your research, please cite our paper:

@misc{kim2025multillmcollaborativecaptiongeneration,
      title={Multi-LLM Collaborative Caption Generation in Scientific Documents}, 
      author={Jaeyoung Kim and Jongho Lee and Hong-Jun Choi and Ting-Yao Hsu and Chieh-Yang Huang and Sungchul Kim and Ryan Rossi and Tong Yu and Clyde Lee Giles and Ting-Hao 'Kenneth' Huang and Sungchul Choi},
      year={2025},
      eprint={2501.02552},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2501.02552}, 
}

About

This repository is the official GitHub page of MLBCAP, the first-place winner of the 2nd SciCap Challenge. MLBCAP has been accepted for presentation at AI4Research @AAAI 2025.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published