Skip to content

[Benchmark] Support ScienceOlympiad Galaxy10DECaLS VRSBench#1410

Open
zhouyujin wants to merge 2 commits intoopen-compass:mainfrom
zhouyujin:add-benchmark
Open

[Benchmark] Support ScienceOlympiad Galaxy10DECaLS VRSBench#1410
zhouyujin wants to merge 2 commits intoopen-compass:mainfrom
zhouyujin:add-benchmark

Conversation

@zhouyujin
Copy link

Add three datasets: ScienceOlympiad, Galaxy10DECaLS, VRSBench

  1. ScienceOlympiad
    TSV link: https://huggingface.co/datasets/YuJJJJin/ScienceOlympiad.tsv
    ScienceOlympiad focuses on competitive‑level physics and chemistry problems with multimodal content. It evaluates models on scientific reasoning and visual comprehension.
  2. Galaxy10DECaLS
    TSV link: https://huggingface.co/datasets/YuJJJJin/Galaxy10DECaLS.tsv
    Galaxy10DECaLS is a curated image classification dataset with 1,774 galaxy images across 10 classes. It evaluates models’ ability to classify astronomical objects based on visual features.
  3. VRSBench
    TSV link: https://huggingface.co/datasets/YuJJJJin/VRSBench.tsv
    VRSBench is derived from the VQA test set of the VRSBench benchmark and evaluates multimodal understanding of remote‑sensing imagery.
    Two variants are provided:
    • VRSBench.tsv: Full evaluation set with 37,409 VQA samples.
    • VRSBench_MINI.tsv: Compact evaluation set with 3,735 samples (10% stratified sampling from the full set, seed=42).
    Both datasets cover 12 question categories and assess a model’s ability to answer remote‑sensing questions through visual analysis and reasoning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant