Skip to content

Enhancing Reasoning Skills in Small Persian Medical Language Models Can Outperform Large-Scale Data Training

Notifications You must be signed in to change notification settings

Mehrdadghassabi/Gaokerena-R

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

logo_with_bg

📃 Paper |🤗 huggingface repository

📒 Table of Contents


📍 Overview

We present gaokerena-R, a model trained with a limited-data approach to enhance the Persian medical reasoning capabilities of the aya-expanse-8b model. Despite using less data, gaokerena-R outperforms our previous model, gaokerena-V, which was trained on a much larger dataset. This demonstrates the effectiveness of our reasoning-focused training strategy under data-constrained conditions.

🏃 Training process

Two methods were proposed to enhance the reasoning capabilities of the baseline model. In both approaches, a teacher model guides the baseline model using Direct Preference Optimization (DPO). We primarily used Method A due to the time-consuming nature of Method B.

Method A

In this method, a teacher model tries to correct the student model’s reasoning errors.

fig1

Method B

In this method, a teacher model critiques the student’s answer and guides it through a conversation to reach the correct answer.

fig2

📊 Results

gaokerena-R + aya-expanse-8b(verifier) gaokerena-V aya-expanse-8b
MMLU-anatomy(fa) 47.40 48.14 40.74
MMLU-medicalgenetics(fa) 56.0 53.0 49.0
MMLU-collegemedicine(fa) 50.28 43.93 44.51
MMLU-clinicalknowledge(fa) 58.86 55.47 52.07
MMLU-professionalmedicine(fa) 48.89 47.05 45.58
MMLU-collegebiology(fa) 54.86 47.22 45.14
MMLU(avg) 52.98 49.31 46.64
IBMSEE Sept 2023 46.42 38.69 34.52
Prompt COT for the main model & Straight for the verifier Straight Straight
Inference_time $\approx 5 \times 35 + 10 + 8s $ $\approx 10s$ $\approx 10s$

⚠️ Risks and Limitations

While Gaokerena aims to provide relatively accurate information, it is not a substitute for professional medical advice. The model may have limitations in:

  • Handling medical emergencies.
  • Addressing highly specialized or rare medical conditions.
  • Offering region-specific guidance, as the training data does not include localized Persian medical practices.

⛔️ License

CC BY-NC-SA 4.0 (non-commercial use only)

🤝 Collaborators

  1. Mehrdad Ghassabi
  2. Sadra Hakim
  3. Dr. Hamid Reza Baradaran Kashani
  4. Pedram Rostami
  5. Zahra Kazemi

About

Enhancing Reasoning Skills in Small Persian Medical Language Models Can Outperform Large-Scale Data Training

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •