📃 Paper |🤗 huggingface repository
- 📒 Table of Contents
- 📍 Overview
- 🏃 Training process
- 📊 Results
⚠️ Risks and Limitations- ⛔️ License
- 🤝 Collaborators
We present gaokerena-R, a model trained with a limited-data approach to enhance the Persian medical reasoning capabilities of the aya-expanse-8b model. Despite using less data, gaokerena-R outperforms our previous model, gaokerena-V, which was trained on a much larger dataset. This demonstrates the effectiveness of our reasoning-focused training strategy under data-constrained conditions.
Two methods were proposed to enhance the reasoning capabilities of the baseline model. In both approaches, a teacher model guides the baseline model using Direct Preference Optimization (DPO). We primarily used Method A due to the time-consuming nature of Method B.
In this method, a teacher model tries to correct the student model’s reasoning errors.
In this method, a teacher model critiques the student’s answer and guides it through a conversation to reach the correct answer.
| gaokerena-R + aya-expanse-8b(verifier) | gaokerena-V | aya-expanse-8b | |
|---|---|---|---|
| MMLU-anatomy(fa) | 47.40 | 48.14 | 40.74 |
| MMLU-medicalgenetics(fa) | 56.0 | 53.0 | 49.0 |
| MMLU-collegemedicine(fa) | 50.28 | 43.93 | 44.51 |
| MMLU-clinicalknowledge(fa) | 58.86 | 55.47 | 52.07 |
| MMLU-professionalmedicine(fa) | 48.89 | 47.05 | 45.58 |
| MMLU-collegebiology(fa) | 54.86 | 47.22 | 45.14 |
| MMLU(avg) | 52.98 | 49.31 | 46.64 |
| IBMSEE Sept 2023 | 46.42 | 38.69 | 34.52 |
| Prompt | COT for the main model & Straight for the verifier | Straight | Straight |
| Inference_time |
While Gaokerena aims to provide relatively accurate information, it is not a substitute for professional medical advice. The model may have limitations in:
- Handling medical emergencies.
- Addressing highly specialized or rare medical conditions.
- Offering region-specific guidance, as the training data does not include localized Persian medical practices.
CC BY-NC-SA 4.0 (non-commercial use only)
- Mehrdad Ghassabi
- Sadra Hakim
- Dr. Hamid Reza Baradaran Kashani
- Pedram Rostami
- Zahra Kazemi
