-
Notifications
You must be signed in to change notification settings - Fork 5
Open
Description
Hey the team,
Thanks for the great work. I have a problem about the SFT training: I found that the model performance after running the official SFT training seems to be lower than the reported number as shown the table below. Though I'm temporarily using Qwen3-VL-8B-Instruct as the judge, there's still a huge gap between the release model and the re-trained model, especially on large datasets like MathVista, MathVision, MathVerse, and MMMU-Pro. Some of them are outside the resulting stderr from the lmms-eval, making me feel concerning.
I haven't run the RL and compared with their downstream results yet but just wanna know if that's normal or not. Looking forward to your guidance. Thanks!
Metadata
Metadata
Assignees
Labels
No labels