【Evaluation】Best practice for evaluating Qwen3 with EvalScope !! #1305
wangxingjun778
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
【Evaluation】Best practice for evaluating Qwen3 !!
For more details, please refer to: https://evalscope.readthedocs.io/en/latest/best_practice/qwen3.html
Power by: EvalScope https://github.com/modelscope/evalscope
Speed Benchmark


Benchmark collection (for evaluating abilities such as code、understanding、instruction following、math ...)

NOTE: The result is based on samples of original benchmarks with eval arg
--limitThinking efficiency of Qwen3


Run Gradio visualization
Get started and have fun ! :)
Beta Was this translation helpful? Give feedback.
All reactions