Could you provide evaluation code?

Could you please provide a complete evaluation script to explain how the model is being evaluated? The inference_chat.py you provided seems to only support inference for a single sample and doesn't handle large-scale batch evaluation. Also, I'm not sure whether it's an issue with my environment or your code, but after downloading the latest model from Hugging Face, I noticed that only the first inference produces a valid result, while all subsequent results are just "!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!", which is quite strange. I would appreciate it if you could thoroughly check the code or let me know what might be causing this issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Could you provide evaluation code? #9

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Could you provide evaluation code? #9

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions