Skip to content

Conversation

@qimcis
Copy link
Collaborator

@qimcis qimcis commented Jan 8, 2026

Description

Added cs537 fall 2021 final, find the original exam here

Testing

Ran benchmarks/courseexam_bench/tests/test_data_schema.py

Checklist

  • Tests pass locally
  • Code follows project style guidelines
  • Documentation updated (if needed)

@tareknaser tareknaser self-requested a review January 8, 2026 16:55
Copy link
Collaborator

@tareknaser tareknaser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together! I left a few comments but it's in good shape overall

Comment on lines 11 to 14
"score_max": 60.0,
"score_avg": 51.86,
"score_median": 52.5,
"score_standard_deviation": 9.69,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may have missed this but I’m not sure where these numbers originated. I couldn’t find them in the PDF

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally I had gotten them from this website, where data was collected from the UW-Madison's Registrar Office, however looking back at it again now, these are the final grades of the students in that given year, not the grades from this the final exam that year.

I don't think it will be possible to find these values for any of the UWM exams, would it be okay to exclude them for the time being? Or should I look for a more suitable exam to add that has this information available?

Screenshot from 2026-01-08 20-01-28

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, Chi. If you cannot find it, you can ignore it. @tareknaser Should we set it blank or some other default value?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can ignore the fields completely. See benchmarks/courseexam_bench/data/cs107_computer_organization_&_systems_fall_2023_final/exam.md for a similar case

"tags": [
"operating-systems"
],
"answer": "B"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious where did you get the ground truth for the answers here? The exam PDF doesn’t seem to have the answers

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ground truth solutions are provided by the professor, here. Similarly, the same can be found for other past CS537 exams here

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's great. @qimcis can you please also upload that the pdf with solution? Then, we can double check the exam.md together. Thanks a lot!

Comment on lines 115 to 120
Given the disk to the right (which rotates counter-clockwise), and assuming FIFO disk scheduling, what request would be serviced last, assuming the requests are: 13, 4, 35, 34, 5
A) 13
B) 4
C) 35
D) 34
E) 5
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Questions 5 to 8 seem to rely on the figure on page 3 which language models cannot access. Could you take a look at the guidelines here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've provided a verbose textual description of the figure now for questions 5-8 according to guidelines, lmk if this works!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for that. I still think it’s a bit confusing. mostly because the figure is hard to describe.

For these questions, let’s just leave them out and make sure to update the exam metadata to adjust the total score and question count

Comment on lines 299 to 304
Thus, we can conclude that the maximum bandwidth obtained during sequential writing to a 2-way mirrored array is ______ [Assume here there are N disks, and that a single disk delivers S MB/s of disk bandwidth]
A) S MB/s
B) 2 x S MB/s
C) N x S MB/s
D) N x S / 2 MB/s
E) N x S x S MB/s
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Questions like 13 seem to be continuations from previous questions. In such cases, we should either include the full context in each question or treat them as a single question

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opted to treat them as a single question

@xuafeng xuafeng requested a review from paizhangliu January 8, 2026 18:18
@qimcis qimcis requested a review from tareknaser January 9, 2026 01:51
@tareknaser
Copy link
Collaborator

If #61 gets merged first, we’ll need some small updates to ExactMatch questions
mainly adding the options in the JSON for the LLM to choose from

@qimcis
Copy link
Collaborator Author

qimcis commented Jan 10, 2026

If #61 gets merged first, we’ll need some small updates to ExactMatch questions mainly adding the options in the JSON for the LLM to choose from

added! should be good to go once #61 is merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants