-
Notifications
You must be signed in to change notification settings - Fork 200
1545: Make max weight matching deterministic for graphs with multiple correct solutions #1546
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
1545: Make max weight matching deterministic for graphs with multiple correct solutions #1546
Conversation
|
|
IvanIsCoding
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I think determinism is welcome. But I don't think BTreeMap is the solution I want merged.
Another exercise before merging is revisiting other points of the algorithm to check if they are deterministic. Matthew wrote this ages ago. At the time we hadn't gotten the feedback from users about determinism. From time to time we still get that feedback (#1501).
|
Thanks for the feedback and suggestions @IvanIsCoding ! Nothing else stands out to me as a source of additional nondeterminism. Just to be sure, I've been running the script I used to come up with the nondeterministic example graphs I added in the tests, and ever since switching to IndexMap it has run through 15 million random graphs so far without finding any further examples. |
Pull Request Test Coverage Report for Build 21552760185Details
💛 - Coveralls |
Addresses #1545.
max_weight_matchingwas nondeterministic for some graphs with multiple valid solutions. See the tests I added, or the linked issue, for some examples of this.I evaluated two choices:
I did some benchmarking on various problem sizes and found that sorting the values of a HashMap caused a roughly 19% slowdown overall, especially for smaller problems. BTreeMap performance though looks pretty much identical modulo some noise:
I think determinism is a really nice property to have for a lot of different applications, and if we can achieve it without a performance penalty, it's worth it.
AI disclaimer:
I had claude come up with a script to find small examples of nondeterministic graphs; the examples in the tests come from there. (The example that I originally detected nondeterminism with had 60 nodes and thousands of edges and would have made a poor test case!)
I also had claude write the benchmarking script I used. I'd be happy to share it if there's interest. Nothing fancy; it just came up with some random graphs with different numbers of nodes and edges, and used matplotlib to generate a visualization.