Ensure Repeatable Results by Preserving Insert Order in Unique Array Generation Methods#4586
Ensure Repeatable Results by Preserving Insert Order in Unique Array Generation Methods#4586Jason5Lee wants to merge 1 commit intodotnet:mainfrom
Conversation
adamsitnik
left a comment
There was a problem hiding this comment.
It's true that HashSet does not provide any guarantees about order semantics. But on the other hand, I would expect the current implementation to always result in the same order. And when we take this change, the order can change. And if this method is used by some benchmarks for which the order is important (like sorting for example), it could affect the time reported by those benchmarks. And this is something that we try to avoid, as it could be detected as a regression/improvement by our automation. More: https://github.com/dotnet/performance/blob/main/docs/microbenchmark-design-guidelines.md#benchmarks-are-immutable
Having said that, I am hesitant to take this change, as it could affect existing benchmarks.
If the current owners of the repo are fine with it, I would recommend ensuring that the affected methods are not used by benchmarks where execution time is dependent on the order of inputs.
Thank you for your contribution @Jason5Lee !
Context:
The methods
ArrayOfUniqueValuesandArrayOfUniqueStringsaim to generate repeatable sequences of unique values or strings. The repeatability is achieved using a fixedSeed(as noted in the comment on line 16). However, the current implementation relies on the traversal order ofHashSet, which does not guarantee a consistent order.Issue:
The assumption that a fixed insertion sequence into a
HashSetensures a repeatable traversal order is flawed:HashSetdoes not provide any guarantees about order semantics in its documentation..NET, may introduce variations inSetbehavior where traversal order is inconsistent even with a fixed insertion sequence.This discrepancy could lead to non-repeatable results, breaking the method's intended behavior.
Solution:
HashSet.CopyTofor transferring elements to the result array, we now directly assign values to the array in the order they are added to theHashSet. This ensures the output matches the insertion order and remains repeatable with a fixed seed.ArrayOfUniqueValuesandArrayOfUniqueStringsmethods to reflect this approach.Changes:
ArrayOfUniqueValues:HashSet.CopyTo.ArrayOfUniqueStrings:HashSet.Benefits:
HashSettraversal.