[NMX] Reference data comparison test.#546
[NMX] Reference data comparison test.#546YooSunYoung wants to merge 2 commits intonmx-output-t-unitfrom
Conversation
| @@ -0,0 +1,107 @@ | |||
| # SPDX-License-Identifier: BSD-3-Clause | |||
| # Copyright (c) 2026 Scipp contributors (https://github.com/scipp) | |||
| """Compare the workflow output to be consistent with the reference output file. | |||
There was a problem hiding this comment.
I'm wondering if we could simplify this by saving the reference solution in the Scipp hdf5 format.
That way, loading and cmoparing to the reference would just be a
ref = sc.io.load_hdf5('reference.h5')
assert sc.identical(da, ref)However, this would not be testing that the data in the NX lauetof format has not changed.
So if something gets updated in the reduced file writer that breaks things, it would not be caught by such an implementation...
Can we somehow think of a way to combine both the simplicity and full coverage?
There was a problem hiding this comment.
I can't really think of simpler solution than this...
and I think this test should stay naive and simple and hard-coded as possible so that anyone can understand what should not change.
| # We use np.allclose instead of np.all(left==right) | ||
| # because we run ray-simulation on the fly | ||
| # and the time-dependent values might slightly change. | ||
| assert np.allclose(values_left, values_right), _cur_path |
There was a problem hiding this comment.
How are you going to deal with the uncertainties we get from the lookup tables that are used to compute the wavelength? Just make sure the same seeds are used everywhere?
If we change the LUT, the test will fail, but maybe that is a good thing? Because we want to know that the results have changed.
In the future, we are thinking of computing the LUT on the fly instead of saving it to disk. So it could potentially change slightly every time you run it, but we can ensure the results stay the same using a seed parameter.
There was a problem hiding this comment.
How are you going to deal with the uncertainties we get from the lookup tables that are used to compute the wavelength? Just make sure the same seeds are used everywhere?
I really don't know...
If we change the LUT, the test will fail, but maybe that is a good thing? Because we want to know that the results have changed.
Yeah I think so. Because ideally it shouldn't change too much that affects the reduction result.
In the future, we are thinking of computing the LUT on the fly instead of saving it to disk. So it could potentially change slightly every time you run it, but we can ensure the results stay the same using a seed parameter.
NMX is already computing the LUT on the fly by default and the seed is also fixed (configurable with default value). Actually.... why does it have slightly different results...? hmmmm
There was a problem hiding this comment.
If we use a seed the results should not change.
There was a problem hiding this comment.
Just an additional comment to say that if we go ahead with computing the lut with chopper_cascade, then there should no longer be randomness in the tables. They should always give the same results.
| excluded_paths = ( | ||
| entry_path / 'reducer/program', # version should be different | ||
| ) | ||
| ref_file_path = get_small_nmx_reduced() |
There was a problem hiding this comment.
I would really like to find a good way to make updating reference solutions easy.
This has been one of the main reasons why I have not been motivated to introduce a whole load of regression tests, because of past experience where we just spent so much time constantly updating reference solutions.
Something like a flag when running pytest that, instead of failing the test, it would make a new file and suggest a patch to the data registry to make updating less cumbersome.
But this was mostly just to get a discussion going. It is beyond the scope of this PR.
|
I'll merge it together with #545 |
I tried finding python API similar to h5diff but couldn't find one...
And we will need some custom comparison tool anyway so I wrote a naive one for now.
It depends on the change in #545 .