Fix problems with higher order gradients in probability test framework#2042
Fix problems with higher order gradients in probability test framework#2042
Conversation
…4.1 (tags/RELEASE_600/final)
…4.1 (tags/RELEASE_600/final)
…into bugfix/issue-2037
…into bugfix/issue-2037
stan/math/prim/prob/frechet_lpdf.hpp
Outdated
| T_partials_return, T_y> | ||
| inv_y(size(y)); | ||
| for (size_t i = 0; i < stan::math::size(y); i++) { | ||
| inv_y[i] = 1.0 / value_of(y_vec[i]); |
There was a problem hiding this comment.
The problem here was this second value_of (inv_y has already had one value_of).
There was a problem hiding this comment.
This is probably unnecesary once my frechet PR is in.
There was a problem hiding this comment.
Yeah whichever goes first is fine with me.
There was a problem hiding this comment.
Now its in and you have a merge conflict.
|
|
||
| for (int i = 0; i <= n; i++) { | ||
| cdf *= binomial_coefficient<double>(N, i) | ||
| cdf += binomial_coefficient<double>(N, i) |
There was a problem hiding this comment.
This code is the same as the log cdf code without the log: https://github.com/stan-dev/math/blob/develop/test/prob/binomial/binomial_cdf_log_test.hpp
| param[0] = 16.0; // y | ||
| param[1] = 3.0; // alpha | ||
| param[2] = 3.0; // beta | ||
| param[0] = 9.0; // y |
There was a problem hiding this comment.
I made the test case easier because the test implementations were struggling numerically. See: #2006
The Stan implementations of the ccdf didn't seem to have this problem.
| using stan::math::log1m; | ||
|
|
||
| return log(gamma_q(alpha, beta * y)); | ||
| return log1m(gamma_p(alpha, beta * y)); |
There was a problem hiding this comment.
Doing log1m(cdf) worked better than log(ccdf) for forward mode. Check here for the lcdf: https://github.com/stan-dev/math/blob/develop/test/prob/gamma/gamma_cdf_log_test.hpp#L84
| const T_y& y, const T_loc& mu, const T_scale& beta, const T3&, const T4&, | ||
| const T5&) { | ||
| return -log(beta) - (y - mu) / beta + exp((mu - y) / beta); | ||
| return -log(beta) - (y - mu) / beta - exp((mu - y) / beta); |
There was a problem hiding this comment.
Gumbel pdf here: https://en.wikipedia.org/wiki/Gumbel_distribution
| using stan::math::gamma_p; | ||
|
|
||
| return log(1.0 - gamma_p(0.5 * nu, 0.5 / y)); | ||
| return log(gamma_p(0.5 * nu, 0.5 / y)); |
There was a problem hiding this comment.
Here's the math version: https://github.com/stan-dev/math/blob/develop/stan/math/prim/prob/inv_chi_square_lccdf.hpp#L92
(this is checked with finite differences so I assume it's the correct one)
…into bugfix/issue-2037
… making the tolerances bigger (Issue #2037)
| param[0] = -2; // y | ||
| param[1] = 0; // mu | ||
| param[2] = 1; // sigma | ||
| param[0] = -2; // y |
There was a problem hiding this comment.
I changed this test cause we test the gradients with absolute error, and because the gradients here were like 17.1234, an error on the 4th digit past the decimal meant that I would have to have lowered the tolerance to 1e-3 when this has more precision than that.
This error happened with fvar<var> variables. Anyway we should be using relative tolerances where it makes sense and autodiff-mode aware tolerances, but I'm trying to skip that to keep it simple.
Anyway I changed the parameters here and the function works better again.
|
@serban-nicusor-toptal had an out of memory disk space error here on the distribution tests: https://jenkins.mc-stan.org/blue/organizations/jenkins/Math%20Pipeline/detail/PR-2042/15/pipeline/40/ Does that mean anything specific or does it just look like a random failiure? |
Jenkins Console Log Machine informationProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010CPU: G++: Clang: |
|
@bbbales2 Thanks a lot of letting me know, looks like some machines had a different physical disk name under |
|
@serban-nicusor-toptal I'm glad to Edit: I'm glad to hear* -- my mom is an English teacher she wouldn't like this :( |
|
@t4c1 This is ready to review. It's the bugfix part of #1989 . I'm also putting the allocation stuff from #2041 here since that separate branch wasn't passing tests (I don't know why). There are bunch of little small weird changes. I tried to add explanations inline. Ask if it's not clear why something is okay/what happened. |
Jenkins Console Log Machine informationProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010CPU: G++: Clang: |
t4c1
left a comment
There was a problem hiding this comment.
I don't see any obvious bugs. However I am hesitant to say this is ok, as I don't know the code. I am also not a mathematician, so there is come confusion regarding math changes.
Maybe it would be better if some more knowledge about prob testing and math reviewed this? Or I can persist, but there will need to be quite some explaining.
stan/math/prim/prob/frechet_lpdf.hpp
Outdated
| T_partials_return, T_y> | ||
| inv_y(size(y)); | ||
| for (size_t i = 0; i < stan::math::size(y); i++) { | ||
| inv_y[i] = 1.0 / value_of(y_vec[i]); |
There was a problem hiding this comment.
This is probably unnecesary once my frechet PR is in.
| for (size_t n = 0; n < parameters.size(); n++) | ||
| if (p < parameters[0].size()) | ||
| param(n) = parameters[n][p]; | ||
| param(n) = get_param<stan::scalar_type_t<T>>(parameters[n], p); |
There was a problem hiding this comment.
Why/where do you need this change?
There was a problem hiding this comment.
This is important for forward mode autodiff when we initialize a vector vs. when we initialize scalars.
If the test scalar type is fvar<var>, then the scalars get initialized with get_param.
This looks like:
param = params[n];
param.d_ = 1.0;
Previously, when initializing vectors, the initialization looked like:
param(n) = parameters[n][p];
I think that leaves the .d_ initialized to zero, which means forward mode autodiff tests fail. Anyway, now we just use the same code to initialize scalars as to initialize each element of a vector so we don't have these problems.
There was a problem hiding this comment.
Hmm, do you know why were forward tests passing than before this PR? Actually this PR seems like it fixes quite a number of serious bugs. Any idea why nothing failed before?
There was a problem hiding this comment.
I'm not sure, and given the seriousness of the bugs I kinda worry there's more.
There was a problem hiding this comment.
Well, I know there's more problems lol. There will be a follow up pull request to this that tests that: cdf([a, b, c, d]) evaluates to the same thing as cdf(a) * cdf(b) * cdf(c) * cdf(d). Lack of those tests caused a bug previously. I also know in the other branch where I put this all together there were problems with discrete_range which didn't show up here. Not sure what the deal is. Maybe adding that other test will uncover them again so I can put the fixes back in?
Jenkins Console Log Machine informationProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010CPU: G++: Clang: |
Fixes #2036 and #2037 and #2038
Release notes
Fix problems with higher order gradients in probability test framework
Checklist
Math issue Probability test framework doesn't clean up memory properly #2036, Probability test framework not computing some gradients correctly #2037, High order functions not being tested in
test_repeat_as_vector#2038Copyright holder: Columbia University
The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
- Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
- Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
the basic tests are passing
./runTests.py test/unit)make test-headers)make test-math-dependencies)make doxygen)make cpplint)the code is written in idiomatic C++ and changes are documented in the doxygen
the new changes are tested