-
Notifications
You must be signed in to change notification settings - Fork 260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support approximately-equal floating point assertions using ULPs #690
Comments
I was confident that I would find that our approximate equality was introduced by @PeteGillin , but it appears not :) (internally: CL 82886021, way back in 2014, so no API Review doc that I can see) Pete did extend approximate equality to Fuzzy Truth, so he may have thoughts. I recall reading the Random ASCII post linked to by GoogleTest during at least some discussions of approximate equality. I think that at least somewhat discouraged me from exploring ULPs, given the "Know what you’re doing" section's statement that "There is no silver bullet," which mentions specific weaknesses of ULPs. That said, we could dig into how GoogleTest's approach has worked out in practice. I have always felt bad that most users are probably just typing "0.000001" with as many zeros as they feel like typing, without actually having much idea of what the "right" value is. And see also https://errorprone.info/bugpattern/FloatingPointAssertionWithinEpsilon.... |
Yeah, I added the support for floating point arrays ( Like you, I can only speculate about how that decision was made in the first place. I would, however, agree with the assertion that there is no silver bullet. That Random ASCII post looks like it goes into loads of detail and seems pretty good. The short version is that the kinds of errors you'd consider acceptable depend on the calculations behind the number and no assumption will always be right. I do think that it's possible that research into how these things are actually used could result in something that would be right often enough to be useful, but I think we'd want to do that research rather than guess. For what it's worth, we do have some rules of thumb in our docs. Again, if anyone can produce better advice with evidence that it's better, I'm happy to LGTM that. |
This may be a weak, "easy-way-out" position to take, but I'd be inclined to simply mirror what googletest is doing, barring some evidence that they regret adding this assertion. It seems better to offer an out-of-the-box usually reasonable approach than force everyone to come up with a magic constant for each usage that doesn't have any semantic meaning to their tests. |
I am sympathetic. This is just one of those things that is not on fire, just a slow bleed of inconvenience :\ One thing that might be fun is to edit Caveats:
It might also be interesting to compare the results against overriding the tolerance to always be zero: I imagine that that "works" fairly often, and we might get more interesting data by comparing the results of that with the results of the ULP change. |
(Vaguely related, even less of a priority: Another thing we could do is, when an approximate comparison fails but is "close," is to output a tolerance that would have let it pass.) |
That might be fun, indeed. (I would suggest doing the experiment with You are correct that other platforms could see different behaviour. With f.p., it's even theoretically possible that behaviour could change on the same exact machine according to load, although I don't know whether that's a big deal in practice. But it'd still be instructive, we'd just have to take the results with that caveat. One other thing worth bearing in mind is that we might want to change |
I ran a test that overrides all The next question is what happens if we change it to check "within 4 ULPs" instead of "within a tolerance of 0." I'm not sure if that's as trivial to implement. To follow googletest, maybe we'd use |
First of all, I haven't looked at the googletest implementation, and I'm not even sure exactly what "within four ULPs of one another" means. If x and y are not close to a power of two then ULP(x) = ULP(y) and the semantic is obvious. In that case, I think you can do If x and y are close enough to a power of two then ULP(x) != ULP(y) and "four ULPs" is ambiguous. If I had to guess, I'd guess that it uses the larger of the two ULPs, maybe? That would seem intuitively more reasonable. But I'd have to check the code to be sure. Anyway, this is not the same as calling I'm being vague about how close to a power of two you have to be there for two reasons. One is that there's a couple of different definitions of ULP floating around, and they differ close to powers of two (per wikipedia). (I guess that maybe IEEE 754 must give a definition since it specifies required accuracies in terms of them, but I don't know.) The other reason is that I'd have to think more than I have done :-). TL;DR is that, if it's important to be consistent with googletest then I think we need to port the googletest code into Java rather than trying to infer the semantic from documentation. |
P.S. There are some things in DoubleUtils that might help with the impl. |
(Actually, I suppose it's possible that the meaning of "within four ULPs of one another" when ULP(x) != ULP(y) is exactly the mix of the two ULPs that corresponds to calling |
Google's C++ test framework supports approximate comparison of floating point numbers without requiring the user specify a precision (though that's available too). The
ASSERT_DOUBLE_EQ
/ASSERT_FLOAT_EQ
macros assert that two numbers are within four ULPs of one another, which (the documentations claims) should work well enough for most use-cases.I'm sure this has been discussed before, so apologies if I'm dredging up something that's already been long-decided. But if ULP-based assertions are intentionally absent from
DoubleSubject
andFloatSubject
I think the documentation should explain why that decision was made and what users should do instead. Currently the class makes no mention of ULPs, and expects callers to always pick an appropriatetolerance
.The text was updated successfully, but these errors were encountered: