-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change _plogpdf function to accept four arguments #199
Conversation
Could you please provide a description for the PR? At first glance, it’s not clear why this change is necessary. Additionally, please include tests that were failing before this change and are passing afterward. Thanks! |
This is needed because many of the projections that are failing are due to presence of factorial, gamma, beta etc. type of functions in the log-partition and base measure. Because these functions are numerically challenging tests in ExponentialFamilyProjections either fail or display strange behavior such as extra factor of 2 when projecting into Chisq. I noticed a strange behavior when writing tests for projecting into Poisson. When the lambda parameter of Poisson is large, due to presence of factorial in the basemeasure, projection is wrong. When the projection is done with unnormalized density these problems are solved. Therefore, the test suit in ExponentialFamilyProjection needs to be implemented with _plogpdf removing all the basemeasure and log-partition factors. In the current implementation of _plogpdf only logpartition is excluded. By excluding basemeasure we can correctly project into Chisq and Poisson. This why it is needed to include the 4th argument. |
@ismailsenoz @bvdmitri I will add a small test for this functionality |
AFAIK none of the tests in ExponentialFamily were failing and there is no explicit tests for _plogpdf functions beforehand anyways. This function was implicitly tested with all logpdf and pdf calls. |
Yeah, you are right, but if it is supposed to be used somewhere, this interface from a hidden optimisation for big containers becomes functionality. _plogpdf was initially added just to faster compute logpdf for big containers. |
I see, that clarifies it for me. Your description makes sense, and to avoid any confusion in the future, it would be beneficial to open a PR with a comprehensive explanation of the change. It would also be helpful to mention that the existing tests should cover this modification. However, I’m still unclear about how this change would assist with projection, as we don’t seem to call Perhaps you refer to the |
It's assisting from the perspective of the target function. If you try to project x -> logpdf(Poisson, x) where the Poisson distribution has a very large natural parameter, the projection fails to recover the correct value. The base measure has factorial in its computations, and small and large natural parameters become indistinguishable. However, if you project x -> log(x) * η, it works fine and correctly recovers the η parameter. The proposal is to use unnormalized densities as the target function inside projection procedures for tests to make them more stable, at least for the distributions where to compute logpdf is numerically problematic such as Chisq, Poisson, Gamma. |
The function in the test setups for exponential family projection It is not related to the method whatsoever. It is just related to the tests. |
Yes, we unfortunately can not manipulate what user provides us, but we can decide what we are testing. And @ismailsenoz point is that it does not make much of a sense to test for Poisson or Chisq projection of them trough their logpdf, it just gives unreasonable values. |
I understand the proposal for the tests, but how would you practically implement this during actual inference? In practice,
But this is precisely what will happen during real inference.
True, but we shouldn’t simplify our tests just to ensure they pass. There are many ways to set up tests—for example, adjusting If tests for |
The problem of the test suit is it is trying to check whether the projection gets finer with increasing number of samples and iterations. In order to test this property properly we need to ensure that the projection is correct. In practice we can not control what kind of function is passed. But in practice no one is going to try projecting normal into normal or poisson into poisson as well. So the fact that some of projection tests passing doesn't mean they are useful neither. If someone tries to project unnormalized density involving factorials or gamma it is a problem but that is not the case often.
We are not trying to simplify the tests. We are actually trying to show that these tests are not really testing what you intend them to test unless you ensure that the projection is correct. I do not think there is a remedy for the generic case. If a user is really interested in passing target functions that involve numerically unstable subroutines we can not do much about that. |
It depends on the use case. In ReactiveMP, we can use that for example if our node is inside the exponential family, we will use _plogpdf to send a message with fixed log base measures and log partition function set to 0. In theory, these values should not change the result, so there is no reason to compute them. For arbitrary use cases, there is no cure. If a user provides a numerically poor implemented target function, we cannot recover anything reasonable from it. However, we can include this example in our documentation to show users that it will work one way but not another. It's important to emphasize that this is not a problem with the method itself, but rather with the target function. Users should be careful about what they provide. |
Thanks for the explanations. It's good that you've observed this behaviour, and its great that you're testing this scenarios. I would also try to investigate (examples are below):
import SpecialFunctions: loggamma
logbasemeasure(ef::ExponentialFamilyDistribution{Chisq}, x) = -x / 2
function logpdf_chisq(ef::ExponentialFamilyDistribution{Chisq}, x)
η = getnaturalparameters(ef)
_statistics = sufficientstatistics(ef, x)
_logpartition = logpartition(ef, x)
_logbasemeasure = logbasemeasure(ef, x)
return _logbasemeasure + ExponentialFamily._scalarproduct(Chisq, η, _statistics) - _logpartition
end
function logpdf_chisq_without_logbasemeasure(ef::ExponentialFamilyDistribution{Chisq}, x)
η = getnaturalparameters(ef)
_statistics = sufficientstatistics(ef, x)
_logpartition = logpartition(ef, x)
return ExponentialFamily._scalarproduct(Chisq, η, _statistics) - _logpartition
end
function logpdf_chisq_without_logpartition(ef::ExponentialFamilyDistribution{Chisq}, x)
η = getnaturalparameters(ef)
_statistics = sufficientstatistics(ef, x)
_logbasemeasure = logbasemeasure(ef, x)
return _logbasemeasure + ExponentialFamily._scalarproduct(Chisq, η, _statistics)
end
function logpdf_chisq_without_logbasemeasure_and_logpartition(ef::ExponentialFamilyDistribution{Chisq}, x)
η = getnaturalparameters(ef)
_statistics = sufficientstatistics(ef, x)
return ExponentialFamily._scalarproduct(Chisq, η, _statistics)
end
function logpdf_chisq_with_big_floats(ef::ExponentialFamilyDistribution{Chisq}, x)
η = getnaturalparameters(ef)
(η1,) = BigFloat.(ExponentialFamily.unpack_parameters(Chisq, η))
_statistics = sufficientstatistics(ef, x)
o = one(η1)
_logpartition = loggamma(BigFloat(η1 + o)) + (η1 + o) * logtwo
_logbasemeasure = -x / 2
return _logbasemeasure + ExponentialFamily._scalarproduct(Chisq, η, _statistics) - _logpartition
end dist = Chisq(100.0)
ef = convert(ExponentialFamilyDistribution, dist)
f1 = (x) -> logpdf(ef, x)
f2 = (x) -> logpdf_chisq(ef, x)
f3 = (x) -> logpdf_chisq_without_logbasemeasure(ef, x)
f4 = (x) -> logpdf_chisq_without_logpartition(ef, x)
f5 = (x) -> logpdf_chisq_without_logbasemeasure_and_logpartition(ef, x)
f6 = (x) -> logpdf_chisq_with_big_floats(ef, x) project_to(ProjectedTo(Chisq), f1) = Chisq{Float64}(ν=50.251845186537395)
project_to(ProjectedTo(Chisq), f2) = Chisq{Float64}(ν=12.415111710651729)
project_to(ProjectedTo(Chisq), f3) = Chisq{Float64}(ν=13.682445714425805)
project_to(ProjectedTo(Chisq), f4) = Chisq{Float64}(ν=50.27226218442978)
project_to(ProjectedTo(Chisq), f5) = Chisq{Float64}(ν=99.33590564858314)
project_to(ProjectedTo(Chisq), f6) = Chisq{Float64}(ν=50.25184518653741) Another question: Does this pattern repeat for all exponential family distributions that have a non-constant base measure? Perhaps the issue is not directly related to the stability or instability of both terms but rather to the fact that the base measure is not constant, which could be influencing the projection process. Both I understand your point, and @Nimrais may be correct in suggesting that this approach works for certain nodes in ReactiveMP, though not universally. However, I have a feeling that there might be a larger issue here. It feels like we might be overlooking something or seeing a different bug. I’m just trying to verify our assumptions. I have this feeling because replacing |
Extra thought, we actually use (gradient of) ‘logpartition’ term in the projection step, which is presumably unstable for certain distributions, but if we remove it only from logpdf we indeed get a correct estimate. But the projection still uses this term to compute the gradient. Which makes me think again that the problem is not in stability/instability but rather something more complex :/ |
We still can merge this, maybe we would need it in the future, just saying. The tests are failing though |
I can take a look at the failing tests after we merge logbasemeasure PR. And then we can merge perhaps. |
Looks like the tests are failing due to the changed API of |
No description provided.