-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds nlatent and nclass to Categorical #68
Conversation
Isn't this information known anyway when you model something? Ie, if you choose to work with a heteroscedastic Gaussian likelihood you should know that you need two latent GPs? |
Right sure you do. But the motivation is to be able to know this information when building the model itself without having to specify it explicitly. If we build later an API where we get something like The reason I need this at the moment is for AugmentedGPLikelihoods.jl where I need to specify the initial size of the augmented variables based on the likelihood : see https://github.com/JuliaGaussianProcesses/AugmentedGPLikelihoods.jl/blob/d976cc0a37639872a520d3818f44f566c58d39f6/src/likelihoods/categorical.jl#L20 |
I also want to mention that in AugmentedGPs.jl I initially had the categorical likelihood implemented such that it would directly infer the number of classes and others and this proved to be a bad decision. It's incompatible with the online setting, it fails when not all classes are present in a treated minibatch and others. |
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Codecov Report
@@ Coverage Diff @@
## master #68 +/- ##
==========================================
+ Coverage 96.50% 96.62% +0.11%
==========================================
Files 10 11 +1
Lines 143 148 +5
==========================================
+ Hits 138 143 +5
Misses 5 5
Continue to review full report at Codecov.
|
Hmm it's still not really clear to me why this information should be encoded in the likelihood, in particular for the categorical likelihood. Currently you can evaluate it with a latent vector of arbitrary size and it will return a categorical distribution with the corresponding number of classes automatically. I guess I'm just not familiar enough with your use case and my confusion arises from the fact that I don't see how one would build a model without knowing the number of GPs, classes, etc. already. |
Maybe let me give a very precise example where this would matter. |
Thanks for the clear example! Now the point I don't understand is: why would want to encode this information in the likelihood if it works fine without specifying the number of classes? If you know the number of classes and you know you want to use a categorical link, with a bijective or a non-bijective link, then you know the number of latent GPs already? And only for this point, but not the likelihood, the number of classes is needed. Or, put differently: You have to know the number of classes, and hence latent GPs, when coding your model. So why not just use this information where it matters, ie when coding the GP? |
Ok I think our discussion here resumes to, when creating a model,
My argument for my POV is that it puts less burden on the user in the case of a high-end interface. |
I am just going to close this PR as it went stale. |
As discussed in #58 I am adding a
nlatent
function that indicates how many latent GPs are neeeded by the likelihood.It also means that
CategoricalLikelihood
needs to know how many classes are represented.