Adds nlatent and nclass to Categorical #68

theogf · 2022-03-10T11:06:39Z

As discussed in #58 I am adding a nlatent function that indicates how many latent GPs are neeeded by the likelihood.
It also means that CategoricalLikelihood needs to know how many classes are represented.

test/likelihoods/categorical.jl

devmotion · 2022-03-10T11:38:38Z

Isn't this information known anyway when you model something? Ie, if you choose to work with a heteroscedastic Gaussian likelihood you should know that you need two latent GPs?

theogf · 2022-03-10T12:28:45Z

Isn't this information known anyway when you model something? Ie, if you choose to work with a heteroscedastic Gaussian likelihood you should know that you need two latent GPs?

Right sure you do. But the motivation is to be able to know this information when building the model itself without having to specify it explicitly.

If we build later an API where we get something like build_model(X, y, kernel, likelihood) one can infer what kind of underlying GP structure we need just from the likelihood.

The reason I need this at the moment is for AugmentedGPLikelihoods.jl where I need to specify the initial size of the augmented variables based on the likelihood : see https://github.com/JuliaGaussianProcesses/AugmentedGPLikelihoods.jl/blob/d976cc0a37639872a520d3818f44f566c58d39f6/src/likelihoods/categorical.jl#L20

theogf · 2022-03-10T12:58:12Z

I also want to mention that in AugmentedGPs.jl I initially had the categorical likelihood implemented such that it would directly infer the number of classes and others and this proved to be a bad decision. It's incompatible with the online setting, it fails when not all classes are present in a treated minibatch and others.
I ended up forcing to give information to the likelihood, the minimum being how many classes are present.

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

codecov · 2022-03-10T13:12:55Z

Codecov Report

Merging #68 (812e5c9) into master (e9b7da9) will increase coverage by 0.11%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master      #68      +/-   ##
==========================================
+ Coverage   96.50%   96.62%   +0.11%     
==========================================
  Files          10       11       +1     
  Lines         143      148       +5     
==========================================
+ Hits          138      143       +5     
  Misses          5        5

Impacted Files	Coverage Δ
src/GPLikelihoods.jl	`100.00% <100.00%> (ø)`
src/TestInterface.jl	`100.00% <100.00%> (ø)`
src/likelihoods/categorical.jl	`100.00% <100.00%> (ø)`
src/likelihoods/gaussian.jl	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e9b7da9...812e5c9. Read the comment docs.

devmotion · 2022-03-10T23:04:25Z

Hmm it's still not really clear to me why this information should be encoded in the likelihood, in particular for the categorical likelihood. Currently you can evaluate it with a latent vector of arbitrary size and it will return a categorical distribution with the corresponding number of classes automatically. I guess I'm just not familiar enough with your use case and my confusion arises from the fact that I don't see how one would build a model without knowing the number of GPs, classes, etc. already.

theogf · 2022-03-11T09:47:00Z

Maybe let me give a very precise example where this would matter.
Assume we want to do multi-class classification in an online setting.
You need to know how many classes are there in advance.
Because even if you planned to infer that from a first batch, there is no guarantee that all classes are present in this batch and you might end up with the wrong number of latent GPs.
So, as you said also, we need to know the number of classes in advance, and it seems a natural thing to me to incorporate this information directly in the likelihood.

devmotion · 2022-03-11T10:41:44Z

Thanks for the clear example! Now the point I don't understand is: why would want to encode this information in the likelihood if it works fine without specifying the number of classes? If you know the number of classes and you know you want to use a categorical link, with a bijective or a non-bijective link, then you know the number of latent GPs already? And only for this point, but not the likelihood, the number of classes is needed.

Or, put differently: You have to know the number of classes, and hence latent GPs, when coding your model. So why not just use this information where it matters, ie when coding the GP?

theogf · 2022-03-11T11:31:12Z

Ok I think our discussion here resumes to, when creating a model,

The number of latent be inferrable from the likelihood(s) itself (my point of view).
The number of latent should be provided by the user and the number of categories should be inferred from it (depending on the link as well) (your point of view if I get it correctly)

My argument for my POV is that it puts less burden on the user in the case of a high-end interface.
Give me a likelihood, give me some data and I will automatically build the right latent model. This allows to have a very general interface no matter what the likelihood is (and if need multiple latent GPs or not)

theogf · 2024-04-07T15:49:56Z

I am just going to close this PR as it went stale.

Adds nlatent and nclass to Categorical

08c31f7

github-actions bot reviewed Mar 10, 2022

View reviewed changes

test/likelihoods/categorical.jl Outdated Show resolved Hide resolved

test/likelihoods/categorical.jl Outdated Show resolved Hide resolved

theogf and others added 3 commits March 10, 2022 13:58

Apply suggestions from code review

69f976e

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Fixed various issues

3424cc9

Formatting

a606ac7

Merge branch 'master' into nlatent

812e5c9

theogf closed this Apr 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds nlatent and nclass to Categorical #68

Adds nlatent and nclass to Categorical #68

theogf commented Mar 10, 2022

devmotion commented Mar 10, 2022

theogf commented Mar 10, 2022

theogf commented Mar 10, 2022

codecov bot commented Mar 10, 2022 •

edited

Loading

devmotion commented Mar 10, 2022

theogf commented Mar 11, 2022

devmotion commented Mar 11, 2022

theogf commented Mar 11, 2022

theogf commented Apr 7, 2024

Adds nlatent and nclass to Categorical #68

Adds nlatent and nclass to Categorical #68

Conversation

theogf commented Mar 10, 2022

devmotion commented Mar 10, 2022

theogf commented Mar 10, 2022

theogf commented Mar 10, 2022

codecov bot commented Mar 10, 2022 • edited Loading

Codecov Report

devmotion commented Mar 10, 2022

theogf commented Mar 11, 2022

devmotion commented Mar 11, 2022

theogf commented Mar 11, 2022

theogf commented Apr 7, 2024

codecov bot commented Mar 10, 2022 •

edited

Loading