Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Online Learning #136

Open
caxelrud opened this issue Nov 27, 2024 · 4 comments
Open

Online Learning #136

caxelrud opened this issue Nov 27, 2024 · 4 comments

Comments

@caxelrud
Copy link

Is it possible to do Online Learning with LaplaceRedux?
In other words, is it possible to use a previous calculated posterior as the prior of a new evaluation?

@pat-alt
Copy link
Member

pat-alt commented Nov 28, 2024

Hi there @caxelrud 👋🏽 since both the prior and posterior are Gaussian, I don't see why this couldn't work. Apparently it's been done before, but I'm not familiar with the details: https://proceedings.neurips.cc/paper_files/paper/2018/file/f31b20466ae89669f9741e047487eb37-Paper.pdf

@caxelrud
Copy link
Author

caxelrud commented Nov 29, 2024

Hi!
I am checking now if the code already has this functionality.
Let me know your comments related to the existing code and its functionalities related to this feature.
Regards,

@caxelrud
Copy link
Author

caxelrud commented Dec 10, 2024

Hi,
At this point I am interested in training an existing model with more data.
To use a previous calculated posterior as the prior of a new evaluation.
Looking into the documentation, the LaplaceRedux.Posterior Type has:

  • posterior_mean::AbstractVector: the MAP estimate of the parameters
  • P::Union{AbstractArray,AbstractDecomposition,Nothing}: the posterior precision matrix

The LaplaceRedux.Prior has:

  • prior_mean::Real: the prior mean
  • prior_precision_matrix::Union{Nothing,AbstractMatrix,UniformScaling}: the prior precision matrix

Since the Prior type prior_mean is a scalar, I can't use the posterior_mean vector.
So, let me know your thoughts on how to overcome this limitation.
Thanks!

@pat-alt
Copy link
Member

pat-alt commented Dec 10, 2024

The prior_mean and field is only used in prior optimization (optimize_prior) which in the current implementation is done through marginal likelihood maximization. Still, this could be worth addressing in the future (#138).

You can still use the posterior mean as a prior, of course, by using it as a regularizer when training on new data: the Gaussian posterior now acting as your Gaussian prior is equivalent to training with weight decay (see Daxberger (2021) and also here and here). The standard Ridge penalty in Flux corresponds to a zero-mean prior, but that should be straight-forward to adjust in your code. This way the posterior mean will act as a prior affecting your MAP estimate when training on new data.

As for using the posterior precision matrix as your new prior, it's worth noting that prior_precision_matrix actually does enter computations elsewhere (not just in optimize_prior), e.g. here in calculating the posterior precision. So here it is indeed crucial to also supply that value at instantiation of your new Laplace object. Of course, you should then also use it when training with weight decay (so the posterior becomes $\mathbf{H}_0$ here).

Please be aware of some limitations here: our package was never designed to be a training framework. It merely ships the functionality for fitting LA to neural networks trained in Flux in a post-hoc fashion. I should also flag that my own research is in a different field, so I'm by no means an expert on LA and I am just brain-storming my thoughts here about your problem setup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants