Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the GTPSA.jl backend #329

Open
wants to merge 47 commits into
base: main
Choose a base branch
from

Conversation

mattsignorelli
Copy link

@gdalle
Copy link
Member

gdalle commented Jun 25, 2024

@mattsignorelli thank you for the contribution!
I fixed a few things and defined AutoGTPSA in the main package for now. Types should not be defined in package extensions because it makes them hard to import.
The main part missing is the pushforward operator. If you don't have a dedicated mechanism for that, you can always deduce it from derivative, gradient or jacobian depending on the function type (array/scalar -> array/scalar). It will be slow but it will work.

@gdalle gdalle linked an issue Jun 25, 2024 that may be closed by this pull request
@gdalle gdalle added the backend Related to one or more autodiff backends label Jun 25, 2024
@codecov-commenter
Copy link

codecov-commenter commented Jun 25, 2024

Codecov Report

Attention: Patch coverage is 0% with 416 lines in your changes missing coverage. Please review.

Project coverage is 54.75%. Comparing base (d51fc0a) to head (77e49d6).

Files with missing lines Patch % Lines
...ace/ext/DifferentiationInterfaceGTPSAExt/onearg.jl 0.00% 287 Missing ⚠️
...ace/ext/DifferentiationInterfaceGTPSAExt/twoarg.jl 0.00% 119 Missing ⚠️
...face/ext/DifferentiationInterfaceGTPSAExt/utils.jl 0.00% 9 Missing ⚠️
...erfaceGTPSAExt/DifferentiationInterfaceGTPSAExt.jl 0.00% 1 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (d51fc0a) and HEAD (77e49d6). Click for more details.

HEAD has 38 uploads less than BASE
Flag BASE (d51fc0a) HEAD (77e49d6)
DI 41 11
DIT 10 2
Additional details and impacted files
@@             Coverage Diff             @@
##             main     #329       +/-   ##
===========================================
- Coverage   98.51%   54.75%   -43.76%     
===========================================
  Files         108      103        -5     
  Lines        4297     4597      +300     
===========================================
- Hits         4233     2517     -1716     
- Misses         64     2080     +2016     
Flag Coverage Δ
DI 52.44% <0.00%> (-46.14%) ⬇️
DIT 60.01% <ø> (-38.37%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@gdalle gdalle marked this pull request as draft June 25, 2024 09:05
@gdalle gdalle marked this pull request as ready for review July 8, 2024 04:41
@gdalle gdalle marked this pull request as draft July 30, 2024 14:05
@mattsignorelli
Copy link
Author

Sorry for the delay on this, I've had my hands tied with all different projects. Two weeks ago I pushed major updates/fixes and finishing touches to GTPSA.jl, which is now it's in first stable release. I'll get back to this shortly

@gdalle
Copy link
Member

gdalle commented Aug 7, 2024

The structure of the tests has changed a little since the last time you checked, can you merge main into this PR?

@gdalle gdalle marked this pull request as draft September 18, 2024 08:37
@gdalle
Copy link
Member

gdalle commented Sep 18, 2024

Sorry for taking so long to get back to you @mattsignorelli! I have brought your PR up to speed with the rest of the package, and taken into account the new version of ADTypes which includes AutoGTPSA.
The last thing I'd like to do before merging is clean up the code a little bit, by removing duplication in favor of clear auxiliary functions. For example, my last commit cuts a lot of code from pushforward thanks to

function initialize!(xt::TPS, x::Number, dx::Number)
    xt[0] = x
    xt[1] = dx
    return xt
end

function initialize!(xt::AbstractArray{<:TPS}, x::AbstractArray, dx::AbstractArray)
    for i in eachindex(xt, x, dx)
        initialize!(xt[i], x[i], dx[i])
    end
    return xt
end

Can you maybe create similar functions to shorten the other operators? Just like TPS initialization, extraction from TPS into numbers and arrays could also be streamlined.

@mattsignorelli
Copy link
Author

mattsignorelli commented Sep 18, 2024

Thanks for your work on this!

function initialize!(xt::TPS, x::Number, dx::Number)
    xt[0] = x
    xt[1] = dx
    return xt
end

I think we need to be careful here. This would only apply for single-variable functions: xt[1] = dx basically sets the dual part of the first variable to dx. However, if there are two variables, then to set the dual part of the second variable, you would need to do xt[2] = dx. E.g.

julia> d = Descriptor(2,1); # two variables to first order

julia> f = TPS(use=d);

julia> f[0] = 1; f[1] = 2; f[2] = 3;

julia> print(f) # corresponds to f(dx1,dx2) = 1 + dx1 + dx2
TPS64:
 Coefficient                Order   Exponent
  1.0000000000000000e+00      0      0   0
  2.0000000000000000e+00      1      1   0
  3.0000000000000000e+00      1      0   1

Also, because the TPS type is mutable, we really only want to evaluate the function once. I'll make some changes to the updates and try to simplify/remove code duplication using utility functions, and push for review

@mattsignorelli
Copy link
Author

I think I understand what Tangents are now. Just to check: to take the pushforward of a 3 variable function with seed [dx1, dx2, dx3], your Tangents object will be Tangents{1, Vector{Float64}}(([dx1, dx2, dx3])) ? And so Tangents is there to store multiple seeds?

@gdalle
Copy link
Member

gdalle commented Sep 18, 2024

Tangents is there to store multiple seeds?

The expression "multiple seeds" is a bit unclear. For instance, for a function f(x::AbstractVector), DI considers dx::AbstractVector to be "a single seed". It used to be possible to call pushforward(f, backend, x, dx) but that is no longer the case.
From v0.6 you will need to use the wrapper pushforward(f, backend, x, Tangents(dx)), and recover a Tangents{1} as the result. On the other hand, if you have several seeds dxa, dxb, dxc (perhaps clearer than numbers?) inside Tangents(dxa, dxb, dxc), then your batch size is 3 and you will get a Tangents{3} as the result.
Sorry if this is not well documented, you're the first outside person to ever contribute an extension so I didn't plan for that case ^^ But if there are unclear aspects I should definitely improve the docs.

@gdalle
Copy link
Member

gdalle commented Sep 18, 2024

I think we need to be careful here. This would only apply for single-variable functions

That's the setting considered in DI. Even with all the latest changes, there will only ever be one active variable x, either in f(x) or in f!(y, x). Of course that variable can itself be an array.

@mattsignorelli
Copy link
Author

Ah ok, I think I understand better. I'm admittedly not super familiar with the usual conventions used by most AD packages in Julia, only that in GTPSA, so I'm trying to connect the dots here. Let me explain how it works in GTPSA, and then maybe you can help better see how to fit it into DI:

Basically with GTPSA, it calculates truncated power series in the variables, assuming the variables are small. So for example to calculate a Jacobian of a function $F: \mathbb{R}^n -&gt; \mathbb{R}^m$, you would define a Descriptor(n, 1) for $n$ variables to order 1, and then use x = vars(d) which gives you a vector which in the usual AD language is actually $[\Delta x_1, \Delta x_2, ... \Delta x_n]$ ("variables" in GTPSA are small, so actually this might be better understood as dx = vars(d)). Then, to calculate the multivariable Taylor expansion of the function $F$ around some point $\vec{a}$, you would just do y = F(a.+x). y here is now the multivariable Taylor expansion. You can then extract the Jacobian from the monomial coefficients of the Taylor expansion.

Alternatively, one could do this one with variable, e.g. calculate $F([a_1+\Delta x_1, a_2, ..., a_n])$, then do $F([a_1, a_2+\Delta x_1, ..., a_n])$ etc and extract the first order part of the TPS result for each component for each calculation. However, this will be slower than instead propagating all of $F(\vec{a}+[\Delta x_1, \Delta x_2, ... \Delta x_n])$, because TPSs are mutable and we only want to calculate the function once.

@gdalle
Copy link
Member

gdalle commented Sep 19, 2024

That makes sense, thank you. But then I think the initialize! function I wrote earlier is fine? At least it passes the tests ^^
Also, this makes me wonder if we could compute HVPs by considering both x and dx as variables somehow. With higher-level differentiation utilities like those in GTPSA, there must be something better to do than computing the full Hessian first.

@gdalle gdalle closed this Sep 19, 2024
@gdalle gdalle reopened this Sep 19, 2024
@gdalle gdalle removed this from the v0.6 milestone Sep 23, 2024
@gdalle gdalle mentioned this pull request Sep 25, 2024
@mattsignorelli
Copy link
Author

OK sorry for the big delay, I have been completely consumed by my other project but I'm ready to get back to this now.

Also, this makes me wonder if we could compute HVPs by considering both x and dx as variables somehow. With higher-level differentiation utilities like those in GTPSA, there must be something better to do than computing the full Hessian first.

That's a great idea. I'll look into this and see if I can come up with anything.

I'll take a look at the latest code and dev guide to reorient myself, and be in touch soon.

Copy link

codecov bot commented Jan 6, 2025

Codecov Report

Attention: Patch coverage is 98.50374% with 6 lines in your changes missing coverage. Please review.

Project coverage is 57.45%. Comparing base (6604be2) to head (731e9da).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...ace/ext/DifferentiationInterfaceGTPSAExt/onearg.jl 98.10% 6 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (6604be2) and HEAD (731e9da). Click for more details.

HEAD has 31 uploads less than BASE
Flag BASE (6604be2) HEAD (731e9da)
DI 35 6
DIT 8 6
Additional details and impacted files
@@             Coverage Diff             @@
##             main     #329       +/-   ##
===========================================
- Coverage   97.61%   57.45%   -40.16%     
===========================================
  Files         112      107        -5     
  Lines        5609     5888      +279     
===========================================
- Hits         5475     3383     -2092     
- Misses        134     2505     +2371     
Flag Coverage Δ
DI 44.07% <98.50%> (-54.70%) ⬇️
DIT 85.76% <ø> (-9.59%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mattsignorelli
Copy link
Author

@gdalle This should be ready soon.. For HVP, I'm having difficulty understanding the tangents. For example one of the tests gives a 6x6 Hessian matrix but the tangent is a scalar number. Shouldn't the tangent be a vector?

@mattsignorelli
Copy link
Author

OK other than hvp, we are pretty much done, all other tests pass. In order to achieve full code coverage, I'm adding the kwarg unsafe_inbounds to GTPSA's getter functions, so that I can run tests with AutoGTPSA{Descriptor}, and just let the Descriptor have many variables to second order (before was just running tests with AutoGTPSA{Nothing}). Once this is released I will bump the version.

I'm having a bit of trouble with hvp. It's basically the same code that worked some number of months ago, but now it's showing disagreements. Any help or clarification on my comment above would be much appreciated :) On that note, I looked into whether or not there is a faster way to calculate the hvp without materializing the full hessian, and unfortunately because TPSs cannot be nested, there isn't really any other option than materializing the full hessian. There are some other ways I found, but none of them are faster..

@mattsignorelli
Copy link
Author

mattsignorelli commented Jan 8, 2025

Actually, I realized I made a mistake in which tests are being run when I made my previous comment. I found the problem

@mattsignorelli mattsignorelli marked this pull request as ready for review January 8, 2025 14:48
@mattsignorelli mattsignorelli requested a review from gdalle as a code owner January 8, 2025 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend Related to one or more autodiff backends
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Adding GTPSA.jl to the interface
3 participants