Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Ops.convolution #428

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft

add Ops.convolution #428

wants to merge 3 commits into from

Conversation

glou-nes
Copy link
Contributor

@glou-nes glou-nes commented Dec 27, 2024

Add convolution using StableHLO function to compute output tensor used by verification/interpreter. Add a structure because I find a limit with ccall:

extern "C" void f(long a, long b, long c, long d,long e, MlirAttribute dn) {
	auto pp = unwrap(dn);
	pp.dump();
}

extern "C" void f_bad(long a, long b, long c, long d,long e,long f, MlirAttribute dn) {
	auto pp = unwrap(dn);
	pp.dump();
}

The second one causes segfault.
@wsmoses do you think this approach is viable?
Probably needs a new file when #421 land.

featureGroupCount::Int,
batchGroupCount::Int,
)
cp = ConvolutionParams(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable to me but yeah I am confused about the arg thing

@mofeing
Copy link
Collaborator

mofeing commented Dec 27, 2024

Why do we need to add a function to the C-API?

@glou-nes
Copy link
Contributor Author

It's using the StableHLO C++ interpreter function to compute the output tensor size, it's not directly available anywhere. So ReactantExtra seems to be the best place for this.

@mofeing
Copy link
Collaborator

mofeing commented Dec 28, 2024

i know computing the size of the result of stablehlo.convolution is hard, but can't it be computed from the notes in the spec? specs are not suuuper clear over how to compute the result shape but i managed to do it for dot_general and i think i understand now how to do the "shape inference" for fft. it takes time but i believe it's better to avoid putting stuff in the C-API if we can do it in Julia. if not doable, totally in favor of adding stuff to the C-API ofc.

in the spec of stablehlo.convolution, i've found the following constraints that are useful for computing the resulting shape:

  • (C20) is telling you that output_dimensions = [output_batch_dimension] + output_spatial_dimensions + [output_feature_dimension], so you can focus on the different dimensions.
  • (C25) says that for dim(result, result_dim)
    • dim(lhs, input_batch_dimension) / batch_group_count if result_dim = output_batch_dimension
    • dim(rhs, kernel_output_feature_dimension) if result_dim = output_feature_dimension
    • num_windows otherwise (i.e. for output_spatial_dimensions), and there are some equations on how to compute num_windows

so, if you know all the input args and attributes of stablehlo.convolution, you should be able to compute the result shape in Julia, right? correct me if i'm wrong or if there is some extra complexity i'm missing.

@wsmoses i guess we could open an issue in the StableHLO repo to ask for better docs on how to compute result shapes? it would be very useful for other ops too. at least some coherent naming would improve current situation..

@glou-nes
Copy link
Contributor Author

Yes completely one can infer the result size from Julia, I started from Paul implementation in ReactantNNlibExt.jl but It was missing several attributes from the generic operation! It's probably not the hardest operation, but it's complex enough ~100 LoC in C++. I read the spec and it's quiet useful and complex . I don't understand why we want to redo the computation on Julia side, this code inferConvolutionOp is already executed during verification pass; advantages of this approach is to directly get verifier error from Julia without rewriting the 34 constraints and get the fullset of features.

@wsmoses
Copy link
Member

wsmoses commented Dec 28, 2024

Yes completely one can infer the result size from Julia, I started from Paul implementation in ReactantNNlibExt.jl but It was missing several attributes from the generic operation! It's probably not the hardest operation, but it's complex enough ~100 LoC in C++. I read the spec and it's quiet useful and complex . I don't understand why we want to redo the computation on Julia side, this code inferConvolutionOp is already executed during verification pass; advantages of this approach is to directly get verifier error from Julia without rewriting the 34 constraints and get the fullset of features.

I could go either way on this. To be clear I’m fine with either approach. Advantage of having a Julia re implementation of the rules means it’ll be easier to debug shape mismatch issues on the Julia side. Con is it requires re implementation, and of course if things change that’s annoying. That said stablehlo shouldn’t really change much especially here.

another thing you could do is you could export this function in the stablehlo c bindings upstream

@mofeing
Copy link
Collaborator

mofeing commented Dec 28, 2024

... without rewriting the 34 constraints and get the fullset of features.

but this is the thing, you don't need to write the 34 constraints in Julia. actually, some of these constraints are not coded in the already available Ops as assertions but you already use them as ways to compute the input arguments/attributes.

in the Ops i wrote myself, i didn't wrote all the constraints. what i did is write easy checks so that we can fast check in Julia any superficial problem (e.g. you're using different eltypes or shapes do not mismatch) and i left deep checks to XLA, who will raise an exception. now that we have propagation of XLA errors to Julia, we can also manage these XLA exceptions.

Con is it requires re implementation, and of course if things change that’s annoying. That said stablehlo shouldn’t really change much especially here.

if there are changes, i'm confident it will be on these deep checks left to XLA with high probabily so we won't need to do modifications (probably).

but not computing the result shape is gonna make debugging more difficult when it's sth we can do ourselves.

having said this, i'm not against this PR and i think this is really needed, so thanks @glou-nes for pushing this forward. we can merge it and refactor it in the future if we see that it's a problem.
just wanna make it clear that always there are tradeoffs, and anything that we put into the C side will make it harder to debug on the Julia side (not only for us, but also for users)


@wsmoses one thing to keep in mind is that the XLA exceptions will be thrown on the verification step, not during MLIR emission, so unless you've used with_debug, you won't be able to link the XLA exception to a region of user code. also, i don't like too much the current implementation of mlir_backtrace and with_debug, too much overhead on calling backtrace and maybe is easier if we just do sth like Profiler.jl does: make a snapshot of the callstack and process it later or in parallel. also, AFAIK the XLA exception will be seen as thrown from run_pass_pipeline! probably, and not from the faulting user code.

@glou-nes
Copy link
Contributor Author

Thank @mofeing for the explanation! I just want to make a point : with this, one get the verification error, directly in MLIR emission, for instance:

function bad_conv(x, ker)
           pp = dimension_number(4, 3, 2, 3, 4)
           Reactant.Ops.convolution(x, ker, pp; padding=[1 2; 3 4; 5 6]) #wrong padding here
end
x = Reactant.to_rarray(randn(Float64, 224, 224, 1, 1))
ker = Reactant.to_rarray(randn(Float64, 10, 10, 1, 1))
@code_hlo optimize=false bad_conv(x, ker)

We have:

loc("convolution"("/depot/dev/Reactant/src/Ops.jl":622:0)): error: expects padding-entries to have same dimension-size as size of window dimensions (2), but got: 3.
ERROR: AssertionError: cannot infer result type #assertion in Ops
[stacktrace]

@mofeing
Copy link
Collaborator

mofeing commented Dec 29, 2024

... with this, one get the verification error, directly in MLIR emission, for instance:

loc("convolution"("/depot/dev/Reactant/src/Ops.jl":622:0)): error: expects padding-entries to have same dimension-size as size of window dimensions (2), but got: 3.
ERROR: AssertionError: cannot infer result type #assertion in Ops
[stacktrace]

you're getting the location of the Ops.jl method, but not the location of the user code that calls Ops.convolution. that's what i'm referring to.

for that, you need to use Ops.with_debug but currently adds overhead and the print is quite ugly.

@glou-nes
Copy link
Contributor Author

glou-nes commented Dec 29, 2024

Sure adding stacktrace in every location make verifier hard to read for instance. Here we got the stacktrace in [stacktrace] I don't add it to simplify the code. It is not what you called "the location of the user code that calls Ops.convolution"?

  [1] convolution(input::TracedRArray{…}, kernel::TracedRArray{…}, dimension_numbers::Attribute; feature_group_count::Int64, batch_group_count::Int64, window_strides::Vector{…}, padding::Matrix{…}, lhs_dilation::Vector{…}, rhs_dilation::Vector{…}, window_reversal::Vector{…}, location::Location)
    @ Reactant.Ops /depot/dev/Reactant/src/Ops.jl:643
  [2] bad_conv
    @ ./REPL[19]:3 [inlined]
  [3] bad_conv(none::TracedRArray{Float64, 4}, none::TracedRArray{Float64, 4})
    @ Reactant ./<missing>:0
  [4] bad_conv
    @ ./REPL[19]:2 [inlined]
  [5] call_with_reactant(::typeof(v), ::TracedRArray{Float64, 4}, ::TracedRArray{Float64, 4})
[...]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants