forked from llvm/torch-mlir
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/backport_ea1_ops #105
Draft
mgehre-amd
wants to merge
1,376
commits into
main
Choose a base branch
from
feature/backport_ea1_ops
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Closed
As of Sep 14, the torch-mlir binary [wheels](https://github.com/llvm/torch-mlir-release/releases/tag/dev-wheels) got renamed to `torch-mlir-core` from `torch-mlir`: ![image](https://github.com/user-attachments/assets/152e4977-71ef-4f57-8757-6dc75f72b670) This was an unintended side-effect of the recent change of `TORCH_MLIR_ENABLE_ONLY_MLIR_PYTHON_BINDINGS=True` (llvm#3711) which skips setting `NAME = "torch-mlir"` in [setup.py](https://github.com/llvm/torch-mlir/blob/main/setup.py#L226-L232). To avoid having multiple downstreams fix their pip deps, this change allows using the same `torch-mlir` name for binaries, and reserves a separate `torch-mlir-ext` name for the (less popular) binaries with extensions enabled.
Issue: nod-ai/SHARK-ModelDev#856 Related tests: ![Screenshot 2024-10-01 175305](https://github.com/user-attachments/assets/0dc0901b-058f-427c-a596-9e806fd38836)
Bumps [externals/llvm-project](https://github.com/Xilinx/llvm-project) from `daa3383` to `09ddec3`. - [Commits](Xilinx/llvm-project@daa3383...09ddec3) --- updated-dependencies: - dependency-name: externals/llvm-project dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>
Add the operation with lowering to linalg. Includes a test for end-to-end correctness.
Addresses ~200 onnx model compile failures in <https://github.com/nod-ai/SHARK-TestSuite> related to <iree-org/iree#18631>. This change simplifies the result of the generated broadcast op substantially, but reduces the case coverage slightly. The case which will become unsupported: - trying to actually broadcast a dynamic dim that is secretly 1. When does this case appear in practical scenarios? - for a model where onnx shape inference cannot figure out that a dim should be 1. Why do I think we should not support this case for now? 1. For all models with dynamic dim expand ops, the previous path uniformly generates uglier linalg IR (making it harder for IREE to fuse properly with other ops). 2. For models failing shape inference castastrophically enough to fail to see a dim is statically 1, we can try to apply constant folding in the onnx model before importing. Leaving this as a draft PR, since it may be more appropriate to fix the compilation failure in IREE rather than torch-mlir. ### Example of broadcast required in previous path: ```mlir %300 = linalg.generic {indexing_maps = [#map11], iterator_types = ["parallel", "parallel", "parallel", "parallel"]} outs(%299 : tensor<?x12x?x?xi1>) { ^bb0(%out: i1): %306 = linalg.index 0 : index %307 = linalg.index 3 : index %308 = arith.index_cast %285 : i64 to index %309 = arith.cmpi eq, %308, %c1 : index %310 = arith.select %309, %c0, %306 : index %311 = arith.index_cast %286 : i64 to index %312 = arith.cmpi eq, %311, %c1 : index %313 = arith.select %312, %c0, %307 : index %extracted_79 = tensor.extract %reshape_78[%310, %c0, %c0, %313] : tensor<?x1x1x?xi1> linalg.yield %extracted_79 : i1 } -> tensor<?x12x?x?xi1> ``` ### Example of broadcast with simplified shape list: ```mlir %409 = linalg.generic {indexing_maps = [#map15, #map11], iterator_types = ["parallel", "parallel", "parallel", "parallel"]} ins(%reshape_135 : tensor<?x1x1x?xi1>) outs(%408 : tensor<?x12x?x?xi1>) { ^bb0(%in: i1, %out: i1): linalg.yield %in : i1 } -> tensor<?x12x?x?xi1> ```
…lvm-project-09ddec3 Bump externals/llvm-project from `daa3383` to `09ddec3`
… generic ops (llvm#3762) This is motivated by the fact that shapes are stored as tensors in ONNX, and IREE tries to perform tensor arithmetic on the device. This causes unnecessary dispatches, and makes it harder for the compiler to reason about shapes. Here is a small snippet of torch-IR that is typical seen coming from ONNX models: ```mlir module { func.func @main_graph(%arg0: !torch.vtensor<[?,?,768],f32>, %arg1: !torch.vtensor<[?,?,768],f32>) -> !torch.vtensor<[],si64> { %int0 = torch.constant.int 0 %0 = torch.vtensor.literal(dense<0> : tensor<1xsi64>) : !torch.vtensor<[1],si64> %1 = torch.aten._shape_as_tensor %arg1 : !torch.vtensor<[?,?,768],f32> -> !torch.vtensor<[3],si64> %2 = torch.aten.index_select %1, %int0, %0 : !torch.vtensor<[3],si64>, !torch.int, !torch.vtensor<[1],si64> -> !torch.vtensor<[1],si64> %3 = torch.aten.squeeze.dim %2, %int0 : !torch.vtensor<[1],si64>, !torch.int -> !torch.vtensor<[],si64> %4 = torch.aten.item %3 : !torch.vtensor<[],si64> -> !torch.int %5 = torch.aten.eq.int %4, %int0 : !torch.int, !torch.int -> !torch.bool %6 = torch.aten.Int.bool %5 : !torch.bool -> !torch.int %7 = torch.aten.size.int %arg0, %int0 : !torch.vtensor<[?,?,768],f32>, !torch.int -> !torch.int %8 = torch.prim.NumToTensor.Scalar %6 : !torch.int -> !torch.vtensor<[],i1> %9 = torch.prim.NumToTensor.Scalar %7 : !torch.int -> !torch.vtensor<[],si64> %10 = torch.prim.NumToTensor.Scalar %4 : !torch.int -> !torch.vtensor<[],si64> %11 = torch.aten.where.self %8, %9, %10 : !torch.vtensor<[],i1>, !torch.vtensor<[],si64>, !torch.vtensor<[],si64> -> !torch.vtensor<[],si64> return %11 : !torch.vtensor<[],si64> } } ``` Without the change in this PR, the result would be: ```mlir #map = affine_map<() -> ()> module { ml_program.global private mutable @global_seed(dense<0> : tensor<i64>) : tensor<i64> func.func @main_graph(%arg0: tensor<?x?x768xf32>, %arg1: tensor<?x?x768xf32>) -> tensor<i64> { %c0_i64 = arith.constant 0 : i64 %c0 = arith.constant 0 : index %dim = tensor.dim %arg1, %c0 : tensor<?x?x768xf32> %0 = arith.index_cast %dim : index to i64 %1 = tensor.empty() : tensor<1xi64> %collapsed = tensor.collapse_shape %1 [] : tensor<1xi64> into tensor<i64> %2 = linalg.fill ins(%0 : i64) outs(%collapsed : tensor<i64>) -> tensor<i64> %extracted = tensor.extract %2[] : tensor<i64> %3 = arith.cmpi eq, %extracted, %c0_i64 : i64 %dim_0 = tensor.dim %arg0, %c0 : tensor<?x?x768xf32> %4 = arith.index_cast %dim_0 : index to i64 %5 = tensor.empty() : tensor<i1> %6 = linalg.fill ins(%3 : i1) outs(%5 : tensor<i1>) -> tensor<i1> %7 = tensor.empty() : tensor<i64> %8 = linalg.fill ins(%4 : i64) outs(%7 : tensor<i64>) -> tensor<i64> %9 = linalg.fill ins(%extracted : i64) outs(%7 : tensor<i64>) -> tensor<i64> %10 = linalg.generic {indexing_maps = [#map, #map, #map, #map], iterator_types = []} ins(%6, %8, %9 : tensor<i1>, tensor<i64>, tensor<i64>) outs(%7 : tensor<i64>) { ^bb0(%in: i1, %in_1: i64, %in_2: i64, %out: i64): %11 = arith.select %in, %in_1, %in_2 : i64 linalg.yield %11 : i64 } -> tensor<i64> return %10 : tensor<i64> } } ``` With the change in this PR, we would instead get: ```mlir module { ml_program.global private mutable @global_seed(dense<0> : tensor<i64>) : tensor<i64> func.func @main_graph(%arg0: tensor<?x?x768xf32>, %arg1: tensor<?x?x768xf32>) -> tensor<i64> { %c0_i64 = arith.constant 0 : i64 %c0 = arith.constant 0 : index %dim = tensor.dim %arg1, %c0 : tensor<?x?x768xf32> %0 = arith.index_cast %dim : index to i64 %1 = tensor.empty() : tensor<1xi64> %collapsed = tensor.collapse_shape %1 [] : tensor<1xi64> into tensor<i64> %2 = linalg.fill ins(%0 : i64) outs(%collapsed : tensor<i64>) -> tensor<i64> %extracted = tensor.extract %2[] : tensor<i64> %3 = arith.cmpi eq, %extracted, %c0_i64 : i64 %dim_0 = tensor.dim %arg0, %c0 : tensor<?x?x768xf32> %4 = arith.index_cast %dim_0 : index to i64 %5 = arith.select %3, %4, %extracted : i64 %6 = tensor.empty() : tensor<i64> %7 = linalg.fill ins(%5 : i64) outs(%6 : tensor<i64>) -> tensor<i64> return %7 : tensor<i64> } } ``` Some related issues for context: 1. <iree-org/iree#18677> 2. <iree-org/iree#18631>
- Add Torch to TOSA legalization for aten.index_select - Fix createOneDimTfIndices function in TosaLegalizeCommon.cpp to correctly convert Torch indices to TF-style indices, which is used in convertGatherNdOp - Update e2e tests in xfail_sets.py - Update basic.mlir with new LIT test for aten.index_select Signed-off-by: Justin Ngo <justin.ngo@arm.com> Change-Id: I52519246183949353a3cf22f0a685fe3df8ec8ff Signed-off-by: Justin Ngo <justin.ngo@arm.com>
…to tm_tensor/linalg_ext dialect (llvm#3754) - To fix issue onnx.ScatterElements: nod-ai/SHARK-ModelDev#823 - E2E test: nod-ai/SHARK-TestSuite#363
Bumps [externals/llvm-project](https://github.com/Xilinx/llvm-project) from `09ddec3` to `9d48ee6`. - [Commits](Xilinx/llvm-project@09ddec3...9d48ee6) --- updated-dependencies: - dependency-name: externals/llvm-project dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>
…lvm-project-9d48ee6 Bump externals/llvm-project from `09ddec3` to `9d48ee6`
- Add Torch to TOSA lowering for aten.fill.Scalar/Tensor, aten.flip, and aten.round - Fix torchScalarToTosaTensor function to correctly convert Torch scalar input to TOSA tensor - Update xfail_sets.py with new e2e results - Update basic.mlir with LIT tests for new ops Change-Id: If1e42c2e582710dd8ad0465eed29806fbcdbde41 Signed-off-by: Justin Ngo <justin.ngo@arm.com>
…lvm#3763) This commit adds the support for negative step values in aten.slice.Tensor op. Although, PyTorch does not allow negative step value for slice op but the Onnx.Slice op supports negative step value which eventually lowers to torch.aten.slice.Tensor op. Hence, the support is added for handling those kind of values during the Torch->Linalg lowering of aten.slice.Tensor op. Signed-Off By: Vivek Khandelwal <vivekkhandelwal1424@gmail.com>
This commit adds the support for the 1-d depthwise convolution as a special case of 1-d group convolution. Signed-Off By: Vivek Khandelwal <vivekkhandelwal1424@gmail.com>
Address nod-ai/SHARK-ModelDev#846 Assume the dynamic squeezed dim is 1.
Bumps [externals/llvm-project](https://github.com/Xilinx/llvm-project) from `9d48ee6` to `81b017a`. - [Commits](Xilinx/llvm-project@9d48ee6...81b017a) --- updated-dependencies: - dependency-name: externals/llvm-project dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>
…lvm-project-81b017a Bump externals/llvm-project from `9d48ee6` to `81b017a`
Bumps [externals/llvm-project](https://github.com/Xilinx/llvm-project) from `81b017a` to `b04eab8`. - [Commits](Xilinx/llvm-project@81b017a...b04eab8) --- updated-dependencies: - dependency-name: externals/llvm-project dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>
…lvm-project-b04eab8 Bump externals/llvm-project from `81b017a` to `b04eab8`
[AutoBump] Merge with fixes of 2374b9e (Oct 04, needs LLVM bump) (68)
Bumps [externals/llvm-project](https://github.com/Xilinx/llvm-project) from `b51a5a5` to `bada367`. - [Commits](Xilinx/llvm-project@b51a5a5...bada367) --- updated-dependencies: - dependency-name: externals/llvm-project dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>
…lvm-project-bada367 Bump externals/llvm-project from `b51a5a5` to `bada367`
asan: replace used python in various lit.cfg's with shim script
Bumps [externals/llvm-project](https://github.com/Xilinx/llvm-project) from `bada367` to `c6d34c5`. - [Commits](Xilinx/llvm-project@bada367...c6d34c5) --- updated-dependencies: - dependency-name: externals/llvm-project dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>
…lvm-project-c6d34c5 Bump externals/llvm-project from `bada367` to `c6d34c5`
[AutoBump] Merge with fixes of 53f7532 (Oct 04) (70)
[AutoBump] Merge with f4840ed (Oct 06) (71)
[AutoBump] Merge with fixes of b08d086 (Oct 07) (72)
[AutoBump] Merge with f6721e5 (Oct 08) (73)
[AutoBump] Merge with fixes of 58489fa (Oct 08) (75)
[AutoBump] Merge with fixes of 614fcdd (Oct 08) (74)
[AutoBump] Merge with fixes of e9ed4af (69)
Bumps [externals/llvm-project](https://github.com/Xilinx/llvm-project) from `c6d34c5` to `2f5bd8b`. - [Commits](Xilinx/llvm-project@c6d34c5...2f5bd8b) --- updated-dependencies: - dependency-name: externals/llvm-project dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>
…lvm-project-2f5bd8b Bump externals/llvm-project from `c6d34c5` to `2f5bd8b`
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is not to be merged; just an easy way to see the accumulated changes.