Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/backport_ea1_ops #105

Draft
wants to merge 1,376 commits into
base: main
Choose a base branch
from
Draft

Feature/backport_ea1_ops #105

wants to merge 1,376 commits into from

Conversation

mgehre-amd
Copy link
Collaborator

This is not to be merged; just an easy way to see the accumulated changes.

@mgehre-amd mgehre-amd mentioned this pull request Jul 4, 2023
samutamm and others added 29 commits October 2, 2024 08:17
As of Sep 14, the torch-mlir binary
[wheels](https://github.com/llvm/torch-mlir-release/releases/tag/dev-wheels)
got renamed to `torch-mlir-core` from `torch-mlir`:
![image](https://github.com/user-attachments/assets/152e4977-71ef-4f57-8757-6dc75f72b670)

This was an unintended side-effect of the recent change of
`TORCH_MLIR_ENABLE_ONLY_MLIR_PYTHON_BINDINGS=True`
(llvm#3711) which skips setting `NAME
= "torch-mlir"` in
[setup.py](https://github.com/llvm/torch-mlir/blob/main/setup.py#L226-L232).

To avoid having multiple downstreams fix their pip deps, this change
allows using the same `torch-mlir` name for binaries, and reserves a
separate `torch-mlir-ext` name for the (less popular) binaries with
extensions enabled.
Bumps [externals/llvm-project](https://github.com/Xilinx/llvm-project) from `daa3383` to `09ddec3`.
- [Commits](Xilinx/llvm-project@daa3383...09ddec3)

---
updated-dependencies:
- dependency-name: externals/llvm-project
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Add the operation with lowering to linalg. Includes a test for
end-to-end correctness.
Addresses ~200 onnx model compile failures in
<https://github.com/nod-ai/SHARK-TestSuite> related to
<iree-org/iree#18631>.

This change simplifies the result of the generated broadcast op
substantially, but reduces the case coverage slightly.

The case which will become unsupported: 
- trying to actually broadcast a dynamic dim that is secretly 1. 

When does this case appear in practical scenarios?
- for a model where onnx shape inference cannot figure out that a dim
should be 1.

Why do I think we should not support this case for now?
1. For all models with dynamic dim expand ops, the previous path
uniformly generates uglier linalg IR (making it harder for IREE to fuse
properly with other ops).
2. For models failing shape inference castastrophically enough to fail
to see a dim is statically 1, we can try to apply constant folding in
the onnx model before importing.

Leaving this as a draft PR, since it may be more appropriate to fix the
compilation failure in IREE rather than torch-mlir.

### Example of broadcast required in previous path:

```mlir
    %300 = linalg.generic {indexing_maps = [#map11], iterator_types = ["parallel", "parallel", "parallel", "parallel"]} outs(%299 : tensor<?x12x?x?xi1>) {
    ^bb0(%out: i1):
      %306 = linalg.index 0 : index
      %307 = linalg.index 3 : index
      %308 = arith.index_cast %285 : i64 to index
      %309 = arith.cmpi eq, %308, %c1 : index
      %310 = arith.select %309, %c0, %306 : index
      %311 = arith.index_cast %286 : i64 to index
      %312 = arith.cmpi eq, %311, %c1 : index
      %313 = arith.select %312, %c0, %307 : index
      %extracted_79 = tensor.extract %reshape_78[%310, %c0, %c0, %313] : tensor<?x1x1x?xi1>
      linalg.yield %extracted_79 : i1
    } -> tensor<?x12x?x?xi1>
```

### Example of broadcast with simplified shape list:

```mlir
    %409 = linalg.generic {indexing_maps = [#map15, #map11], iterator_types = ["parallel", "parallel", "parallel", "parallel"]} ins(%reshape_135 : tensor<?x1x1x?xi1>) outs(%408 : tensor<?x12x?x?xi1>) {
    ^bb0(%in: i1, %out: i1):
      linalg.yield %in : i1
    } -> tensor<?x12x?x?xi1>
```
…lvm-project-09ddec3

Bump externals/llvm-project from `daa3383` to `09ddec3`
… generic ops (llvm#3762)

This is motivated by the fact that shapes are stored as tensors in ONNX,
and IREE tries to perform tensor arithmetic on the device. This causes
unnecessary dispatches, and makes it harder for the compiler to reason
about shapes.

Here is a small snippet of torch-IR that is typical seen coming from
ONNX models:

```mlir
module {
  func.func @main_graph(%arg0: !torch.vtensor<[?,?,768],f32>, %arg1: !torch.vtensor<[?,?,768],f32>) -> !torch.vtensor<[],si64> {
    %int0 = torch.constant.int 0
    %0 = torch.vtensor.literal(dense<0> : tensor<1xsi64>) : !torch.vtensor<[1],si64>
    %1 = torch.aten._shape_as_tensor %arg1 : !torch.vtensor<[?,?,768],f32> -> !torch.vtensor<[3],si64>
    %2 = torch.aten.index_select %1, %int0, %0 : !torch.vtensor<[3],si64>, !torch.int, !torch.vtensor<[1],si64> -> !torch.vtensor<[1],si64>
    %3 = torch.aten.squeeze.dim %2, %int0 : !torch.vtensor<[1],si64>, !torch.int -> !torch.vtensor<[],si64>
    %4 = torch.aten.item %3 : !torch.vtensor<[],si64> -> !torch.int
    %5 = torch.aten.eq.int %4, %int0 : !torch.int, !torch.int -> !torch.bool
    %6 = torch.aten.Int.bool %5 : !torch.bool -> !torch.int
    %7 = torch.aten.size.int %arg0, %int0 : !torch.vtensor<[?,?,768],f32>, !torch.int -> !torch.int
    %8 = torch.prim.NumToTensor.Scalar %6 : !torch.int -> !torch.vtensor<[],i1>
    %9 = torch.prim.NumToTensor.Scalar %7 : !torch.int -> !torch.vtensor<[],si64>
    %10 = torch.prim.NumToTensor.Scalar %4 : !torch.int -> !torch.vtensor<[],si64>
    %11 = torch.aten.where.self %8, %9, %10 : !torch.vtensor<[],i1>, !torch.vtensor<[],si64>, !torch.vtensor<[],si64> -> !torch.vtensor<[],si64>
    return %11 : !torch.vtensor<[],si64>
  }
}
```

Without the change in this PR, the result would be:

```mlir
#map = affine_map<() -> ()>
module {
  ml_program.global private mutable @global_seed(dense<0> : tensor<i64>) : tensor<i64>
  func.func @main_graph(%arg0: tensor<?x?x768xf32>, %arg1: tensor<?x?x768xf32>) -> tensor<i64> {
    %c0_i64 = arith.constant 0 : i64
    %c0 = arith.constant 0 : index
    %dim = tensor.dim %arg1, %c0 : tensor<?x?x768xf32>
    %0 = arith.index_cast %dim : index to i64
    %1 = tensor.empty() : tensor<1xi64>
    %collapsed = tensor.collapse_shape %1 [] : tensor<1xi64> into tensor<i64>
    %2 = linalg.fill ins(%0 : i64) outs(%collapsed : tensor<i64>) -> tensor<i64>
    %extracted = tensor.extract %2[] : tensor<i64>
    %3 = arith.cmpi eq, %extracted, %c0_i64 : i64
    %dim_0 = tensor.dim %arg0, %c0 : tensor<?x?x768xf32>
    %4 = arith.index_cast %dim_0 : index to i64
    %5 = tensor.empty() : tensor<i1>
    %6 = linalg.fill ins(%3 : i1) outs(%5 : tensor<i1>) -> tensor<i1>
    %7 = tensor.empty() : tensor<i64>
    %8 = linalg.fill ins(%4 : i64) outs(%7 : tensor<i64>) -> tensor<i64>
    %9 = linalg.fill ins(%extracted : i64) outs(%7 : tensor<i64>) -> tensor<i64>
    %10 = linalg.generic {indexing_maps = [#map, #map, #map, #map], iterator_types = []} ins(%6, %8, %9 : tensor<i1>, tensor<i64>, tensor<i64>) outs(%7 : tensor<i64>) {
    ^bb0(%in: i1, %in_1: i64, %in_2: i64, %out: i64):
      %11 = arith.select %in, %in_1, %in_2 : i64
      linalg.yield %11 : i64
    } -> tensor<i64>
    return %10 : tensor<i64>
  }
}
```

With the change in this PR, we would instead get:

```mlir
module {
  ml_program.global private mutable @global_seed(dense<0> : tensor<i64>) : tensor<i64>
  func.func @main_graph(%arg0: tensor<?x?x768xf32>, %arg1: tensor<?x?x768xf32>) -> tensor<i64> {
    %c0_i64 = arith.constant 0 : i64
    %c0 = arith.constant 0 : index
    %dim = tensor.dim %arg1, %c0 : tensor<?x?x768xf32>
    %0 = arith.index_cast %dim : index to i64
    %1 = tensor.empty() : tensor<1xi64>
    %collapsed = tensor.collapse_shape %1 [] : tensor<1xi64> into tensor<i64>
    %2 = linalg.fill ins(%0 : i64) outs(%collapsed : tensor<i64>) -> tensor<i64>
    %extracted = tensor.extract %2[] : tensor<i64>
    %3 = arith.cmpi eq, %extracted, %c0_i64 : i64
    %dim_0 = tensor.dim %arg0, %c0 : tensor<?x?x768xf32>
    %4 = arith.index_cast %dim_0 : index to i64
    %5 = arith.select %3, %4, %extracted : i64
    %6 = tensor.empty() : tensor<i64>
    %7 = linalg.fill ins(%5 : i64) outs(%6 : tensor<i64>) -> tensor<i64>
    return %7 : tensor<i64>
  }
}
```

Some related issues for context:
1. <iree-org/iree#18677>
2. <iree-org/iree#18631>
- Add Torch to TOSA legalization for aten.index_select
- Fix createOneDimTfIndices function in TosaLegalizeCommon.cpp to
correctly convert Torch indices to TF-style indices, which is used in
convertGatherNdOp
- Update e2e tests in xfail_sets.py
- Update basic.mlir with new LIT test for aten.index_select

Signed-off-by: Justin Ngo <justin.ngo@arm.com>
Change-Id: I52519246183949353a3cf22f0a685fe3df8ec8ff

Signed-off-by: Justin Ngo <justin.ngo@arm.com>
…e linalg generic ops (llvm#3762)" (llvm#3767)

Reverted due to downstream model changes. Will reland with fixes post
integration.

This reverts commit 6e8c7be.
…to tm_tensor/linalg_ext dialect (llvm#3754)

- To fix issue onnx.ScatterElements: nod-ai/SHARK-ModelDev#823
- E2E test: nod-ai/SHARK-TestSuite#363
Bumps [externals/llvm-project](https://github.com/Xilinx/llvm-project) from `09ddec3` to `9d48ee6`.
- [Commits](Xilinx/llvm-project@09ddec3...9d48ee6)

---
updated-dependencies:
- dependency-name: externals/llvm-project
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
…lvm-project-9d48ee6

Bump externals/llvm-project from `09ddec3` to `9d48ee6`
- Add Torch to TOSA lowering for aten.fill.Scalar/Tensor, aten.flip, and
aten.round
- Fix torchScalarToTosaTensor function to correctly convert Torch scalar
input to TOSA tensor
- Update xfail_sets.py with new e2e results
- Update basic.mlir with LIT tests for new ops


Change-Id: If1e42c2e582710dd8ad0465eed29806fbcdbde41

Signed-off-by: Justin Ngo <justin.ngo@arm.com>
…lvm#3763)

This commit adds the support for negative step values in
aten.slice.Tensor op. Although, PyTorch does not allow negative step
value for slice op but the Onnx.Slice op supports negative step value
which eventually lowers to torch.aten.slice.Tensor op. Hence, the
support is added for handling those kind of values during the
Torch->Linalg lowering of aten.slice.Tensor op.

Signed-Off By: Vivek Khandelwal <vivekkhandelwal1424@gmail.com>
This commit adds the support for the 1-d depthwise convolution as a
special case of 1-d group convolution.

Signed-Off By: Vivek Khandelwal <vivekkhandelwal1424@gmail.com>
Bumps [externals/llvm-project](https://github.com/Xilinx/llvm-project) from `9d48ee6` to `81b017a`.
- [Commits](Xilinx/llvm-project@9d48ee6...81b017a)

---
updated-dependencies:
- dependency-name: externals/llvm-project
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
…lvm-project-81b017a

Bump externals/llvm-project from `9d48ee6` to `81b017a`
Bumps [externals/llvm-project](https://github.com/Xilinx/llvm-project) from `81b017a` to `b04eab8`.
- [Commits](Xilinx/llvm-project@81b017a...b04eab8)

---
updated-dependencies:
- dependency-name: externals/llvm-project
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
…lvm-project-b04eab8

Bump externals/llvm-project from `81b017a` to `b04eab8`
mgehre-amd and others added 30 commits January 3, 2025 19:25
[AutoBump] Merge with fixes of 2374b9e (Oct 04, needs LLVM bump) (68)
Bumps [externals/llvm-project](https://github.com/Xilinx/llvm-project) from `b51a5a5` to `bada367`.
- [Commits](Xilinx/llvm-project@b51a5a5...bada367)

---
updated-dependencies:
- dependency-name: externals/llvm-project
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
…lvm-project-bada367

Bump externals/llvm-project from `b51a5a5` to `bada367`
asan: replace used python in various lit.cfg's with shim script
Bumps [externals/llvm-project](https://github.com/Xilinx/llvm-project) from `bada367` to `c6d34c5`.
- [Commits](Xilinx/llvm-project@bada367...c6d34c5)

---
updated-dependencies:
- dependency-name: externals/llvm-project
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
…lvm-project-c6d34c5

Bump externals/llvm-project from `bada367` to `c6d34c5`
[AutoBump] Merge with fixes of 53f7532 (Oct 04) (70)
[AutoBump] Merge with f4840ed (Oct 06) (71)
[AutoBump] Merge with fixes of b08d086 (Oct 07) (72)
[AutoBump] Merge with f6721e5 (Oct 08) (73)
[AutoBump] Merge with fixes of 58489fa (Oct 08) (75)
[AutoBump] Merge with fixes of 614fcdd (Oct 08) (74)
[AutoBump] Merge with fixes of e9ed4af (69)
Bumps [externals/llvm-project](https://github.com/Xilinx/llvm-project) from `c6d34c5` to `2f5bd8b`.
- [Commits](Xilinx/llvm-project@c6d34c5...2f5bd8b)

---
updated-dependencies:
- dependency-name: externals/llvm-project
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
…lvm-project-2f5bd8b

Bump externals/llvm-project from `c6d34c5` to `2f5bd8b`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.