Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump to f355cd6f6c51 [Nr 5] #242

Merged
merged 124 commits into from
Aug 8, 2024
Merged

Bump to f355cd6f6c51 [Nr 5] #242

merged 124 commits into from
Aug 8, 2024

Conversation

josel-amd
Copy link
Collaborator

Small conflict resolution

marbre and others added 30 commits March 7, 2024 15:48
This adds the `CExpression` trait to additional ops to allow to use
these ops within the expression operation. Furthermore, the operator
precedence is defined for those ops.
Summary:
Some dependencies on the standard C extensions are added transitively.
This patch adds the new values.
…lvm#80378)

This patch is stacked on
llvm#80372,
llvm#80307, and
llvm#80306.

ShuffleVector on scalable vector types gets IRTranslate'd to
G_SPLAT_VECTOR since a ShuffleVector that has operates on scalable
vectors is a splat vector where the value of the splat vector is the 0th
element of the first operand, because the index mask operand is the
zeroinitializer (undef and poison are treated as zeroinitializer here).
This is analogous to what happens in SelectionDAG for ShuffleVector.

`buildSplatVector` is renamed to`buildBuildVectorSplatVector`. I did not
make this a separate patch because it would cause problems to revert
that change without reverting this change too.
…4270)

This switches the default of `LLDB_TEST_USE_VENDOR_PACKAGES` from `ON`
to `OFF` in preparation for eventually deleting it. All known LLDB
buildbots have this package installed, so flipping the default will
uncover any other users.

If this breaks anything, the preferred fix is to install `pexpect` on
the host system. The second fix is to build with cmake option
`-DLLDB_TEST_USE_VENDOR_PACKAGES=ON` as a temporary measure until
`pexpect` can be installed. If neither of those work, reverting this
patch is OK.
Details in the linked issue. Might fail on other architectures
but I can't confirm, they can add to this if it does.
…r types" (llvm#84330)

Reverts llvm#80378

causing Buildbot failures that did not show up with check-llvm or CI.
This improves overall analysis for minbitwidth in SLP. It allows to
analyze the trees with store/insertelement root nodes. Also, instead of
using single minbitwidth, detected from the very first analysis stage,
it tries to detect the best one for each trunc/ext subtree in the graph
and use it for the subtree.
Results in better code and less vector register pressure.

Metric: size..text

Program                                                                                                                                                size..text
                                                                                                                                                       results     results0    diff
                                                                      test-suite :: SingleSource/Benchmarks/Adobe-C++/simple_types_loop_invariant.test    92549.00    92609.00  0.1%
                                                                                  test-suite :: External/SPEC/CINT2017speed/625.x264_s/625.x264_s.test   663381.00   663493.00  0.0%
                                                                                   test-suite :: External/SPEC/CINT2017rate/525.x264_r/525.x264_r.test   663381.00   663493.00  0.0%
                                                                                               test-suite :: MultiSource/Benchmarks/Bullet/bullet.test   307182.00   307214.00  0.0%
                                                                             test-suite :: External/SPEC/CFP2017speed/638.imagick_s/638.imagick_s.test  1394420.00  1394484.00  0.0%
                                                                              test-suite :: External/SPEC/CFP2017rate/538.imagick_r/538.imagick_r.test  1394420.00  1394484.00  0.0%
                                                                                test-suite :: External/SPEC/CFP2017rate/510.parest_r/510.parest_r.test  2040257.00  2040273.00  0.0%

                                                                              test-suite :: External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test 12396098.00 1239585.00 -0.0%
                                                                                         test-suite :: External/SPEC/CINT2006/445.gobmk/445.gobmk.test   909944.00   909768.00 -0.0%

SingleSource/Benchmarks/Adobe-C++/simple_types_loop_invariant - 4 scalar
instructions remain scalar (good).
Spec2017/x264 - the whole function idct4x4dc is vectorized using <16
x i16> instead of <16 x i32>, also zext/trunc are removed. In other
places last vector zext/sext removed and replaced by
extractelement + scalar zext/sext pair.
MultiSource/Benchmarks/Bullet/bullet - reduce or <4 x i32> replaced by
reduce or <4 x i8>
Spec2017/imagick - Removed extra zext from 2 packs of the operations.
Spec2017/parest - Removed extra zext, replaced by extractelement+scalar
zext
Spec2017/blender - the whole bunch of vector zext/sext replaced by
extractelement+scalar zext/sext, some extra code vectorized in smaller
types.
Spec2006/gobmk - fixed cost estimation, some small code remains scalar.

Reviewers: RKSimon

Pull Request: llvm#84334
Summary:
The clang-offload-bundler uses an empty file to control the bundles made
for embedding. Previously this still used `/dev/null` by mistake even on
Windows.
…llvm#84223)

This is in preparation for an upcoming commit that will add "funclet"
OpBundle to the inserted runtime calls where the function's EH
personality requires it.

See PR llvm#82533
…lvm#84142)

Arithmetic constants for vector types can be constructed from objects
implementing Python buffer protocol such as `array.array`. Note that
until Python 3.12, there is no typing support for buffer protocol
implementers, so the annotations use array explicitly.

Reverts llvm#84103
…vm#83643)

As per the OpenMP standard, "If a variable appears in a link clause on a
declare target directive that does not have a device_type clause with
the nohost device-type-description then it is treated as if it had
appeared in a map clause with a map-type of tofrom" is an implicit
mapping rule. Before this change, such variables were mapped as to by
default.
Instead of the longer ArrayElemPtr + Load.
llvm#82985)

This adds a rewrite that converts illegal 2D unit-dim `shape_casts` into
`vector.transpose` ops.

E.g.

```mlir
// Case 1:
%a = vector.shape_cast %0 : vector<[4]x1xf32> to vector<1x[4]xf32>
// Case 2:
%b = vector.shape_cast %1 : vector<[4]x1xf32> to vector<[4]xf32>
```

Becomes:

```mlir
// Case 1:
%a = vector.transpose %0 : [1, 0] vector<[4]x1xf32> to vector<1x[4]xf32>
// Case 2:
%t = vector.transpose %1 : [1, 0] vector<[4]x1xf32> to vector<1x[4]xf32>
%b = vector.shape_cast %t : vector<1x[4]xf32> to vector<[4]xf32>
```

Various lowerings and drop unit-dims patterns add such shape_casts,
however, if they do not cancel out (which they likely won't if we've
reached the vector-legalization pass) they will prevent lowering the IR.

Rewriting them as a transpose gives `LiftIllegalVectorTransposeToMemory`
a chance to eliminate the illegal types.
…4332)

Summary:
The new driver does not need this hash and it can lead to redefined
symbol errors when the CUID hash isn't set.
This would have helped identify problems with llvm#83905 which only showed
up in an LLVM_ENABLE_EXPENSIVE_CHECKS build.
…lvm#80378)

Recommits llvm#80378 which was reverted in
llvm#84330. The problem was that the change in
llvm/test/CodeGen/AArch64/GlobalISel/legalizer-info-validation.mir used
217 as an opcode instead of a regex.
This PR adds the following to the mlir c api:

- The disctinct mlir builtin attribute.
- LLVM attributes (mostly debug related ones)
Targets the dynamic realignment pattern of `(Ptr + Align - 1) & -Align;`
as implemented by gep then ptrmask.

Specifically, when the pointer already has alignment information,
dynamically realigning it to less than is already known should be a
no-op. Discovered while writing test cases for another patch.

For the zero low bits of a known aligned pointer, adding the gep index
then removing it with a mask is a no-op. Folding the ptrmask effect
entirely into the gep is the ideal result as that unblocks other
optimisations that are not aware of ptrmask.

In some other cases the gep is known to be dead and is removed without
changing the ptrmask.

In the least effective case, this transform creates a new gep with a
rounded-down index and still leaves the ptrmask unchanged. That
simplified gep is still a minor improvement, geps are cheap and ptrmask
occurs in address calculation contexts so I don't think it's worth
special casing to avoid the extra instruction.
The buildbot seems to complain about `strcmp` function not available in
the vfork patch (llvm#81564):

https://lab.llvm.org/buildbot/#/builders/68/builds/70093/steps/6/logs/stdio

Unfortunately, I can't reproduce the failure on my linux machine so this
is a guessing fix. If anyone has a way to reproduce and very this fix,
please feel free to merge this change.

Co-authored-by: jeffreytan81 <jeffreytan@fb.com>
)

Summary:
The HIP toolchain has support for compressing the final output. We
should respect that when we create the executable.
Currently, progress events reported by the ProgressManager and broadcast
to eBroadcastBitProgressCategory always specify they're complete. The
problem is that the ProgressManager reports kNonDeterministicTotal for
both the total and the completed number of (sub)events. Because the
values are the same, the event reports itself as complete.

This patch fixes the issue by reporting 0 as the completed value for the
start event and kNonDeterministicTotal for the end event.
Combined constructs are decomposed into separate operations. However,
this does not adhere to `acc` dialect's goal to be able to regenerate
semantically equivalent clauses as user's intent. Thus, add an attribute
to keep track of the combined constructs.
Removes an unused field. Retypes unshared smart pointers to `unique_ptr`.
MaskRay and others added 25 commits March 7, 2024 16:39
Allows linalg structured operations to be handled during spmdization and
sharding propagation.

There is only support for projected permutation indexing maps.
…m#83922)

During block signature conversion, a new block is inserted and ops are
moved from the old block to the new block. This commit changes the
implementation such that ops are moved in bulk (`splice`) instead of
one-by-one; that's what `splitBlock` is doing.

This also makes it possible to pass the new block argument types
directly to `createBlock` instead of using `addArgument` (which bypasses
the rewriter). This doesn't change anything from a technical point of
view (there is no rewriter API for adding arguments at the moment), but
the implementation reads a bit nicer.
…3425)

This commit adds listener support to the dialect conversion. Similarly
to the greedy pattern rewrite driver, an optional listener can be
specified in the configuration object.

Listeners are notified only if the dialect conversion succeeds. In case
of a failure, where some IR changes are first performed and then rolled
back, no notifications are sent.

Due to the fact that some kinds of rewrite are reflected in the IR
immediately and some in a delayed fashion, there are certain limitations
when attaching a listener; these are documented in `ConversionConfig`.
To summarize, users are always notified about all rewrites that
happened, but the notifications are sent all at once at the very end,
and not interleaved with the actual IR changes.

This change is in preparation improvements to
`transform.apply_conversion_patterns`, which currently invalidates all
handles. In the future, it can use a listener to update handles
accordingly, similar to `transform.apply_patterns`.
llvm-project/mlir/lib/Dialect/Linalg/Transforms/MeshShardingInterfaceImpl.cpp:96:8:
error: unused variable 'resultElementType' [-Werror,-Wunused-variable]
  Type resultElementType =
       ^
llvm-project/mlir/lib/Dialect/Linalg/Transforms/MeshShardingInterfaceImpl.cpp:122:1:
error: non-void function does not return a value in all control paths [-Werror,-Wreturn-type]
}
^
2 errors generated.
Close llvm#71034

See

https://discourse.llvm.org/t/rfc-c-20-modules-introduce-thin-bmi-and-decls-hash/74755

This patch introduces reduced BMI, which doesn't contain the definitions
of functions and variables if its definitions won't contribute to the
ABI.

Testing is a big part of the patch. We want to make sure the reduced BMI
contains the same behavior with the existing and relatively stable
fatBMI. This is pretty helpful for further reduction.

The user interfaces part it left to following patches to ease the
reviewing.
…umer (NFC)

llvm-project/clang/lib/Frontend/FrontendActions.cpp:213:10:
error: moving a local object in a return statement prevents copy elision [-Werror,-Wpessimizing-move]
  213 |   return std::move(Consumers);
      |          ^
/Users/jiefu/llvm-project/clang/lib/Frontend/FrontendActions.cpp:213:10: note: remove std::move call here
  213 |   return std::move(Consumers);
      |          ^~~~~~~~~~         ~
1 error generated.
The test case demonstrates how meta variables are needlessly renamed,
making diffs harder to read.
The match argument isn't used.
Prior to this change, running UTC on larger tests, especially tests
with unnamed IR values, often resulted in a spuriously large diff
because e.g. TMPnn variables in the CHECK lines were renumbered. This
change attempts to reduce the diff by keeping those variable names the
same.

There are cases in which this "drift" of variable names can end up being
more confusing. The old behavior can be re-enabled with the
--reset-variable-names command line argument.

The improvement may not be immediately apparent in the diff of this change.
The point is that the diff of stable_ir_values.ll against
stable_ir_values.ll.expected after this change is smaller.

Ideally, we'd also keep meta variables for "global" objects stable, e.g.
for attributes (#nn) and metadata (!nn). However, that would require a
much more substantial refactoring of how we generate check lines, so I
left it for future work.
PR llvm#75982 had been created before these tests were added, therefore
some test were not updated.
This reverts commit fb02f9a.

Looks like some Python version incompatibility, will investigate.
Resubmitting this after previous revert with the following changes:

- Split table into table_rhs_idx and table_candidate_idx so that
  bisect.bisect_left can be used without the `key` argument, which
  was introduced in Python 3.10
- Remove a re.Pattern type annotation

Original commit message:

Prior to this change, running UTC on larger tests, especially tests
with unnamed IR values, often resulted in a spuriously large diff
because e.g. TMPnn variables in the CHECK lines were renumbered. This
change attempts to reduce the diff by keeping those variable names the
same.

There are cases in which this "drift" of variable names can end up being
more confusing. The old behavior can be re-enabled with the
--reset-variable-names command line argument.

The improvement may not be immediately apparent in the diff of this change.
The point is that the diff of stable_ir_values.ll against
stable_ir_values.ll.expected after this change is smaller.

Ideally, we'd also keep meta variables for "global" objects stable, e.g.
for attributes (#nn) and metadata (!nn). However, that would require a
much more substantial refactoring of how we generate check lines, so I
left it for future work.
Older versions of clang do not have __builtin_complex, but they
may define `__GNUC__`.
…s. (llvm#84187)

With the many pseudos used in SVE codegen it can be too easy to miss
instructions. This enables the existing test we have for checking the
scheduling info of the pseudos matches the real instructions, and
adjusts the scheduling info in the NeoverseV1 model to make sure all are
handled. In the cases I could I opted to use the same info as in the
NeoverseV2 model, to keep the differences smaller.
…this` with values. (llvm#84164)

This is the constructor's job, and we want to be able to test that it
does this.
…y value. (llvm#84317)

I'm making some changes to `Environment::getResultObjectLocation()`,
with the
ultimate goal of eliminating `RecordValue` entirely, and I'd like to
make sure
I don't break this behavior (and I've realized we don't have a test for
it yet).
These checks have been broken since 6afe972.
The check_cxx_compiler_flag macro only takes two arguments and passing
three to it ends up calling
`cmake_check_compiler_flag(CXX "${_FLAG}" ${_RESULT})` with ${_FLAG}
equal to `-Werror` and the result variable being the actually tested
compiler flag.

I noticed this because some of the flags that I know should be supported
were being flagged as not supported. `--debug-trycompile` shows the
following surprising line in the generated CMakeLists.txt:
`add_definitions([==[-D-Wempty-body]==] [==[-Werror]==])` which then
results in the following error while running the check:
```
FAILED: CMakeFiles/cmTC_72736.dir/src.cxx.o
tmp/upstream-llvm-readonly/bin/clang++   -nodefaultlibs -std=c++17 -fcolor-diagnostics   -D-Wempty-body -Werror -MD -MT CMakeFiles/cmTC_72736.dir/src.cxx.o -MF CMakeFiles/cmTC_72736.dir/src.cxx.o.d -o CMakeFiles/cmTC_72736.dir/src.cxx.o -c .../cmake-build-all-sanitizers/CMakeFiles/CMakeScratch/TryCompile-nyh3QR/src.cxx
In file included from <built-in>:450:
<command line>:1:9: error: macro name must be an identifier
    1 | #define -Wempty-body 1
      |         ^
   1 error generated.
```

It would be great if CMake could be a bit more helpful here so I've
filed https://gitlab.kitware.com/cmake/cmake/-/issues/25735.

See also https://reviews.llvm.org/D146920.

Reviewed By: nikic

Pull Request: llvm#83779
@josel-amd josel-amd changed the title Bump to f355cd6f6c51 Bump to f355cd6f6c51 [Nr 5] Aug 8, 2024
@josel-amd josel-amd merged commit 334b09f into jose.bump_6 Aug 8, 2024
9 checks passed
@josel-amd josel-amd deleted the jose.bump_7 branch August 8, 2024 12:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.