[AutoBump] Merge with be6248a4 (Jun 11) (68) #332

mgehre-amd · 2024-09-09T16:15:48Z

No description provided.

…2591) This fixes two problems with the 2-stage PGO builds. The first problem was that the stage2-instrumented and stage2 targets would not be built on the second ninja invocation. For example: This would work as expected. $ ninja -v -C build stage2-instrumented-generate-profdata Edit a file. $ touch llvm/lib/Support/StringExtras.cpp This would rebuild stage1 only and not build stage2-instrumented or regenerate the profile data. $ ninja -v -C build stage2-instrumented-generate-profdata The second problem was that in some cases, the profile data would be regenerated, but not all of the stage2 targets would be rebuilt. One common scenario where this would happen is: This would work as expected. $ ninja -C build stage2-check-all This would regenerate the profile data, but then only build the targets that were needed for install, but weren't needed for check-all. This would cause errors like: ld.lld: error: Function Import: link error: linking module flags 'ProfileSummary': IDs have conflicting values in ... This is because the executibles being built for the install target used the new profile data, but they were linking with libraries that used the old profile data. $ ninja -C build stage2-install With this change we can re-enable PGO for the release builds.

`blendv` instructions are very similar to `select`. We will add support for them in followup patches.

…hWeights (llvm#89465) It does not look like 2000 is needed here in particular. Follow up to llvm#89464

A block represents a chunk of memory used by the freelist allocator. It contains header information denoting the usable space and pointers as offsets to the next and previous block. On it's own, this doesn't do much. This is a part of llvm#94270 to land in smaller patches. This is a subset of pigweed's freelist allocator implementation.

Fix llvm#88955

Generalize logic to set the result type for ops where the result type and the types of all operands match. Use it to support any unary and binops.

…lvm#84726) Doxygen allows for the `@throw`, `@throws`, and `@exception` commands to have an attached argument indicating the type being thrown. Currently, Clang's AST parsing doesn't support parsing out this argument from doc comments. The result is missing compatibility with Doxygen. This PR implements parsing of arguments for the `@throw`, `@throws`, and `@exception` commands. Each command can only have one argument, matching the semantics of Doxygen.

…94652) This patch lowers the `REDUCE` intrinsic call to the runtime equivalent for scalar results. Call with array result will follow.

…lround}f16 (llvm#94473)

DIEStreamer no longer needs Rewriter, so we can remove the constructor parameter and clean up the callers.

llvm#95038) …IB_FREE_H

…lvm#95037)

…m#89467) It does not look like particular value is inportant. Howere, there is a comment., but the current implementation of `create{Unlikely,Likely}BranchWeights` use the same value. Follow up to llvm#89464

…ll` (NFC) (llvm#95009) It should be `x86-registered-target` because we only need the X86 target in this case. `x86_64-linux` will be too strict here as it puts a prerequisite on the default target triple.

Make it more feasible to replace the fragment reprsentation, which might yield a large peak RSS win.

llvm#95056) This reverts commit 2cf1439 since it broke the llvm test suite: SingleSource/UnitTests/AArch64/acle-fmv-features.c:59:9: error: instruction requires: altnzcv SingleSource/UnitTests/AArch64/acle-fmv-features.c:117:10: error: instruction requires: aes ... Looks like the FMV dependencies were used in the target attribute and now features that are FMVOnly (have AEK_NONE) cannot be expanded in parseTargetAttr using the ExtensionSet. This suggests that either the tests are wrong (they are using an FMVOnly feature in a target attribute), or that we need to turn the FMVOnly features into Extensions (these two are tablegen classes).

Presently, if name starts with a symbol it's converted to hex which may cause the result to be invalid by starting with a digit. Address this and add a small test. Co-authored-by: Will Dietz <w@wdtz.org>

This implements a traditional freelist to be used by the freelist allocator. It operates on spans of bytes which can be anything. The freelist allocator will store Blocks inside them. This is a part of llvm#94270 to land in smaller patches.

…rinsic (llvm#94559) Relanding this PR now that llvm#90503 has merged. with `FTAN` landing in [TargetLoweringBase.cpp:L1021](https://github.com/llvm/llvm-project/blob/main/llvm/lib/CodeGen/TargetLoweringBase.cpp#L1020C23-L1021C63 ) There is now a llvm tan intrinsic 32\64\128 Expand case for all llvm backends. In LLVM, the `llvm.experimental.constrained.cos` and `llvm.experimental.constrained.sin` intrinsics are used for performing cosine and sine calculations with additional constraints on floating-point operations. This behavior is expected for all floating-point math intrinsics. This change adds these constraints for the `tan` intrinsic. - `Builtins.td` - replace TanF128 with F16F128MathTemplate - `CGBuiltin.cpp` - map existing tan builtins to `tan` and `constrained_tan` intrinsic - `ConstrainedOps.def` map tan and constrained_tan to an ISDOpcode. resolves llvm#91421 --------- Co-authored-by: Farzon Lotfi <farzon@farzon.com>

…vm#95051) Add sanitize_numerical_stability attribute.

…m#94557)" (llvm#94734) This reverts commit c007883.

This patch updates the lowering of `LaunchFuncOp` in GPU to LLVM to only legalize the operation with the converted operands, effectively removing the lowering used by the old serialization pipeline. It also removes all remaining uses of the old gpu serialization infrastructure in `gpu-to-llvm`. See [Compilation overview | 'gpu' Dialect - MLIR docs](https://mlir.llvm.org/docs/Dialects/GPU/#compilation-overview) for additional information on the target attributes compilation pipeline that replaced the old serialization pipeline.

…94987) This addresses a clang-tidy suggestion.

) They were assigned from calls to find_chunk_ptr_for_size which return size_t now.

…lvm#95067) Add comments and a test for delay-init libraries on macOS. I originally added the support in 954d00e a month ago, but without these additional clarifications. rdar://126885033

The case-list of the switches generated by this pass were not "deterministic" (based on allocation patterns). This is because the CaseList order relied on an unordered_set order. Using the sorted exit target list for those should solve the problem. Fixes llvm#94961 Signed-off-by: Nathan Gauër <brioche@google.com>

llvm#95098) MLIR's LLVM dialect does not internally support debug records, only converting to/from debug intrinsics. To smooth the transition from intrinsics to records, there is a step prior to IR->MLIR translation that switches the IR module to intrinsic-form; this patch adds the equivalent conversion to record-form at MLIR->IR translation, and also modifies the flang front end to use the WriteNewDbgInfoFormat flag when it is emitting LLVM IR.

…ession (llvm#94356) This commit reimplements the functionality of the Clang Static Analyzer checker `alpha.core.SizeofPointer` within clang-tidy by adding a new (off-by-default) option to bugprone-sizeof-expression which activates reporting all the `sizeof(ptr)` expressions (where ptr is an expression that produces a pointer). The main motivation for this change is that `alpha.core.SizeofPointer` was an AST-based checker, which did not rely on the path sensitive capabilities of the Static Analyzer, so there was no reason to keep it in the Static Analyzer instead of the more lightweight clang-tidy. After this commit I'm planning to create a separate commit that deletes `alpha.core.SizeofPointer` from Clang Static Analyzer. It was natural to place this moved logic in bugprone-sizeof-expression, because that check already provided several heuristics that reported various especially suspicious classes of `sizeof(ptr)` expressions. The new mode `WarnOnSizeOfPointer` is off-by-default, so it won't surprise the existing users; but it can provide a more through coverage for the vulnerability CWE-467 ("Use of sizeof() on a Pointer Type") than the existing partial heuristics. Previously this checker had an exception that the RHS of a `sizeof(array) / sizeof(array[0])` expression is not reported; I generalized this to an exception that the check doesn't report `sizeof(expr[0])` and `sizeof(*expr)`. This idea is taken from the Static Analyzer checker `alpha.core.SizeofPointer` (which had an exception for `*expr`), but analysis of open source projects confirmed that this indeed eliminates lots of unwanted results. Note that the suppression of `sizeof(expr[0])` and `sizeof(*expr)` reports also affects the "old" mode `WarnOnSizeOfPointerToAggregate` which is enabled by default. This commit also replaces the old message "suspicious usage of 'sizeof(A*)'; pointer to aggregate" with two more concrete messages; but I feel that this tidy check would deserve a through cleanup of all the diagnostic messages that it can produce. (I added a FIXME to mark one outright misleading message.)

Those BitVectors get expensive on targets like AMDGPU with thousands of registers, and RegAliasIterator is also expensive. We can move all liveness calculations to use RegUnits instead to speed it up for targets where RegAliasIterator is expensive, like AMDGPU. On targets where RegAliasIterator is cheap, this alternative can be a little more expensive, but I believe the tradeoff is worth it.

This PR is required to fix `std/algorithms/alg.nonmodifying/mismatch/mismatch.pass.cpp` test for big endian platrofrms such as z/OS.

…anslation (llvm#95098)" Reverted due to failure on buildbot due to missing use of the WriteNewDbgInfoFormat flag in MLIR. This reverts commit ca920bb.

Add a section about fence & address spaces that covers amdgpu-as.

Some cases should be legal for gfx940.

…ranslation (llvm#95098)" Reapplies the original patch with some additional conversion layers added to the MLIR translator, to ensure that we don't write the new debug info format unless WriteNewDbgInfoFormat is set. This reverts commit 8c5d9c7.

Following from the previous commit, this patch converts to the appropriate debug info format before printing LLVM IR. See: llvm#95098

The main goal of this PR (and subsequent PRs), is to add more tests with scalable vectors to: * vector-transfer-collapse-inner-most-dims.mlir There's quite a few cases to consider, hence this is split into multiple PRs. In this PR, the very first test is complemented with all the possible combinations: * scalable (rather than fixed) unit trailing dim, * dynamic (rather than static) trailing dim in the source memref. Also, * `@leading_scalable_dimension_transfer_read` and `@trailing_scalable_one_dim_transfer_read`, are replaced with: * `@contiguous_inner_most_scalable_inner_dim` and `@negative_scalable_unit_dim`, respectively, and added to the list above (i.e. alongside other variations for the very first test). In addition: * "_view" is removed from function names (it's not clear to me what it was meant to signify) * extra comments are added to separate tests for vector.transfer_read and vector.transfer_write NOTE: This PR is limited to tests for `vector.transfer_read`.

… as standard library functions in misc-include-cleaner (llvm#94923) Fixes: llvm#93335 For decl with body, we should provide physical locations also. Because it may be the function which have the same name as std library.

A caveat here is that we can only preserve nusw if the offset additions did not overflow. Proofs: https://alive2.llvm.org/ce/z/u56z_u

As we approach the state where support for debug intrinsics is dropping and we print and use debug records by default, the documentation should be updated to refer to debug records as the primary debug info representation, with debug intrinsics being relegated to an optional alternative. This patch performs a few updates: - Replace references to intrinsics with references to records across all the documentation. - Replace intrinsics with records in code examples. - Move debug records prior to debug intrinsics in the SourceLevelDebugging document, and change text to refer to them as the primary representation. - Add release notes describing the change.

This is a printf style variadic function. If using a "%s" format, we should pass "const char *" rather than "StringRef". The use of data() here is safe because we know that the StringRef was originally derived from a null-terminated string.

…s result (llvm#94571) Prepare for new pass manager version of `MachineDominatorTreeAnalysis`. We may need a machine dominator tree version of `DomTreeUpdater` to handle `SplitCriticalEdge` in some CodeGen passes.

…5117) Co-authored-by: Joseph Huber <huberjn@outlook.com>

…translation (llvm#95098)" Also reverts "[MLIR][Flang][DebugInfo] Convert debug format in MLIR translators" The patch above introduces behaviour controlled by an LLVM flag into the Flang driver, which is incorrect behaviour. This reverts commits: 3cc2710. 460408f.

This makes sure we try to process declaration DIEs that are erroneously present in the index. Until bd5c636, clang was emitting index entries for declaration DIEs with DW_AT_signature attributes. This makes sure to avoid returning those DIEs as the definitions of a type, but also makes sure to pass through DIEs referring to static constexpr member variables, which is a (probably nonconforming) extension used by dsymutil. It adds test cases for both of the scenarios. It is essentially a recommit of llvm#91808.

…sn't drop below threshold Upcoming SimplifyDemandedBits support for CMOV will simplify the code and reduce the critical path below the threshold if we stick with i32 multiplies

…ling Add basic pass through handling - we could extend this to truncate CMOVQ to CMOVL in a future patch

tstellar and others added 30 commits June 10, 2024 13:01

[NFC][msan] Prepare function to extract main logic (llvm#94880)

4f41698

[NFC][msan] Extract handleSelectLikeInst (llvm#94881)

983bf65

`blendv` instructions are very similar to `select`. We will add support for them in followup patches.

[NFCI][metadata][LibCallsShrinkWrap] Use create{Unlikely,Likely}Branc…

d3c0ed3

…hWeights (llvm#89465) It does not look like 2000 is needed here in particular. Follow up to llvm#89464

[mlir][sparse] fix missing cmake dependencies. (llvm#95034)

d474976

Fix llvm#88955

[VPlan] Generalize type inference for binary VPInstructions (NFC).

83da21a

Generalize logic to set the result type for ops where the result type and the types of all operands match. Use it to support any unary and binops.

[flang] Lower REDUCE intrinsic with no DIM argument and rank 1 (llvm#…

0babff9

…94652) This patch lowers the `REDUCE` intrinsic call to the runtime equivalent for scalar results. Call with array result will follow.

[libc][math][c23] Add MPFR unit tests for {rint,lrint,llrint,lround,l…

f50656c

…lround}f16 (llvm#94473)

[BOLT] Clean up DIEStreamer (NFC) (llvm#95028)

3c8e0b8

DIEStreamer no longer needs Rewriter, so we can remove the constructor parameter and clean up the callers.

[NFC][libc][stdlib] LLVM_LIBC_SRC_STDLIB_LDIV_H -> LLVM_LIBC_SRC_STDL… (

076a50a

llvm#95038) …IB_FREE_H

[libc][stdlib] Move LIBC_INLINE after template and before constexpr (l…

fd4a740

…lvm#95037)

[flang][NFC] Remove debug printing

f3b212c

[NFCI][metadata][clang] Use create{Unlikely,Likely}BranchWeights (llv…

bb2bf3a

…m#89467) It does not look like particular value is inportant. Howere, there is a comment., but the current implementation of `create{Unlikely,Likely}BranchWeights` use the same value. Follow up to llvm#89464

[Inliner][test] Fix incorrect REQUIRE line in `inline-switch-default.…

1fe4f2d

…ll` (NFC) (llvm#95009) It should be `x86-registered-target` because we only need the X86 target in this case. `x86_64-linux` will be too strict here as it puts a prerequisite on the default target triple.

[X86,MC] Remove two getPrevNode

3aa41e0

Make it more feasible to replace the fragment reprsentation, which might yield a large peak RSS win.

[memprof] Fix comment typos (NFC)

e2d539b

[mlir] Sanitize identifiers with leading symbol. (llvm#94795)

46e41c8

Presently, if name starts with a symbol it's converted to hex which may cause the result to be invalid by starting with a digit. Address this and add a small test. Co-authored-by: Will Dietz <w@wdtz.org>

[LLVM][IR][Sanitizers] Add sanitize_numerical_stability attribute (ll…

c4f8ae6

…vm#95051) Add sanitize_numerical_stability attribute.

MCSection: Remove unused reverse iterators

abbb24b

Reland "[X86] Assign AVX10_1 feature priority to align with gcc. (llv…

5275aed

…m#94557)" (llvm#94734) This reverts commit c007883.

[MC] Remove getFragmentList uses. NFC

cb63abc

[clang] Replace X && isa<Y>(X) with isa_and_nonnull<Y>(X). NFC (llvm#…

69e9e77

…94987) This addresses a clang-tidy suggestion.

[libc][stdlib] Change old unsigned short variables to size_t (llvm#95065

39cf880

) They were assigned from calls to find_chunk_ptr_for_size which return size_t now.

[lldb] NFC add comments and test case for ObjectFileMachO delay-init (l…

1934208

…lvm#95067) Add comments and a test for delay-init libraries on macOS. I originally added the support in 954d00e a month ago, but without these additional clarifications. rdar://126885033

Keenuts and others added 28 commits June 11, 2024 13:57

[libc++] Fix endianness for algorithm mismatch (llvm#93082)

ffc3a6b

This PR is required to fix `std/algorithms/alg.nonmodifying/mismatch/mismatch.pass.cpp` test for big endian platrofrms such as z/OS.

Revert "[MLIR][Flang][DebugInfo] Set debug info format in MLIR->IR tr…

8c5d9c7

…anslation (llvm#95098)" Reverted due to failure on buildbot due to missing use of the WriteNewDbgInfoFormat flag in MLIR. This reverts commit ca920bb.

[AMDGPU] Document amdgpu-as in AMDGPUUsage (llvm#94335)

a45080f

Add a section about fence & address spaces that covers amdgpu-as.

AMDGPU: Add more tests for vector typed atomicrmw fadd

a2bc50a

Some cases should be legal for gfx940.

[MLIR][Flang][DebugInfo] Convert debug format in MLIR translators

3cc2710

Following from the previous commit, this patch converts to the appropriate debug info format before printing LLVM IR. See: llvm#95098

[ConstantFolding] Preserve nowrap flags in gep of gep fold

da5f45f

A caveat here is that we can only preserve nusw if the offset additions did not overflow. Proofs: https://alive2.llvm.org/ce/z/u56z_u

Fix test to have correct requirements (llvm#95106)

32add24

[clang][Interp] Support ObjCEncodeExprs

e805b77

[CodeGen][NewPM] Split MachineDominatorTree into a concrete analysi…

837dc54

…s result (llvm#94571) Prepare for new pass manager version of `MachineDominatorTreeAnalysis`. We may need a machine dominator tree version of `DomTreeUpdater` to handle `SplitCriticalEdge` in some CodeGen passes.

[X86] early-ifcvt-remarks.ll - add codegen checks

a7d28f5

Updated the annotations of Python bindings (llvm#92733)

bc5ced5

[bazel] Add missing dependency for 3cc2710

37e9bf9

[Offload][NFCI] Initialize the KernelArgsTy to default values (llvm#9…

2eb60e2

…5117) Co-authored-by: Joseph Huber <huberjn@outlook.com>

[X86] early-ifcvt-remarks.ll - use i64 arithmetic to ensure ifcvt doe…

1df3798

…sn't drop below threshold Upcoming SimplifyDemandedBits support for CMOV will simplify the code and reduce the critical path below the threshold if we stick with i32 multiplies

[X86] SimplifyDemandedBitsForTargetNode - add basic X86ISD::CMOV hand…

464eb64

…ling Add basic pass through handling - we could extend this to truncate CMOVQ to CMOVL in a future patch

[Bazel] Layering fix for 65310f3

be6248a

[AutoBump] Merge with be6248a (Jun 11)

a5e2e47

cferry-AMD approved these changes Sep 11, 2024

View reviewed changes

mgehre-amd closed this Sep 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AutoBump] Merge with be6248a4 (Jun 11) (68) #332

[AutoBump] Merge with be6248a4 (Jun 11) (68) #332

mgehre-amd commented Sep 9, 2024

[AutoBump] Merge with be6248a4 (Jun 11) (68) #332

[AutoBump] Merge with be6248a4 (Jun 11) (68) #332

Conversation

mgehre-amd commented Sep 9, 2024