Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AutoBump] Merge with be6248a4 (Jun 11) (68) #332

Closed
wants to merge 97 commits into from

Conversation

mgehre-amd
Copy link
Collaborator

No description provided.

tstellar and others added 30 commits June 10, 2024 13:01
…2591)

This fixes two problems with the 2-stage PGO builds. The first problem
was that the stage2-instrumented and stage2 targets would not be built
on the second ninja invocation. For example:

This would work as expected.
$ ninja -v -C build stage2-instrumented-generate-profdata

Edit a file.
$ touch llvm/lib/Support/StringExtras.cpp

This would rebuild stage1 only and not build stage2-instrumented or
regenerate the profile data.
$ ninja -v -C build stage2-instrumented-generate-profdata

The second problem was that in some cases, the profile data would be
regenerated, but not all of the stage2 targets would be rebuilt. One
common scenario where this would happen is:

This would work as expected.
$ ninja -C build stage2-check-all

This would regenerate the profile data, but then only build the
targets that were needed for install, but weren't needed for
check-all.  This would cause errors like:
ld.lld: error: Function Import: link error: linking module flags
'ProfileSummary': IDs have conflicting values in ...
 This is because the executibles being built for the install target
used the new profile data, but they were linking with libraries that
used the old profile data.
$ ninja -C build stage2-install

With this change we can re-enable PGO for the release builds.
`blendv` instructions are very similar to `select`.
We will add support for them in followup patches.
…hWeights (llvm#89465)

It does not look like 2000 is needed here in particular.

Follow up to llvm#89464
A block represents a chunk of memory used by the freelist allocator. It
contains header information denoting the usable space and pointers as
offsets to the next and previous block.

On it's own, this doesn't do much. This is a part of
llvm#94270 to land in smaller
patches.

This is a subset of pigweed's freelist allocator implementation.
Generalize logic to set the result type for ops where the result type
and the types of all operands match. Use it to support any unary and
binops.
…lvm#84726)

Doxygen allows for the `@throw`, `@throws`, and `@exception` commands to
have an attached argument indicating the type being thrown. Currently,
Clang's AST parsing doesn't support parsing out this argument from doc
comments. The result is missing compatibility with Doxygen.

This PR implements parsing of arguments for the `@throw`, `@throws`, and
`@exception` commands. Each command can only have one argument, matching
the semantics of Doxygen.
…94652)

This patch lowers the `REDUCE` intrinsic call to the runtime equivalent
for scalar results. Call with array result will follow.
DIEStreamer no longer needs Rewriter, so we can remove the constructor
parameter and clean up the callers.
…m#89467)

It does not look like particular value is inportant.
Howere, there is a comment., but the current implementation
of `create{Unlikely,Likely}BranchWeights` use the same value.

Follow up to llvm#89464
…ll` (NFC) (llvm#95009)

It should be `x86-registered-target` because we only need the X86 target
in this case. `x86_64-linux` will be too strict here as it puts a
prerequisite on the default target triple.
Make it more feasible to replace the fragment reprsentation, which might
yield a large peak RSS win.
llvm#95056)

This reverts commit 2cf1439 since it
broke the llvm test suite:

SingleSource/UnitTests/AArch64/acle-fmv-features.c:59:9:
  error: instruction requires: altnzcv
SingleSource/UnitTests/AArch64/acle-fmv-features.c:117:10:
  error: instruction requires: aes
...

Looks like the FMV dependencies were used in the target attribute and
now features that are FMVOnly (have AEK_NONE) cannot be expanded in
parseTargetAttr using the ExtensionSet.

This suggests that either the tests are wrong (they are using an FMVOnly
feature in a target attribute), or that we need to turn the FMVOnly
features into Extensions (these two are tablegen classes).
Presently, if name starts with a symbol it's converted to hex which may
cause the result to be invalid by starting with a digit.

Address this and add a small test.

Co-authored-by: Will Dietz <w@wdtz.org>
This implements a traditional freelist to be used by the freelist
allocator. It operates on spans of bytes which can be anything. The
freelist allocator will store Blocks inside them.

This is a part of llvm#94270 to land in smaller patches.
…rinsic (llvm#94559)

Relanding this PR now that
llvm#90503 has merged. with `FTAN`
landing in
[TargetLoweringBase.cpp:L1021](https://github.com/llvm/llvm-project/blob/main/llvm/lib/CodeGen/TargetLoweringBase.cpp#L1020C23-L1021C63
) There is now a llvm tan intrinsic 32\64\128 Expand case for all llvm
backends.

In LLVM, the `llvm.experimental.constrained.cos` and
`llvm.experimental.constrained.sin` intrinsics are used for performing
cosine and sine calculations with additional constraints on
floating-point operations. This behavior is expected for all
floating-point math intrinsics. This change adds these constraints for
the `tan` intrinsic.

-  `Builtins.td` - replace TanF128 with F16F128MathTemplate
- `CGBuiltin.cpp` - map existing tan builtins to `tan` and
`constrained_tan` intrinsic
-   `ConstrainedOps.def` map tan and constrained_tan  to an ISDOpcode.

resolves  llvm#91421

---------

Co-authored-by: Farzon Lotfi <farzon@farzon.com>
This patch updates the lowering of `LaunchFuncOp` in GPU to LLVM to only
legalize the operation with the converted operands, effectively removing
the lowering used by the old serialization pipeline.
It also removes all remaining uses of the old gpu serialization
infrastructure in `gpu-to-llvm`.

See [Compilation overview | 'gpu' Dialect - MLIR
docs](https://mlir.llvm.org/docs/Dialects/GPU/#compilation-overview) for
additional information on the target attributes compilation pipeline
that replaced the old serialization pipeline.
)

They were assigned from calls to find_chunk_ptr_for_size which return
size_t now.
…lvm#95067)

Add comments and a test for delay-init libraries on macOS. I originally
added the support in 954d00e a month
ago, but without these additional clarifications.

rdar://126885033
Keenuts and others added 28 commits June 11, 2024 13:57
The case-list of the switches generated by this pass were not
"deterministic" (based on allocation patterns).
This is because the CaseList order relied on an unordered_set order.
Using the sorted exit target list for those should solve the problem.

Fixes llvm#94961

Signed-off-by: Nathan Gauër <brioche@google.com>
llvm#95098)

MLIR's LLVM dialect does not internally support debug records, only
converting to/from debug intrinsics. To smooth the transition from
intrinsics to records, there is a step prior to IR->MLIR translation
that switches the IR module to intrinsic-form; this patch adds the
equivalent conversion to record-form at MLIR->IR translation, and also
modifies the flang front end to use the WriteNewDbgInfoFormat flag when
it is emitting LLVM IR.
…ession (llvm#94356)

This commit reimplements the functionality of the Clang Static Analyzer
checker `alpha.core.SizeofPointer` within clang-tidy by adding a new
(off-by-default) option to bugprone-sizeof-expression which activates
reporting all the `sizeof(ptr)` expressions (where ptr is an expression
that produces a pointer).

The main motivation for this change is that `alpha.core.SizeofPointer`
was an AST-based checker, which did not rely on the path sensitive
capabilities of the Static Analyzer, so there was no reason to keep it
in the Static Analyzer instead of the more lightweight clang-tidy.

After this commit I'm planning to create a separate commit that deletes
`alpha.core.SizeofPointer` from Clang Static Analyzer.

It was natural to place this moved logic in bugprone-sizeof-expression,
because that check already provided several heuristics that reported
various especially suspicious classes of `sizeof(ptr)` expressions.

The new mode `WarnOnSizeOfPointer` is off-by-default, so it won't
surprise the existing users; but it can provide a more through coverage
for the vulnerability CWE-467 ("Use of sizeof() on a Pointer Type") than
the existing partial heuristics.

Previously this checker had an exception that the RHS of a
`sizeof(array) / sizeof(array[0])` expression is not reported; I
generalized this to an exception that the check doesn't report
`sizeof(expr[0])` and `sizeof(*expr)`. This idea is taken from the
Static Analyzer checker `alpha.core.SizeofPointer` (which had an
exception for `*expr`), but analysis of open source projects confirmed
that this indeed eliminates lots of unwanted results.

Note that the suppression of `sizeof(expr[0])` and `sizeof(*expr)`
reports also affects the "old" mode `WarnOnSizeOfPointerToAggregate`
which is enabled by default.

This commit also replaces the old message "suspicious usage of
'sizeof(A*)'; pointer to aggregate" with two more concrete messages; but
I feel that this tidy check would deserve a through cleanup of all the
diagnostic messages that it can produce. (I added a FIXME to mark one
outright misleading message.)
Those BitVectors get expensive on targets like AMDGPU with thousands of
registers, and RegAliasIterator is also expensive.

We can move all liveness calculations to use RegUnits instead to speed
it up for targets where RegAliasIterator is expensive, like AMDGPU.
On targets where RegAliasIterator is cheap, this alternative can be a little more expensive, but I believe the tradeoff is worth it.
This PR is required to fix
`std/algorithms/alg.nonmodifying/mismatch/mismatch.pass.cpp` test for
big endian platrofrms such as z/OS.
…anslation (llvm#95098)"

Reverted due to failure on buildbot due to missing use of the
WriteNewDbgInfoFormat flag in MLIR.

This reverts commit ca920bb.
Add a section about fence & address spaces that covers amdgpu-as.
Some cases should be legal for gfx940.
…ranslation (llvm#95098)"

Reapplies the original patch with some additional conversion layers added
to the MLIR translator, to ensure that we don't write the new debug info
format unless WriteNewDbgInfoFormat is set.

This reverts commit 8c5d9c7.
Following from the previous commit, this patch converts to the
appropriate debug info format before printing LLVM IR.

See: llvm#95098
The main goal of this PR (and subsequent PRs), is to add more tests with
scalable vectors to:
  * vector-transfer-collapse-inner-most-dims.mlir

There's quite a few cases to consider, hence this is split into multiple
PRs. In this PR, the very first test is complemented with all the
possible combinations:
  * scalable (rather than fixed) unit trailing dim,
  * dynamic (rather than static) trailing dim in the source memref.

Also,
  * `@leading_scalable_dimension_transfer_read` and
    `@trailing_scalable_one_dim_transfer_read`,

are replaced with:
  * `@contiguous_inner_most_scalable_inner_dim` and
    `@negative_scalable_unit_dim`,

respectively, and added to the list above (i.e. alongside other
variations for the very first test).

In addition:
  * "_view" is removed from function names (it's not clear to me what it
    was meant to signify)
  * extra comments are added to separate tests for vector.transfer_read
    and vector.transfer_write

NOTE: This PR is limited to tests for `vector.transfer_read`.
… as standard library functions in misc-include-cleaner (llvm#94923)

Fixes: llvm#93335
For decl with body, we should provide physical locations also. Because
it may be the function which have the same name as std library.
A caveat here is that we can only preserve nusw if the offset
additions did not overflow.

Proofs: https://alive2.llvm.org/ce/z/u56z_u
As we approach the state where support for debug intrinsics is dropping and
we print and use debug records by default, the documentation should be updated
to refer to debug records as the primary debug info representation, with
debug intrinsics being relegated to an optional alternative.

This patch performs a few updates:
- Replace references to intrinsics with references to records across all
the documentation.
- Replace intrinsics with records in code examples.
- Move debug records prior to debug intrinsics in the
SourceLevelDebugging document, and change text to refer to them as the
primary representation.
- Add release notes describing the change.
This is a printf style variadic function. If using a "%s" format, we
should pass "const char *" rather than "StringRef".

The use of data() here is safe because we know that the StringRef was
originally derived from a null-terminated string.
…s result (llvm#94571)

Prepare for new pass manager version of `MachineDominatorTreeAnalysis`.
We may need a machine dominator tree version of `DomTreeUpdater` to
handle `SplitCriticalEdge` in some CodeGen passes.
…translation (llvm#95098)"

Also reverts "[MLIR][Flang][DebugInfo] Convert debug format in MLIR translators"

The patch above introduces behaviour controlled by an LLVM flag into the
Flang driver, which is incorrect behaviour.

This reverts commits:
  3cc2710.
  460408f.
This makes sure we try to process declaration DIEs that are erroneously
present in the index. Until bd5c636, clang was emitting index
entries for declaration DIEs with DW_AT_signature attributes. This makes
sure to avoid returning those DIEs as the definitions of a type, but
also makes sure to pass through DIEs referring to static constexpr
member variables, which is a (probably nonconforming) extension used by
dsymutil.

It adds test cases for both of the scenarios. It is essentially a
recommit of llvm#91808.
…sn't drop below threshold

Upcoming SimplifyDemandedBits support for CMOV will simplify the code and reduce the critical path below the threshold if we stick with i32 multiplies
…ling

Add basic pass through handling - we could extend this to truncate CMOVQ to CMOVL in a future patch
@mgehre-amd mgehre-amd closed this Sep 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.