forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AutoBump] Merge with be6248a4 (Jun 11) (68) #332
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…2591) This fixes two problems with the 2-stage PGO builds. The first problem was that the stage2-instrumented and stage2 targets would not be built on the second ninja invocation. For example: This would work as expected. $ ninja -v -C build stage2-instrumented-generate-profdata Edit a file. $ touch llvm/lib/Support/StringExtras.cpp This would rebuild stage1 only and not build stage2-instrumented or regenerate the profile data. $ ninja -v -C build stage2-instrumented-generate-profdata The second problem was that in some cases, the profile data would be regenerated, but not all of the stage2 targets would be rebuilt. One common scenario where this would happen is: This would work as expected. $ ninja -C build stage2-check-all This would regenerate the profile data, but then only build the targets that were needed for install, but weren't needed for check-all. This would cause errors like: ld.lld: error: Function Import: link error: linking module flags 'ProfileSummary': IDs have conflicting values in ... This is because the executibles being built for the install target used the new profile data, but they were linking with libraries that used the old profile data. $ ninja -C build stage2-install With this change we can re-enable PGO for the release builds.
`blendv` instructions are very similar to `select`. We will add support for them in followup patches.
…hWeights (llvm#89465) It does not look like 2000 is needed here in particular. Follow up to llvm#89464
A block represents a chunk of memory used by the freelist allocator. It contains header information denoting the usable space and pointers as offsets to the next and previous block. On it's own, this doesn't do much. This is a part of llvm#94270 to land in smaller patches. This is a subset of pigweed's freelist allocator implementation.
Generalize logic to set the result type for ops where the result type and the types of all operands match. Use it to support any unary and binops.
…lvm#84726) Doxygen allows for the `@throw`, `@throws`, and `@exception` commands to have an attached argument indicating the type being thrown. Currently, Clang's AST parsing doesn't support parsing out this argument from doc comments. The result is missing compatibility with Doxygen. This PR implements parsing of arguments for the `@throw`, `@throws`, and `@exception` commands. Each command can only have one argument, matching the semantics of Doxygen.
…94652) This patch lowers the `REDUCE` intrinsic call to the runtime equivalent for scalar results. Call with array result will follow.
DIEStreamer no longer needs Rewriter, so we can remove the constructor parameter and clean up the callers.
…m#89467) It does not look like particular value is inportant. Howere, there is a comment., but the current implementation of `create{Unlikely,Likely}BranchWeights` use the same value. Follow up to llvm#89464
…ll` (NFC) (llvm#95009) It should be `x86-registered-target` because we only need the X86 target in this case. `x86_64-linux` will be too strict here as it puts a prerequisite on the default target triple.
Make it more feasible to replace the fragment reprsentation, which might yield a large peak RSS win.
llvm#95056) This reverts commit 2cf1439 since it broke the llvm test suite: SingleSource/UnitTests/AArch64/acle-fmv-features.c:59:9: error: instruction requires: altnzcv SingleSource/UnitTests/AArch64/acle-fmv-features.c:117:10: error: instruction requires: aes ... Looks like the FMV dependencies were used in the target attribute and now features that are FMVOnly (have AEK_NONE) cannot be expanded in parseTargetAttr using the ExtensionSet. This suggests that either the tests are wrong (they are using an FMVOnly feature in a target attribute), or that we need to turn the FMVOnly features into Extensions (these two are tablegen classes).
Presently, if name starts with a symbol it's converted to hex which may cause the result to be invalid by starting with a digit. Address this and add a small test. Co-authored-by: Will Dietz <w@wdtz.org>
This implements a traditional freelist to be used by the freelist allocator. It operates on spans of bytes which can be anything. The freelist allocator will store Blocks inside them. This is a part of llvm#94270 to land in smaller patches.
…rinsic (llvm#94559) Relanding this PR now that llvm#90503 has merged. with `FTAN` landing in [TargetLoweringBase.cpp:L1021](https://github.com/llvm/llvm-project/blob/main/llvm/lib/CodeGen/TargetLoweringBase.cpp#L1020C23-L1021C63 ) There is now a llvm tan intrinsic 32\64\128 Expand case for all llvm backends. In LLVM, the `llvm.experimental.constrained.cos` and `llvm.experimental.constrained.sin` intrinsics are used for performing cosine and sine calculations with additional constraints on floating-point operations. This behavior is expected for all floating-point math intrinsics. This change adds these constraints for the `tan` intrinsic. - `Builtins.td` - replace TanF128 with F16F128MathTemplate - `CGBuiltin.cpp` - map existing tan builtins to `tan` and `constrained_tan` intrinsic - `ConstrainedOps.def` map tan and constrained_tan to an ISDOpcode. resolves llvm#91421 --------- Co-authored-by: Farzon Lotfi <farzon@farzon.com>
…vm#95051) Add sanitize_numerical_stability attribute.
…m#94557)" (llvm#94734) This reverts commit c007883.
This patch updates the lowering of `LaunchFuncOp` in GPU to LLVM to only legalize the operation with the converted operands, effectively removing the lowering used by the old serialization pipeline. It also removes all remaining uses of the old gpu serialization infrastructure in `gpu-to-llvm`. See [Compilation overview | 'gpu' Dialect - MLIR docs](https://mlir.llvm.org/docs/Dialects/GPU/#compilation-overview) for additional information on the target attributes compilation pipeline that replaced the old serialization pipeline.
…94987) This addresses a clang-tidy suggestion.
…lvm#95067) Add comments and a test for delay-init libraries on macOS. I originally added the support in 954d00e a month ago, but without these additional clarifications. rdar://126885033
The case-list of the switches generated by this pass were not "deterministic" (based on allocation patterns). This is because the CaseList order relied on an unordered_set order. Using the sorted exit target list for those should solve the problem. Fixes llvm#94961 Signed-off-by: Nathan Gauër <brioche@google.com>
llvm#95098) MLIR's LLVM dialect does not internally support debug records, only converting to/from debug intrinsics. To smooth the transition from intrinsics to records, there is a step prior to IR->MLIR translation that switches the IR module to intrinsic-form; this patch adds the equivalent conversion to record-form at MLIR->IR translation, and also modifies the flang front end to use the WriteNewDbgInfoFormat flag when it is emitting LLVM IR.
…ession (llvm#94356) This commit reimplements the functionality of the Clang Static Analyzer checker `alpha.core.SizeofPointer` within clang-tidy by adding a new (off-by-default) option to bugprone-sizeof-expression which activates reporting all the `sizeof(ptr)` expressions (where ptr is an expression that produces a pointer). The main motivation for this change is that `alpha.core.SizeofPointer` was an AST-based checker, which did not rely on the path sensitive capabilities of the Static Analyzer, so there was no reason to keep it in the Static Analyzer instead of the more lightweight clang-tidy. After this commit I'm planning to create a separate commit that deletes `alpha.core.SizeofPointer` from Clang Static Analyzer. It was natural to place this moved logic in bugprone-sizeof-expression, because that check already provided several heuristics that reported various especially suspicious classes of `sizeof(ptr)` expressions. The new mode `WarnOnSizeOfPointer` is off-by-default, so it won't surprise the existing users; but it can provide a more through coverage for the vulnerability CWE-467 ("Use of sizeof() on a Pointer Type") than the existing partial heuristics. Previously this checker had an exception that the RHS of a `sizeof(array) / sizeof(array[0])` expression is not reported; I generalized this to an exception that the check doesn't report `sizeof(expr[0])` and `sizeof(*expr)`. This idea is taken from the Static Analyzer checker `alpha.core.SizeofPointer` (which had an exception for `*expr`), but analysis of open source projects confirmed that this indeed eliminates lots of unwanted results. Note that the suppression of `sizeof(expr[0])` and `sizeof(*expr)` reports also affects the "old" mode `WarnOnSizeOfPointerToAggregate` which is enabled by default. This commit also replaces the old message "suspicious usage of 'sizeof(A*)'; pointer to aggregate" with two more concrete messages; but I feel that this tidy check would deserve a through cleanup of all the diagnostic messages that it can produce. (I added a FIXME to mark one outright misleading message.)
Those BitVectors get expensive on targets like AMDGPU with thousands of registers, and RegAliasIterator is also expensive. We can move all liveness calculations to use RegUnits instead to speed it up for targets where RegAliasIterator is expensive, like AMDGPU. On targets where RegAliasIterator is cheap, this alternative can be a little more expensive, but I believe the tradeoff is worth it.
This PR is required to fix `std/algorithms/alg.nonmodifying/mismatch/mismatch.pass.cpp` test for big endian platrofrms such as z/OS.
…anslation (llvm#95098)" Reverted due to failure on buildbot due to missing use of the WriteNewDbgInfoFormat flag in MLIR. This reverts commit ca920bb.
Add a section about fence & address spaces that covers amdgpu-as.
Some cases should be legal for gfx940.
…ranslation (llvm#95098)" Reapplies the original patch with some additional conversion layers added to the MLIR translator, to ensure that we don't write the new debug info format unless WriteNewDbgInfoFormat is set. This reverts commit 8c5d9c7.
Following from the previous commit, this patch converts to the appropriate debug info format before printing LLVM IR. See: llvm#95098
The main goal of this PR (and subsequent PRs), is to add more tests with scalable vectors to: * vector-transfer-collapse-inner-most-dims.mlir There's quite a few cases to consider, hence this is split into multiple PRs. In this PR, the very first test is complemented with all the possible combinations: * scalable (rather than fixed) unit trailing dim, * dynamic (rather than static) trailing dim in the source memref. Also, * `@leading_scalable_dimension_transfer_read` and `@trailing_scalable_one_dim_transfer_read`, are replaced with: * `@contiguous_inner_most_scalable_inner_dim` and `@negative_scalable_unit_dim`, respectively, and added to the list above (i.e. alongside other variations for the very first test). In addition: * "_view" is removed from function names (it's not clear to me what it was meant to signify) * extra comments are added to separate tests for vector.transfer_read and vector.transfer_write NOTE: This PR is limited to tests for `vector.transfer_read`.
… as standard library functions in misc-include-cleaner (llvm#94923) Fixes: llvm#93335 For decl with body, we should provide physical locations also. Because it may be the function which have the same name as std library.
A caveat here is that we can only preserve nusw if the offset additions did not overflow. Proofs: https://alive2.llvm.org/ce/z/u56z_u
As we approach the state where support for debug intrinsics is dropping and we print and use debug records by default, the documentation should be updated to refer to debug records as the primary debug info representation, with debug intrinsics being relegated to an optional alternative. This patch performs a few updates: - Replace references to intrinsics with references to records across all the documentation. - Replace intrinsics with records in code examples. - Move debug records prior to debug intrinsics in the SourceLevelDebugging document, and change text to refer to them as the primary representation. - Add release notes describing the change.
This is a printf style variadic function. If using a "%s" format, we should pass "const char *" rather than "StringRef". The use of data() here is safe because we know that the StringRef was originally derived from a null-terminated string.
…s result (llvm#94571) Prepare for new pass manager version of `MachineDominatorTreeAnalysis`. We may need a machine dominator tree version of `DomTreeUpdater` to handle `SplitCriticalEdge` in some CodeGen passes.
…5117) Co-authored-by: Joseph Huber <huberjn@outlook.com>
…translation (llvm#95098)" Also reverts "[MLIR][Flang][DebugInfo] Convert debug format in MLIR translators" The patch above introduces behaviour controlled by an LLVM flag into the Flang driver, which is incorrect behaviour. This reverts commits: 3cc2710. 460408f.
This makes sure we try to process declaration DIEs that are erroneously present in the index. Until bd5c636, clang was emitting index entries for declaration DIEs with DW_AT_signature attributes. This makes sure to avoid returning those DIEs as the definitions of a type, but also makes sure to pass through DIEs referring to static constexpr member variables, which is a (probably nonconforming) extension used by dsymutil. It adds test cases for both of the scenarios. It is essentially a recommit of llvm#91808.
…sn't drop below threshold Upcoming SimplifyDemandedBits support for CMOV will simplify the code and reduce the critical path below the threshold if we stick with i32 multiplies
…ling Add basic pass through handling - we could extend this to truncate CMOVQ to CMOVL in a future patch
cferry-AMD
approved these changes
Sep 11, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.