forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AutoBump] Merge with 507e59aa (13) #270
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Prepare migration for dag-isel.
Add Passes in dependency list
This patch adjusts argument passing for `APInt` to improve the compile-time. Compile-time improvement: https://llvm-compile-time-tracker.com/compare.php?from=d1f182c895728d89c5c3d198b133e212a5d9d4a3&to=32d6611af69bf4e76373f9bc7d9649650f760e48&stat=instructions:u
This patch adjusts argument passing for `APInt` to improve the compile-time. Compile-time improvement: https://llvm-compile-time-tracker.com/compare.php?from=d1f182c895728d89c5c3d198b133e212a5d9d4a3&to=ba3e326def3a6e5cd6d72ff5a49c74fba18de1df&stat=instructions:u
Prepare for dag-isel, also migrate some test case
The zero-value DT_JMPREL is benign but not needed. This is also code simplification available after https://reviews.llvm.org/D65651
…)" This reverts commit 01b1b0c. Breaks following AArch64 SVE buildbots: https://lab.llvm.org/buildbot/#/builders/184/builds/11363 https://lab.llvm.org/buildbot/#/builders/176/builds/9331
Here we introduce three new GMIR instructions to cover a set of trap intrinsics. The idea behind it is that generic intrinsics shouldn't be used with G_INTRINSIC opcode. These new instructions can match perfectly with existing trap ISD nodes. It allows X86, AArch64, RISCV and Mips to reuse SelectionDAG patterns for selection and avoid manual selection. However AMDGPU is an exception. It selects traps during legalization regardless SelectionDAG or GlobalISel. Since there are not many places where traps are used, this change attempts to clean up all the usages of G_INTRINSIC with trap intrinsics. So, there is no stage when both G_TRAP and G_INTRINSIC_W_SIDE_EFFECTS(@llvm.trap) are allowed.
…odules (llvm#85917) Clang modules take a significant compile time hit when pushing and popping diagnostics. Since all the headers are marked as system headers in the modulemap, we can simply disable this pushing and popping when building with clang modules.
Remove getSizeOrUnknown call when MachineMemOperand is created. For Scalable TypeSize, the MemoryType created becomes a scalable_vector. 2 MMOs that have scalable memory access can then use the updated BasicAA that understands scalable LocationSize. Original Patch by Harvin Iriawan Co-authored-by: David Green <david.green@arm.com>
``` --------------------------------------------------- Benchmark old new --------------------------------------------------- bm_mismatch<char>/1 0.835 ns 2.37 ns bm_mismatch<char>/2 1.44 ns 2.60 ns bm_mismatch<char>/3 2.06 ns 2.83 ns bm_mismatch<char>/4 2.60 ns 3.29 ns bm_mismatch<char>/5 3.15 ns 3.77 ns bm_mismatch<char>/6 3.82 ns 4.17 ns bm_mismatch<char>/7 4.29 ns 4.52 ns bm_mismatch<char>/8 4.78 ns 4.86 ns bm_mismatch<char>/16 9.06 ns 7.54 ns bm_mismatch<char>/64 31.7 ns 19.1 ns bm_mismatch<char>/512 249 ns 8.16 ns bm_mismatch<char>/4096 1956 ns 44.2 ns bm_mismatch<char>/32768 15498 ns 501 ns bm_mismatch<char>/262144 123965 ns 4479 ns bm_mismatch<char>/1048576 495668 ns 21306 ns bm_mismatch<short>/1 0.710 ns 2.12 ns bm_mismatch<short>/2 1.03 ns 2.66 ns bm_mismatch<short>/3 1.29 ns 3.56 ns bm_mismatch<short>/4 1.68 ns 4.29 ns bm_mismatch<short>/5 1.96 ns 5.18 ns bm_mismatch<short>/6 2.59 ns 5.91 ns bm_mismatch<short>/7 2.86 ns 6.63 ns bm_mismatch<short>/8 3.19 ns 7.33 ns bm_mismatch<short>/16 5.48 ns 13.0 ns bm_mismatch<short>/64 16.6 ns 4.06 ns bm_mismatch<short>/512 130 ns 13.8 ns bm_mismatch<short>/4096 985 ns 93.8 ns bm_mismatch<short>/32768 7846 ns 1002 ns bm_mismatch<short>/262144 63217 ns 10637 ns bm_mismatch<short>/1048576 251782 ns 42471 ns bm_mismatch<int>/1 0.716 ns 1.91 ns bm_mismatch<int>/2 1.21 ns 2.49 ns bm_mismatch<int>/3 1.38 ns 3.46 ns bm_mismatch<int>/4 1.71 ns 4.04 ns bm_mismatch<int>/5 2.00 ns 4.98 ns bm_mismatch<int>/6 2.43 ns 5.67 ns bm_mismatch<int>/7 3.05 ns 6.38 ns bm_mismatch<int>/8 3.22 ns 7.09 ns bm_mismatch<int>/16 5.18 ns 12.8 ns bm_mismatch<int>/64 16.6 ns 5.28 ns bm_mismatch<int>/512 129 ns 25.2 ns bm_mismatch<int>/4096 1009 ns 201 ns bm_mismatch<int>/32768 7776 ns 2144 ns bm_mismatch<int>/262144 62371 ns 20551 ns bm_mismatch<int>/1048576 254750 ns 90097 ns ```
Use `LINK_COMPONENTS` parameter of `add_llvm_library` rather than passing LLVM components directly to `target_link_libraries`, in order to ensure that LLVM dylib is linked correctly when used. Otherwise, CMake insists on linking to static libraries that aren't present on distributions doing pure dylib installs, such as Gentoo. This fixes a regression introduced in dcbddc2.
If any return from overwriteChangedFiles is true some fixes were not applied.
Commit f44db24 (2015) enabled this simplication.
MIPS is different and should better off use separate code.
…lvm#84464) Instead of keeping a mapping of Inst->VPValues (of their corresponding recipes) in VPlan's Value2VPValue mapping, keep it in VPRecipeBuilder instead. After recently replacing the last user of this mapping after initial construction, this mapping is only needed for recipe construction (to map IR operands to VPValue operands). By moving the mapping, VPlan's VPValue tracking can be simplified and limited only to live-ins. It also allows removing disableValue2VPValue and associated machinery & asserts. PR: llvm#84464
…ion (llvm#86386) Move the code adding top-level cmake/Modules directory to CMAKE_MODULE_PATH prior to including `GetDarwinLinkerVersion`, in order to fix standalone builds. Fixes a regression introduced by 3bc71c2.
Hide the implementations of `FuncHashes` and `BBHashMap` classes, getting rid of `at` accessors that could throw an exception. Test Plan: NFC Reviewers: ayermolo, maksfb, dcci, rafaelauler Reviewed By: rafaelauler Pull Request: llvm#86353
Attach branch counters to YAML profile, covering intra-function control flow. Depends on: llvm#86353 Test Plan: Updated bolt/test/X86/bolt-address-translation-yaml.test Reviewers: rafaelauler, dcci, ayermolo, maksfb Reviewed By: rafaelauler Pull Request: llvm#76911
…non-splat vector SREM expansion when we aren't hitting the special case. (llvm#86238) Fixes llvm#84830 Introduced in llvm#82706
The indexed MemProf file has a huge amount of redundancy. In a large internal application, 82% of call stacks, stored in IndexedAllocationInfo::CallStack, are duplicates. We should work toward deduplicating call stacks by referring to them with unique IDs with actual call stacks stored in a separate data structure, much like we refer to memprof::Frame with memprof::FrameId. At the same time, we need to facilitate a graceful transition from the current version of the MemProf format to the next. We should be able to read (but not write) the current version of the MemProf file even after we move onto the next one. With those goals in mind, I propose to have an integer ID next to CallStack in IndexedAllocationInfo to refer to a call stack in a succinct manner. We'll gradually increase the areas of the compiler where IDs and call stacks have one-to-one correspondence and eventually remove the existing CallStack field. This patch adds call stack ID, named CSId, to IndexedAllocationInfo and teaches the raw profile reader to compute unique call stack IDs and store them in the new field. It does not introduce any user of the call stack IDs yet, except in verifyFunctionProfileData.
This commit adds the `BufferViewFlowOpInterface` to the bufferization dialect. This interface can be implemented by ops that operate on buffers to indicate that a buffer op result and/or region entry block argument may be the same buffer as a buffer operand (or a view thereof). This interface is queried by the `BufferViewFlowAnalysis`. The new interface has two interface methods: * `populateDependencies`: Implementations use the provided callback to declare dependencies between operands and op results/region entry block arguments. E.g., for `%r = arith.select %c, %m1, %m2 : memref<5xf32>`, the interface implementation should declare two dependencies: %m1 -> %r and %m2 -> %r. * `mayBeTerminalBuffer`: An SSA value is a terminal buffer if the buffer view flow analysis stops at the specified value. E.g., because the value is a newly allocated buffer or because no further information is available about the origin of the buffer. Ops that implement the `RegionBranchOpInterface` or `BranchOpInterface` do not have to implement the `BufferViewFlowOpInterface`. The buffer dependencies can be inferred from those two interfaces. This commit makes the `BufferViewFlowAnalysis` more accurate. For unknown ops, it conservatively used to declare all combinations of operands and op results/region entry block arguments as dependencies (false positives). This is no longer the case. While the analysis is still a "maybe" analysis with false positives (e.g., when analyzing ops such as `arith.select` or `scf.if` where the taken branch is not known at compile time), results and region entry block arguments of unknown ops are now marked as terminal buffers. This commit addresses a TODO in `BufferViewFlowAnalysis.cpp`: ``` // TODO: We should have an op interface instead of a hard-coded list of // interfaces/ops. ``` It is no longer needed to hard-code ops.
…lvm#86416) Reverting llvm#85188 with follow up patches. This reverts commit 362d263. This reverts commit c9bdeab. This reverts commit 6bc6e1a. This reverts commit 01fa550. This reverts commit ddcbab3.
Fixes llvm#83561. When a thread is blocked on a mutex and we send an async signal to that mutex, it never arrives because tsan thinks that `pthread_mutex_lock` is not a blocking function. This patch marks `pthread_*_lock` functions as blocking so we can successfully deliver async signals like `SIGPROF` when the thread is blocked on them. See the issue also for more details. I also added a test, which is a simplified version of the compiler explorer example I posted in the issue. Please let me know if you have any other ideas or things to improve! Happy to work on them. Also I filed llvm#83844 which is more tricky because we don't have a libc wrapper for `SYS_futex`. I'm not sure how to intercept this yet. Please let me know if you have ideas on that as well. Thanks!
…tsan (llvm#86537) Fixes llvm#83844. This PR adds callbacks to mark futex syscalls as blocking. Unfortunately we didn't have a mechanism before to mark syscalls as a blocking call, so I had to implement it, but it mostly reuses the `BlockingCall` implementation [here](https://github.com/llvm/llvm-project/blob/96819daa3d095cf9f662e0229dc82eaaa25480e8/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp#L362-L380). The issue includes some information but this issue was discovered because Rust uses futexes directly. So most likely we need to update Rust as well to use these callbacks. Also see the latest comments in llvm#85188 for some context. I also sent another PR llvm#84162 to mark `pthread_*_lock` calls as blocking.
The latest ACLE allows it and further clarifies the following in regards to the combination of the two attributes: "If the `default` matches with another explicitly provided version in the same translation unit, then the compiler can emit only one function instead of the two. The explicitly provided version shall be preferred." ("default" refers to the default clone here) ARM-software/acle#310
…aming functions. (llvm#85388) Similar to how we protected FP/fixed-vector arguments and results from calls, we should do the same for arguments/results from locally-streaming functions such that those are not spilled/filled as ZPR registers. This may cause a small regression (additional spills/fills), which is addressed by llvm#85386.
…86535) Testing with MSVC link.exe showed that it respects such options, while LLD currently discards them.
…es (llvm#86595) Summary: We have a plugin singleton that implements the Plugin interface. This then spawns separate device and kernels. Previously when these needed to reach into the global singleton they would use the `PluginTy::get` routine to get access to it. In the future we will move away from this as the lifetime of the plugin will be handled by `libomptarget` directly. This patch removes uses of this inside of the plugin implementaion themselves by simply keeping a reference to the plugin inside of the device. The external `__tgt_rtl` functions still use the global method, but will be removed later.
…ed .def files. (llvm#86564) It's similar to llvm#86535, but for export specified in .def files.
llvm#86486) This attribute tells the compiler that the variable must have its exit-time destructor run, so it makes sense that it would silence the warning telling users that an exit-time destructor is required. Fixes llvm#68686
llvm-project/clang/lib/Sema/SemaDecl.cpp:11653:20: error: unused variable 'OldMVKind' [-Werror,-Wunused-variable] MultiVersionKind OldMVKind = OldFD->getMultiVersionKind(); ^ 1 error generated.
This patch removes APIs that creating NUW neg. It is a trivial case because `sub nuw 0, X` always gets simplified into zero. I believe there is no optimization opportunities in the real-world applications that we can take advantage of the nuw flag. Motivated by llvm#84792 (comment). Compile-time improvement: https://llvm-compile-time-tracker.com/compare.php?from=d1f182c895728d89c5c3d198b133e212a5d9d4a3&to=da7b7478b7cbb32c09d760f6b8d0e67901e0d533&stat=instructions:u
Fixed the printing of templated argument list and added test case.
…NFC. (llvm#86259) This patch passes APInt by const reference in m_SpecificInt instead of by value. Specifically, it refactors `m_SpecificInt(uint64_t V)` to avoid APInt construction and dangling reference. I believe it is safe to pass the APInt by const reference into `m_SpecificInt` even if it is a temporary. See also https://en.cppreference.com/w/cpp/language/lifetime > All temporary objects are destroyed as the last step in evaluating the [full-expression](https://en.cppreference.com/w/cpp/language/expressions#Full-expressions) that (lexically) contains the point where they were created Compile-time impact: https://llvm-compile-time-tracker.com/compare.php?from=d1f182c895728d89c5c3d198b133e212a5d9d4a3&to=7edf459b95ab2be33b70ec67faf87b3b8cc84f09&stat=instructions:u
…5911) Currently patchpoints can only have two result types, `void` and `i64`. This limits the result to general purpose registers. This patch makes `patchpoint.i64` an overloadable intrinsic, allowing result values that can fit in a single register (e.g. integers, pointers, floats).
Need to include initial sext/zext/trunc nodes to the list of the demoted root values to correctly calculate the cost and handle the vectorization.
…tadata (llvm#86431) Similar to llvm#80829 for GlobalISel.
Added a new variant of the CHECK() function that takes a custom message as a parameter. This is useful for more meaninful error messages when the compiler is expected to crash. Fixes llvm#78931
…lvm#84864) This change: - Updates the existing Clang User's Manual section on SPGO so that it describes how to use llvm-profgen to perform SPGO on Windows. This is new functionality implemented in llvm#83972. - Fixes a minor typo in the existing llvm-profgen invocation example. - Adds an LLVM release note on this new functionality in llvm-profgen.
…#86655) This matches the CMake targets and reduces the number of headers that need to be included in multiple targets.
…5378) Using remove() on DeclContext::lookup_result list invalidates iterators. This assertion failure was one (fortunate) symptom: ``` clang/include/clang/AST/DeclBase.h:1337: reference clang::DeclListNode::iterator::operator*() const: Assertion `Ptr && "dereferencing end() iterator"' failed. ```
…ich are needed to authenticate signed pointers (llvm#67454)" (llvm#86674) This reverts commit 8bd1f91. It appears that the commit broke msan bots.
These tests show invalid tbaa.struct metadata that is currently accepted in preparation for a change to the IR Verifier that will then reject it. PR: llvm#86167
cmcgirr-amd
approved these changes
Aug 16, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.