forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First batch post March 7th [Nr 0] #237
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This adds a `CExpression` trait and replaces the `isCExpression()` function.
…nyext load. This load isn't selected by tablegen due to the anyext, but wasn't generating a subreg_to_reg. Maybe it shouldn't be formed at all during the combiner but to stop crashes later in codegen select it manually for now.
This reverts commit 2a13422. It was causing test failures on the expensive check builders.
Changing SPIR-V backend meeting day and removing my office hours
llvm#77753) Per https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2448r2.html function/constructor/destructor can be marked `constexpr` even though it never produces a constant expression. Non-literal types as return types and parameter types of functions marked `constexpr` are also allowed. Since this is not a DR, the diagnostic messages are still preserved for C++ standards older than C++23.
This is the concat_vector equivalent of llvm#81312, in that we recursively split concat_vectors with more than two operands into smaller concat_vectors. This allows us to break up the chain of vslideups, as well as perform the vslideups at a smaller LMUL, which in turn reduces register pressure as the previous lowering performed N vslideups at the highest result LMUL. For now, it stops splitting past MF2. This is done as a DAG combine so that any undef operands are combined away: If we do this during lowering then we end up with unnecessary vslideups of undefs.
…ences within trivial statements (llvm#82229) This PR makes alpha.webkit.UncountedLocalVarsChecker ignore raw references and pointers to a ref counted type which appears within "trival" statements. To do this, this PR extends TrivialFunctionAnalysis so that it can also analyze "triviality" of statements as well as that of functions Each Visit* function is now augmented with withCachedResult, which is responsible for looking up and updating the cache for each Visit* functions. As this PR dramatically improves the false positive rate of the checker, it also deletes the code to ignore raw pointers and references within if and for statements.
Try to make this editable by using functions for the number of wait states as a function of the number of passes. I'm assuming the current hazard test coverage is comprehensive. This could probably use another round to further simplify it. Alternatively, I believe this could all be expressed in a constant table indexed by an instruction classify function and number of passes.
…vm#84292) This trips the verifier changes added in llvm#83251 Stimulated by llvm/test/MC/WebAssembly/extern-functype-intrinsic.ll
Currently, the SLS hardening pass is run before the machine outliner, which means that the outliner creates new functions and calls which do not have the SLS hardening applied. The fix for this is to move the SLS passes to after the outliner, as has recently been done for the return address signing pass. This also avoids a bug where the SLS outliner emits code with instructions after a return, which the outliner doesn't correctly handle.
Clang returns an error when compiling this file with c++20 ``` error: ISO C++20 does not permit initialization of char array with UTF-8 string literal ``` It seems like c++20 treats u8strings differently than strings (probably needs char8_t). Make this a string to fix the error.
) The standard specifies that it it UB to specialize the following traits: - `std::is_integral` - `std::is_unsigned` - `std::make_unsigned` - `std::make_signed` This patch: - Removes specializations for `BigInt` - Transforms SFINAE for `bit.h` functions from template parameter to return type (This makes specialization easier). - Adds `BigInt` specialization for `bit.h` functions. - Fixes code depending on previous specializations.
Following the pattern used for SOP instructions, we can use the same multiclass with a default argument to define renamed and non-renamed instructions.
This allows us to configure the pass to emit linalg.copy instead of memref.copy. This is consistent with one-shot-bufferize, which also allows to configure the `memCpyFn`, see https://discord.com/channels/636084430946959380/642426447167881246/1211698722438783087
ACLE suggests: ARM-software/acle#308. GCC emits diagnostics for attribute strings which contain duplicate features, but for now let's follow the SPEC in regards to mangling rules and we can change the semantic behavior of the compiler later if there's value to it.
This adds patterns and a pass to convert the Arith dialect to EmitC. For now, this covers arithemtic binary ops operating on floating point types. It is not checked within the patterns whether the types, such as the Tensor type, are supported in the respective EmitC operations. If unsupported types should be converted, the conversion will fail anyway because no legal EmitC operation can be created. This can clearly be improved in a follow up, also resulting in better error messages. Functions for such checks should not solely be used in the conversions and should also be (re)used in the verifier.
…lvm#84299) Note: This is a reland of llvm#84035. The standard specifies that it it UB to specialize the following traits: - `std::is_integral` - `std::is_unsigned` - `std::make_unsigned` - `std::make_signed` This patch: - Removes specializations for `BigInt` - Transforms SFINAE for `bit.h` functions from template parameter to return type (This makes specialization easier). - Adds `BigInt` specialization for `bit.h` functions. - Fixes code depending on previous specializations.
…th changing shuffles (llvm#84156) Fix gap in the cost estimation for length changing shuffles, by adjusting the shuffle mask and either widening the shuffle inputs or extracting the lower elements of the result. A small step towards moving some of this implementation inside improveShuffleKindFromMask and/or target getShuffleCost handlers (and reduce the diffs in cost estimation depending on whether coming from a ShuffleVectorInst or the raw operands / mask components)
Reverts llvm#84291 The patch broke Windows builds.
Pass the Pseudo (instead of its name) into EXP_Real_Row and EXP_Real_ComprVM since it is already available in all subclasses.
Rename getNumVGPRBlocks to getEncodedNumVGPRBlocks, to clarify that it's using the encoding granule. This is used to program the hardware. In practice, the hardware will use the alloc granule instead, so this patch also adds a new helper, getAllocatedNumVGPRBlocks, which can be useful when driving heuristics.
See also discussion on llvm#84291.
…l-expression. (llvm#82611) In llvm#72985, I made a change to discard expression state (`ExprToLoc` and `ExprToVal`) at the beginning of each basic block. I did so with the claim that "we never need to access entries from these maps outside of the current basic block", noting that there are exceptions to this claim when control flow happens inside a full-expression (the operands of `&&`, `||`, and the conditional operator live in different basic blocks than the operator itself) but that we already have a mechanism for retrieving the values of these operands from the environment for the block they are computed in. It turns out, however, that the operands of these operators aren't the only expressions whose values can be accessed from a different basic block; when control flow happens within a full-expression, that control flow can be "interposed" between an expression and its parent. Here is an example: ```cxx void f(int*, int); bool cond(); void target() { int i = 0; f(&i, cond() ? 1 : 0); } ``` ([godbolt](https://godbolt.org/z/hrbj1Mj3o)) In the CFG[^1] , note how the expression for `&i` is computed in block B4, but the parent of this expression (the `CallExpr`) is located in block B1. The the argument expression `&i` and the `CallExpr` are essentially "torn apart" into different basic blocks by the conditional operator in the second argument. In other words, the edge between the `CallExpr` and its argument `&i` straddles the boundary between two blocks. I used to think that this scenario -- where an edge between an expression and one of its children straddles a block boundary -- could only happen between the expression that triggers the control flow (`&&`, `||`, or the conditional operator) and its children, but the example above shows that other expressions can be affected as well; the control flow is still triggered by `&&`, `||` or the conditional operator, but the expressions affected lie outside these operators. Discarding expression state too soon is harmful. For example, an analysis that checks the arguments of the `CallExpr` above would not be able to retrieve a value for the `&i` argument. This patch therefore ensures that we don't discard expression state before the end of a full-expression. In other cases -- when the evaluation of a full-expression is complete -- we still want to discard expression state for the reasons explained in llvm#72985 (avoid performing joins on boolean values that are no longer needed, which unnecessarily extends the flow condition; improve debuggability by removing clutter from the expression state). The impact on performance from this change is about a 1% slowdown in the Crubit nullability check benchmarks: ``` name old cpu/op new cpu/op delta BM_PointerAnalysisCopyPointer 71.9µs ± 1% 71.9µs ± 2% ~ (p=0.987 n=15+20) BM_PointerAnalysisIntLoop 190µs ± 1% 192µs ± 2% +1.06% (p=0.000 n=14+16) BM_PointerAnalysisPointerLoop 325µs ± 5% 324µs ± 4% ~ (p=0.496 n=18+20) BM_PointerAnalysisBranch 193µs ± 0% 192µs ± 4% ~ (p=0.488 n=14+18) BM_PointerAnalysisLoopAndBranch 521µs ± 1% 525µs ± 3% +0.94% (p=0.017 n=18+19) BM_PointerAnalysisTwoLoops 337µs ± 1% 341µs ± 3% +1.19% (p=0.004 n=17+19) BM_PointerAnalysisJoinFilePath 1.62ms ± 2% 1.64ms ± 3% +0.92% (p=0.021 n=20+20) BM_PointerAnalysisCallInLoop 1.14ms ± 1% 1.15ms ± 4% ~ (p=0.135 n=16+18) ``` [^1]: ``` [B5 (ENTRY)] Succs (1): B4 [B1] 1: [B4.9] ? [B2.1] : [B3.1] 2: [B4.4]([B4.6], [B1.1]) Preds (2): B2 B3 Succs (1): B0 [B2] 1: 1 Preds (1): B4 Succs (1): B1 [B3] 1: 0 Preds (1): B4 Succs (1): B1 [B4] 1: 0 2: int i = 0; 3: f 4: [B4.3] (ImplicitCastExpr, FunctionToPointerDecay, void (*)(int *, int)) 5: i 6: &[B4.5] 7: cond 8: [B4.7] (ImplicitCastExpr, FunctionToPointerDecay, _Bool (*)(void)) 9: [B4.8]() T: [B4.9] ? ... : ... Preds (1): B5 Succs (2): B2 B3 [B0 (EXIT)] Preds (1): B1 ```
…llvm#84296) I did not know how `-mmlir` flag works and was deferring the addition of `--openm-enabled-delayed-privatization` until later because I thought some work needs to be done to do that. This commit just adds some extra `RUN` lines to delayed privatization tests to run them from `flang` as well.
…llvm#79818) This patch adds support for parsing the proposed non-instruction debug info ("RemoveDIs") from textual IR, and adds a test for the parser as well as a set of verifier tests that are dependent on parsing to fire. An important detail of this patch is the fact that although we can now parse in the RemoveDIs (new) and Intrinsic (old) debug info formats, we will always convert back to the old format at the end of parsing - this is done for two reasons: firstly to ensure that every tool is able to process IR printed in the new format, regardless of whether that tool has had RemoveDIs support added, and secondly to maintain the effect of the existing flags: for the tools where support for the new format has been added, we will run LLVM passes in the new format iff `--try-experimental-debuginfo-iterators=true`, and we will print in the new format iff `--write-experimental-debuginfo-iterators=true`; the format of the textual IR input should have no effect on either of these features.
…are local to the inner context (llvm#84150) Make TopLevelStmtDecl a DeclContext so that variables defined in statements are attached to the TopLevelDeclContext. This fixes redefinition errors from variables declared in if conditions and for-init statements. These must be local to the inner context (C++ 3.3.2p4), but they had generated definitions on global scope instead. This PR makes the TopLevelStmtDecl looking more like a FunctionDecl and that's fine because the FunctionDecl is very close in terms of semantics. Additionally, ActOnForStmt() requires a CompoundScope when processing a NullStmt body. --------- Co-authored-by: Vassil Vassilev <v.g.vassilev@gmail.com>
This reverts commit fb02f9a. Looks like some Python version incompatibility, will investigate.
Resubmitting this after previous revert with the following changes: - Split table into table_rhs_idx and table_candidate_idx so that bisect.bisect_left can be used without the `key` argument, which was introduced in Python 3.10 - Remove a re.Pattern type annotation Original commit message: Prior to this change, running UTC on larger tests, especially tests with unnamed IR values, often resulted in a spuriously large diff because e.g. TMPnn variables in the CHECK lines were renumbered. This change attempts to reduce the diff by keeping those variable names the same. There are cases in which this "drift" of variable names can end up being more confusing. The old behavior can be re-enabled with the --reset-variable-names command line argument. The improvement may not be immediately apparent in the diff of this change. The point is that the diff of stable_ir_values.ll against stable_ir_values.ll.expected after this change is smaller. Ideally, we'd also keep meta variables for "global" objects stable, e.g. for attributes (#nn) and metadata (!nn). However, that would require a much more substantial refactoring of how we generate check lines, so I left it for future work.
Older versions of clang do not have __builtin_complex, but they may define `__GNUC__`.
…s. (llvm#84187) With the many pseudos used in SVE codegen it can be too easy to miss instructions. This enables the existing test we have for checking the scheduling info of the pseudos matches the real instructions, and adjusts the scheduling info in the NeoverseV1 model to make sure all are handled. In the cases I could I opted to use the same info as in the NeoverseV2 model, to keep the differences smaller.
…this` with values. (llvm#84164) This is the constructor's job, and we want to be able to test that it does this.
…y value. (llvm#84317) I'm making some changes to `Environment::getResultObjectLocation()`, with the ultimate goal of eliminating `RecordValue` entirely, and I'd like to make sure I don't break this behavior (and I've realized we don't have a test for it yet).
These checks have been broken since 6afe972. The check_cxx_compiler_flag macro only takes two arguments and passing three to it ends up calling `cmake_check_compiler_flag(CXX "${_FLAG}" ${_RESULT})` with ${_FLAG} equal to `-Werror` and the result variable being the actually tested compiler flag. I noticed this because some of the flags that I know should be supported were being flagged as not supported. `--debug-trycompile` shows the following surprising line in the generated CMakeLists.txt: `add_definitions([==[-D-Wempty-body]==] [==[-Werror]==])` which then results in the following error while running the check: ``` FAILED: CMakeFiles/cmTC_72736.dir/src.cxx.o tmp/upstream-llvm-readonly/bin/clang++ -nodefaultlibs -std=c++17 -fcolor-diagnostics -D-Wempty-body -Werror -MD -MT CMakeFiles/cmTC_72736.dir/src.cxx.o -MF CMakeFiles/cmTC_72736.dir/src.cxx.o.d -o CMakeFiles/cmTC_72736.dir/src.cxx.o -c .../cmake-build-all-sanitizers/CMakeFiles/CMakeScratch/TryCompile-nyh3QR/src.cxx In file included from <built-in>:450: <command line>:1:9: error: macro name must be an identifier 1 | #define -Wempty-body 1 | ^ 1 error generated. ``` It would be great if CMake could be a bit more helpful here so I've filed https://gitlab.kitware.com/cmake/cmake/-/issues/25735. See also https://reviews.llvm.org/D146920. Reviewed By: nikic Pull Request: llvm#83779
mgehre-amd
approved these changes
Aug 7, 2024
josel-amd
changed the title
First batch post March 7th
First batch post March 7th [Nr 0]
Aug 8, 2024
Auto-merged enabled. Should merge itself soon. Thanks for the approval! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.