-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for operators on Core.IntLiteral
.
#4716
base: trunk
Are you sure you want to change the base?
Add support for operators on Core.IntLiteral
.
#4716
Conversation
Also add support for mixed comparison between different integer types.
Stop trying to allow operations on IntLiterals to be lowered. That's not possible in general because we don't necessarily have a value at runtime for the IntLiteral.
toolchain/check/eval.cpp
Outdated
@@ -769,6 +766,15 @@ static auto DiagnoseDivisionByZero(Context& context, SemIRLoc loc) -> void { | |||
context.emitter().Emit(loc, CompileTimeDivisionByZero); | |||
} | |||
|
|||
// Get an integer at a suitable bit-width: either its actual width if it has a | |||
// fixed width, or the canonical width from the value store if not. | |||
static auto GetIntAtSuitableWidth(Context& context, IntId bit_width_id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might it make sense to make this behavior part of GetAtWidth
, rather than a separate function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. I think the need to deal with the "unknown width" case in exactly this way is probably specific to the evaluation logic, and other users of IntStore
are unlikely to want it -- especially in something with such an innocuous name :) It looks like all the current calls to GetAtWidth
are in this file, but I think that's just because lower/constant.cpp hasn't been updated to use it yet -- this is precisely GetAtWidth
written longhand, and doesn't want this special casing for an invalid width.
toolchain/check/eval.cpp
Outdated
static auto GetIntAtSuitableWidth(Context& context, IntId bit_width_id, | ||
IntId int_id) -> llvm::APInt { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems potentially pretty confusing that this function takes its parameters in the opposite order from GetAtWidth
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, especially given the parameters are all the same types. I'm a little bit uneasy with having the bit width last given the signatures of GetIntAtSuitableWidth
/ GetIntsAtSuitableWidth
-- having the "variadic" part last feels better to me -- but matching GetAtWidth
and making the parameter order match the order in which the things are mentioned in the name is probably reasonable. Done.
toolchain/check/eval.cpp
Outdated
} | ||
|
||
default: | ||
// Break to do additional setup for other builtin kinds. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite follow: if we break here, we don't just do "additional setup", we actually perform the operations, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Pre-existing.) Yeah, this seems confusing -- and in fact the shift logic is completely different to the rest of this, so I've split that out into a different function.
if (result.overflow && !lhs_bit_width_id.is_valid()) { | ||
// Retry with a larger bit width. Most operations can only overflow by one | ||
// bit, but signed n-bit multiplication can overflow to 2n-1 bits. | ||
int new_width = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not do this before the first attempt, and avoid the need to retry? If the concern is that 2n bits could be too expensive if it's not needed, it seems like we could compute a tighter upper bound pretty efficiently (something like lhs_val.ceilLogBase2() + rhs_val.ceilLogBase2()
?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally planned to do that, but @chandlerc suggested it'd be better to do it this way -- IIUC the rationale is that we'll almost never need to go to a wider size than the inputs (because they've already been rounded up to a multiple of 64 bits by the IntStore
), so it's better to speculatively assume that the result will fit than to spend time computing a width -- especially because any wider upper bound will require a heap allocation (APInt
heap allocates integers wider than 64 bits) and wider operations get more expensive pretty quickly in APInt
at least.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep.
But it seems good to capture this rationale in the comments as otherwise it is a bit mysterious why we wait to see the overflow before doing this.
var a_lit: Core.IntLiteral() = 12; | ||
var an_i32: i32 = 34; | ||
|
||
// This can't be valid: we don't have a compile-time or runtime integer value for `n`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does "n" refer to in this context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, missed this when I renamed the variable. Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is less_eq
meaningfully different from greater_eq
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I'd just already updated both before I decided it wasn't worth it. I can revert the changes to one of them if you like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume the thinking here is that while sdiv
has a special rule about 0, that rule is sufficiently orthogonal to fixed-vs-variable width that we don't need separate coverage of non-fixed-width cases?
In cases like this where we're relying on tests on another operation to provide coverage indirectly, I wonder if it might make sense to have a comment explaining that, and pointing to the tests that are believed to provide that coverage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. Thinking about this again, I think it probably is worth testing the weird case where sdiv can overflow -- but not for IntLiteral. And while testing that I think it also makes sense to explicitly test division by zero, which is another overflow-like case but one that can happen for IntLiteral. Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might it make sense to have a test of non-fixed-width overflow, or would that be too slow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just too slow. I tried forming a value somewhat near our bit limit using a left-shift (which I think is the fastest way we have to do that) and it ran for a very long time just doing the shift. I think the APInt
multiply algorithm is probably quadratic in the length of the int, too... (I can't imagine it's doing a Fourier transform to speed it up!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know LLVM IR yet, so it could take me a while to review the changes in this directory, especially since IIUC the golden outputs are supposed to be reviewed more carefully in lower
than in check
. Feel free to find another reviewer for this part if you want to expedite things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've asked @chandlerc to take a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The LLVM IR looks pretty good. Left some comments throughout.
A higher level meta comment though: I think we need a way to suppress the SemIR from test splits where we're able to fully validate the behavior with the type system as you're doing with Expect(<some value>)
, or where we're just testing diagnostics. The SemIR created by these file splits is huge and completely unhelpful given that they are self enforcing.
Not sure that's strictly necessary prior to landing this PR, but it was already hard to just navigate the SemIR added by this PR.
And if anything, we should be leveraging all the opportunities we have to directly test things the way you are and bypass the more complex SemIR-based testing and only do that in a few places where we really want to zoom into how this is represented, not how it behaves.
if (result.overflow && !lhs_bit_width_id.is_valid()) { | ||
// Retry with a larger bit width. Most operations can only overflow by one | ||
// bit, but signed n-bit multiplication can overflow to 2n-1 bits. | ||
int new_width = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep.
But it seems good to capture this rationale in the comments as otherwise it is a bit mysterious why we wait to see the overflow before doing this.
// Note that this can in theory still overflow if we limited `new_width` to | ||
// `MaxIntWidth`. In that case we fall through to the signed overflow | ||
// diagnostic below. | ||
result = ComputeBinaryIntOpResult(builtin_kind, lhs_val, rhs_val); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe:
result = ComputeBinaryIntOpResult(builtin_kind, lhs_val, rhs_val); | |
result = ComputeBinaryIntOpResult(builtin_kind, lhs_val, rhs_val); | |
CARBON_CHECK(new_width == IntStore::MaxIntWidth || !result.overflow); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the new tests in this file result in a pretty staggering amount of SemIR being printed with unusually low value.
These tests actually check that they work correctly intrinsically or in the diagnostics being emitted, and so the SemIR is especially unfortunate here. Not sure there is anything to do in this PR but something to think about in terms of whene the SemIR isn't as helpful.
// TODO: This might be an awkward size, such as 33 or 65 bits, for a | ||
// signed/unsigned comparison. Would it be better to round this up to a | ||
// "nicer" bit width? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My memory is that we experimented with this when considering these semantics for C++ and LLVM did a good job of handling the "odd" size and producing the expected result... But yeah, we can always revisit if practice proves otherwise.
// Weirdly, LLVM requires the operands of bit shift operators to be of the | ||
// same type. We can always use the width of the LHS, because if the RHS | ||
// doesn't fit in that then the cast is out of range anyway. | ||
// | ||
// TODO: In a development build we should trap in that case. | ||
rhs = context.builder().CreateZExtOrTrunc(rhs, lhs->getType(), "rhs"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe worth commenting here that LLVM always treats the shift amount as an unsigned amount, hence ZExt is correct? And maybe add checking for negative shifts to the TODO?
// TODO: We could allow these in the case where the operand has a | ||
// compile-time value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about when the integer literal value can be represented without change in the non-literal operand's type?
Or do we handle that by converting the literal?
I would expect fn F(i: i32) -> bool { return i < 32; }
to work fine... If this is handled by conversion, maybe just mention that in the comment to avoid confusion?
// Shifts by an integer literal amount are compile-time only. We don't | ||
// have a value for the shift amount at runtime in general. | ||
// TODO: We could allow these in the case where the shift amount has a | ||
// compile-time value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that the shift amount has to fit into the shifted operand's type (and LLVM even makes us use that!), why not support there?
Or maybe this is what you mean by "compile-time value" -- but don't integer literals always have a compile time value?
Sorry if the comment is just confusing me (see below as well).
// the RHS with the LHS bit width. | ||
CARBON_CHECK(rhs.type_id == lhs.type_id, "Heterogeneous builtin integer op!"); | ||
llvm::APInt rhs_val = context.ints().GetAtWidth(rhs.int_id, lhs_bit_width_id); | ||
return {.lhs = context.ints().GetAtWidth(lhs_id, bit_width_id), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment on APIntBinaryOperands talks about RVO, but doesn't this function defeat that by returning a named variable in one path and a temporary in another?
// Retry with a larger bit width. Most operations can only overflow by one | ||
// bit, but signed n-bit multiplication can overflow to 2n-1 bits. | ||
int new_width = | ||
builtin_kind == SemIR::BuiltinFunctionKind::IntSMul |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this not seem to imply that IntUMul maybe have introduced wrapping on the first ComputeBinaryIntOpResult()
attempt, and wouldn't if we'd increased its bitwidth? Is that desirable?
Fixes integer builtins to produce the correct values (and not CHECK-fail) when used on integer literals. Also adds impls to the prelude to use the new builtins to perform operations on integer literals.
Perhaps most importantly, this allows directly initializing
i32
values with negative numbers, as the negation operation on integer literals now works.For testing I've added tests for use of literals with one operator in each class (addition, multiplication, ordering, bitwise, etc) for which there are distinct rules or overflow behavior, rather than exhaustively testing all the combinations. This is aimed at finding a good tradeoff between maintainability of the tests and thorough test coverage.
Also fixes lowering of heterogeneous shifts and comparisons. These are currently disabled when one of the operands is an integer literal, but we may want to allow that when the integer literal operand has a known constant value.