-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New evaluator that doesn't depend on flattening #1495
Conversation
This test case describes a scenario where we don't get the desired result. The behavior changed, not for the better nor for the worse.
Hi @romac @MahtabNorouzi! I realize this is way too big to review. Unfortunately, the nature of this work and my current resources made it so this is the only feasible option. @josef-widder has been using a version of this since last week as a beta tester, and I'll also be using this heavily over this week. I've also been testing this with our set of examples and I'm confident with its current state. So I'd like to ask that you give this a look, however it fits in your time and priorities right now. Let me know if you think this is worth merging/releasing, or if there are any other requirements I should meet before putting it out. I'm also happy to answer any questions or go through stuff in detail either in here, slack or zoom. |
@konnov if you want to take a look at this, I'm sure you'll have valuable feedback. I think this is an evolution of what you wrote, and I would not be able to come up with it without the base you provided through your implementation. |
It's a massive change. I think it would make more sense to break it down into much smaller PRs. Reviewing such a big change in the simulator is a full-time job for a week. If you need help with that, Informal management knows my hourly rates. |
Of course, yes. I've always be very in favor of small PRs and feel bad that this is so big. However, in this particular case, I don't think the effort taken (probably more than a week) to break this down would be compensated in significantly more informed reviews, considering the people available to review it at this point. On top of that, I mentioned you here to make you aware of this change. I've tested the new version in many of our internal and public Quint projects and it's all good. I know you are also writing Quint specs, and I'd rather have you informed than surprised by such meaningful change in the simulator. And sorry if I implied that I expected you to work for free! I'm just trying to be considerate of your participation in this as best as I can. |
Hey @bugarela! No worries! I am just saying that I could have a look at smaller PRs. Understanding a massive change is quite hard without having context. Normally the context exchange happens during regular meetings. Since we don't have those anymore, reviewing such a big change would be similar to a security audit to me, of course, the focus would not be on the security bugs. If you have no other choice to refactor this big PR into smaller chunks, I would recommend you to introduce a flag to run both the old simulator and the new one in a lockstep, so you would be able to quickly detect when they start to deviate. |
// Update the definition to point to the expression being overriden | ||
constDef.id = ex.id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a kind of a hack so the compiler would evaluate the constant as the expression it was overridden to. We now do it on the level of registers on the evaluator.
decl.overrides.forEach(([_param, expr]) => { | ||
this.graphAddAll(expr.id, Set([decl.id])) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is from a reverted commit, which added a hack for something we don't need anymore
different version numbers
Hello
This is a huge PR because it changes the core of our evaluation process to not depend on flattening anymore. I've moved a lot of things around and attempted many different things throughout this process, and I finally have a version that I think is good enough to ship.
Since this is pretty much a rewrite, we are getting rid of a lot of tech debt and fixing many bugs. Here's a list of what I could find:
get
with no key inside a fold #1471init
action #1439Either
instead ofOption
inRuntimeValue
#1058compileFromCode
and use more appropriate interfaces #1052<undefined value>
in the REPL when evaluating #924_lastTrace
when evaluating actions #288init
andstep
params in the CLI have invalid syntax. #1151The following issues are consequences of flattening bugs. After this PR, the issues stop happening for
test
,run
, andrepl
, and only happen inverify
andcompile
commands, as we still need flattening for Apalache. So the impact of these issues was significantly reduced.Change overview
The main change is that, while the old evaluator can evaluate a single flattened module, this version can evaluate expressions, definitions, simulations and tests over the original modules, avoiding the flattening pass (which is slow and buggy).
Related to that, the new version compiles from a starting point: either an expression/definition typed into the REPL, a combiation of init/step/inv given to
simulate
, or from a test definition. It won't even touch code that is not a dependency of this entrypoint. The old evaluator would go through the entire flattened module and compile everything beforehand.The biggest challenge is to handle constants. This was previously done by the flattener, and now we need to take care of them while simulating. A good example of this is the
instances.qnt
file:quint/examples/language-features/instances.qnt
Lines 38 to 45 in 351b52a
Evaluating
V1::init
andV2::init
are two different things, even though both point to the same definition in the lookup table. To make sure we use different constant values for different constants, and that we use different variable registers for variables that come from the same module but through different instances, we need to track a some extra information while building (compiling) expressions.Structure
The new evaluator has two main steps:
builder.ts
). Take Quint expressions and build typescript arrow functions that can be called to obtain the value of that expression under a certain contextPerformance
Although performance changes are not consistent, my evaluation is that it it overall better.
For the REPL, it is a complete game changer. For the malachite project, evaluating any expression in the REPL would take around 20 seconds, while now it is completely instant.
For runs and tests, it is faster for most cases, sometimes taking half of the time. For some cases, it is slower, but I couldn't find any case where it takes double of the time or more. I want to further improve this version so we can get faster results for all scenarios, but I think this version is good enough to release, specially considering the REPL performance and how many bugs it fixes. See benchmark details below.
Benchmarks
"old version" in the benchmarks is the last released version. Ideally I should benchmark against main, but this is easier. Also, we are measuring simulation/testing time, and performance changes outside of this execution scope are not accounted.Cases where the new version is better than the old one:
Cases where the new version is worse than the old one:
CHANGELOG.md
for any new functionalityREADME.md
updated for any listed functionality