Replies: 18 comments 25 replies
-
I wonder if there has been efforts to verify the Alive to C++ transpiler in addition to the empirical testing presented in page 9.
|
Beta Was this translation helpful? Give feedback.
-
Doing the tasks in this class gave me the impression that writing compiler optimizations for a feature-rich programming language are really tedious with a plethora of edge cases. I think this paper is a right step towards towards optimization development using machines to find these edge cases for us, and even better, prove that our optimizations are correct. This makes me wonder if there are more classes of optimizations we can write general formal verifiers for. |
Beta Was this translation helpful? Give feedback.
-
I was looking up what peephole optimizations were and this wikipedia article provides many examples. They seem like a small subset of local optimization where we replace a contiguous set of instructions in a block with something computationally less expensive. |
Beta Was this translation helpful? Give feedback.
-
Rather than verifying existing transformations, how feasible would it be to brute-force new peephole optimizations? I saw that the paper referenced the program Optgen, which generates all possible optimizations but doesn't support poison values and undefined behavior. Alive also claims to take several hours to test multiplication/division heavy transformations. If we restrict our transformations to a subset of the language (no instructions that multiply/divide or can cause undefined behavior) can we still find optimizations that aren't immediately obvious to us? |
Beta Was this translation helpful? Give feedback.
-
Couple questions:
|
Beta Was this translation helpful? Give feedback.
-
I'm pretty excited by the impact of this paper (and Alive2) on real production code. It's neat that this was done on such a widely used tool, yet still on a problem small enough to apply formal verification. I like that they tried both using SMT array theory to encode memory operations, as well as replacing stores with if-then-else expressions to be more efficient. I don't think I quite understood the latter approach fully though, and what exactly an "eager Ackermannization encoding" is. |
Beta Was this translation helpful? Give feedback.
-
I think it's kind of interesting how the paper notes that many LLVM developers working on peephole optimizations frequently just leave out |
Beta Was this translation helpful? Give feedback.
-
I thought this paper was a pretty interesting read, although it was a lot to digest. Compiler testing is definitely a very important part of implementing compilers, so I'm curious as to the impact that this paper has had? It seems to me that the integration of tools like Alive into compiler work could have led to more efficient development in the field. |
Beta Was this translation helpful? Give feedback.
-
Echoing the sentiments of some of the previous comments, I think this is a really cool paper that has the potential to improve a lot of compiler engineers' lives; writing all these compiler optimizations so far this semester has made me realize just how many edge cases have the potential to exist, and I'm sure that being able to formally verify any optimization, even if it only affects a small subset of instructions, has the potential to improve the correctness of many existing compilers out there. I also find it interesting how the authors encoded undefined behavior and logic on arrays to work well with the SMT solvers they use; these encodings certainly do not seem trivial and I'd like to learn more about how exactly they're done. Also, since this paper came out nearly a decade ago, I'm curious as to how the use of this tool has spread over that time in compiler development. It certainly seems well-suited for its use case, so I'd appreciate learning about how this tool or others like it are used in the real world currently. |
Beta Was this translation helpful? Give feedback.
-
I agree with others that one of the most attractive points of this system is that it can be used to prevent new bugs before they arise, in tandem with regular compiler development (section 6.2). This reminds me of the counterexample generation in the parser generator I used in CS 4120 (the parser generator outputs a sequence of tokens and a description of how the parser gets "confused," i.e. a shift-reduce or reduce-reduce conflict). If we regard the process of formally proving something and the process of finding a counterexample as duals, then both these tools perform a process that can be boring or unintuitive for humans and replace it with a checker that is conservative (i.e. it may not finish its task in time, but it never outputs a wrong answer). It seems to me that in the long run, such a tool would be well worth the initial investment to get developers familiarized with it. |
Beta Was this translation helpful? Give feedback.
-
I'm curious about some of the problems that arise when making a tool like this. As others have mentioned, Alive is very useful. Being able to prove your peephole optimizations saves the fear of deploying an optimization that inevitably wrecks someone's program out in the world. But many of the ways the problems were framed suggested that proving optimizations created the need to solve SAT-instances, some of which are even quantified placing the problem in NP-Hard or PSPACE-Hard. The SMT solvers we have are great, but I'm curious about the applicability at scale. Another thing that I was curious about was the notion of translating Alive back into C++. It seems like Alive syntax is meant to be reminiscent of LLVM syntax, but if it gets transpiled to C++ it seems like it'd be better making the syntax a bit more lightweight to begin with? Anyways, definitely an interesting read. I also found it pretty funny that LLVM passes were proved incorrect with this tool - definitely tough to get compiler optimizations right, and it's especially important for what they're used for. |
Beta Was this translation helpful? Give feedback.
-
The formal methods tools introduced in the paper are really exciting and seem to bring a lot of promise and reassurance to compiler developers—a structured and mathematically sound approach to verifying the correctness of compiler transformations. Also, apologies if this seems like a silly question as I have 0 background with formal verification, but I was wondering how we can be certain that the verifier itself is correct. I understand that in the context of this paper, when Alive fails to prove the correctness of a transformation, it prints a counterexample showing values for inputs and constants, as well as for each of the preceding intermediate operations. However, if these counterexamples become too complex or exceed the scale at which we can manually inspect them, it may raise concerns about the reliability of the verifier. In such cases, it might seem impractical to manually validate the correctness of the verifier, potentially leading to a paradoxical situation where we rely on an automated tool whose accuracy we cannot independently verify. |
Beta Was this translation helpful? Give feedback.
-
The authors seems to be hiding a lot of implementation detail and they seem to be non trivial to implement. For instance, Alive would rely on using a collection of built-in predicates such as isPowerOf2(), MaskedValue, IsZero(), for dataflow analysis. I wonder if there exist a tradeoff here between simplicity vs program runtime, like RISC vs MISC. |
Beta Was this translation helpful? Give feedback.
-
One interesting side detail brought up by the paper was compilation time. At the time of the paper, the LLVM+Alive configuration was slightly faster (7%) than LLVM 3.6 at -O3. The paper noted that this was because Alive only had implemented a fraction of the InstCombine optimizations. This, coupled with the imbalanced applicability of different optimizations makes me wonder about ways to navigate the balance between trying out all the optimizations in the world and compilation speed. I'm not the most familiar in this area but I assume there's some level of overhead on compilation speed that grows linearishly based on number of optimizations. The way I know is provided to navigate this is simply having compiler options like -O2 and -O3 that allow you to (coarsely) turn the knobs on how optimized you want to go. However, I feel like it'd be interesting to look at more intelligent/efficient methods for applying optimizations that reduce the penalty of having a larger suite of optimizations, especially with a majority having a good chance of not being applied to any particular program. When I worked with Haskell at an internship years ago, I remember the production build times were excruciatingly long due to GHC being absolutely massive. I'm not sure if it's the same situation mentioned in the paper where there are many optimizations that don't get applied (I'm sure many were). Nevertheless, I think it'd be super cool to look into. Does anyone know if this is an active area of research? |
Beta Was this translation helpful? Give feedback.
-
It goes without saying that the correctness of compilers is critical and almost always expected. As the authors remark in the paper,
Most people wouldn't even think about the possibility of a compiler bug when debugging, so this paper undoubtedly addresses some important problems. I was shocked and impressed to find that Alive managed to catch eight bugs in such a widely used compiler infrastructure. Some of them were subtle (e.g. different ranges of inputs), and others produced incorrect expressions. While this work is promising and already impactful, its scope is somewhat limited. As others have noted,
Moreover, (the non-experimental version of) Alive does not support branches, which are integral to many programs. I wonder if they have included control flow in Alive2. I don't know much about SMT solvers so I didn't fully understand some parts of the paper, but hopefully I will! It was also interesting to see how the authors handled different features of the language (e.g. keeping track of allocated memory with |
Beta Was this translation helpful? Give feedback.
-
This was a needed reading! We have seen compiler optimizations as different islands, each doing its own task. Of course, we have also seen generalizations, but it still seemed more like a "big bag of algorithms", just like the field of networking - according to Scott Shenker - was more like a "big bag of protocols" before the software-defined networking paradigm. It was heartening to see that there actually are systematic ways to abstract correctness for this bag, and to talk about it formally. Specifically, I love the abstraction of "undefined" stuff. I have been constantly struggling with this in my optimizations this semester, as there always is this tension between "what to cover" and "what to leave for the developer". Turns out that the solution for a correctness analysis was just to abstract all that away to quantification over the possible values. Of course, this cannot capture all the ambiguity, since there might be infinite cases to consider. But even the limited version manages to capture subtle bugs. I am wondering whether this approach can be augmented with model checking, through (perhaps runtime) division of the state space into a finite set. A cute part of the paper was the difference between undefined and poisoned variables. It reminded me of the difference between immediate domination and the dominance frontier. It seems like there is a fundamental role played by this difference between the immediate and post-immediate reach of an operation in compilers. Finally, to go back to the big bag and abstraction issue, you might have noticed that some optimizations that we have been doing have a pretty solid algebraic structure. This is actually accurate, and there is a paper by Dexter and Maria-Cristina Patron that delves into it through the lens of the famous KAT framework: Certification of Compiler Optimizations |
Beta Was this translation helpful? Give feedback.
-
In section 3.3.3, the author mentions that "Many SMT solvers do not support the theory of arrays efficiently", so they "explore encoding memory operations without arrays". This reminds me of the experience of generating zero-knowledge proofs. Zero-Knowledge proof usually also requires us compiling a high-level language to a set of constraints. In that case, we should also encode memory access with some other techniques, this may also produce a quadratic increase in the size of constraints. It's exciting to find a connection here. [1] Efficient RAM and control flow in verifiable outsourced computation |
Beta Was this translation helpful? Give feedback.
-
I found the verification efforts in this paper interesting to compare and contrast to hardware emulation, which also strives to perform a search over function-compatible building blocks but at the hardware gate level instead of the software instruction layer, for instance when developing AMD chips capable or reasonable x86 application performance [1]. In particular the use of a fast gate level FPGA hardware emulator makes me wonder about the exploration of available chip(let) design space for runtime optimization. |
Beta Was this translation helpful? Give feedback.
-
Hi all, this is the discussion thread for next Thursday (September 28). The selected paper is Provably Correct Peephole Optimizations with Alive. Please post your comments/questions here!
Beta Was this translation helpful? Give feedback.
All reactions