-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add initial support LLVM stable (v19) #23311
base: main
Are you sure you want to change the base?
Conversation
208a081
to
d26814e
Compare
Run the majority of the `other` under llvm stable. There are still some features that don't seem to work. Generate and experimental warning for now, at least until we can fix the known issues. Fixes: emscripten-core#11362
This is quite a lot of additional testing. Do we know we actually want this? From our quick discussion before, I wasn't certain of the benefit here. If the goal is to support linux distros, being able to tell them "use this emscripten with your stable LLVM version N", then that is possible without this. This PR allows us to say that a particular emscripten commit works with two LLVM verisons - but why do we want that specifically? |
I feel like this way I would feel a lot more confident recommending using a stable LLVM release. The proposal of "use this emscripten N with your stable LLVM version M" doesn't actually test that M and N are compatiable, its just somewhat likely due to branch point closeness. This new approach has the advantage that we actually continuously the test that all emscripten releases are compatible with at least one stable LLVM release. This is a much higher bar IMHO and it means that we know it true, at least for the subset of features we test. It also means that we have an entire revision range that we can recomment. i.e. "emscripten version A-B is known to work with LLVM M". |
I think it can, as a slightly modified version of previous ideas. In the past, yeah, we said "get the closest version", and you're right that without explicit testing that seems risky. But if we pull from LLVM say 20 times a year (assuming we are moving to a model with less frequent rolls), then if the LLVM stable releases from the year are among those 20, then we get an "emscripten with stable LLVM release" for free. |
One problem with that approach is that LLVM releases are never made from the trunk. They are always on a branch with a whole bunch of backports. I agree however testing against the branch point (where trunk and the release branch share common ancestor) would offer a pretty good guarantee. I guess there is nothing stopping us from choosing actual release branches? However I'm not sure was what to link our continuous updates to LLVM releases like that. I like the idea of the LLVM roller being flexible. Are you suggesting that we have emscripten follow each new release branch up until the point where we actaully need something from tot, and which point we switch it back to tot? |
Also, we them end up deviding emscripten releases into those that we llvm-stable compatible and those that are not. With this approach all emscripten releases are packagable with at least one llvm stable release. I don't really want to be a in the position of saying its simple not feasible to package certain emscripten releases since they didn't happen to conincide with an llvm release. I think it could be good if all emscripten versions should be packagable. |
Good questions, I'm not sure of the details. But I hope we could use the release branches, so we would have an emscripten release that literally uses the same LLVM as the LLVM release that came out that week. And the next update after would be from the main branch once more, etc. |
@kripken I think what you are suggesting is that in the emscripten code we essentially do exactly what we are doing now (where every revision of emscripten supports exactly one revision of LLVM), but then just ensure that some revision of emscripten is released/tested with a release version of LLVM? If we do that, then anyone who wants to use a release version of LLVM will only ever be able to use exactly one version of emscripten. This would be only a slight improvement over what we have now (i.e. an emscripten version that works with each LLVM release probably does exist today if you find the right one, but this would guarantee and test it). But there would be no way to update Emscripten for such users. This PR as written here would at least allow anyone with the most recent release of LLVM to get emscripten updates until the next stable release. The required change in this PR is quite minimaI. I would also be curious to see what happens if you try LLVM 18. In terms of implementation, a more flexible option (compared to having just
Then whenever we wanted to drop support for LLVM 19, we could just find all the places we check for that version specifically, and delete them. We could also use a mechanism like this for newer revisions that aren't part of a release yet, if we wanted to use e.g. Google's tested revisions. It would need to be slightly more sophisticated than just the major version number to handle unreleased revisions but I'm sure we could figure something out. |
Technically with our DEPS system nothing is stopping us from shipping an LLVM revision on a release branch. |
Also I do think it would be OK to run these tests on the Chromium post-commit bots that don't block the roll, especially if we keep the autorollers turned on. Breakages probably won't be too frequent, and we'll find out about them pretty quickly. And we can run as many tests as we want. |
While this is extra testing burden, I hope it won't be too much extra maintenance burden. Since the tests run in parallel it doesn't actually cause PR to have to wait longer (since this is not the slowest job). The the concern about testing time is really a concern about using more circleci budget I guess? In that can we could consider (as Derek says) running these tests on the chromium CI.. or we could use github actions to run them instead of circleci? Alternatively we could look into self hosted test bots as a way to reduce the financial impact of adding new bots. |
To be clear, I'm not opposed to this PR or this feature. I'm just still not sure what it gains for us. In particular, why is it important for people to be able to use more than one Emscripten version with a particular LLVM stable? For Linux distros I imagine one is enough, similarly to how a single Binaryen version is enough for a particular LLVM as well. However, note that you can use multiple Emscriptens even with what I proposed: if Emscripten 4.18 uses LLVM stable, and the next time we update LLVM is 4.33, then 4.18-4.32 are all using LLVM stable. (This assumes we are going to a slower cadence of updates from LLVM, as discussed earlier.) Btw, another way to look at it is this: Imagine we are starting from scratch when thinking about LLVM updates. Then a natural thing to do is to update LLVM on each stable release (as e.g. wasi-sdk does). Emscripten can do just that, but also roll in incremental checkpoints in the middle, at manually chosen times, whenever we have some update that we want to pull in. |
It sounds like you are proposing a world were with "pin to stable when possible" and "jump to head when needed", in the hope that that later will be possible for a lot of the time? I'm not necessarily opposed to that idea, but my idea is more like the best of both worlds. It allows us to be both always be pinned to stable and always be forward looking at the same time. There is of course a cost to supporting both which is twofold:
I think if we can show that both (1) and (2) are small enough then it could be worth it. The questions remains as to is how to define "small enough" and how to measure these costs. |
I just don't see yet the benefit from supporting both. Which users specifically do you see as being helped by this? But I agree the complexity cost is not huge, so the bar is not very high. |
Any user wanting to package the current version of emscripten today would benefit. Package maintainers try to package the current versions of software and there is an expectation that current version of X to be compatible with the current versions of its dependencies X, Y and Z. Imagine you are packaging emscripten today and you want it to be compatible with the llvm package which your colleagues are working on (v19). You would not expect to be forced to go back in time to some older version of emscripten from last yeat, especially if there are features in the latest emscripten version that your users will want. What is more, I'm not just convinced that emscripten will want to track LLVM stable for any significant length of time. This meaning that most of the time emscripten will be tracking tot, which means most emscripten versions will be "unpackagable with llvm stable" regardless of which version of llvm stable you choose. |
In my experience, and this is where we might be seeing different things, package maintainers want the latest stable version, and they are ok if the stable version is a bit older. It's common for them to package a version that is several months old - e.g. LLVM is often 6 months old in Linux distros, if not more. In my proposal, the Emscripten version they use would be just as old as the LLVM version. So I think it would be acceptable. But, again, we may be thinking of different users here. I have in mind Linux distro packagers. What kind of user do you see that needs the very latest Emscripten and also wants an older LLVM? (I do agree that for such a user, what you propose would be ideal.) |
I would argue that at any given time the "stable" version of LLVM is up to 6 months old but the "stable" version of emscripten is only a couple of weeks old. This is because emscripten does much more frequent releases than llvm. Perhaps you are correct and packagers would ok going back 6 months when packaging emscripten. My other point still stands: We should not have some version of emscripten that are unpackagable. Something certain emscripten versions a shown to have issues and package maintainers should be allowed to pick whichever emscripten release suites them. You are correct, we could "bless" certain emscripten releases as being packaging-compatible, but that doesn't seem like a great outcome. Also we have already missed the window so I guess that first time we could execute the plan your are suggesting is when v21 branch is created? |
I see, thanks for explaining. You are right that this might help sometimes. But with that said, if we update somewhat rarely, I am not that worried here. First, if say LLVM 20 is compatible with Emscripten versions 4.20-4.24, but not 4.25-4.40, then any bug in the first release 4.20 would have had a few releases to get fixed, and packagers could use those. Second, AFAIK it is standard practice for packagers to do such backporting themselves in Linux distros, when there is a justified reason (like a security patch). |
I'm still feel like the solution you are proposing means its hard to make strong guarantees about compatibility though. Even if emscripten tracks a release branch for a while while its first created, how can we be sure we will track it long enough that we will actually get to the released version. IIRC release branches exist for many weeks (months?) before the release is created. Can we commit to tracking each new release branch at least that long? It sounds like this alternative means forces a certain llvm rolling schedule on us, which is quite a big change from what we have today. |
I think I'd say that my intuition is about like Sam, in that it seems not great that some (probably most?) versions of Emscripten wouldn't be packageable, and that it would be better if Emscripten could always support some stable version. But also I don't have a clear picture yet of exactly who that would serve (e.g. are there people trying to maintain their own updated emscripten package on top of an LLVM that they can't update, such as from their distro? Or do distro packagers want to update Emscripten more often than however often they update LLVM?) Regardless of whether we go to updating LLVM less often, I do think we'll want to be able to get changes into EMSDK more often than every 6 months, and we'll want to maintain the ability to update LLVM to the tip of trunk on-demand to bring in important improvements or fixes. So I think that would mean that most Emscripten versions wouldn't just accidentally correspond primarily to a release. |
Good point, if those can take months it seems impractical. Out of curiosity, why is it that long, do you know? Overall you and @dschuff make good points, so I'm ok with your approach. |
For LLVM 19 it looks like it was branched on July 23, and marked final on September 17, with 4 RCs in between, so there was quite a bit of stabilization work in that time. That of course is the value of the release process, with all those patches on the branch reducing the number of bugs in the release. I've been thinking of "6 months" as the upper bound on the time for a change to reach stable, but maybe 2-8 is a more accurate range. |
I'm not sure if this PR would help with my situation, but for another use case: I'm linking Rust and Emscripten. They each bundle their own LLVM, but so far I haven't run into any problems even though they have different major versions of LLVM. I guess that's still a possibility in the future (if they diverge on what a WASM is meant to look like?) However the one thing that would need the same LLVM version would be cross-language LTO. My impression is that Emscripten gets new LLVM versions quickly, so that when Emscripten first switches to "19" that it's not necessarily going to be the same as the final version 19 - does it start essentially the same as 18 and then over time move to what will be stabilised as 19? So I think that simply watching for LLVM PRs here doesn't reliably indicate what the actual LLVM version a release of Emscripten is using, and I'm not sure how else to check. So greater clarity on the LLVM status with each release would be helpful. The possibility of switching to a system version of stable LLVM could potentially make it easier to get cross-language LTO working. But because Emscripten seems to switch to new LLVM versions much quicker than Rust I'm not sure it's ever going to be feasible, unless I stick with quite old Emscripten releases, which I often can't do because I run into bugs. |
Yes that is basically correct. We track tip-of-tree LLVM today, which means the LLVM version reported is always + higher than the current release branch. When LLVM create a new branch for vN then update the main branch to be vN+1 .. this is why emscripten currently requires v20 even though v20 has not be released and doesn't even have a release branch yet. |
I think it makes sense to have a functionality like this, maybe as an undocumented experiment for now. I think we'd want to try to see how many places we end up accumulating in the code where we have a conditional on the LLVM version, or to disable a test, and we can see what kind of testing matrix we end up with. I'd also be interesting in finding out from users who use this, what they need in terms of how far back we'd need support. Do we just immediately drop support for LLVM 19 as soon as LLVM 20.1.0 (which is the first non-RC release, despite the 1 in the minor version number) releases? Or does there need to be some overlap? wrt what I said above though: what do you think of making it more granular than "tot" vs "the most recent stable"? That would allow for a period of overlap if we want it, and also allow for having "stable" vs "current release" vs "future tot" if we want that. |
Run the majority of the
other
under llvm stable. There are stillsome features that don't seem to work.
Generate and experimental warning for now, at least until we can fix
the known issues.
Fixes: #11362