-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
west update fails on submodules with relative paths #545
Comments
Good catch. As a coincidence I did notice recently (outside west) that relative submodules paths indeed assume there is an @mbolivar-nordic can you remind us why |
It doesn't, actually. The issue description explains what it does: " west [...] adds remote with name specified in west.yml". The value is determined by zephyr/west.yml in this case. https://github.com/zephyrproject-rtos/zephyr/blob/045c4bbf72d358db320c23be1caa3b5564fb3e2c/west.yml#L22
West leaves this up to the user. However, So I am concerned that adding any remote to "fix" this issue is going to break that guarantee.
I don't know what this means. @Mierunski can you please provide a manifest and steps to reproduce? |
It's simply a .gitmodules file without a hostname. Useful for mirroring. Here's one example: thesofproject/sof@250ab3875c23
@Mierunski can you please change your west.yml and replace |
Changing in west.yml to If it is still required I can create and example repository with tree that reproduces this issue. |
This is quite crucial for us, and I believe this is a bug in west that it can't handle relative submodules properly and it should be treated as a bug |
And what manifest?
Yes, I need steps to reproduce, please. |
rm -rf zeptest/
mkdir zeptest/
cd zeptest/
west init
sed -i -e '/remotes:/ a\
- name: not_origin\
url-base: https://github.com/thesofproject/' zephyr/west.yml
sed -i -e '/name: sof/ a\
remote: not_origin\
submodules: true' zephyr/west.yml
git -C zephyr diff
west update sof fails like this:
Works fine when replacing |
What is this starting from? Is this a zephyr repository? |
Please give me something I can copy paste into my terminal and reproduce. |
Empty directory.
mkdir added. |
Thanks! Sorry to require hand-holding, but I didn't write west's submodule support, and I don't know much about submodules. I really needed some help to get this failure on my desk. Here is what I've figured out so far. By replacing
So Trying to figure out why, I found that the
Higher up in the page, the
The sof module's
And the
Since we detach HEAD after checking out On a hunch, I tried these commands from
And that worked fine. So it looks like west can't run I'm guessing we will have to do something like this in the general case:
@marc-hb and @Mierunski, you both seem to know a lot about submodules. Does the above look reasonable to you? |
I am a bit concerned about how |
The analysis seems 100% correct and the suggested fix sounds good and like it should work. I haven't yet thought and tried to find other possible way(s) to approach this. Note tentative and "unconditional" fix #546 already submitted.
I think it will be fine because git, not west initializes the first level of submodules. So they will have cc: @kkasperczyk-no |
Sounds like it should work, however manually setting urls of all submodules and detecting them seems prone to errors instead of letting git handle it. I'm fine with it, but I won't be able to provide patch for it. Proposed PR passes tox tests and is currently used in our internal CI with success. |
The problem with #546 is that it breaks a guarantee we have had since the beginning, namely that your working tree's remotes do not matter to west, and you can change them at will. I tried to explain that in #545 (comment). I should have been clearer; this is a blocker for me. |
Curious what @kkasperczyk-no thinks and if he has time to address this issue. If not, I'll try to find some time for it. |
I was worried about this, but there seems to be a simple solution. from urllib.parse import urlparse, urljoin
def get_submodule_url(superproject_url, submodule_url):
submodule_parsed = urlparse(submodule_url)
if submodule_parsed.scheme:
# The submodule URL has a scheme, so it's not relative.
return submodule_url
if not superproject_url.endswith('/'):
# If we don't terminate the url with a /, urljoin will replace
# the final path component. This is inconsistent with how git
# behaves.
superproject_url = f'{superproject_url}/'
return urljoin(superproject_url, submodule_url)
sof_url = 'https://github.com/zephyrproject-rtos/sof'
rimage_url = '../rimage.git'
fake_submodule_url = 'https://github.com/foo/bar'
print(f'{get_submodule_url(sof_url, rimage_url)=}')
print(f'{get_submodule_url(sof_url + "/", rimage_url)=}')
print(f'{get_submodule_url(sof_url, fake_submodule_url)=}')
print(f'{get_submodule_url(sof_url + "/", fake_submodule_url)=}') This prints:
|
The above works with superproject URLs that are paths, too. For example |
There are some pathological examples (see https://datatracker.ietf.org/doc/html/rfc3986.html#section-5.4 and https://www.rfc-editor.org/errata/eid4547) where git does different things, but in normal cases this seems to work fine as far as I can tell, and even the RFC editors don't seem to have a good handle on the edge cases. |
So I tested --recursive. I didn't use
I also found the first "bug":
This is of course missing I also noticed the "normal" submodule process (as opposed to the tentative west solution) does this by default:
This variable is missing at the first level when setting An important warning: don't forget the Finally, I learned something about submodules $names. I used to believe that they were purely informative but they have an important role actually: they connect the url in Hope this helps. |
I agree, I would like that Note, for some projects, like |
But how will you determine the The But there is no fork of So even setting the And from the manifest we cannot guess that this URL: My take on this would be that the Zephyr fork in use should update here:
to not use relative paths (as that is not expected to work when the submodule is not forked) and only use URLs of the form And then describe the limitation that if you're going to use submodules in |
While I'm not a fan of #546 either, note it does not interfere with any
This is because Zephyr does not need I used this example because @mbolivar-nordic was asking for something easy to copy/paste and because I knew about it but it's not the perfect example. To perfect this example just replace https://github.com/zephyrproject-rtos/sof with https://github.com/thesofproject/sof in the manifest and move on. |
@tejlmand I'm afraid you also missed this:
|
and it depends on it, that's the main problem.
Ok, let me elaborate on the issue I see with Today, This means I can use a single west workspace when working directly in Zephyr, but also in nRF Connect SDK, and I even use that very same workspace to test out NCS downstream manifest if users are experiencing problems. I can do that the following way:
all of this inside the same workspace. Now, let's imagine for a second that Zephyr also forked But what if nRF Connect SDK needs to patch then when switching manifest repo to It could also be the other way around. Therefore, The ability to use a single workspace and just switch manifest repo has been possible since the dawn of Although I see value in also having limited submodule support in I would like to repeat an earlier statement from another PR:
I think that if |
I'm not sure you noticed that there is an alternative to #546 described above where So we have:
You explained that these two features conflict each other, and I bet you're right in the typical case. Long story short, the mismatch is because git submodules depends on git remotes whereas There are corner cases where these two features can probably work together: for instance github lets you fetch commits located in a fork even from the wrong remote but this is admittedly github-specific. There may be other fetching tricks to make it work (one below) but I agree that these two features seem to conflict with each other in the most common case. So your answer to this conflict is to make I strongly disagree simply because there are west users who don't care about feature 1 and do care about feature 2 (or don't even have a choice in the matter). So this would really be throwing the baby with the bathwater There may even be users who want to use feature 1 only in one software project and feature 2 only in another software project depending on what they're working on.
I can name a gazillions of git horrors but I really fail to see how relative submodules qualify as a "bad git workflow". A conflict with west feature 1 is the only problem I see and most owners of relative submodules probably don't care about west at all. As @Mierunski mentioned before, relative submodules are even a gitlab requirement Supporting git submodules in west always felt a bit of hack anyway considering west and submodules are effectively two competing solutions for the exact same problem. So conflicts and mismatches like this one are bound to happen. I hope there are better ways to manage these conflicts than blocking entire git features. Here for instance, users who want to use both feature 1. and 2. may prefer to manually In fact In terms of "conflict management" error messages matter a lot. In this case it will be:
which I find extremely clear and helpful (didn't think I would write this about submodules one day...)
It's great that |
I did notice the discussion on this but not the details so I was under the impression that the config approach would behave similar to But I see the config approach would work to some degree, and it avoids creation of a But I share the same concern regarding conflicting solutions.
There might be other fully documented git features not working fully with So I don't see that as an argument in itself. Supporting
Even outside of So let's clone locally, that could be my local or my fork.
and if I had cloned my tejlmand fork first and afterwards added If I make commits to the forked submodule and push, then on a new I have heard many complaints from others when working with submodules, and guess this could be one more candidate, but I have never heard complaints on how After starting to use
Or maybe we should go one step back. A solution that actually satisfies both 1 and 2 above, that is being able to checkout the submodule inside the project, albeit with its SHA placed in manifest. So we could take one step back and consider if the
and yet I live happily without such remote in my Zephyr repo:
|
Yes of course, but unlike worktrees, relative submodules can work perfectly fine for many west users and they already are as @Mierunski proved. They just cause some inconvenience when switching to a different manifest, that's really not the end of the world. Baby, bathwater.
Understood but that ship has sailed, submodules support is here in
git submodules don't support branches, only SHA1s so they absolutely don't care where they're downloaded from. If the mirror/fork is "good enough" then everything works and is fine. With github it's even easier because commits are shared across all forks.
That's what
There's no "correct" remote, with absolute URLs you can have the converse problem: what when I want CI to clone parent+submodule changes of mine for testing? And of course mirroring, Gitlab,...
I'm not pretending relative submodules are perfect for all use cases, far from it. They have drawbacks too. But the reality is: a very significant number of users think they're overall a better tradeoff than absolute submodules, including Gitlab!! There's really no reason for
Then maybe you should not advise submodule users which "terrible workflows" they should avoid? There's plenty of submodules advice available already. Number 1 is of course: use
Yes of course, but we're discussing submodules here. |
actually it is not only when switching manifest. Imagine I have the following manifest, and we'll be using the The manifest file at revision
which in the current proposal would set At a later stage, I decide a local fork of
and because of the fork, the remote has been changed. Now, the problem is that A user starting at revision
Only forks created as a Not a manual fork that has been pushed as a completely new repo, like this, which is also a fork, but not a
No,
so it's taking me back to start (or where the
yes, that is exactly what would be needed if not setting the remote specifically, but not so efficient and behavior like this we try to avoid in
I think you misunderstand what I mean when saying taking one step back. Today submodules can be either:
or:
now the latter is already very close to being a project. That way Should be fairly straight forward for the submodule list, as all projects are listed. A second benefit could be that we could allow users to overrule, for example:
but of course that should be a new enhancement, just showing what possibilities a step back could offer.
👍 And must admit, when I have worked with submodules in the past it have been without the intensive forking we do with Zephyr and in downstream projects. |
Not sure what your example proves but that's not how you use
Interesting, indeed that's not what I had understood. |
The funny thing is the exact same problem happens with absolute submodules: the origin is not updated either. What makes you think the old one is always valid? Only the new one may have the commits you need (hence |
Absolute So the difference is that with an absolute path, then everyone will fail meaning the maintainer of the fork will be notified / a patch updating the absolute url in the fork will be sent which then updates Whereas, with a relative path everything depends on which project that was cloned locally first / what is in If it was the fork, for example: The main difference being, if something is messed up when the url is absolute then I can push a fix that all users will receive on next fetch, and things will start working. My past experiences with |
Now, if you read my message and your own reply:
I exactly say |
So maybe we can get back on track discussing a solution. Of course, understanding how But generally it seems we both have a good understanding of this, but could probably be better and how we describe this knowledge 😉 (goes also for me). @marc-hb So besides being interesting, do you have any comments ? And do you agree the use-case I describe here: #545 (comment)
must work out-of-the-box for a And that |
I get that you may not like submodules, its not like I like them either.
Is fine for me. Just allow for using Gitlab CI without some dirty hacks. |
And that is what we are discussing. If For example:
not an ideal situation but it allows you to continue your work without having to wait for a fix in
Not sure if you noticed this use-case: #545 (comment) That will not work with your proposal. and this comment by @mbolivar-nordic : #545 (comment)
All your proposals makes
We are not making features for gitlab, but features for
Exactly why we are discussing the right solution. I consider your current proposals |
Its not a feature for gitlab, its an existing feature in Git that is not working when used under
I got my hacked solution already deployed internally, don't need anything more personally, just wanted to share something so that other ppl wont waste time on the same thing. |
No they don't. This is why you fail to see that the problem you described happens almost the same with absolute URLs.
No, not "immediately". As I wrote a number of times already, even this requires a You can reproduce this locally and easily thanks to Top (remote) -> middle (local) -> bottom (local) Because of the absolute URLs, the bottom submodule fetches from the top submodule. Now add one brand new commit in the middle submodule. Even if you change the url in the
As documented, There is no "correct" URL for submodules, it depends. In some cases relative URLs are better suited and in other cases absolute URLs are better. Both have valid use cases. Exactly like in a
No I do not, this is the core disagreement. Having a small and well documented impedance mismatch is perfectly acceptable and it keeps things simple: no need to copy/paste/diverge The core problem is you're bent on trying to achieve perfect integration with west. But submodules use remotes differently so that's not possible, at least not without some copy/paste/diverge of remote information which would make things even more complicated. |
This a blocker for what #546 does but you're still missing the alternative solution at the top that is getting completely drowned in this quest for perfect west integration. The alternative solution is letting
This is slipping into bad faith territory. Relative submodules != gitlab.
Yes, and the once-off |
Seems I was mislead by
Please note this was a comment to this sentence: Making After all, take a look at the first sentence here:
Until someone comes along with this use-case: #545 (comment) If user clones the manifest at some point between revision A-M + If we already know this flawed behavior, then why not address it from the start ? You might repeat |
not really, cause if there is no |
The
Because I don't see any really, truly simple solution and because... (as a coincidence I was in the middle of writing this as a mere EDIT when you answered) This is really a corner case so I don't get why you're so passionate about it. It will affect:
When it happens it will:
Morever:
If the (IMHO) very minor issue and corner case you described is the only one then I think it will be an extremely acceptable level of frustration. I don't understand why you're blowing this rare use case and minor inconvenience out of proportion. Part of the rationale for creating
Sorry I don't understand the issue here. As long as the revisions are the expected ones this looks good to me. How's that different from switching between different branches that have a different list of |
Because:
But most importantly I'm being careful that we do not introduce a behavior we could end up having to change later. Right now we have a bug that prevents users from using relative paths, which means we don't change west behavior, cause there isn't one (unless you consider something not working a behavior) And already today we know that west is capable of handle submodules internally as projects without conflicting with
Because the user coming from manifest revision |
@marc-hb please read the comment I was replying to first, before commenting my comment.
where |
Add a regression test for zephyrproject-rtos#545. Mark it xfail for now since we don't have a fix. We'll remove that once we have the fix in place. Signed-off-by: Martí Bolívar <marti.bolivar@nordicsemi.no>
I would like to unblock this discussion. I'm starting with a regression test: #557 Please take a look if you have time, but I will start work on the fix in parallel either way, taking into account the discussion above. |
Add a regression test for zephyrproject-rtos#545. Mark it xfail for now since we don't have a fix. We'll remove that once we have the fix in place. Signed-off-by: Martí Bolívar <marti.bolivar@nordicsemi.no>
Add a regression test for zephyrproject-rtos#545. Mark it xfail for now since we don't have a fix. We'll remove that once we have the fix in place. Signed-off-by: Martí Bolívar <marti.bolivar@nordicsemi.no>
Add a regression test for #545. Mark it xfail for now since we don't have a fix. We'll remove that once we have the fix in place. Signed-off-by: Martí Bolívar <marti.bolivar@nordicsemi.no>
When west initializes project repository it only adds remote with name specified in
west.yml
while git assumes that default remote isorigin
. Relative submodule paths are resolved to default remote.Without
origin
remote,west update
fails while trying to update repository that contains submodules with relative paths, with following warninggit logic for resolving relative paths can be found here https://github.com/git/git/blob/v2.20.1/builtin/submodule--helper.c#L135
Manually adding
origin
remote solves the issue.i'd propose west should also add
origin
remote to projects by default.The text was updated successfully, but these errors were encountered: