-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow modifying remote/project urls of imported manifests #615
Comments
Did you take a look at |
Thanks for pointing that out. I've overlooked this issue. But I'm not quite sure what to make of it. The solution proposed is to use the url.<base>.insteadOf config option of git, which I had considered above. But if I see it correctly this mandates that every developer has to configure their git installation accordingly. And this changes the servers used to checkout without that fact being reflected in some |
I believe the salient point is "anything 'dynamic' in imports instantly becomes insanely complicated". Quoting @mbolivar-nordic in #565
Quoting him again in #608
To be fair, mirroring does not change any git version so it's not actually "dynamic". I'm afraid mirroring is unfortunate "collateral damage" of the more general complexity of changing imports on the fly. My 2 cents, I'm merely trying (and probably failing) to follow @mbolivar-nordic's expert thoughts on this. EDIT: https://github.com/zephyrproject-rtos/west/issues?q=label%3A%22Partial+imports%22 |
I'm instinctively not a fan of regular expressions and "rewrites" for this. It seems vastly overkill to just point at a different git clone. git It could/should be something as simple git's
I don't understand why. Mirroring should make absolutely no functional difference; it should only affect build performance. If mirroring makes any functional difference then it's a bug. |
Wait, this requirement is actually much worse than I thought: what if I want to use my own, personal/internal mirrors for maximum performance and fault tolerance? Or because my network is isolated for security reasons? I do NOT want to touch any manifest files in that case because comparing manifests is the very first step to make sure I'm building exactly the same thing as someone else - including for security reasons but not just[*] |
It's meant to ensure that we have everything necessary to build a project in-house so that in a couple of years from now no matter what happens we are able to reproduce the binaries. So I want to make it obvious that our internal mirror is used and I want to make it the default by putting the rewrite rule (or whatever method) on the company_library_repository level (because every application project is importing this repository). |
The biggest innovation brought by decentralized version control systems like git and mercurial is that the server does not matter. All clones are equivalent, they all have the same content (except of course for brand new work in progress not synchronized yet). Hosting and content are now unrelated to each other. Mirroring and backups come "for free". In fact this "content-addressable distributed storage" idea is so good that it's being extended beyond git to pretty much everything: https://en.wikipedia.org/wiki/InterPlanetary_File_System To "ensure you have everything necessary" specific server names do not and must not matter.
If that mattered then there would be a flaw in your design. |
That's true as long as one can find the repository somewhere. One could argue that this will be always the case for any valuable piece of source code. But I don't want to rely on somebody else to do the backup for me. Maybe we have different standpoints regarding this subject which explains our discussion. So how do we proceed? Should we wait for @mbolivar-nordic to have a look at this issue? |
Yes, and? I don't understand the point you're trying to make here sorry.
To perform backups I can (and I do) clone my own repos as many times as I like, I don't have to wait for someone else to do it. |
I try to summarize it. I hope this makes my thoughts clear.
|
Thanks for making sure there is no misunderstanding. My point is: 3. and 4. must not be in a manifest. The final mirroring decision must belong to a local configuration file. Having default values and sample configuration files in git is fine but the final mirroring decision must be in an (optional) local file that is not tracked in git. Even when you and I build from different mirrors we must keep the ability to compare what we build and make sure it's the same thing all the way. This is becoming a legal requirement for obvious security reasons: https://www.cisa.gov/sbom The very first step (not the only one) to compare our builds is to make sure we build from the same manifests and that our git (and mercurial, etc.) were a bit of a revolution because they totally decoupled location and version control thanks to the magic of checksumming everything. "Download from anywhere; I don't care". Git submodules kept that critical decoupling: you can mirror submodules and download them from anywhere and they still work. Git submodules come with a default location but it can be redirected in West manifests should certainly not regress this wonderful design decision.
This does not prove anything. You could still have a build script downloading something without west and git knowing anything. For instance CMake's "ExternalProject" does that routinely. I'm not 100% sure but I think it has happened in Zephyr: https://github.com/zephyrproject-rtos/zephyr/issues?q=externalproject. There are many other ways to download from the Internet. The only way to make really sure is to test your mirror while having no internet access. |
Just remembered the Zephyr project has a EDIT: here's another example of code being downloaded at build time. It's wrong but not uncommon |
Happy new year! @marc-hb thank you for collecting all the backlogs!
Yep, the way import works internally is more complicated than people The internal representations west uses to keep everything consistent
I thought about it a bit and we may be able to salvage some sort of
This does not seem to address the core problem, which as I understand
I think that something like this is probably doable, though I share Let me start by saying that while I don't have time to implement this The function that needs to be hooked is Line 2073 in fc278d4
This is what's responsible for taking the parsed YAML data and We need to track any context-specific information that we may have The correct place to put this information is Line 266 in fc278d4
The current import context is always available as
@jobasto is this something you would be willing to take on? |
Happy new year!
The goal is to have a "catch-all" rewrite rule that applies to all imports if no other rewrite rule is specified. One could also imagine to put a rule in the remotes section: manifest:
defaults:
remote: upstream
remotes:
- name: upstream
url-base: https://github.com/zephyrproject-rtos
url-rewrite: [...] However, I don't need it for my use case as I've only one import where I want to rewrite urls. So we can keep things simple and leave that out. But I'm not sure how far we should think ahead. For example, I only need one rewrite rule. So rewrite-url:
from: "from regexp"
to: "to regexp" would be enough for me. But I can imagine that maybe somebody would need multiple rules. So I would prefer to have a list of rules even if I put only one rule in that list.
Regarding regular expressions the rationale was a bit like before: Keep it flexible. But I don't need it at the moment. So we could replace the beginning of the url as git does. If the need to have regular expressions should ever arise we could always extend the syntax, e.g. rewrite-url:
from: "from regexp"
to: "to regexp"
regexp: true
Yes, I would. And thanks a lot for offering guidance. |
@marc-hb :
I think this is a separate feature request from what @jobasto is asking for. West has no features for populating configuration variables to certain values by default. Putting this in the manifest is the only way to guarantee that other users are a You could certainly additionally add configuration file support for overriding URLs, the same way we have configuration file support for group filters in addition to manifest file syntax for specifying group filters. However that does not seem to address the concrete use case here. Furthermore upon reflection I am a bit leery of adding configuration file support for this. Anything that allows users to silently get something different when they run |
This feels like overkill to me unless you have some examples where it really saves a lot of typing. If the idea of restricting these rules to live inside an Remember west is:
Agreed; anywhere that one rewrite rule can appear, we should ideally allow several. Though then the semantics of what happens when multiple rules apply needs to be clearly specified. I suggest "first match wins, subsequent matches are ignored". Thoughts?
Agreed -- please feel free to design a syntax that is future-proof enough to make regular expressions as a future extension, but let's limit this to what you need, and what git can do, at least for now / until we have a clear use case.
Great, thanks! I'm watching the repository and will review. Feel free to ask questions or chat further on the |
Note that this is a little subtle since we can inherit rules from multiple levels of parents. |
I'm afraid we're talking across each other. The default value for mirroring is obviously "no mirroring", so there's no need for west to provide default mirroring configuration variables.
@jobasto and I have clearly different mirroring objectives but it's still mirroring in both cases. @jobasto wants to make sure different people using the same top-level manifest hit "exactly the mirrors specified". I believe there is "strong and compelling" (and very different) use case for "invisible mirroring" where mirroring is a local decision; making sure only THE CONTENT is the same no matter where it was downloaded from. In fact it's not me believing that but basically anyone working on content-addressable storage, including the people who designed git and friends. See all the references above. I think these two different mirroring objectives are not mutually exclusive and I suspect they could share most of their "remote rewriting" implementation besides the user interface.
OK so maybe we're not talking across each other? |
I don't understand this. If the If on the other hand the So mirroring will never make any content difference one way or the other - as expected. |
These two cases are strikingly different when you are on a support team and you are positive what's in your company's fork of a repository, but have no idea what's in a random customer's mirror. |
So there are two manifest types: the deterministic ones that provide true content-addressable storage, reproducible builds, safe supply chains etc. and the other ones with a non-zero number of "moving parts". I understand there are some situations where you want to use the latter instead of the former but these should be considered the exception and not the rule. In such cases you probably want to ask users to (re)move ALL their custom configuration anyway for good measure - including git's [*] ... but is not workspace-specific. |
That's a fair counterpoint. OK, I'm fine with "someone" (again, not me) adding configuration options for this as well, but I don't think it should block an initial implementation which doesn't have them, especially if that's all the author has time to submit. We can always add more features later as concrete use cases arise. I am at least a little sympathetic to the "I want to hack something together with my own mirrors and my local config files" -- but if that's the case, |
Thanks, glad we could find some common ground.
Agreed; I'm certainly not expecting anyone to implement something they don't need themselves. On the other hand, I expect the implementation of one feature not to get in the way of an other, future and very similar feature with which it's likely to share most of its code. That's what I care about. |
It's simply global = in the wrong place for some use cases. It's not because you want to use a mirror for some git clones that you want to use a mirror for all git clones / projects / workspaces. |
OK, yep, good point. Thanks. |
I just want to let everybody know that I've written a little code and will open a pull request as soon as I can (which might still take some time). |
Please share code as soon as you can but don't forget @mbolivar-nordic will request some new tests and coverage eventually :-) |
A use case we are trying to figure out would be to have "remotes-override" section as part of your initial manifest repository. This would replace project URLs with a different URL. The use case is when you want to try to retain a 1-to-1 alignment with some internal repositories with external ones that you will periodically push to. But you have a corporate security restriction to not expose internal server names which would normally be in the shared project's west.yml. So your internal repository west.yml will point to external URLs. For internal developers, you would have them start with a different internal manifest repository that would import the primary project but would replace the key URLs with internal server names. @jobasto did you ever put together code to do this? |
I also interested in this. We use nordic's nrf connect sdk for some projects. So this meas in the applications
I think 2. is easier to maintain 1. as it can directly be used for nested projects. |
This objective is completely different from the one in the description at the top: "long-term reproducibility" This major difference has already been discussed above. The purpose of a manifest is to list content. If the content is different then the manifest should be different.
That should not be a problem.
This looks much more complex than just dynamically changing remote URLs. |
yes the objective is different, still it can be done in the same way. in the end, it does not really matter why you may want to replace a remote url. one can do it to point to a fork with a important bugfix, another may just want to use some internal mirror server.
really? so for me it does basically the same... and Golang's usage has already been proven to be useful. |
It matters at the very least from a test coverage and maintenance perspective, and possibly from an implementation complexity perspective too. This matters even more now considering the current maintenance shortage. https://docs.zephyrproject.org/latest/develop/west/why.html#design-constraints
|
Introduction
We use west to checkout all source files necessary for building a firmware and use the import feature as pictured below. This way we don't need to specify the exact version of the modules we want to use because that information is taken from the imported manifests.
Requirement
To ensure long term reproducibility we want to mirror all externally hosted repositories on our internal git server and use this mirror server for cloning.
Possible solutions
User/system specific git config
One way to achieve this would be to use the git url.<base>.insteadOf config option (see also #28 (comment)). However, this option is user and/or system specific but we want that the actually used servers are deducible by examining the project_repository and its imports.
Specifying git config options in manifest
Another way would be to introduce a new manifest file key that allows setting arbitrary git config options. However, west probably shouldn't set user or system wide config options so it would have to set the config option in all imported repositories.
Adding a
rewrite-url
key to the manifestProbably the cleanest way would be to introduce a new key to the manifest file, e.g.
rewrite-url
. An example could look likerewrite-url are a list of rewrite rules that will be applied to a url in order. If manifests are imported multiple levels deep the rewrite rules on the lowest level are applied first.
Open points:
Suggestions?
The text was updated successfully, but these errors were encountered: