-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[run-task] Add the ability to configure submodule checkouts. #512
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for writing this!
I think it largely looks good. There's an important open question further down about caches/existing checkouts that I'd like to clarify before approving this, and one other less important thing.
true, auto or yes: All submodules are checked out.
TIL that auto
is boolean True in python.
Note! This changes the default behavior of tasks! Submodules are NOT checked out by default as a result of this change. But I figured I would write it in the most sensible way first and then grapple with the challenges of how to handle compatibility.
I could go either way on what the default behaviour should be, but I think what you have here is fine. I'm not too stressed about compat - it's a simple enough change for consumers to make when they upgrade, and few enough are using submodules anyways.
@@ -346,6 +462,7 @@ def test_git_checkout( | |||
commit=None, | |||
ssh_key_file=None, | |||
ssh_known_hosts_file=None, | |||
submodules=submodules, | |||
) | |||
|
|||
# Check desired files exist |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's an assert further down to check whether the outer repo is at the right revision. There's an assert checking for expected files makes sure something was checked out for the submodules, which I guess is good enough here, since the submodules themselves control which revs are used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is correct, a submodule in the superproject specifies the exact hash of child project to be checked out, and the git submodule update --force
flag will ensure that the revision specified by the superproject is checked out.
"git", | ||
"submodule", | ||
"init", | ||
p |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checkouts are often cached on individual workers. Will this work correctly if a checkout already contains an initialized submodule at the same (or different) revision?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be technically possible to have different submodules checked out at different revisions, but technically a task should only rely on the submodules that it declares, and those would all get the git submodule update
treatment below.
That being said, I'm starting to think we should default to updating all submodules if SUBMODULES
is not set. I don't think it's going to be easy to audit tasks to tell whether / which submodules they may or may not be using. So preserving backwards compatibility by defaulting to all of them will probably save us some headache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will take a little bit of rethinking how what the env variables should mean to properly convey that (default disabled is easy because an unset environment variable is read as an empty string), but I will see what I can do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think conceptually I do prefer this default behaviour of not checking out any submodules unless SUBMODULES
is specified..
But thinking about it, this might cause us some headaches as it may be difficult to determine which tasks depend on what submodules as we upgrade Taskgraph places.
Can we refactor things so that not specifying the env preserves backwards compatibility?
Sorry for the delay in getting back to you, I got distracted by PTO :) |
This Implements #504 and adds a bunch of tests to verify the behaviour of
run-task
when it comes to handling checkouts and their git submodules.The first major change is to
run-task
which now checks for an environment variable of the form<reponame>_SUBMODULES
to control how to handle git submodules. This variable can take the following values:true
,auto
oryes
: All submodules are checked out.The second set of changes addresses the run-task transforms so that this behaviour can be controlled by a YAML task description. In this stream of work we extend the
RepoConfig
class to include an optionalsubmodules
value, which can take one of the following values:None
orFalse
: No submodules are checked out.True
: All submodules are checked out.Note! This changes the default behavior of tasks! Submodules are NOT checked out by default as a result of this change. But I figured I would write it in the most sensible way first and then grapple with the challenges of how to handle compatibility.