Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make EphemeralRunnerReconciler create runner pods earlier #3831

Merged
merged 3 commits into from
Dec 11, 2024

Conversation

mumoshu
Copy link
Collaborator

@mumoshu mumoshu commented Nov 30, 2024

Tweaks the reconciler so that there are fewer requeues in the happy path, improving
runner startup time when dozens or hundreds of jobs are queued.

This is a pull request for the first commit mentioned in my comment at #3276 (comment).
It should fix, or at least alleviate #3276

How the improvement works

Let's simplify the ephemeral-runner reconciliation logic into generally 2 phases: (1) get token and (2) create runner pod.

This change improves the runner pod creation latency by making (2) happen earlier.

How it should work ideally

With three concurrent job startups, it should ideally be like this:

reconciliation #1 at t1: job 1 (1)
r #2 at t2: job 1 (2): 1 pod created until t2
r #3 at t3: job 2 (1)
r #4 at t4: job 2 (2): 2 pods created until t4
r #5 at t5: job 3 (1):
r #6 at t6: job 3 (2): 3 pods created until t6

How it works today

With the unnecessary requeues in the happy path, it had to be like this:

reconciliation #1 at t1: job 1 (1)
r #2 at t2: job 2 (1)
r #3 at t3: job 3 (1)
r #4 at t4: job 1 (2): 1 pod until t4
r #5 at t5: job 2 (2): 2 pods until t5
r #5 at t6: job 3 (2): 3 pod s until t6

You can see that the first two jobs take longer to start than the ideal scenario.
The more requeues in the happy path and concurrent N jobs you have, the longer the first N-1 jobs take to start.

@mumoshu mumoshu requested review from toast-gear, rentziass and a team as code owners November 30, 2024 05:43
@Link- Link- self-assigned this Dec 4, 2024
@Link- Link- added the attention Requires attention label Dec 4, 2024
@Link- Link- added this to the gha-runner-scale-set-0.9.4 milestone Dec 4, 2024
@Link- Link- merged commit 32ae917 into master Dec 11, 2024
16 checks passed
@Link- Link- deleted the fast-runner-startup branch December 11, 2024 20:28
@Link- Link- mentioned this pull request Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
attention Requires attention
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants