Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make EphemeralRunnerController MaxConcurrentReconciles configurable #3832

Merged
merged 4 commits into from
Dec 11, 2024

Conversation

mumoshu
Copy link
Collaborator

@mumoshu mumoshu commented Nov 30, 2024

Improve the runner pod startup times by using more goroutines(almost "more cpus", assuming you are NOT K8s API or network bounded) for ephemeralrunner reconciliation.

Today, cotroller-runtime and hence ARC reconciles EphemeralRunnerPod one by one, which should be limiting the runner pod creation throughput to almost unit time / latency of the GHA jit config API.

If you are not bounded by CPU, K8s, network, or GHA-API yet, setting N MaxConcurrentReconciles should improve the throughput by N.

This is the PR for the second commit mentioned in #3276 (comment).

@mumoshu mumoshu requested review from toast-gear, rentziass and a team as code owners November 30, 2024 06:02
// This updates the option value only if the environment variable is set.
// If the option is already set (via a command-line flag), the value from the environment variable takes precedence.
func (o *Options) LoadEnv() error {
v, err := o.getEnvInt("RUNNER_MAX_CONCURRENT_RECONCILES")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this env just to make it easy to give it a shot without modifying the chart.

Instead, we can modify the manager args template there

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mumoshu - I recommend we go with modifying the chart and adding this as an extra element under flags:

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Link- Thanks! Addressed in 0bebb66

// rather than having to correlate those in multiple places.
func OptionsWithDefault() Options {
return Options{
RunnerMaxConcuncurrentReconciles: 2,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making 2 the new default so that users can benefit from the update without any config changes.
Formerly it was treated 1 as not specified, and having 2 shouldn't be harmful as you should be able to limit CPU req/lim at the K8s level anyway.

@Link- Link- self-assigned this Dec 4, 2024
@Link- Link- added the attention Requires attention label Dec 4, 2024
@Link- Link- added this to the gha-runner-scale-set-0.9.4 milestone Dec 4, 2024
Copy link
Member

@Link- Link- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @mumoshu 🙏

@Link- Link- merged commit 3998f6d into master Dec 11, 2024
16 checks passed
@Link- Link- deleted the runner-concurrent-reconcile branch December 11, 2024 20:19
@Link- Link- mentioned this pull request Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
attention Requires attention
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants