Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: join node creates new cluster when initial etcd sync config fails #5151
fix: join node creates new cluster when initial etcd sync config fails #5151
Changes from all commits
4edfec6
808d811
9ccb6f5
917d17b
8f267f4
1037204
e6d4e22
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why the delay was increased in the commit message, or in a comment? If I'm not mistaken this will now block for ~ 17 minutes, whereas it was blocking only around 100 secs before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did the calculation again, as I forgot to take into account the max delay. I think it's 5 minutes overall, not 15 ... 👼
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I increased the number of attempts to 20. So I think it's 127 seconds plus like 12 minutes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. Will it be likely that joining will succeed after 15 minutes, when it didn't after 5 minutes? I'm just thinking if it makes sense to wait for that long, as surrounding tooling may have its own, shorter timeouts. How long does k0sctl wait until it aborts the join process? /cc @kke
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is some discussion here related to why the timeout was increased.
Looks like k0sctl waits 2 minutes but does not face the issue we are facing because join is sequential.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping @kke @twz123 . I'm waiting on a response to the above ^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what is the actual timeout now with this implementation? I don't think it makes sense to have it more than say 5mins, if the joining won't work in the first 5 mins, what would make it work after that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jnummelin ive decreased it to ~ 4 minutes