Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: prevent false positives in already running k0s instance detection #5400

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

peterhoneder
Copy link

Description

When running k0s in one of our deployments, we noticed that during startup with an existing runtime config file, it thinks it's running, but it isn't since the system just started up from scratch. The root cause is, that the detection in the linux runtime works by checking if the pid from the runtime config is a running process on the system, but not if the pid is actually the same executable image.

So what happens is, that if any other process during startup takes up that pid value from the runtime config, k0s thinks it is already running although it isn't.

I guess an even better method to solve this would be to hold a lock file instead of writing a pid value in the yaml, but the fix right now addresses the topic by checking if the pid value is from the same executable image, and if not, it will allow k0s to be started although the pid value is taken by another process.

Fixes #5399

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update

How Has This Been Tested?

  • Manual test
  • Auto test added

The tests extend the current runtime tests by 3 more cases:

  • check if NewRuntimeConfig detects if a pid from the runtime config is running with the same executable image as its own
  • check if another process is running with the pid value from the runtime config, but is from a different executable image, that it will not result in ErrK0sAlreadyRunning
  • check if no process is running under the pid from the runtime config, it will also work correctly (no error)

Checklist:

  • My code follows the style guidelines of this project
  • My commit messages are signed-off
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings

@peterhoneder peterhoneder requested review from a team as code owners January 3, 2025 11:46
@peterhoneder peterhoneder marked this pull request as draft January 3, 2025 12:35
@peterhoneder peterhoneder force-pushed the fix-runtime-pid-checks branch from e19014e to de62d7c Compare January 3, 2025 17:22
The main bugfix is to check if a possibly running pid from the runtime
config not only exists as a process, but also is the same excutable
image that is checking.

All platforms use a common library for ps-style checks now in this
implementation.

This changes the behavior for all platforms to use a common library
to perform the "is k0s running" checks on startup. The advantage is,
that the code is much more simple, the disadvantage that an additional
library is now required with additional sub-dependencies.

This commit also refactors the function checkPid into checkInstanceRunning,
since the previous implementation did not allow for more complex error
handling and would have hidden errors. More complex error handling is
now required because the implementation is more complex.

Signed-off-by: Peter Honeder <peter.honeder@unwired.at>
@peterhoneder peterhoneder force-pushed the fix-runtime-pid-checks branch from de62d7c to 0f4f23e Compare January 3, 2025 17:35
@peterhoneder peterhoneder marked this pull request as ready for review January 3, 2025 19:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

k0s error "an instance of k0s is already running" but it is not running
1 participant