-
Notifications
You must be signed in to change notification settings - Fork 898
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Ensure workers and miq_worker rows match #22771
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,10 +12,6 @@ def sync_from_system | |
# we only have to sync the list of pods and deployments once | ||
ensure_kube_monitors_started if my_server_is_primary? | ||
|
||
# Before syncing the workers check for any orphaned worker rows that don't have | ||
# a current pod and delete them | ||
cleanup_orphaned_worker_rows | ||
|
||
# Update worker deployments with updated settings such as cpu/memory limits | ||
sync_deployment_settings | ||
end | ||
|
@@ -54,6 +50,14 @@ def cleanup_orphaned_worker_rows | |
end | ||
end | ||
|
||
def cleanup_orphaned_workers | ||
orphaned_pods = current_pods.keys - miq_workers.pluck(:system_uid) | ||
return if orphaned_pods.empty? | ||
|
||
# TODO destroy orphaned pods | ||
orphaned_pods.each { |_pod| } | ||
end | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you add a comment in here just to describe why this is empty? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same comment for the other "empty" methods. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It just isn't implemented yet, I'm testing with Process first |
||
|
||
def cleanup_failed_workers | ||
super | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,7 @@ | ||
class MiqServer::WorkerManagement::Process < MiqServer::WorkerManagement | ||
def sync_from_system | ||
require "sys/proctable" | ||
self.miq_processes = Sys::ProcTable.ps.select { |proc| proc.ppid == my_server.pid } | ||
@miq_processes_by_pid = Sys::ProcTable.ps.select { |proc| proc.ppid == my_server.pid }.index_by(&:pid) | ||
end | ||
|
||
def sync_starting_workers | ||
|
@@ -12,6 +12,22 @@ def sync_stopping_workers | |
MiqWorker.find_all_stopping.to_a | ||
end | ||
|
||
def cleanup_orphaned_worker_rows | ||
orphaned_rows = miq_workers.where.not(:pid => miq_pids) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We use system_uid column in kubernetes - is that column populated with pid in process mode? If so, may be worth using the same column for consistency. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It isn't set for process or systemd but we should do that for consistency. That's going to be another PR but we can probably get that in first and make use of it here. |
||
return if orphaned_rows.empty? | ||
|
||
_log.warn("Removing orphaned worker rows without corresponding processes: #{orphaned_rows.collect(&:pid).inspect}") | ||
orphaned_rows.destroy_all | ||
end | ||
|
||
def cleanup_orphaned_workers | ||
orphaned_workers = miq_pids - miq_workers.pluck(:pid) | ||
return if orphaned_workers.empty? | ||
|
||
_log.warn("Removing orphaned processes without corresponding worker rows: #{orphaned_workers.inspect}") | ||
orphaned_workers.each { |pid| ::Process.kill(9, pid) } | ||
end | ||
|
||
def monitor_workers | ||
super | ||
|
||
|
@@ -74,5 +90,13 @@ def validate_worker(worker) | |
|
||
private | ||
|
||
attr_accessor :miq_processes | ||
attr_reader :miq_processes_by_pid | ||
|
||
def miq_processes | ||
miq_processes_by_pid.values | ||
end | ||
|
||
def miq_pids | ||
miq_processes_by_pid.keys | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO I think this has to be after
cleanup_orphaned_worker_rows
, and doesn't make sense IMO to havesync_deployment_settings
insync_from_system
since this is supposed to just be "find what is out there"