From 875eb3f039dc09a098fd8da1d1330876f579ad8b Mon Sep 17 00:00:00 2001 From: Leon Schuermann Date: Sun, 5 Jan 2025 13:11:36 -0500 Subject: [PATCH] kernel/scheduler/mlfq: fix lockup by immediately servicing kernel interrupts Previously the MLFQ scheduler overwrote the `continue_process` method, which can control whether the core kernel loop immediately continues executing a process or is allowed to do other work first (such as servicing interrupts and deferred calls). While this may be a fine decision on some chips, it leads to a scheduler lockup with an uncooperative process on LiteX (and likely other platforms). This is because on some RISC-V platforms the scheduler timer is not a dedicated system attached to its own interrupt source, but a virtual scheduler timer based on an `Alarm` implementation that shares the "machine external interrupt" CPU input. When a process' timeslice expires, this alarm will raise an MEXT interrupt. To allow the kernel to do actual work and not get stuck in the trap handler, we disable MEXT interrupts by clearing the MIE::MEIE bit. Unfortunately, because of the overriden `continue_process` method, we then never handle this interrupt, which keeps the alarm interrupt asserted, and the MEXT CPU interrupt source disabled. This can cause the kernel to never interrupt and deschedule an uncooperative process. --- kernel/src/scheduler/mlfq.rs | 6 ------ 1 file changed, 6 deletions(-) diff --git a/kernel/src/scheduler/mlfq.rs b/kernel/src/scheduler/mlfq.rs index 13c6fd6951..f6ec8f508e 100644 --- a/kernel/src/scheduler/mlfq.rs +++ b/kernel/src/scheduler/mlfq.rs @@ -28,7 +28,6 @@ use crate::collections::list::{List, ListLink, ListNode}; use crate::hil::time::{self, ConvertTicks, Ticks}; use crate::platform::chip::Chip; use crate::process::Process; -use crate::process::ProcessId; use crate::process::StoppedExecutingReason; use crate::scheduler::{Scheduler, SchedulingDecision}; @@ -183,9 +182,4 @@ impl, C: Chip> Scheduler for MLFQSched<'_, self.processes[queue_idx].push_tail(self.processes[queue_idx].pop_head().unwrap()); } } - - unsafe fn continue_process(&self, _: ProcessId, _: &C) -> bool { - // This MLFQ scheduler only preempts processes if there is a timeslice expiration - true - } }