为什么内核抢占只有在 preempt_count == 0 时才是安全的？

Why kernel preemption is safe only when preempt_count == 0?

Linux 内核 2.6 引入了一个新的每线程 field---preempt_count---which 是 incremented/decremented 每当锁是 acquired/released 时。该字段用于允许内核抢占："If need_resched is set and preempt_count is zero, then a more important task is runnable and it is safe to preempt."

根据 Robert Love 的 "Linux Kernel Development" book： "So when is it safe to reschedule? The kernel is capable of preempting a task running in the kernel so long as it does not hold a lock."

我的问题是：为什么在内核中抢占任务运行而这个任务持有锁是不安全的？

如果另一个任务被调度并试图获取锁，它将阻塞（或自旋直到它的时间片结束），所以我们不会让两个线程同时进入同一个临界区。任何人都可以概述一个有问题的场景，以防我们抢占在内核模式下持有锁的任务吗？

谢谢！

在@Tsyvarev 的帮助下，我想我现在可以回答我自己的问题并描述一个有问题的场景，在这个场景中我们确实抢占了一个在内核模式下持有锁的任务。

线程 #1 持有自旋锁并被抢占。
然后调度线程 #2，并自旋以获取自旋锁。

现在，如果线程 #2 是常规进程，它最终将完成其时间片。在那种情况下，线程 #1 将再次被调度，释放锁，一切都很好。 但是，如果线程 #2 是更高优先级的实时进程，线程 #1 将永远不会再次到达运行，我们就会陷入死锁。

此答案得到 another Whosebug thread which cites the FreeBSD documentation 证实：

While locks can protect most data in the case of a preemption, not all of the kernel is preemption safe. For example, if a thread holding a spin mutex preempted and the new thread attempts to grab the same spin mutex, the new thread may spin forever as the interrupted thread may never get a chance to execute.

虽然上面的引述没有明确解释为什么 "interrupted thread may never get a chance to execute" 再次出现。

虽然这是一个老问题，但接受的答案是不正确的。

首先标题是问：

Why kernel preemption is safe only when preempt_count > 0?

这是不正确的，恰恰相反。当 preempt_count > 0 时禁用内核抢占，当 preempt_count == 0 时启用内核抢占。

此外，声明：

If another task is scheduled and tries to grab the lock, it will block (or spin until its time slice ends),

并不总是正确的。

假设您获得了自旋锁。启用抢占。发生进程切换，并且在新进程的上下文中出现一些 softirq 运行s。运行ning 软中断时禁用抢占。如果其中一个 softirqs 试图获取你的锁，它永远不会停止旋转，因为抢占被禁用。因此你有一个僵局。

您无法控制抢占您进程的进程是否会运行软中断。禁用软中断的 preempt_count 字段是 process-specific。软中断必须运行禁用抢占以保留软中断的 per-cpu 序列化。

为什么内核抢占只有在 preempt_count == 0 时才是安全的？

Why kernel preemption is safe only when preempt_count == 0?

multithreading

linux-kernel

preemption