为什么 schedule() 在使用默认 prepare_arch_switch() 时不会导致死锁
why schedule() does not lead to deadlock while using the default prepare_arch_switch()
在Linux 2.6.11.12中,在shedule()
函数到select之前"next"任务到运行,它会锁定运行队列
spin_lock_irq(&rq->lock);
而且,在调用context_switch()
执行上下文切换之前,它会调用prepare_arch_switch()
,默认是空操作:
/*
* Default context-switch locking:
*/
#ifndef prepare_arch_switch
# define prepare_arch_switch(rq, next) do { } while (0)
# define finish_arch_switch(rq, next) spin_unlock_irq(&(rq)->lock)
# define task_running(rq, p) ((rq)->curr == (p))
#endif
也就是说,它会持有rq->lock
直到switch_to()
return,然后,宏finish_arch_switch()
才真正释放锁。
假设,有任务A、B、C,现在A调用schedule()
切换到B(此时rq->lock
被锁定)。 B迟早会打电话给schedule()
。此时,B 被 A 锁定了,如何获取 rq->lock
?
还有一些arch依赖的实现,比如:
/*
* On IA-64, we don't want to hold the runqueue's lock during the low-level context-switch,
* because that could cause a deadlock. Here is an example by Erich Focht:
*
* Example:
* CPU#0:
* schedule()
* -> spin_lock_irq(&rq->lock)
* -> context_switch()
* -> wrap_mmu_context()
* -> read_lock(&tasklist_lock)
*
* CPU#1:
* sys_wait4() or release_task() or forget_original_parent()
* -> write_lock(&tasklist_lock)
* -> do_notify_parent()
* -> wake_up_parent()
* -> try_to_wake_up()
* -> spin_lock_irq(&parent_rq->lock)
*
* If the parent's rq happens to be on CPU#0, we'll wait for the rq->lock
* of that CPU which will not be released, because there we wait for the
* tasklist_lock to become available.
*/
#define prepare_arch_switch(rq, next) \
do { \
spin_lock(&(next)->switch_lock); \
spin_unlock(&(rq)->lock); \
} while (0)
#define finish_arch_switch(rq, prev) spin_unlock_irq(&(prev)->switch_lock)
在这种情况下,我非常确定这个版本会做正确的事情,因为它在调用 context_switch()
.
之前解锁了 rq->lock
但是默认实现会怎样?它如何正确地做事?
我在linux 2.6.32.68的context_switch()
中找到了一条评论,说的是代码下的故事:
/*
* Since the runqueue lock will be released by the next
* task (which is an invalid locking op but in the case
* of the scheduler it's an obvious special-case), so we
* do an early lockdep release here:
*/
但是我们不切换到另一个锁定lock
的任务,下一个任务会解锁它,如果下一个任务是新创建的,函数ret_from_fork()
最终也会调用finish_task_switch()
解锁 rq->lock
在Linux 2.6.11.12中,在shedule()
函数到select之前"next"任务到运行,它会锁定运行队列
spin_lock_irq(&rq->lock);
而且,在调用context_switch()
执行上下文切换之前,它会调用prepare_arch_switch()
,默认是空操作:
/*
* Default context-switch locking:
*/
#ifndef prepare_arch_switch
# define prepare_arch_switch(rq, next) do { } while (0)
# define finish_arch_switch(rq, next) spin_unlock_irq(&(rq)->lock)
# define task_running(rq, p) ((rq)->curr == (p))
#endif
也就是说,它会持有rq->lock
直到switch_to()
return,然后,宏finish_arch_switch()
才真正释放锁。
假设,有任务A、B、C,现在A调用schedule()
切换到B(此时rq->lock
被锁定)。 B迟早会打电话给schedule()
。此时,B 被 A 锁定了,如何获取 rq->lock
?
还有一些arch依赖的实现,比如:
/*
* On IA-64, we don't want to hold the runqueue's lock during the low-level context-switch,
* because that could cause a deadlock. Here is an example by Erich Focht:
*
* Example:
* CPU#0:
* schedule()
* -> spin_lock_irq(&rq->lock)
* -> context_switch()
* -> wrap_mmu_context()
* -> read_lock(&tasklist_lock)
*
* CPU#1:
* sys_wait4() or release_task() or forget_original_parent()
* -> write_lock(&tasklist_lock)
* -> do_notify_parent()
* -> wake_up_parent()
* -> try_to_wake_up()
* -> spin_lock_irq(&parent_rq->lock)
*
* If the parent's rq happens to be on CPU#0, we'll wait for the rq->lock
* of that CPU which will not be released, because there we wait for the
* tasklist_lock to become available.
*/
#define prepare_arch_switch(rq, next) \
do { \
spin_lock(&(next)->switch_lock); \
spin_unlock(&(rq)->lock); \
} while (0)
#define finish_arch_switch(rq, prev) spin_unlock_irq(&(prev)->switch_lock)
在这种情况下,我非常确定这个版本会做正确的事情,因为它在调用 context_switch()
.
rq->lock
但是默认实现会怎样?它如何正确地做事?
我在linux 2.6.32.68的context_switch()
中找到了一条评论,说的是代码下的故事:
/*
* Since the runqueue lock will be released by the next
* task (which is an invalid locking op but in the case
* of the scheduler it's an obvious special-case), so we
* do an early lockdep release here:
*/
但是我们不切换到另一个锁定lock
的任务,下一个任务会解锁它,如果下一个任务是新创建的,函数ret_from_fork()
最终也会调用finish_task_switch()
解锁 rq->lock