抢占式调度和休眠
在LDD3上有一个例子,代码如下:
1: /* Wait for space for writing; caller must hold device semaphore. On2: * error the semaphore will be released before returning. */3: static int scull_getwritespace(struct scull_pipe *dev, struct file *filp)4: {5:6: while (spacefree(dev) == 0)7: { /* full */8: DEFINE_WAIT(wait);9:10: up(&dev->sem);11: if (filp->f_flags & O_NONBLOCK)12: return -EAGAIN;13:14: PDEBUG("\"%s\" writing: going to sleep\n",current->comm);15: prepare_to_wait(&dev->outq, &wait, TASK_INTERRUPTIBLE);16: if (spacefree(dev) == 0)17: schedule();18: finish_wait(&dev->outq, &wait);19: if (signal_pending(current))20: /* signal: tell the fs layer to handle it */21: return -ERESTARTSYS;22: if (down_interruptible(&dev->sem))23: return -ERESTARTSYS;24: }25: return 0;26:27: }28:
上面代码见LDD3 Page 159。
在书上面,解释了为什么需要Line 16的判断语句。因为在prepare_to_wait后,并没有真正让出CPU,只是改变了进程的状态。而这个时候,可能缓冲区又已经可用了(行10和 15之间read函数释放了缓冲区),那么我就不需要调用schedule让出处理器资源而休眠了。试想,如果不判断而调用schedule,而这个时候 read函数已经唤醒过了,那么谁来唤醒我呢?
可是仍然有一个疑问,因为2.6内核是可抢占的。那么调用假设prepare_to_wait和if判断之间也就是行15和16之间,发生了内核抢占,调度了另外的进程来运行。并且在行10和15之间,read 函数已经释放了缓冲区。那么由于不可能有进程来唤醒我,我就一直在休眠了!!!!!
实际上,这里就涉及到内核抢占的问题。在内核抢占的时候,schedule并不是被直接调用的,而是通过 preempt_schedule,preempt_schedule_irq和__cond_resched间接调用的,这三个例程都会在调用 schedule之前给当前的抢占计数加上PREEMPT_ACTIVE:
1: add_preempt_count(PREEMPT_ACTIVE);2: ...3: schedule();4: ...5: sub_preempt_count(PREEMPT_ACTIVE);
而schedule函数内部会检查PREEMPT_ACITVE位是否被设置,只有当此位没有被设置的时候才会将不是TASK_RUNNING状态的进程移除运行队列:
1: if (prev->state && !(preempt_count() & PREEMPT_ACTIVE)) {2: switch_count = &prev->nvcsw;3: if (unlikely((prev->state & TASK_INTERRUPTIBLE) &&4: unlikely(signal_pending(prev))))5: prev->state = TASK_RUNNING;6: else {7: if (prev->state == TASK_UNINTERRUPTIBLE)8: rq->nr_uninterruptible++;9: deactivate_task(prev, rq);10: }11: }
因此,被抢占的进程始终会回到被抢占的点继续执行,无论它被抢占的时候状态为何。这样即使在行15和16之间发生抢占,该进程仍然在运行队列,下次仍然有机会被投入运行(被投入运行的条件不仅仅是被唤醒了,因为其在运行队列中)。因此,仍然会去执行下面的判断语句。在这篇邮件中也有相关的说明:
See the PREEMPT_ACTIVE logic.
If a task is preempted it is marked PREEMPT_ACTIVE and it skips the
runqueue removal logic in schedule(). So even if it is !TASK_RUNNING itwill run again.
You can see this in schedule() and preempt_schedule(), both in
kernel/sched.c.Robert Love
[参考]
1.http://blog.chinaunix.net/u/5251/showart_450988.html
2.http://linux.derkeiler.com/Mailing-Lists/Kernel/2004-10/2391.html