Waiting & locking

Spinlocks

#include <linux/spinlock.h>

 spinlock_t
 rwlock_t

Spinlocks are a way to synchronize access to data structures.

spin_lock_init(lock)

Initialize a lock

spin_lock(lock)

take a lock.

spin_unlock(lock)

release a lock.

If the lock is already taken by another processor, then spin_lock will simply try again until it succeeds. The processor does not execute anything else in this time. Therefor, critical sections guarded by a spinlock must be as short as possible.

When a datastructure is often read but only rarely updated then using a spinlock would be bad as only one reader could access the data at a given time. There is a special lock that takes into account such an asymmetric use.

rwlock_init(rwlock)

read_{lock,unlock}(rwlock)

Lock/unlock for reading; no writer can take the lock at the same time but many readers can hold the lock at the same time.

write_{lock,unlock}(rwlock)

Lock/unlock for writing; no other reader or writer is allowed to take the lock at the same time.

Spinlocks (and rwlocks) can be used almost everywhere, even in interrupt handlers. In order to prevent deadlocks, interrupts must be disabled during the critical section if the guarding lock may be used in an interrupt handler.

There is a special version of spinlocks that automatically disables interrupts on the local processor:

 long flags;
 spin_lock_irqsave(&lock, flags);
 /* ... critical section, IRQs are disabled here */
 spin_lock_irqrestore(&lock, flags);

Semaphores

#include <asm/semaphore.h>

 struct semaphore;

Semaphores can be used to synchronize processes. A Semaphore is an integer variable which can be increased and decreased. When a process tries to decrese the value of the semaphore below zero, it is blocked until it is possible to decrease the semaphore without making it negative (i.e. until some other process increases the semaphore).

Unlike spinlocks, the blocked process is put to sleep instead of trying over and over again. As Interrupt handlers are not allowed to sleep it is not possible to use semaphores there. FIXME

sema_init(sem, value)

initialize semaphore.

up(sem)

increase semaphore (V(sem))

down(sem)

decrease semaphore, wait if it would become negative (P(sem))

down_interruptible(sem)

same as down, but returns -EINTR when it is interrupted by a signal.

A critical section can be implemented by initializing the semaphore to 1 and surrounding the critical section with down() and up() calls. It is advisable to use the interruptible version if the process can block for a long time.

	 ret = down_interruptible(&sem);
	 if (ret) goto out;
	 /* critical section */
	 up(&sem);
	 ret = 0;
 out:

Wait queues

#include <linux/wait.h>

Wait queues are used to wake up a sleeping processes. A waitqueue is simply a linked list of processes that need to be woken up when some event occures.

wait_queue_t

an entry that represents one process that is to be woken up

wait_queue_head_t

linked list of wait_queue_t entries

init_waitqueue_head(q)

create an empty wait queue list

wait_event(q, event)

registers with wait queue q, waits until event becomes true.

wait_event_interruptible(q, event)

same as wait_event, but returns with -ERESTARTSYS when a signal is received.

wake_up(q)

wakes up all processes that are registered with q. If the process executed wait_event, then it will re-evaluate the event and eventually return from the wait_event function.

poll

#include <linux/poll.h>

The select/poll system call allows userspace applications to wait for data to arrive on one or more file descriptors.

As it is not known which file/driver will be the first to

The poll/select system call will call the f_ops->poll method of all file descriptors. Each ->poll method should return whether data is available or not.

If no file descriptor has any data available, then the poll/select call has to wait for data on those file descriptors. It has to know about all wait queues that could be used to signal new data.

poll_wait(file, q, pt)

register wait queue q for an poll/select system call. The driver should wake up that wait queue when new data is available.

This is best illustrated with an example. The following example_poll function returns the status of the file descriptor (is it possible to read or write) and registers two wait queues that can be used wake the poll/select call.

 unsigned int example_poll(struct file * file, poll_table * pt)
 {
 	unsigned int mask = 0;
 	if (data_avail_to_read) mask |= POLLIN | POLLRDNORM;
 	if (data_avail_to_write) mask |= POLLOUT | POLLWRNORM;
 	poll_wait(file, &read_queue, pt);
 	poll_wait(file, &write_queue, pt);
 	return mask;
 }

Then, when data is available again the driver should call:

 data_avail_to_read = 1;
 wake_up(&read_queue);

This will cause the select/poll system call to wake up and to check all file descriptors again (by calling the f_ops->poll function). The select/poll call will return as soon as any file descriptor is available for read or write.

select/poll is often used by userspace applications to check if the next read or write of a file descriptor would block or not. However, bad things can happen between the select and the next read or write call. For example, another process could consume that data inbetween. If a process really expects a read or write call to not block, it should set O_NONBLOCK on open(2) or via fcntl(2). Every driver should check file->f_flags for O_NONBLOCK and return -EAGAIN if no data is available immediately.

Staying awake

There are some situations when sleeping is not allowed:

  • in interrupt context -- without a process context, it is not possible to wake up later.

  • while holding a spinlock.

This is important to keep in mind as many kernel functions may sleep for short periods of time internally. You have to make sure that you only call atomic functions while in interrupt context or while holding a spinlock.

Errors caused by calling the wrong function may be hard to find, as many internal functions only have a small chance of actually sleeping (e.g. kmalloc(..., GFP_KERNEL) only sleeps if it thinks that it can get more memory later). To make the user aware of a potential problem, may_sleep() can be used to print a warning if the function is executed in a context that does not allow to sleep.