The Linux Kernel APIThe Linux Kernel API
This documentation is free software; you can redistribute
it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation; either
version 2 of the License, or (at your option) any later
version.
This program is distributed in the hope that it will be
useful, but WITHOUT ANY WARRANTY; without even the implied
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
See the GNU General Public License for more details.
You should have received a copy of the GNU General Public
License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
MA 02111-1307 USA
For more details see the file COPYING in the source
distribution of Linux.
The Linux Kernel APIThe Linux Kernel APITable of Contents1. Driver Basics Driver Entry and Exit points Atomic and pointer manipulation Delaying, scheduling, and timer routines High-resolution timers Internal Functions Kernel objects manipulation Kernel utility functions 2. Data Types Doubly Linked Lists 3. Basic C Library Functions String Conversions String Manipulation Bit Operations 4. Memory Management in Linux The Slab Cache User Space Memory Access More Memory Management Functions 5. Kernel IPC facilities IPC utilities 6. FIFO Buffer kfifo interface 7. The proc filesystem sysctl interface proc filesystem interface 8. The debugfs filesystem debugfs interface 9. The Linux VFS The Filesystem types The Directory Cache Inode Handling Registration and Superblocks File Locks Other Functions 10. Linux Networking Networking Base Types Socket Buffer Functions Socket Filter Generic Network Statistics SUN RPC subsystem 11. Network device support Driver Support Synchronous PPP 12. Module Support Module Loading Inter Module support 13. Hardware Interfaces Interrupt Handling Resources Management MTRR Handling PCI Support Library PCI Hotplug Support Library MCA Architecture MCA Device Functions MCA Bus DMA 14. The Device File System devfs_mk_dir 15. The Filesystem for Exporting Kernel Objects sysfs_create_file sysfs_update_file sysfs_chmod_file sysfs_remove_file sysfs_create_dir sysfs_remove_dir sysfs_create_link sysfs_remove_link sysfs_create_bin_file sysfs_remove_bin_file 16. Security Framework register_security unregister_security mod_reg_security mod_unreg_security capable 17. Power Management pm_register pm_unregister pm_unregister_all pm_send_all 18. Device drivers infrastructure Device Drivers Base Device Drivers Power Management Device Drivers ACPI Support Device drivers PnP support 19. Block Devices blk_get_backing_dev_info blk_queue_prep_rq blk_queue_merge_bvec blk_queue_make_request blk_queue_ordered blk_queue_issue_flush_fn blk_queue_bounce_limit blk_queue_max_sectors blk_queue_max_phys_segments blk_queue_max_hw_segments blk_queue_max_segment_size blk_queue_hardsect_size blk_queue_stack_limits blk_queue_segment_boundary blk_queue_dma_alignment blk_queue_find_tag blk_queue_free_tags blk_queue_init_tags blk_queue_resize_tags blk_queue_end_tag blk_queue_start_tag blk_queue_invalidate_tags generic_unplug_device blk_start_queue blk_stop_queue blk_sync_queue blk_run_queue blk_cleanup_queue blk_init_queue blk_requeue_request blk_insert_request blk_rq_map_user blk_rq_map_user_iov blk_rq_unmap_user blk_rq_map_kern blk_execute_rq_nowait blk_execute_rq blkdev_issue_flush blk_end_sync_rq blk_congestion_wait generic_make_request submit_bio end_that_request_first end_that_request_chunk blk_complete_request 20. Miscellaneous Devices misc_register misc_deregister 21. Video4Linux video_register_device video_unregister_device 22. Sound Devices snd_printk snd_printd snd_assert snd_printdd register_sound_special_device register_sound_mixer register_sound_midi register_sound_dsp register_sound_synth unregister_sound_special unregister_sound_mixer unregister_sound_midi unregister_sound_dsp unregister_sound_synth snd_pcm_playback_ready snd_pcm_capture_ready snd_pcm_playback_data snd_pcm_playback_empty snd_pcm_capture_empty snd_pcm_format_cpu_endian snd_pcm_new_stream snd_pcm_new snd_device_new snd_device_free snd_device_register snd_iprintf snd_info_get_line snd_info_get_str snd_info_create_module_entry snd_info_create_card_entry snd_card_proc_new snd_info_free_entry snd_info_register snd_info_unregister snd_rawmidi_receive snd_rawmidi_transmit_empty snd_rawmidi_transmit_peek snd_rawmidi_transmit_ack snd_rawmidi_transmit snd_rawmidi_new snd_rawmidi_set_ops snd_request_card snd_lookup_minor_data snd_register_device snd_unregister_device copy_to_user_fromio copy_from_user_toio snd_pcm_lib_preallocate_free_for_all snd_pcm_lib_preallocate_pages snd_pcm_lib_preallocate_pages_for_all snd_pcm_sgbuf_ops_page snd_pcm_lib_malloc_pages snd_pcm_lib_free_pages snd_card_new snd_card_disconnect snd_card_free snd_card_free_in_thread snd_card_register snd_component_add snd_card_file_add snd_card_file_remove snd_power_wait snd_dma_program snd_dma_disable snd_dma_pointer snd_ctl_new snd_ctl_new1 snd_ctl_free_one snd_ctl_add snd_ctl_remove snd_ctl_remove_id snd_ctl_rename_id snd_ctl_find_numid snd_ctl_find_id snd_pcm_set_ops snd_pcm_set_sync snd_interval_refine snd_interval_ratnum snd_interval_list snd_pcm_hw_rule_add snd_pcm_hw_constraint_integer snd_pcm_hw_constraint_minmax snd_pcm_hw_constraint_list snd_pcm_hw_constraint_ratnums snd_pcm_hw_constraint_ratdens snd_pcm_hw_constraint_msbits snd_pcm_hw_constraint_step snd_pcm_hw_constraint_pow2 snd_pcm_hw_param_value_min snd_pcm_hw_param_value_max snd_pcm_hw_param_first snd_pcm_hw_param_last snd_pcm_hw_param_set snd_pcm_hw_param_mask snd_pcm_hw_param_near snd_pcm_lib_ioctl snd_pcm_period_elapsed snd_hwdep_new snd_pcm_stop snd_pcm_suspend snd_pcm_suspend_all snd_malloc_pages snd_free_pages snd_dma_alloc_pages snd_dma_alloc_pages_fallback snd_dma_free_pages snd_dma_get_reserved_buf snd_dma_reserve_buf 23. 16x50 UART Driver uart_handle_dcd_change uart_handle_cts_change uart_update_timeout uart_get_baud_rate uart_get_divisor uart_register_driver uart_unregister_driver uart_add_one_port uart_remove_one_port serial8250_suspend_port serial8250_resume_port serial8250_register_port serial8250_unregister_port 24. Z85230 Support Library z8530_interrupt z8530_sync_open z8530_sync_close z8530_sync_dma_open z8530_sync_dma_close z8530_sync_txdma_open z8530_sync_txdma_close z8530_describe z8530_init z8530_shutdown z8530_channel_load z8530_null_rx z8530_queue_xmit z8530_get_stats 25. Frame Buffer Library Frame Buffer Memory Frame Buffer Colormap Frame Buffer Video Mode Database Frame Buffer Macintosh Video Mode Database Frame Buffer Fonts Driver BasicsDriver BasicsChapter 1. Driver BasicsDriver Entry and Exit pointsDriver Entry and Exit pointsNamemodule_init --
driver initialization entry point
SynopsisSynopsis module_init (x); x;ArgumentsArgumentsx
function to be run at kernel boot time or module insertion
DescriptionDescription
module_init will either be called during do_initcalls (if
builtin) or at module insertion time (if a module). There can only
be one per module.
Namemodule_exit --
driver exit entry point
SynopsisSynopsis module_exit (x); x;ArgumentsArgumentsx
function to be run when driver is removed
DescriptionDescription
module_exit will wrap the driver clean-up code
with cleanup_module when used with rmmod when
the driver is a module. If the driver is statically
compiled into the kernel, module_exit has no effect.
There can only be one per module.
Atomic and pointer manipulationAtomic and pointer manipulationNameatomic_read --
read atomic variable
SynopsisSynopsis atomic_read (v); v;ArgumentsArgumentsv
pointer of type atomic_t
DescriptionDescription
Atomically reads the value of v.
Nameatomic_set --
set atomic variable
SynopsisSynopsis atomic_set (v, i); v; i;ArgumentsArgumentsv
pointer of type atomic_t
i
required value
DescriptionDescription
Atomically sets the value of v to i.
Nameatomic_add --
add integer to atomic variable
SynopsisSynopsisvoid atomic_add (i, v);int i;atomic_t * v;ArgumentsArgumentsi
integer value to add
v
pointer of type atomic_t
DescriptionDescription
Atomically adds i to v.
Nameatomic_sub --
subtract the atomic variable
SynopsisSynopsisvoid atomic_sub (i, v);int i;atomic_t * v;ArgumentsArgumentsi
integer value to subtract
v
pointer of type atomic_t
DescriptionDescription
Atomically subtracts i from v.
Nameatomic_sub_and_test --
subtract value from variable and test result
SynopsisSynopsisint atomic_sub_and_test (i, v);int i;atomic_t * v;ArgumentsArgumentsi
integer value to subtract
v
pointer of type atomic_t
DescriptionDescription
Atomically subtracts i from v and returns
true if the result is zero, or false for all
other cases.
Nameatomic_inc --
increment atomic variable
SynopsisSynopsisvoid atomic_inc (v);atomic_t * v;ArgumentsArgumentsv
pointer of type atomic_t
DescriptionDescription
Atomically increments v by 1.
Nameatomic_dec --
decrement atomic variable
SynopsisSynopsisvoid atomic_dec (v);atomic_t * v;ArgumentsArgumentsv
pointer of type atomic_t
DescriptionDescription
Atomically decrements v by 1.
Nameatomic_dec_and_test --
decrement and test
SynopsisSynopsisint atomic_dec_and_test (v);atomic_t * v;ArgumentsArgumentsv
pointer of type atomic_t
DescriptionDescription
Atomically decrements v by 1 and
returns true if the result is 0, or false for all other
cases.
Nameatomic_inc_and_test --
increment and test
SynopsisSynopsisint atomic_inc_and_test (v);atomic_t * v;ArgumentsArgumentsv
pointer of type atomic_t
DescriptionDescription
Atomically increments v by 1
and returns true if the result is zero, or false for all
other cases.
Nameatomic_add_negative --
add and test if negative
SynopsisSynopsisint atomic_add_negative (i, v);int i;atomic_t * v;ArgumentsArgumentsi
integer value to add
v
pointer of type atomic_t
DescriptionDescription
Atomically adds i to v and returns true
if the result is negative, or false when
result is greater than or equal to zero.
Nameatomic_add_return --
add and return
SynopsisSynopsisint atomic_add_return (i, v);int i;atomic_t * v;ArgumentsArgumentsi
integer value to add
v
pointer of type atomic_t
DescriptionDescription
Atomically adds i to v and returns i + v
Nameatomic_add_unless --
add unless the number is a given value
SynopsisSynopsis atomic_add_unless (v, a, u); v; a; u;ArgumentsArgumentsv
pointer of type atomic_t
a
the amount to add to v...
u
...unless v is equal to u.
DescriptionDescription
Atomically adds a to v, so long as it was not u.
Returns non-zero if v was not u, and zero otherwise.
Nameget_unaligned --
get value from possibly mis-aligned location
SynopsisSynopsis get_unaligned (ptr); ptr;ArgumentsArgumentsptr
pointer to value
DescriptionDescription
This macro should be used for accessing values larger in size than
single bytes at locations that are expected to be improperly aligned,
e.g. retrieving a u16 value from a location not u16-aligned.
Note that unaligned accesses can be very expensive on some architectures.
Nameput_unaligned --
put value to a possibly mis-aligned location
SynopsisSynopsis put_unaligned (val, ptr); val; ptr;ArgumentsArgumentsval
value to place
ptr
pointer to location
DescriptionDescription
This macro should be used for placing values larger in size than
single bytes at locations that are expected to be improperly aligned,
e.g. writing a u16 value to a location not u16-aligned.
Note that unaligned accesses can be very expensive on some architectures.
Delaying, scheduling, and timer routinesDelaying, scheduling, and timer routinesNamepid_alive --
check that a task structure is not stale
SynopsisSynopsisint pid_alive (p);struct task_struct * p;ArgumentsArgumentsp
Task structure to be checked.
DescriptionDescription
Test if a process is not yet dead (at most zombie state)
If pid_alive fails, then pointers within the task structure
can be stale and must not be dereferenced.
Name__wake_up --
wake up threads blocked on a waitqueue.
SynopsisSynopsisvoid fastcall __wake_up (q, mode, nr_exclusive, key);wait_queue_head_t * q;unsigned int mode;int nr_exclusive;void * key;ArgumentsArgumentsq
the waitqueue
mode
which threads
nr_exclusive
how many wake-one or wake-many threads to wake up
key
is directly passed to the wakeup function
Name__wake_up_sync --
wake up threads blocked on a waitqueue.
SynopsisSynopsisvoid fastcall __wake_up_sync (q, mode, nr_exclusive);wait_queue_head_t * q;unsigned int mode;int nr_exclusive;ArgumentsArgumentsq
the waitqueue
mode
which threads
nr_exclusive
how many wake-one or wake-many threads to wake up
DescriptionDescription
The sync wakeup differs that the waker knows that it will schedule
away soon, so while the target thread will be woken up, it will not
be migrated to another CPU - ie. the two threads are 'synchronized'
with each other. This can prevent needless bouncing between CPUs.
On UP it can prevent extra preemption.
Nametask_nice --
return the nice value of a given task.
SynopsisSynopsisint task_nice (p);const task_t * p;ArgumentsArgumentsp
the task in question.
Namesched_setscheduler --
change the scheduling policy and/or RT priority of
SynopsisSynopsisint sched_setscheduler (p, policy, param);struct task_struct * p;int policy;struct sched_param * param;ArgumentsArgumentsp
the task in question.
policy
new policy.
param
structure containing the new RT priority.
DescriptionDescription
a thread.
Nameyield --
yield the current processor to other threads.
SynopsisSynopsisvoid __sched yield (void); void;ArgumentsArgumentsvoid
no arguments
DescriptionDescription
this is a shortcut for kernel-space yielding - it marks the
thread runnable and calls sys_sched_yield.
Nameschedule_timeout --
sleep until timeout
SynopsisSynopsissigned long __sched schedule_timeout (timeout);signed long timeout;ArgumentsArgumentstimeout
timeout value in jiffies
DescriptionDescription
Make the current task sleep until timeout jiffies have
elapsed. The routine will return immediately unless
the current task state has been set (see set_current_state).
You can set the task state as follows -
TASK_UNINTERRUPTIBLE - at least timeout jiffies are guaranteed to
pass before the routine returns. The routine will return 0
TASK_INTERRUPTIBLE - the routine may return early if a signal is
delivered to the current task. In this case the remaining time
in jiffies will be returned, or 0 if the timer expired in time
The current task state is guaranteed to be TASK_RUNNING when this
routine returns.
Specifying a timeout value of MAX_SCHEDULE_TIMEOUT will schedule
the CPU away without a bound on the timeout. In this case the return
value will be MAX_SCHEDULE_TIMEOUT.
In all cases the return value is guaranteed to be non-negative.
Namemsleep --
sleep safely even with waitqueue interruptions
SynopsisSynopsisvoid msleep (msecs);unsigned int msecs;ArgumentsArgumentsmsecs
Time in milliseconds to sleep for
Namemsleep_interruptible --
sleep waiting for signals
SynopsisSynopsisunsigned long msleep_interruptible (msecs);unsigned int msecs;ArgumentsArgumentsmsecs
Time in milliseconds to sleep for
High-resolution timersHigh-resolution timersNamektime_set --
Set a ktime_t variable from a seconds/nanoseconds value
SynopsisSynopsisktime_t ktime_set (secs, nsecs);const long secs;const unsigned long nsecs;ArgumentsArgumentssecs
seconds to set
nsecs
nanoseconds to set
DescriptionDescription
Return the ktime_t representation of the value
DescriptionDescription
Return the ktime_t representation of the value
Namektime_sub --
subtract two ktime_t variables
SynopsisSynopsisktime_t ktime_sub (lhs, rhs);const ktime_t lhs;const ktime_t rhs;ArgumentsArgumentslhs
minuend
rhs
subtrahend
DescriptionDescription
Returns the remainder of the substraction
DescriptionDescription
Returns the remainder of the substraction
Namektime_add --
add two ktime_t variables
SynopsisSynopsisktime_t ktime_add (add1, add2);const ktime_t add1;const ktime_t add2;ArgumentsArgumentsadd1
addend1
add2
addend2
DescriptionDescription
Returns the sum of addend1 and addend2
DescriptionDescription
Returns the sum of addend1 and addend2
Namektime_add_ns --
Add a scalar nanoseconds value to a ktime_t variable
SynopsisSynopsisktime_t ktime_add_ns (kt, nsec);const ktime_t kt;u64 nsec;ArgumentsArgumentskt
addend
nsec
the scalar nsec value to add
DescriptionDescription
Returns the sum of kt and nsec in ktime_t format
DescriptionDescription
Returns the sum of kt and nsec in ktime_t format
Nametimespec_to_ktime --
convert a timespec to ktime_t format
SynopsisSynopsisktime_t timespec_to_ktime (ts);const struct timespec ts;ArgumentsArgumentsts
the timespec variable to convert
DescriptionDescription
Returns a ktime_t variable with the converted timespec value
DescriptionDescription
Returns a ktime_t variable with the converted timespec value
Nametimeval_to_ktime --
convert a timeval to ktime_t format
SynopsisSynopsisktime_t timeval_to_ktime (tv);const struct timeval tv;ArgumentsArgumentstv
the timeval variable to convert
DescriptionDescription
Returns a ktime_t variable with the converted timeval value
DescriptionDescription
Returns a ktime_t variable with the converted timeval value
Namektime_to_timespec --
convert a ktime_t variable to timespec format
SynopsisSynopsisstruct timespec ktime_to_timespec (kt);const ktime_t kt;ArgumentsArgumentskt
the ktime_t variable to convert
DescriptionDescription
Returns the timespec representation of the ktime value
DescriptionDescription
Returns the timespec representation of the ktime value
Namektime_to_timeval --
convert a ktime_t variable to timeval format
SynopsisSynopsisstruct timeval ktime_to_timeval (kt);const ktime_t kt;ArgumentsArgumentskt
the ktime_t variable to convert
DescriptionDescription
Returns the timeval representation of the ktime value
DescriptionDescription
Returns the timeval representation of the ktime value
Namektime_to_clock_t --
convert a ktime_t variable to clock_t format
SynopsisSynopsisclock_t ktime_to_clock_t (kt);const ktime_t kt;ArgumentsArgumentskt
the ktime_t variable to convert
DescriptionDescription
Returns a clock_t variable with the converted value
Namektime_to_ns --
convert a ktime_t variable to scalar nanoseconds
SynopsisSynopsisu64 ktime_to_ns (kt);const ktime_t kt;ArgumentsArgumentskt
the ktime_t variable to convert
DescriptionDescription
Returns the scalar nanoseconds representation of kt
Namestruct hrtimer --
the basic hrtimer structure
SynopsisSynopsis
struct hrtimer {
struct rb_node node;
ktime_t expires;
enum hrtimer_state state;
int (* function) (void *);
void * data;
struct hrtimer_base * base;
}; MembersMembersnode
red black tree node for time ordered insertion
expires
the absolute expiry time in the hrtimers internal
representation. The time is related to the clock on
which the timer is based.
state
state of the timer
function
timer expiry callback function
data
argument for the callback function
base
pointer to the timer base (per cpu and per clock)
DescriptionDescription
The hrtimer structure must be initialized by init_hrtimer_#CLOCKTYPE
DescriptionDescription
The hrtimer structure must be initialized by init_hrtimer_#CLOCKTYPE
Namestruct hrtimer_base --
the timer base for a specific clock
SynopsisSynopsis
struct hrtimer_base {
clockid_t index;
spinlock_t lock;
struct rb_root active;
struct rb_node * first;
ktime_t resolution;
ktime_t (* get_time) (void);
struct hrtimer * curr_timer;
}; MembersMembersindex
clock type index for per_cpu support when moving a timer
to a base on another cpu.
lock
lock protecting the base and associated timers
active
red black tree root node for the active timers
first
pointer to the timer node which expires first
resolution
the resolution of the clock, in nanoseconds
get_time
function to retrieve the current time of the clock
curr_timer
the timer which is executing a callback right now
DescriptionDescription
Namektime_get_real --
get the real (wall-) time in ktime_t format
SynopsisSynopsisktime_t ktime_get_real (void); void;ArgumentsArgumentsvoid
no arguments
DescriptionDescription
returns the time in ktime_t format
Namektime_get_ts --
get the monotonic clock in timespec format
SynopsisSynopsisvoid ktime_get_ts (ts);struct timespec * ts;ArgumentsArgumentsts
pointer to timespec variable
DescriptionDescription
The function calculates the monotonic clock from the realtime
clock and the wall_to_monotonic offset and stores the result
in normalized timespec format in the variable pointed to by ts.
DescriptionDescription
The function calculates the monotonic clock from the realtime
clock and the wall_to_monotonic offset and stores the result
in normalized timespec format in the variable pointed to by ts.
Internal FunctionsInternal FunctionsNamereparent_to_init --
Reparent the calling kernel thread to the init task.
SynopsisSynopsisvoid reparent_to_init (void); void;ArgumentsArgumentsvoid
no arguments
DescriptionDescription
If a kernel thread is launched as a result of a system call, or if
it ever exits, it should generally reparent itself to init so that
it is correctly cleaned up on exit.
The various task state such as scheduling policy and priority may have
been inherited from a user process, so we reset them to sane values here.
NOTE that reparent_to_init gives the caller full capabilities.
Namesys_tgkill --
send signal to one specific thread
SynopsisSynopsislong sys_tgkill (tgid, pid, sig);int tgid;int pid;int sig;ArgumentsArgumentstgid
the thread group ID of the thread
pid
the PID of the thread
sig
signal to be sent
DescriptionDescription
This syscall also checks the tgid and returns -ESRCH even if the PID
exists but it's not belonging to the target process anymore. This
method solves the problem of threads exiting and PIDs getting reused.
Kernel objects manipulationKernel objects manipulationNamekobject_init --
initialize object.
SynopsisSynopsisvoid kobject_init (kobj);struct kobject * kobj;ArgumentsArgumentskobj
object in question.
Namekobject_add --
add an object to the hierarchy.
SynopsisSynopsisint kobject_add (kobj);struct kobject * kobj;ArgumentsArgumentskobj
object.
Namekobject_register --
initialize and add an object.
SynopsisSynopsisint kobject_register (kobj);struct kobject * kobj;ArgumentsArgumentskobj
object in question.
Namekobject_set_name --
Set the name of an object
SynopsisSynopsisint kobject_set_name (kobj, fmt, ...);struct kobject * kobj;const char * fmt; ...;ArgumentsArgumentskobj
object.
fmt
format string used to build the name
...
variable arguments
DescriptionDescription
If strlen(name) >= KOBJ_NAME_LEN, then use a dynamically allocated
string that kobj->k_name points to. Otherwise, use the static
kobj->name array.
Namekobject_del --
unlink kobject from hierarchy.
SynopsisSynopsisvoid kobject_del (kobj);struct kobject * kobj;ArgumentsArgumentskobj
object.
Namekobject_unregister --
remove object from hierarchy and decrement refcount.
SynopsisSynopsisvoid kobject_unregister (kobj);struct kobject * kobj;ArgumentsArgumentskobj
object going away.
Namekobject_get --
increment refcount for object.
SynopsisSynopsisstruct kobject * kobject_get (kobj);struct kobject * kobj;ArgumentsArgumentskobj
object.
Namekobject_put --
decrement refcount for object.
SynopsisSynopsisvoid kobject_put (kobj);struct kobject * kobj;ArgumentsArgumentskobj
object.
DescriptionDescription
Decrement the refcount, and if 0, call kobject_cleanup.
Namekset_register --
initialize and add a kset.
SynopsisSynopsisint kset_register (k);struct kset * k;ArgumentsArgumentsk
kset.
Namekset_unregister --
remove a kset.
SynopsisSynopsisvoid kset_unregister (k);struct kset * k;ArgumentsArgumentsk
kset.
Namekset_find_obj --
search for object in kset.
SynopsisSynopsisstruct kobject * kset_find_obj (kset, name);struct kset * kset;const char * name;ArgumentsArgumentskset
kset we're looking in.
name
object's name.
DescriptionDescription
Lock kset via kset->subsys, and iterate over kset->list,
looking for a matching kobject. If matching object is found
take a reference and return the object.
Namesubsystem_register --
register a subsystem.
SynopsisSynopsisint subsystem_register (s);struct subsystem * s;ArgumentsArgumentss
the subsystem we're registering.
DescriptionDescription
Once we register the subsystem, we want to make sure that
the kset points back to this subsystem for correct usage of
the rwsem.
Namesubsys_create_file --
export sysfs attribute file.
SynopsisSynopsisint subsys_create_file (s, a);struct subsystem * s;struct subsys_attribute * a;ArgumentsArgumentss
subsystem.
a
subsystem attribute descriptor.
Namesubsys_remove_file --
remove sysfs attribute file.
SynopsisSynopsisvoid subsys_remove_file (s, a);struct subsystem * s;struct subsys_attribute * a;ArgumentsArgumentss
subsystem.
a
attribute desciptor.
Kernel utility functionsKernel utility functionsNamecontainer_of --
cast a member of a structure out to the containing structure
SynopsisSynopsis container_of (ptr, type, member); ptr; type; member;ArgumentsArgumentsptr
the pointer to the member.
type
the type of the container struct this is embedded in.
member
the name of the member within the struct.
Nameprintk --
print a kernel message
SynopsisSynopsisint printk (fmt, ...);const char * fmt; ...;ArgumentsArgumentsfmt
format string
...
variable arguments
DescriptionDescription
This is printk. It can be called from any context. We want it to work.
We try to grab the console_sem. If we succeed, it's easy - we log the output and
call the console drivers. If we fail to get the semaphore we place the output
into the log buffer and return. The current holder of the console_sem will
notice the new output in release_console_sem and will send it to the
consoles before releasing the semaphore.
One effect of this deferred printing is that code which calls printk and
then changes console_loglevel may break. This is because console_loglevel
is inspected when the actual printing occurs.
See alsoSee also
printf(3)
Nameacquire_console_sem --
lock the console system for exclusive use.
SynopsisSynopsisvoid acquire_console_sem (void); void;ArgumentsArgumentsvoid
no arguments
DescriptionDescription
Acquires a semaphore which guarantees that the caller has
exclusive access to the console system and the console_drivers list.
Can sleep, returns nothing.
Namerelease_console_sem --
unlock the console system
SynopsisSynopsisvoid release_console_sem (void); void;ArgumentsArgumentsvoid
no arguments
DescriptionDescription
Releases the semaphore which the caller holds on the console system
and the console driver list.
While the semaphore was held, console output may have been buffered
by printk. If this is the case, release_console_sem emits
the output prior to releasing the semaphore.
If there is output waiting for klogd, we wake it up.
release_console_sem may be called from any context.
Nameconsole_conditional_schedule --
yield the CPU if required
SynopsisSynopsisvoid __sched console_conditional_schedule (void); void;ArgumentsArgumentsvoid
no arguments
DescriptionDescription
If the console code is currently allowed to sleep, and
if this CPU should yield the CPU to another task, do
so here.
Must be called within acquire_console_sem.
Namepanic --
halt the system
SynopsisSynopsisNORET_TYPE void panic (fmt, ...);const char * fmt; ...;ArgumentsArgumentsfmt
The text string to print
...
variable arguments
DescriptionDescription
Display a message, then perform cleanups.
This function never returns.
Namenotifier_chain_register --
Add notifier to a notifier chain
SynopsisSynopsisint notifier_chain_register (list, n);struct notifier_block ** list;struct notifier_block * n;ArgumentsArgumentslist
Pointer to root list pointer
n
New entry in notifier chain
DescriptionDescription
Adds a notifier to a notifier chain.
Currently always returns zero.
Namenotifier_chain_unregister --
Remove notifier from a notifier chain
SynopsisSynopsisint notifier_chain_unregister (nl, n);struct notifier_block ** nl;struct notifier_block * n;ArgumentsArgumentsnl
Pointer to root list pointer
n
New entry in notifier chain
DescriptionDescription
Removes a notifier from a notifier chain.
Returns zero on success, or -ENOENT on failure.
Namenotifier_call_chain --
Call functions in a notifier chain
SynopsisSynopsisint __kprobes notifier_call_chain (n, val, v);struct notifier_block ** n;unsigned long val;void * v;ArgumentsArgumentsn
Pointer to root pointer of notifier chain
val
Value passed unmodified to notifier function
v
Pointer passed unmodified to notifier function
DescriptionDescription
Calls each function in a notifier chain in turn.
If the return value of the notifier can be and'd
with NOTIFY_STOP_MASK, then notifier_call_chain
will return immediately, with the return value of
the notifier function which halted execution.
Otherwise, the return value is the return value
of the last notifier function called.
Nameregister_reboot_notifier --
Register function to be called at reboot time
SynopsisSynopsisint register_reboot_notifier (nb);struct notifier_block * nb;ArgumentsArgumentsnb
Info about notifier function to be called
DescriptionDescription
Registers a function with the list of functions
to be called at reboot time.
Currently always returns zero, as notifier_chain_register
always returns zero.
Nameunregister_reboot_notifier --
Unregister previously registered reboot notifier
SynopsisSynopsisint unregister_reboot_notifier (nb);struct notifier_block * nb;ArgumentsArgumentsnb
Hook to be unregistered
DescriptionDescription
Unregisters a previously registered reboot
notifier function.
Returns zero on success, or -ENOENT on failure.
Nameemergency_restart --
reboot the system
SynopsisSynopsisvoid emergency_restart (void); void;ArgumentsArgumentsvoid
no arguments
DescriptionDescription
Without shutting down any hardware or taking any locks
reboot the system. This is called when we know we are in
trouble so this is our best effort to reboot. This is
safe to call in interrupt context.
Namekernel_restart --
reboot the system
SynopsisSynopsisvoid kernel_restart (cmd);char * cmd;ArgumentsArgumentscmd
pointer to buffer containing command to execute for restart
or NULL
DescriptionDescription
Shutdown everything and perform a clean reboot.
This is not safe to call in interrupt context.
Namekernel_kexec --
reboot the system
SynopsisSynopsisvoid kernel_kexec (void); void;ArgumentsArgumentsvoid
no arguments
DescriptionDescription
Move into place and start executing a preloaded standalone
executable. If nothing was preloaded return an error.
Namekernel_halt --
halt the system
SynopsisSynopsisvoid kernel_halt (void); void;ArgumentsArgumentsvoid
no arguments
DescriptionDescription
Shutdown everything and perform a clean system halt.
Namekernel_power_off --
power_off the system
SynopsisSynopsisvoid kernel_power_off (void); void;ArgumentsArgumentsvoid
no arguments
DescriptionDescription
Shutdown everything and perform a clean system power_off.
Namecall_rcu --
Queue an RCU callback for invocation after a grace period.
SynopsisSynopsisvoid fastcall call_rcu (head, func);struct rcu_head * head;void (*func)
(struct rcu_head *rcu);ArgumentsArgumentshead
structure to be used for queueing the RCU updates.
func
actual update function to be invoked after the grace period
DescriptionDescription
The update function will be invoked some time after a full grace
period elapses, in other words after all currently executing RCU
read-side critical sections have completed. RCU read-side critical
sections are delimited by rcu_read_lock and rcu_read_unlock,
and may be nested.
Namecall_rcu_bh --
Queue an RCU for invocation after a quicker grace period.
SynopsisSynopsisvoid fastcall call_rcu_bh (head, func);struct rcu_head * head;void (*func)
(struct rcu_head *rcu);ArgumentsArgumentshead
structure to be used for queueing the RCU updates.
func
actual update function to be invoked after the grace period
DescriptionDescription
The update function will be invoked some time after a full grace
period elapses, in other words after all currently executing RCU
read-side critical sections have completed. call_rcu_bh assumes
that the read-side critical sections end on completion of a softirq
handler. This means that read-side critical sections in process
context must not be interrupted by softirqs. This interface is to be
used when most of the read-side critical sections are in softirq context.
RCU read-side critical sections are delimited by rcu_read_lock and
rcu_read_unlock, * if in interrupt context or rcu_read_lock_bh
and rcu_read_unlock_bh, if in process context. These may be nested.
Namercu_barrier --
Wait until all the in-flight RCUs are complete.
SynopsisSynopsisvoid rcu_barrier (void); void;ArgumentsArgumentsvoid
no arguments
Namesynchronize_rcu --
wait until a grace period has elapsed.
SynopsisSynopsisvoid synchronize_rcu (void); void;ArgumentsArgumentsvoid
no arguments
DescriptionDescription
Control will return to the caller some time after a full grace
period has elapsed, in other words after all currently executing RCU
read-side critical sections have completed. RCU read-side critical
sections are delimited by rcu_read_lock and rcu_read_unlock,
and may be nested.
If your read-side code is not protected by rcu_read_lock, do -not-
use synchronize_rcu.
Data TypesData TypesChapter 2. Data TypesDoubly Linked ListsDoubly Linked ListsNamelist_add --
add a new entry
SynopsisSynopsisvoid list_add (new, head);struct list_head * new;struct list_head * head;ArgumentsArgumentsnew
new entry to be added
head
list head to add it after
DescriptionDescription
Insert a new entry after the specified head.
This is good for implementing stacks.
Namelist_add_tail --
add a new entry
SynopsisSynopsisvoid list_add_tail (new, head);struct list_head * new;struct list_head * head;ArgumentsArgumentsnew
new entry to be added
head
list head to add it before
DescriptionDescription
Insert a new entry before the specified head.
This is useful for implementing queues.
Namelist_add_rcu --
add a new entry to rcu-protected list
SynopsisSynopsisvoid list_add_rcu (new, head);struct list_head * new;struct list_head * head;ArgumentsArgumentsnew
new entry to be added
head
list head to add it after
DescriptionDescription
Insert a new entry after the specified head.
This is good for implementing stacks.
The caller must take whatever precautions are necessary
(such as holding appropriate locks) to avoid racing
with another list-mutation primitive, such as list_add_rcu
or list_del_rcu, running on this same list.
However, it is perfectly legal to run concurrently with
the _rcu list-traversal primitives, such as
list_for_each_entry_rcu.
Namelist_add_tail_rcu --
add a new entry to rcu-protected list
SynopsisSynopsisvoid list_add_tail_rcu (new, head);struct list_head * new;struct list_head * head;ArgumentsArgumentsnew
new entry to be added
head
list head to add it before
DescriptionDescription
Insert a new entry before the specified head.
This is useful for implementing queues.
The caller must take whatever precautions are necessary
(such as holding appropriate locks) to avoid racing
with another list-mutation primitive, such as list_add_tail_rcu
or list_del_rcu, running on this same list.
However, it is perfectly legal to run concurrently with
the _rcu list-traversal primitives, such as
list_for_each_entry_rcu.
Namelist_del --
deletes entry from list.
SynopsisSynopsisvoid list_del (entry);struct list_head * entry;ArgumentsArgumentsentry
the element to delete from the list.
NoteNote
list_empty on entry does not return true after this, the entry is
in an undefined state.
Namelist_del_rcu --
deletes entry from list without re-initialization
SynopsisSynopsisvoid list_del_rcu (entry);struct list_head * entry;ArgumentsArgumentsentry
the element to delete from the list.
NoteNote
list_empty on entry does not return true after this,
the entry is in an undefined state. It is useful for RCU based
lockfree traversal.
In particular, it means that we can not poison the forward
pointers that may still be used for walking the list.
The caller must take whatever precautions are necessary
(such as holding appropriate locks) to avoid racing
with another list-mutation primitive, such as list_del_rcu
or list_add_rcu, running on this same list.
However, it is perfectly legal to run concurrently with
the _rcu list-traversal primitives, such as
list_for_each_entry_rcu.
Note that the caller is not permitted to immediately free
the newly deleted entry. Instead, either synchronize_rcu
or call_rcu must be used to defer freeing until an RCU
grace period has elapsed.
Namelist_del_init --
deletes entry from list and reinitialize it.
SynopsisSynopsisvoid list_del_init (entry);struct list_head * entry;ArgumentsArgumentsentry
the element to delete from the list.
Namelist_move --
delete from one list and add as another's head
SynopsisSynopsisvoid list_move (list, head);struct list_head * list;struct list_head * head;ArgumentsArgumentslist
the entry to move
head
the head that will precede our entry
Namelist_move_tail --
delete from one list and add as another's tail
SynopsisSynopsisvoid list_move_tail (list, head);struct list_head * list;struct list_head * head;ArgumentsArgumentslist
the entry to move
head
the head that will follow our entry
Namelist_empty --
tests whether a list is empty
SynopsisSynopsisint list_empty (head);const struct list_head * head;ArgumentsArgumentshead
the list to test.
Namelist_empty_careful --
tests whether a list is
SynopsisSynopsisint list_empty_careful (head);const struct list_head * head;ArgumentsArgumentshead
the list to test.
DescriptionDescription
empty _and_ checks that no other CPU might be
in the process of still modifying either member
NOTENOTE
using list_empty_careful without synchronization
can only be safe if the only activity that can happen
to the list entry is list_del_init. Eg. it cannot be used
if another CPU could re-list_add it.
Namelist_splice --
join two lists
SynopsisSynopsisvoid list_splice (list, head);struct list_head * list;struct list_head * head;ArgumentsArgumentslist
the new list to add.
head
the place to add it in the first list.
Namelist_splice_init --
join two lists and reinitialise the emptied list.
SynopsisSynopsisvoid list_splice_init (list, head);struct list_head * list;struct list_head * head;ArgumentsArgumentslist
the new list to add.
head
the place to add it in the first list.
DescriptionDescription
The list at list is reinitialised
Namelist_entry --
get the struct for this entry
SynopsisSynopsis list_entry (ptr, type, member); ptr; type; member;ArgumentsArgumentsptr
the &struct list_head pointer.
type
the type of the struct this is embedded in.
member
the name of the list_struct within the struct.
Namelist_for_each --
iterate over a list
SynopsisSynopsis list_for_each (pos, head); pos; head;ArgumentsArgumentspos
the &struct list_head to use as a loop counter.
head
the head for your list.
Name__list_for_each --
iterate over a list
SynopsisSynopsis __list_for_each (pos, head); pos; head;ArgumentsArgumentspos
the &struct list_head to use as a loop counter.
head
the head for your list.
DescriptionDescription
This variant differs from list_for_each in that it's the
simplest possible list iteration code, no prefetching is done.
Use this for code that knows the list to be very short (empty
or 1 entry) most of the time.
Namelist_for_each_prev --
iterate over a list backwards
SynopsisSynopsis list_for_each_prev (pos, head); pos; head;ArgumentsArgumentspos
the &struct list_head to use as a loop counter.
head
the head for your list.
Namelist_for_each_safe --
iterate over a list safe against removal of list entry
SynopsisSynopsis list_for_each_safe (pos, n, head); pos; n; head;ArgumentsArgumentspos
the &struct list_head to use as a loop counter.
n
another &struct list_head to use as temporary storage
head
the head for your list.
Namelist_for_each_entry --
iterate over list of given type
SynopsisSynopsis list_for_each_entry (pos, head, member); pos; head; member;ArgumentsArgumentspos
the type * to use as a loop counter.
head
the head for your list.
member
the name of the list_struct within the struct.
Namelist_for_each_entry_reverse --
iterate backwards over list of given type.
SynopsisSynopsis list_for_each_entry_reverse (pos, head, member); pos; head; member;ArgumentsArgumentspos
the type * to use as a loop counter.
head
the head for your list.
member
the name of the list_struct within the struct.
Namelist_prepare_entry --
prepare a pos entry for use as a start point in
SynopsisSynopsis list_prepare_entry (pos, head, member); pos; head; member;ArgumentsArgumentspos
the type * to use as a start point
head
the head of the list
member
the name of the list_struct within the struct.
DescriptionDescription
list_for_each_entry_continue
Namelist_for_each_entry_continue --
iterate over list of given type
SynopsisSynopsis list_for_each_entry_continue (pos, head, member); pos; head; member;ArgumentsArgumentspos
the type * to use as a loop counter.
head
the head for your list.
member
the name of the list_struct within the struct.
DescriptionDescription
continuing after existing point
Namelist_for_each_entry_safe --
iterate over list of given type safe against removal of list entry
SynopsisSynopsis list_for_each_entry_safe (pos, n, head, member); pos; n; head; member;ArgumentsArgumentspos
the type * to use as a loop counter.
n
another type * to use as temporary storage
head
the head for your list.
member
the name of the list_struct within the struct.
Namelist_for_each_entry_safe_continue --
iterate over list of given type
SynopsisSynopsis list_for_each_entry_safe_continue (pos, n, head, member); pos; n; head; member;ArgumentsArgumentspos
the type * to use as a loop counter.
n
another type * to use as temporary storage
head
the head for your list.
member
the name of the list_struct within the struct.
DescriptionDescription
continuing after existing point safe against removal of list entry
Namelist_for_each_entry_safe_reverse --
iterate backwards over list of given type safe against
SynopsisSynopsis list_for_each_entry_safe_reverse (pos, n, head, member); pos; n; head; member;ArgumentsArgumentspos
the type * to use as a loop counter.
n
another type * to use as temporary storage
head
the head for your list.
member
the name of the list_struct within the struct.
DescriptionDescription
removal of list entry
Namelist_for_each_rcu --
iterate over an rcu-protected list
SynopsisSynopsis list_for_each_rcu (pos, head); pos; head;ArgumentsArgumentspos
the &struct list_head to use as a loop counter.
head
the head for your list.
DescriptionDescription
This list-traversal primitive may safely run concurrently with
the _rcu list-mutation primitives such as list_add_rcu
as long as the traversal is guarded by rcu_read_lock.
Namelist_for_each_safe_rcu --
iterate over an rcu-protected list safe
SynopsisSynopsis list_for_each_safe_rcu (pos, n, head); pos; n; head;ArgumentsArgumentspos
the &struct list_head to use as a loop counter.
n
another &struct list_head to use as temporary storage
head
the head for your list.
DescriptionDescription
This list-traversal primitive may safely run concurrently with
the _rcu list-mutation primitives such as list_add_rcu
as long as the traversal is guarded by rcu_read_lock.
DescriptionDescription
This list-traversal primitive may safely run concurrently with
the _rcu list-mutation primitives such as list_add_rcu
as long as the traversal is guarded by rcu_read_lock.
Namelist_for_each_entry_rcu --
iterate over rcu list of given type
SynopsisSynopsis list_for_each_entry_rcu (pos, head, member); pos; head; member;ArgumentsArgumentspos
the type * to use as a loop counter.
head
the head for your list.
member
the name of the list_struct within the struct.
DescriptionDescription
This list-traversal primitive may safely run concurrently with
the _rcu list-mutation primitives such as list_add_rcu
as long as the traversal is guarded by rcu_read_lock.
Namelist_for_each_continue_rcu --
iterate over an rcu-protected list
SynopsisSynopsis list_for_each_continue_rcu (pos, head); pos; head;ArgumentsArgumentspos
the &struct list_head to use as a loop counter.
head
the head for your list.
DescriptionDescription
This list-traversal primitive may safely run concurrently with
the _rcu list-mutation primitives such as list_add_rcu
as long as the traversal is guarded by rcu_read_lock.
DescriptionDescription
This list-traversal primitive may safely run concurrently with
the _rcu list-mutation primitives such as list_add_rcu
as long as the traversal is guarded by rcu_read_lock.
Namehlist_del_rcu --
deletes entry from hash list without re-initialization
SynopsisSynopsisvoid hlist_del_rcu (n);struct hlist_node * n;ArgumentsArgumentsn
the element to delete from the hash list.
NoteNote
list_unhashed on entry does not return true after this,
the entry is in an undefined state. It is useful for RCU based
lockfree traversal.
In particular, it means that we can not poison the forward
pointers that may still be used for walking the hash list.
The caller must take whatever precautions are necessary
(such as holding appropriate locks) to avoid racing
with another list-mutation primitive, such as hlist_add_head_rcu
or hlist_del_rcu, running on this same list.
However, it is perfectly legal to run concurrently with
the _rcu list-traversal primitives, such as
hlist_for_each_entry.
Namehlist_add_head_rcu --
adds the specified element to the specified hlist,
SynopsisSynopsisvoid hlist_add_head_rcu (n, h);struct hlist_node * n;struct hlist_head * h;ArgumentsArgumentsn
the element to add to the hash list.
h
the list to add to.
DescriptionDescription
The caller must take whatever precautions are necessary
(such as holding appropriate locks) to avoid racing
with another list-mutation primitive, such as hlist_add_head_rcu
or hlist_del_rcu, running on this same list.
However, it is perfectly legal to run concurrently with
the _rcu list-traversal primitives, such as
hlist_for_each_entry_rcu, used to prevent memory-consistency
problems on Alpha CPUs. Regardless of the type of CPU, the
list-traversal primitive must be guarded by rcu_read_lock.
DescriptionDescription
The caller must take whatever precautions are necessary
(such as holding appropriate locks) to avoid racing
with another list-mutation primitive, such as hlist_add_head_rcu
or hlist_del_rcu, running on this same list.
However, it is perfectly legal to run concurrently with
the _rcu list-traversal primitives, such as
hlist_for_each_entry_rcu, used to prevent memory-consistency
problems on Alpha CPUs. Regardless of the type of CPU, the
list-traversal primitive must be guarded by rcu_read_lock.
Namehlist_add_before_rcu --
adds the specified element to the specified hlist
SynopsisSynopsisvoid hlist_add_before_rcu (n, next);struct hlist_node * n;struct hlist_node * next;ArgumentsArgumentsn
the new element to add to the hash list.
next
the existing element to add the new element before.
DescriptionDescription
The caller must take whatever precautions are necessary
(such as holding appropriate locks) to avoid racing
with another list-mutation primitive, such as hlist_add_head_rcu
or hlist_del_rcu, running on this same list.
However, it is perfectly legal to run concurrently with
the _rcu list-traversal primitives, such as
hlist_for_each_entry_rcu, used to prevent memory-consistency
problems on Alpha CPUs.
DescriptionDescription
The caller must take whatever precautions are necessary
(such as holding appropriate locks) to avoid racing
with another list-mutation primitive, such as hlist_add_head_rcu
or hlist_del_rcu, running on this same list.
However, it is perfectly legal to run concurrently with
the _rcu list-traversal primitives, such as
hlist_for_each_entry_rcu, used to prevent memory-consistency
problems on Alpha CPUs.
Namehlist_add_after_rcu --
adds the specified element to the specified hlist
SynopsisSynopsisvoid hlist_add_after_rcu (prev, n);struct hlist_node * prev;struct hlist_node * n;ArgumentsArgumentsprev
the existing element to add the new element after.
n
the new element to add to the hash list.
DescriptionDescription
The caller must take whatever precautions are necessary
(such as holding appropriate locks) to avoid racing
with another list-mutation primitive, such as hlist_add_head_rcu
or hlist_del_rcu, running on this same list.
However, it is perfectly legal to run concurrently with
the _rcu list-traversal primitives, such as
hlist_for_each_entry_rcu, used to prevent memory-consistency
problems on Alpha CPUs.
DescriptionDescription
The caller must take whatever precautions are necessary
(such as holding appropriate locks) to avoid racing
with another list-mutation primitive, such as hlist_add_head_rcu
or hlist_del_rcu, running on this same list.
However, it is perfectly legal to run concurrently with
the _rcu list-traversal primitives, such as
hlist_for_each_entry_rcu, used to prevent memory-consistency
problems on Alpha CPUs.
Namehlist_for_each_entry --
iterate over list of given type
SynopsisSynopsis hlist_for_each_entry (tpos, pos, head, member); tpos; pos; head; member;ArgumentsArgumentstpos
the type * to use as a loop counter.
pos
the &struct hlist_node to use as a loop counter.
head
the head for your list.
member
the name of the hlist_node within the struct.
Namehlist_for_each_entry_continue --
iterate over a hlist continuing after existing point
SynopsisSynopsis hlist_for_each_entry_continue (tpos, pos, member); tpos; pos; member;ArgumentsArgumentstpos
the type * to use as a loop counter.
pos
the &struct hlist_node to use as a loop counter.
member
the name of the hlist_node within the struct.
Namehlist_for_each_entry_from --
iterate over a hlist continuing from existing point
SynopsisSynopsis hlist_for_each_entry_from (tpos, pos, member); tpos; pos; member;ArgumentsArgumentstpos
the type * to use as a loop counter.
pos
the &struct hlist_node to use as a loop counter.
member
the name of the hlist_node within the struct.
Namehlist_for_each_entry_safe --
iterate over list of given type safe against removal of list entry
SynopsisSynopsis hlist_for_each_entry_safe (tpos, pos, n, head, member); tpos; pos; n; head; member;ArgumentsArgumentstpos
the type * to use as a loop counter.
pos
the &struct hlist_node to use as a loop counter.
n
another &struct hlist_node to use as temporary storage
head
the head for your list.
member
the name of the hlist_node within the struct.
Namehlist_for_each_entry_rcu --
iterate over rcu list of given type
SynopsisSynopsis hlist_for_each_entry_rcu (tpos, pos, head, member); tpos; pos; head; member;ArgumentsArgumentstpos
the type * to use as a loop counter.
pos
the &struct hlist_node to use as a loop counter.
head
the head for your list.
member
the name of the hlist_node within the struct.
DescriptionDescription
This list-traversal primitive may safely run concurrently with
the _rcu list-mutation primitives such as hlist_add_head_rcu
as long as the traversal is guarded by rcu_read_lock.
Basic C Library FunctionsBasic C Library FunctionsChapter 3. Basic C Library Functions
When writing drivers, you cannot in general use routines which are
from the C Library. Some of the functions have been found generally
useful and they are listed below. The behaviour of these functions
may vary slightly from those defined by ANSI, and these deviations
are noted in the text.
String ConversionsString ConversionsNamesimple_strtoll --
convert a string to a signed long long
SynopsisSynopsislong long simple_strtoll (cp, endp, base);const char * cp;char ** endp;unsigned int base;ArgumentsArgumentscp
The start of the string
endp
A pointer to the end of the parsed string will be placed here
base
The number base to use
Namesimple_strtoul --
convert a string to an unsigned long
SynopsisSynopsisunsigned long simple_strtoul (cp, endp, base);const char * cp;char ** endp;unsigned int base;ArgumentsArgumentscp
The start of the string
endp
A pointer to the end of the parsed string will be placed here
base
The number base to use
Namesimple_strtol --
convert a string to a signed long
SynopsisSynopsislong simple_strtol (cp, endp, base);const char * cp;char ** endp;unsigned int base;ArgumentsArgumentscp
The start of the string
endp
A pointer to the end of the parsed string will be placed here
base
The number base to use
Namesimple_strtoull --
convert a string to an unsigned long long
SynopsisSynopsisunsigned long long simple_strtoull (cp, endp, base);const char * cp;char ** endp;unsigned int base;ArgumentsArgumentscp
The start of the string
endp
A pointer to the end of the parsed string will be placed here
base
The number base to use
Namevsnprintf --
Format a string and place it in a buffer
SynopsisSynopsisint vsnprintf (buf, size, fmt, args);char * buf;size_t size;const char * fmt;va_list args;ArgumentsArgumentsbuf
The buffer to place the result into
size
The size of the buffer, including the trailing null space
fmt
The format string to use
args
Arguments for the format string
DescriptionDescription
The return value is the number of characters which would
be generated for the given input, excluding the trailing
'\0', as per ISO C99. If you want to have the exact
number of characters written into buf as return value
(not including the trailing '\0'), use vscnprintf. If the
return is greater than or equal to size, the resulting
string is truncated.
Call this function if you are already dealing with a va_list.
You probably want snprintf instead.
Namevscnprintf --
Format a string and place it in a buffer
SynopsisSynopsisint vscnprintf (buf, size, fmt, args);char * buf;size_t size;const char * fmt;va_list args;ArgumentsArgumentsbuf
The buffer to place the result into
size
The size of the buffer, including the trailing null space
fmt
The format string to use
args
Arguments for the format string
DescriptionDescription
The return value is the number of characters which have been written into
the buf not including the trailing '\0'. If size is <= 0 the function
returns 0.
Call this function if you are already dealing with a va_list.
You probably want scnprintf instead.
Namesnprintf --
Format a string and place it in a buffer
SynopsisSynopsisint snprintf (buf, size, fmt, ...);char * buf;size_t size;const char * fmt; ...;ArgumentsArgumentsbuf
The buffer to place the result into
size
The size of the buffer, including the trailing null space
fmt
The format string to use
@...: Arguments for the format string
...
variable arguments
DescriptionDescription
The return value is the number of characters which would be
generated for the given input, excluding the trailing null,
as per ISO C99. If the return is greater than or equal to
size, the resulting string is truncated.
Namescnprintf --
Format a string and place it in a buffer
SynopsisSynopsisint scnprintf (buf, size, fmt, ...);char * buf;size_t size;const char * fmt; ...;ArgumentsArgumentsbuf
The buffer to place the result into
size
The size of the buffer, including the trailing null space
fmt
The format string to use
@...: Arguments for the format string
...
variable arguments
DescriptionDescription
The return value is the number of characters written into buf not including
the trailing '\0'. If size is <= 0 the function returns 0. If the return is
greater than or equal to size, the resulting string is truncated.
Namevsprintf --
Format a string and place it in a buffer
SynopsisSynopsisint vsprintf (buf, fmt, args);char * buf;const char * fmt;va_list args;ArgumentsArgumentsbuf
The buffer to place the result into
fmt
The format string to use
args
Arguments for the format string
DescriptionDescription
The function returns the number of characters written
into buf. Use vsnprintf or vscnprintf in order to avoid
buffer overflows.
Call this function if you are already dealing with a va_list.
You probably want sprintf instead.
Namesprintf --
Format a string and place it in a buffer
SynopsisSynopsisint sprintf (buf, fmt, ...);char * buf;const char * fmt; ...;ArgumentsArgumentsbuf
The buffer to place the result into
fmt
The format string to use
@...: Arguments for the format string
...
variable arguments
DescriptionDescription
The function returns the number of characters written
into buf. Use snprintf or scnprintf in order to avoid
buffer overflows.
Namevsscanf --
Unformat a buffer into a list of arguments
SynopsisSynopsisint vsscanf (buf, fmt, args);const char * buf;const char * fmt;va_list args;ArgumentsArgumentsbuf
input buffer
fmt
format of buffer
args
arguments
Namesscanf --
Unformat a buffer into a list of arguments
SynopsisSynopsisint sscanf (buf, fmt, ...);const char * buf;const char * fmt; ...;ArgumentsArgumentsbuf
input buffer
fmt
formatting of buffer
@...: resulting arguments
...
variable arguments
String ManipulationString ManipulationNamestrnicmp --
Case insensitive, length-limited string comparison
SynopsisSynopsisint strnicmp (s1, s2, len);const char * s1;const char * s2;size_t len;ArgumentsArgumentss1
One string
s2
The other string
len
the maximum number of characters to compare
Namestrcpy --
Copy a NUL terminated string
SynopsisSynopsischar * strcpy (dest, src);char * dest;const char * src;ArgumentsArgumentsdest
Where to copy the string to
src
Where to copy the string from
Namestrncpy --
Copy a length-limited, NUL-terminated string
SynopsisSynopsischar * strncpy (dest, src, count);char * dest;const char * src;size_t count;ArgumentsArgumentsdest
Where to copy the string to
src
Where to copy the string from
count
The maximum number of bytes to copy
DescriptionDescription
The result is not NUL-terminated if the source exceeds
count bytes.
In the case where the length of src is less than that of
count, the remainder of dest will be padded with NUL.
Namestrlcpy --
Copy a NUL terminated string into a sized buffer
SynopsisSynopsissize_t strlcpy (dest, src, size);char * dest;const char * src;size_t size;ArgumentsArgumentsdest
Where to copy the string to
src
Where to copy the string from
size
size of destination buffer
BSDBSD
the result is always a valid
NUL-terminated string that fits in the buffer (unless,
of course, the buffer size is zero). It does not pad
out the result like strncpy does.
Namestrcat --
Append one NUL-terminated string to another
SynopsisSynopsischar * strcat (dest, src);char * dest;const char * src;ArgumentsArgumentsdest
The string to be appended to
src
The string to append to it
Namestrncat --
Append a length-limited, NUL-terminated string to another
SynopsisSynopsischar * strncat (dest, src, count);char * dest;const char * src;size_t count;ArgumentsArgumentsdest
The string to be appended to
src
The string to append to it
count
The maximum numbers of bytes to copy
DescriptionDescription
Note that in contrast to strncpy, strncat ensures the result is
terminated.
Namestrlcat --
Append a length-limited, NUL-terminated string to another
SynopsisSynopsissize_t strlcat (dest, src, count);char * dest;const char * src;size_t count;ArgumentsArgumentsdest
The string to be appended to
src
The string to append to it
count
The size of the destination buffer.
Namestrcmp --
Compare two strings
SynopsisSynopsisint strcmp (cs, ct);const char * cs;const char * ct;ArgumentsArgumentscs
One string
ct
Another string
Namestrncmp --
Compare two length-limited strings
SynopsisSynopsisint strncmp (cs, ct, count);const char * cs;const char * ct;size_t count;ArgumentsArgumentscs
One string
ct
Another string
count
The maximum number of bytes to compare
Namestrchr --
Find the first occurrence of a character in a string
SynopsisSynopsischar * strchr (s, c);const char * s;int c;ArgumentsArgumentss
The string to be searched
c
The character to search for
Namestrrchr --
Find the last occurrence of a character in a string
SynopsisSynopsischar * strrchr (s, c);const char * s;int c;ArgumentsArgumentss
The string to be searched
c
The character to search for
Namestrnchr --
Find a character in a length limited string
SynopsisSynopsischar * strnchr (s, count, c);const char * s;size_t count;int c;ArgumentsArgumentss
The string to be searched
count
The number of characters to be searched
c
The character to search for
Namestrlen --
Find the length of a string
SynopsisSynopsissize_t strlen (s);const char * s;ArgumentsArgumentss
The string to be sized
Namestrnlen --
Find the length of a length-limited string
SynopsisSynopsissize_t strnlen (s, count);const char * s;size_t count;ArgumentsArgumentss
The string to be sized
count
The maximum number of bytes to search
Namestrspn --
Calculate the length of the initial substring of s which only
SynopsisSynopsissize_t strspn (s, accept);const char * s;const char * accept;ArgumentsArgumentss
The string to be searched
accept
The string to search for
DescriptionDescription
contain letters in accept
Namestrcspn --
Calculate the length of the initial substring of s which does
SynopsisSynopsissize_t strcspn (s, reject);const char * s;const char * reject;ArgumentsArgumentss
The string to be searched
reject
The string to avoid
DescriptionDescription
not contain letters in reject
Namestrpbrk --
Find the first occurrence of a set of characters
SynopsisSynopsischar * strpbrk (cs, ct);const char * cs;const char * ct;ArgumentsArgumentscs
The string to be searched
ct
The characters to search for
Namestrsep --
Split a string into tokens
SynopsisSynopsischar * strsep (s, ct);char ** s;const char * ct;ArgumentsArgumentss
The string to be searched
ct
The characters to search for
DescriptionDescription
strsep updates s to point after the token, ready for the next call.
It returns empty tokens, too, behaving exactly like the libc function
of that name. In fact, it was stolen from glibc2 and de-fancy-fied.
Same semantics, slimmer shape. ;)
Namememset --
Fill a region of memory with the given value
SynopsisSynopsisvoid * memset (s, c, count);void * s;int c;size_t count;ArgumentsArgumentss
Pointer to the start of the area.
c
The byte to fill the area with
count
The size of the area.
DescriptionDescription
Do not use memset to access IO space, use memset_io instead.
Namememcpy --
Copy one area of memory to another
SynopsisSynopsisvoid * memcpy (dest, src, count);void * dest;const void * src;size_t count;ArgumentsArgumentsdest
Where to copy to
src
Where to copy from
count
The size of the area.
DescriptionDescription
You should not use this function to access IO space, use memcpy_toio
or memcpy_fromio instead.
Namememmove --
Copy one area of memory to another
SynopsisSynopsisvoid * memmove (dest, src, count);void * dest;const void * src;size_t count;ArgumentsArgumentsdest
Where to copy to
src
Where to copy from
count
The size of the area.
DescriptionDescription
Unlike memcpy, memmove copes with overlapping areas.
Namememcmp --
Compare two areas of memory
SynopsisSynopsisint memcmp (cs, ct, count);const void * cs;const void * ct;size_t count;ArgumentsArgumentscs
One area of memory
ct
Another area of memory
count
The size of the area.
Namememscan --
Find a character in an area of memory.
SynopsisSynopsisvoid * memscan (addr, c, size);void * addr;int c;size_t size;ArgumentsArgumentsaddr
The memory area
c
The byte to search for
size
The size of the area.
DescriptionDescription
returns the address of the first occurrence of c, or 1 byte past
the area if c is not found
Namestrstr --
Find the first substring in a NUL terminated string
SynopsisSynopsischar * strstr (s1, s2);const char * s1;const char * s2;ArgumentsArgumentss1
The string to be searched
s2
The string to search for
Namememchr --
Find a character in an area of memory.
SynopsisSynopsisvoid * memchr (s, c, n);const void * s;int c;size_t n;ArgumentsArgumentss
The memory area
c
The byte to search for
n
The size of the area.
DescriptionDescription
returns the address of the first occurrence of c, or NULL
if c is not found
Bit OperationsBit OperationsNameset_bit --
Atomically set a bit in memory
SynopsisSynopsisvoid set_bit (nr, addr);int nr;volatile unsigned long * addr;ArgumentsArgumentsnr
the bit to set
addr
the address to start counting from
DescriptionDescription
This function is atomic and may not be reordered. See __set_bit
if you do not require the atomic guarantees.
NoteNote
there are no guarantees that this function will not be reordered
on non x86 architectures, so if you are writting portable code,
make sure not to rely on its reordering guarantees.
Note that nr may be almost arbitrarily large; this function is not
restricted to acting on a single-word quantity.
Name__set_bit --
Set a bit in memory
SynopsisSynopsisvoid __set_bit (nr, addr);int nr;volatile unsigned long * addr;ArgumentsArgumentsnr
the bit to set
addr
the address to start counting from
DescriptionDescription
Unlike set_bit, this function is non-atomic and may be reordered.
If it's called on the same region of memory simultaneously, the effect
may be that only one operation succeeds.
Nameclear_bit --
Clears a bit in memory
SynopsisSynopsisvoid clear_bit (nr, addr);int nr;volatile unsigned long * addr;ArgumentsArgumentsnr
Bit to clear
addr
Address to start counting from
DescriptionDescription
clear_bit is atomic and may not be reordered. However, it does
not contain a memory barrier, so if it is used for locking purposes,
you should call smp_mb__before_clear_bit and/or smp_mb__after_clear_bit
in order to ensure changes are visible on other processors.
Name__change_bit --
Toggle a bit in memory
SynopsisSynopsisvoid __change_bit (nr, addr);int nr;volatile unsigned long * addr;ArgumentsArgumentsnr
the bit to change
addr
the address to start counting from
DescriptionDescription
Unlike change_bit, this function is non-atomic and may be reordered.
If it's called on the same region of memory simultaneously, the effect
may be that only one operation succeeds.
Namechange_bit --
Toggle a bit in memory
SynopsisSynopsisvoid change_bit (nr, addr);int nr;volatile unsigned long * addr;ArgumentsArgumentsnr
Bit to change
addr
Address to start counting from
DescriptionDescription
change_bit is atomic and may not be reordered. It may be
reordered on other architectures than x86.
Note that nr may be almost arbitrarily large; this function is not
restricted to acting on a single-word quantity.
Nametest_and_set_bit --
Set a bit and return its old value
SynopsisSynopsisint test_and_set_bit (nr, addr);int nr;volatile unsigned long * addr;ArgumentsArgumentsnr
Bit to set
addr
Address to count from
DescriptionDescription
This operation is atomic and cannot be reordered.
It may be reordered on other architectures than x86.
It also implies a memory barrier.
Name__test_and_set_bit --
Set a bit and return its old value
SynopsisSynopsisint __test_and_set_bit (nr, addr);int nr;volatile unsigned long * addr;ArgumentsArgumentsnr
Bit to set
addr
Address to count from
DescriptionDescription
This operation is non-atomic and can be reordered.
If two examples of this operation race, one can appear to succeed
but actually fail. You must protect multiple accesses with a lock.
Nametest_and_clear_bit --
Clear a bit and return its old value
SynopsisSynopsisint test_and_clear_bit (nr, addr);int nr;volatile unsigned long * addr;ArgumentsArgumentsnr
Bit to clear
addr
Address to count from
DescriptionDescription
This operation is atomic and cannot be reordered.
It can be reorderdered on other architectures other than x86.
It also implies a memory barrier.
Name__test_and_clear_bit --
Clear a bit and return its old value
SynopsisSynopsisint __test_and_clear_bit (nr, addr);int nr;volatile unsigned long * addr;ArgumentsArgumentsnr
Bit to clear
addr
Address to count from
DescriptionDescription
This operation is non-atomic and can be reordered.
If two examples of this operation race, one can appear to succeed
but actually fail. You must protect multiple accesses with a lock.
Nametest_and_change_bit --
Change a bit and return its old value
SynopsisSynopsisint test_and_change_bit (nr, addr);int nr;volatile unsigned long * addr;ArgumentsArgumentsnr
Bit to change
addr
Address to count from
DescriptionDescription
This operation is atomic and cannot be reordered.
It also implies a memory barrier.
Nametest_bit --
Determine whether a bit is set
SynopsisSynopsisint test_bit (nr, addr);int nr;const volatile void * addr;ArgumentsArgumentsnr
bit number to test
addr
Address to start counting from
Namefind_first_zero_bit --
find the first zero bit in a memory region
SynopsisSynopsisint find_first_zero_bit (addr, size);const unsigned long * addr;unsigned size;ArgumentsArgumentsaddr
The address to start the search at
size
The maximum size to search
DescriptionDescription
Returns the bit-number of the first zero bit, not the number of the byte
containing a bit.
Namefind_next_zero_bit --
find the first zero bit in a memory region
SynopsisSynopsisint find_next_zero_bit (addr, size, offset);const unsigned long * addr;int size;int offset;ArgumentsArgumentsaddr
The address to base the search on
size
The maximum size to search
offset
The bitnumber to start searching at
Name__ffs --
find first bit in word.
SynopsisSynopsisunsigned long __ffs (word);unsigned long word;ArgumentsArgumentsword
The word to search
DescriptionDescription
Undefined if no bit exists, so code should check against 0 first.
Namefind_first_bit --
find the first set bit in a memory region
SynopsisSynopsisunsigned find_first_bit (addr, size);const unsigned long * addr;unsigned size;ArgumentsArgumentsaddr
The address to start the search at
size
The maximum size to search
DescriptionDescription
Returns the bit-number of the first set bit, not the number of the byte
containing a bit.
Namefind_next_bit --
find the first set bit in a memory region
SynopsisSynopsisint find_next_bit (addr, size, offset);const unsigned long * addr;int size;int offset;ArgumentsArgumentsaddr
The address to base the search on
size
The maximum size to search
offset
The bitnumber to start searching at
Nameffz --
find first zero in word.
SynopsisSynopsisunsigned long ffz (word);unsigned long word;ArgumentsArgumentsword
The word to search
DescriptionDescription
Undefined if no zero exists, so code should check against ~0UL first.
Nameffs --
find first bit set
SynopsisSynopsisint ffs (x);int x;ArgumentsArgumentsx
the word to search
DescriptionDescription
This is defined the same way as
the libc and compiler builtin ffs routines, therefore
differs in spirit from the above ffz (man ffs).
Namefls --
find last bit set
SynopsisSynopsisint fls (x);int x;ArgumentsArgumentsx
the word to search
DescriptionDescription
This is defined the same way as ffs.
Namehweight32 --
returns the hamming weight of a N-bit word
SynopsisSynopsis hweight32 (x); x;ArgumentsArgumentsx
the word to weigh
DescriptionDescription
The Hamming Weight of a number is the total number of bits set in it.
Memory Management in LinuxMemory Management in LinuxChapter 4. Memory Management in LinuxThe Slab CacheThe Slab CacheNamekmem_cache_create --
Create a cache.
SynopsisSynopsisstruct kmem_cache * kmem_cache_create (name, size, align, flags, ctor, dtor);const char * name;size_t size;size_t align;unsigned long flags;void (*ctor)
(void*, struct kmem_cache *, unsigned long);void (*dtor)
(void*, struct kmem_cache *, unsigned long);ArgumentsArgumentsname
A string which is used in /proc/slabinfo to identify this cache.
size
The size of objects to be created in this cache.
align
The required alignment for the objects.
flags
SLAB flags
ctor
A constructor for the objects.
dtor
A destructor for the objects.
DescriptionDescription
Returns a ptr to the cache on success, NULL on failure.
Cannot be called within a int, but can be interrupted.
The ctor is run when new pages are allocated by the cache
and the dtor is run before the pages are handed back.
name must be valid until the cache is destroyed. This implies that
the module calling this has to destroy the cache before getting
unloaded.
The flags are
SLAB_POISON - Poison the slab with a known test pattern (a5a5a5a5)
to catch references to uninitialised memory.
SLAB_RED_ZONE - Insert `Red' zones around the allocated memory to check
for buffer overruns.
SLAB_NO_REAP - Don't automatically reap this cache when we're under
memory pressure.
SLAB_HWCACHE_ALIGN - Align the objects in this cache to a hardware
cacheline. This can be beneficial if you're counting cycles as closely
as davem.
Namekmem_cache_shrink --
Shrink a cache.
SynopsisSynopsisint kmem_cache_shrink (cachep);struct kmem_cache * cachep;ArgumentsArgumentscachep
The cache to shrink.
DescriptionDescription
Releases as many slabs as possible for a cache.
To help debugging, a zero exit status indicates all slabs were released.
Namekmem_cache_destroy --
delete a cache
SynopsisSynopsisint kmem_cache_destroy (cachep);struct kmem_cache * cachep;ArgumentsArgumentscachep
the cache to destroy
DescriptionDescription
Remove a struct kmem_cache object from the slab cache.
Returns 0 on success.
It is expected this function will be called by a module when it is
unloaded. This will remove the cache completely, and avoid a duplicate
cache being allocated each time a module is loaded and unloaded, if the
module doesn't have persistent in-kernel storage across loads and unloads.
The cache must be empty before calling this function.
The caller must guarantee that noone will allocate memory from the cache
during the kmem_cache_destroy.
Namekmem_cache_alloc --
Allocate an object
SynopsisSynopsisvoid * kmem_cache_alloc (cachep, flags);struct kmem_cache * cachep;gfp_t flags;ArgumentsArgumentscachep
The cache to allocate from.
flags
See kmalloc.
DescriptionDescription
Allocate an object from this cache. The flags are only relevant
if the cache has no available objects.
Namekmem_cache_alloc_node --
Allocate an object on the specified node
SynopsisSynopsisvoid * kmem_cache_alloc_node (cachep, flags, nodeid);struct kmem_cache * cachep;gfp_t flags;int nodeid;ArgumentsArgumentscachep
The cache to allocate from.
flags
See kmalloc.
nodeid
node number of the target node.
DescriptionDescription
Identical to kmem_cache_alloc, except that this function is slow
and can sleep. And it will allocate memory on the given node, which
can improve the performance for cpu bound structures.
New and improvedNew and improved
it will now make sure that the object gets
put on the correct node list so that there is no false sharing.
Name__alloc_percpu --
allocate one copy of the object for every present
SynopsisSynopsisvoid * __alloc_percpu (size);size_t size;ArgumentsArgumentssize
how many bytes of memory are required.
DescriptionDescription
cpu in the system, zeroing them.
Objects should be dereferenced using the per_cpu_ptr macro only.
Namekmem_cache_free --
Deallocate an object
SynopsisSynopsisvoid kmem_cache_free (cachep, objp);struct kmem_cache * cachep;void * objp;ArgumentsArgumentscachep
The cache the allocation was from.
objp
The previously allocated object.
DescriptionDescription
Free an object which was previously allocated from this
cache.
Namekfree --
free previously allocated memory
SynopsisSynopsisvoid kfree (objp);const void * objp;ArgumentsArgumentsobjp
pointer returned by kmalloc.
DescriptionDescription
If objp is NULL, no operation is performed.
Don't free memory not originally allocated by kmalloc
or you will run into trouble.
Namefree_percpu --
free previously allocated percpu memory
SynopsisSynopsisvoid free_percpu (objp);const void * objp;ArgumentsArgumentsobjp
pointer returned by alloc_percpu.
DescriptionDescription
Don't free memory not originally allocated by alloc_percpu
The complemented objp is to check for that.
User Space Memory AccessUser Space Memory AccessNameaccess_ok --
Checks if a user space pointer is valid
SynopsisSynopsis access_ok (type, addr, size); type; addr; size;ArgumentsArgumentstype
Type of access: VERIFY_READ or VERIFY_WRITE. Note that
VERIFY_WRITE is a superset of VERIFY_READ - if it is safe
to write to a block, it is always safe to read from it.
addr
User space pointer to start of block to check
size
Size of block to check
ContextContext
User context only. This function may sleep.
DescriptionDescription
Checks if a pointer to a block of memory in user space is valid.
Returns true (nonzero) if the memory block may be valid, false (zero)
if it is definitely invalid.
Note that, depending on architecture, this function probably just
checks that the pointer is in the user space range - after calling
this function, memory access functions may still return -EFAULT.
Nameget_user --
Get a simple variable from user space.
SynopsisSynopsis get_user (x, ptr); x; ptr;ArgumentsArgumentsx
Variable to store result.
ptr
Source address, in user space.
ContextContext
User context only. This function may sleep.
DescriptionDescription
This macro copies a single simple variable from user space to kernel
space. It supports simple types like char and int, but not larger
data types like structures or arrays.
ptr must have pointer-to-simple-variable type, and the result of
dereferencing ptr must be assignable to x without a cast.
Returns zero on success, or -EFAULT on error.
On error, the variable x is set to zero.
Nameput_user --
Write a simple value into user space.
SynopsisSynopsis put_user (x, ptr); x; ptr;ArgumentsArgumentsx
Value to copy to user space.
ptr
Destination address, in user space.
ContextContext
User context only. This function may sleep.
DescriptionDescription
This macro copies a single simple value from kernel space to user
space. It supports simple types like char and int, but not larger
data types like structures or arrays.
ptr must have pointer-to-simple-variable type, and x must be assignable
to the result of dereferencing ptr.
Returns zero on success, or -EFAULT on error.
Name__get_user --
Get a simple variable from user space, with less checking.
SynopsisSynopsis __get_user (x, ptr); x; ptr;ArgumentsArgumentsx
Variable to store result.
ptr
Source address, in user space.
ContextContext
User context only. This function may sleep.
DescriptionDescription
This macro copies a single simple variable from user space to kernel
space. It supports simple types like char and int, but not larger
data types like structures or arrays.
ptr must have pointer-to-simple-variable type, and the result of
dereferencing ptr must be assignable to x without a cast.
Caller must check the pointer with access_ok before calling this
function.
Returns zero on success, or -EFAULT on error.
On error, the variable x is set to zero.
Name__put_user --
Write a simple value into user space, with less checking.
SynopsisSynopsis __put_user (x, ptr); x; ptr;ArgumentsArgumentsx
Value to copy to user space.
ptr
Destination address, in user space.
ContextContext
User context only. This function may sleep.
DescriptionDescription
This macro copies a single simple value from kernel space to user
space. It supports simple types like char and int, but not larger
data types like structures or arrays.
ptr must have pointer-to-simple-variable type, and x must be assignable
to the result of dereferencing ptr.
Caller must check the pointer with access_ok before calling this
function.
Returns zero on success, or -EFAULT on error.
Name__copy_to_user_inatomic --
Copy a block of data into user space, with less checking.
SynopsisSynopsis__always_inline unsigned long __must_check __copy_to_user_inatomic (to, from, n);void __user * to;const void * from;unsigned long n;ArgumentsArgumentsto
Destination address, in user space.
from
Source address, in kernel space.
n
Number of bytes to copy.
ContextContext
User context only. This function may sleep.
DescriptionDescription
Copy data from kernel space to user space. Caller must check
the specified block with access_ok before calling this function.
Returns number of bytes that could not be copied.
On success, this will be zero.
Name__copy_from_user_inatomic --
Copy a block of data from user space, with less checking.
SynopsisSynopsis__always_inline unsigned long __copy_from_user_inatomic (to, from, n);void * to;const void __user * from;unsigned long n;ArgumentsArgumentsto
Destination address, in kernel space.
from
Source address, in user space.
n
Number of bytes to copy.
ContextContext
User context only. This function may sleep.
DescriptionDescription
Copy data from user space to kernel space. Caller must check
the specified block with access_ok before calling this function.
Returns number of bytes that could not be copied.
On success, this will be zero.
If some data could not be copied, this function will pad the copied
data to the requested size using zero bytes.
Namestrlen_user --
Get the size of a string in user space.
SynopsisSynopsis strlen_user (str); str;ArgumentsArgumentsstr
The string to measure.
ContextContext
User context only. This function may sleep.
DescriptionDescription
Get the size of a NUL-terminated string in user space.
Returns the size of the string INCLUDING the terminating NUL.
On exception, returns 0.
If there is a limit on the length of a valid string, you may wish to
consider using strnlen_user instead.
Name__strncpy_from_user --
Copy a NUL terminated string from userspace, with less checking.
SynopsisSynopsislong __strncpy_from_user (dst, src, count);char * dst;const char __user * src;long count;ArgumentsArgumentsdst
Destination address, in kernel space. This buffer must be at
least count bytes long.
src
Source address, in user space.
count
Maximum number of bytes to copy, including the trailing NUL.
DescriptionDescription
Copies a NUL-terminated string from userspace to kernel space.
Caller must check the specified block with access_ok before calling
this function.
On success, returns the length of the string (not including the trailing
NUL).
If access to userspace fails, returns -EFAULT (some data may have been
copied).
If count is smaller than the length of the string, copies count bytes
and returns count.
Namestrncpy_from_user --
Copy a NUL terminated string from userspace.
SynopsisSynopsislong strncpy_from_user (dst, src, count);char * dst;const char __user * src;long count;ArgumentsArgumentsdst
Destination address, in kernel space. This buffer must be at
least count bytes long.
src
Source address, in user space.
count
Maximum number of bytes to copy, including the trailing NUL.
DescriptionDescription
Copies a NUL-terminated string from userspace to kernel space.
On success, returns the length of the string (not including the trailing
NUL).
If access to userspace fails, returns -EFAULT (some data may have been
copied).
If count is smaller than the length of the string, copies count bytes
and returns count.
Nameclear_user --
Zero a block of memory in user space.
SynopsisSynopsisunsigned long clear_user (to, n);void __user * to;unsigned long n;ArgumentsArgumentsto
Destination address, in user space.
n
Number of bytes to zero.
DescriptionDescription
Zero a block of memory in user space.
Returns number of bytes that could not be cleared.
On success, this will be zero.
Name__clear_user --
Zero a block of memory in user space, with less checking.
SynopsisSynopsisunsigned long __clear_user (to, n);void __user * to;unsigned long n;ArgumentsArgumentsto
Destination address, in user space.
n
Number of bytes to zero.
DescriptionDescription
Zero a block of memory in user space. Caller must check
the specified block with access_ok before calling this function.
Returns number of bytes that could not be cleared.
On success, this will be zero.
Namestrnlen_user --
Get the size of a string in user space.
SynopsisSynopsislong strnlen_user (s, n);const char __user * s;long n;ArgumentsArgumentss
The string to measure.
n
The maximum valid length
DescriptionDescription
Get the size of a NUL-terminated string in user space.
Returns the size of the string INCLUDING the terminating NUL.
On exception, returns 0.
If the string is too long, returns a value greater than n.
Namecopy_to_user --
Copy a block of data into user space.
SynopsisSynopsisunsigned long copy_to_user (to, from, n);void __user * to;const void * from;unsigned long n;ArgumentsArgumentsto
Destination address, in user space.
from
Source address, in kernel space.
n
Number of bytes to copy.
ContextContext
User context only. This function may sleep.
DescriptionDescription
Copy data from kernel space to user space.
Returns number of bytes that could not be copied.
On success, this will be zero.
Namecopy_from_user --
Copy a block of data from user space.
SynopsisSynopsisunsigned long copy_from_user (to, from, n);void * to;const void __user * from;unsigned long n;ArgumentsArgumentsto
Destination address, in kernel space.
from
Source address, in user space.
n
Number of bytes to copy.
ContextContext
User context only. This function may sleep.
DescriptionDescription
Copy data from user space to kernel space.
Returns number of bytes that could not be copied.
On success, this will be zero.
If some data could not be copied, this function will pad the copied
data to the requested size using zero bytes.
More Memory Management FunctionsMore Memory Management FunctionsNamepage_dup_rmap --
duplicate pte mapping to a page
SynopsisSynopsisvoid page_dup_rmap (page);struct page * page;ArgumentsArgumentspage
the page to add the mapping to
For copy_page_range onlyFor copy_page_range only
minimal extract from page_add_rmap,
avoiding unnecessary tests (already checked) so it's quicker.
Nameread_cache_pages --
populate an address space with some pages, and
SynopsisSynopsisint read_cache_pages (mapping, pages, filler, data);struct address_space * mapping;struct list_head * pages;int (*filler)
(void *, struct page *);void * data;ArgumentsArgumentsmapping
the address_space
pages
The address of a list_head which contains the target pages. These
pages have their ->index populated and are otherwise uninitialised.
filler
callback routine for filling a single page.
data
private data for the callback routine.
DescriptionDescription
Hides the details of the LRU cache etc from the filesystems.
DescriptionDescription
Hides the details of the LRU cache etc from the filesystems.
Namefilemap_fdatawait --
walk the list of under-writeback pages of the given
SynopsisSynopsisint filemap_fdatawait (mapping);struct address_space * mapping;ArgumentsArgumentsmapping
address space structure to wait for
DescriptionDescription
address space and wait for all of them.
Nameunlock_page --
unlock a locked page
SynopsisSynopsisvoid fastcall unlock_page (page);struct page * page;ArgumentsArgumentspage
the page
DescriptionDescription
Unlocks the page and wakes up sleepers in ___wait_on_page_locked.
Also wakes sleepers in wait_on_page_writeback because the wakeup
mechananism between PageLocked pages and PageWriteback pages is shared.
But that's OK - sleepers in wait_on_page_writeback just go back to sleep.
The first mb is necessary to safely close the critical section opened by the
TestSetPageLocked, the second mb is necessary to enforce ordering between
the clear_bit and the read of the waitqueue (to avoid SMP races with a
parallel wait_on_page_locked).
DescriptionDescription
Unlocks the page and wakes up sleepers in ___wait_on_page_locked.
Also wakes sleepers in wait_on_page_writeback because the wakeup
mechananism between PageLocked pages and PageWriteback pages is shared.
But that's OK - sleepers in wait_on_page_writeback just go back to sleep.
The first mb is necessary to safely close the critical section opened by the
TestSetPageLocked, the second mb is necessary to enforce ordering between
the clear_bit and the read of the waitqueue (to avoid SMP races with a
parallel wait_on_page_locked).
Namefind_lock_page --
locate, pin and lock a pagecache page
SynopsisSynopsisstruct page * find_lock_page (mapping, offset);struct address_space * mapping;unsigned long offset;ArgumentsArgumentsmapping
the address_space to search
offset
the page index
DescriptionDescription
Locates the desired pagecache page, locks it, increments its reference
count and returns its address.
Returns zero if the page was not present. find_lock_page may sleep.
DescriptionDescription
Locates the desired pagecache page, locks it, increments its reference
count and returns its address.
Returns zero if the page was not present. find_lock_page may sleep.
Namefind_or_create_page --
locate or add a pagecache page
SynopsisSynopsisstruct page * find_or_create_page (mapping, index, gfp_mask);struct address_space * mapping;unsigned long index;gfp_t gfp_mask;ArgumentsArgumentsmapping
the page's address_space
index
the page's index into the mapping
gfp_mask
page allocation mode
DescriptionDescription
Locates a page in the pagecache. If the page is not present, a new page
is allocated using gfp_mask and is added to the pagecache and to the VM's
LRU list. The returned page is locked and has its reference count
incremented.
find_or_create_page may sleep, even if gfp_flags specifies an atomic
allocation!
find_or_create_page returns the desired page's address, or zero on
memory exhaustion.
DescriptionDescription
Locates a page in the pagecache. If the page is not present, a new page
is allocated using gfp_mask and is added to the pagecache and to the VM's
LRU list. The returned page is locked and has its reference count
incremented.
find_or_create_page may sleep, even if gfp_flags specifies an atomic
allocation!
find_or_create_page returns the desired page's address, or zero on
memory exhaustion.
Nameunmap_mapping_range --
unmap the portion of all mmaps
SynopsisSynopsisvoid unmap_mapping_range (mapping, holebegin, holelen, even_cows);struct address_space * mapping;loff_t const holebegin;loff_t const holelen;int even_cows;ArgumentsArgumentsmapping
the address space containing mmaps to be unmapped.
holebegin
byte in first page to unmap, relative to the start of
the underlying file. This will be rounded down to a PAGE_SIZE
boundary. Note that this is different from vmtruncate, which
must keep the partial page. In contrast, we must get rid of
partial pages.
holelen
size of prospective hole in bytes. This will be rounded
up to a PAGE_SIZE boundary. A holelen of zero truncates to the
end of the file.
even_cows
1 when truncating a file, unmap even private COWed pages;
but 0 when invalidating pagecache, don't throw away private data.
DescriptionDescription
in the specified address_space corresponding to the specified
page range in the underlying file.
Namevfree --
release memory allocated by vmalloc
SynopsisSynopsisvoid vfree (addr);void * addr;ArgumentsArgumentsaddr
memory base address
DescriptionDescription
Free the virtually contiguous memory area starting at addr, as
obtained from vmalloc, vmalloc_32 or __vmalloc. If addr is
NULL, no operation is performed.
Must not be called in interrupt context.
DescriptionDescription
Free the virtually contiguous memory area starting at addr, as
obtained from vmalloc, vmalloc_32 or __vmalloc. If addr is
NULL, no operation is performed.
Must not be called in interrupt context.
Namevunmap --
release virtual mapping obtained by vmap
SynopsisSynopsisvoid vunmap (addr);void * addr;ArgumentsArgumentsaddr
memory base address
DescriptionDescription
Free the virtually contiguous memory area starting at addr,
which was created from the page array passed to vmap.
Must not be called in interrupt context.
DescriptionDescription
Free the virtually contiguous memory area starting at addr,
which was created from the page array passed to vmap.
Must not be called in interrupt context.
Namevmap --
map an array of pages into virtually contiguous space
SynopsisSynopsisvoid * vmap (pages, count, flags, prot);struct page ** pages;unsigned int count;unsigned long flags;pgprot_t prot;ArgumentsArgumentspages
array of page pointers
count
number of pages to map
flags
vm_area->flags
prot
page protection for the mapping
DescriptionDescription
Maps count pages from pages into contiguous kernel virtual
space.
DescriptionDescription
Maps count pages from pages into contiguous kernel virtual
space.
Name__vmalloc_node --
allocate virtually contiguous memory
SynopsisSynopsisvoid * __vmalloc_node (size, gfp_mask, prot, node);unsigned long size;gfp_t gfp_mask;pgprot_t prot;int node;ArgumentsArgumentssize
allocation size
gfp_mask
flags for the page level allocator
prot
protection mask for the allocated pages
node
node to use for allocation or -1
DescriptionDescription
Allocate enough pages to cover size from the page level
allocator with gfp_mask flags. Map them into contiguous
kernel virtual space, using a pagetable protection of prot.
DescriptionDescription
Allocate enough pages to cover size from the page level
allocator with gfp_mask flags. Map them into contiguous
kernel virtual space, using a pagetable protection of prot.
Namevmalloc --
allocate virtually contiguous memory
SynopsisSynopsisvoid * vmalloc (size);unsigned long size;ArgumentsArgumentssize
allocation size
DescriptionDescription
Allocate enough pages to cover size from the page level
allocator and map them into contiguous kernel virtual space.
For tight cotrol over page level allocator and protection flags
use __vmalloc instead.
DescriptionDescription
Allocate enough pages to cover size from the page level
allocator and map them into contiguous kernel virtual space.
For tight cotrol over page level allocator and protection flags
use __vmalloc instead.
Namevmalloc_node --
allocate memory on a specific node
SynopsisSynopsisvoid * vmalloc_node (size, node);unsigned long size;int node;ArgumentsArgumentssize
allocation size
node
numa node
DescriptionDescription
Allocate enough pages to cover size from the page level
allocator and map them into contiguous kernel virtual space.
For tight cotrol over page level allocator and protection flags
use __vmalloc instead.
DescriptionDescription
Allocate enough pages to cover size from the page level
allocator and map them into contiguous kernel virtual space.
For tight cotrol over page level allocator and protection flags
use __vmalloc instead.
Namevmalloc_32 --
allocate virtually contiguous memory (32bit addressable)
SynopsisSynopsisvoid * vmalloc_32 (size);unsigned long size;ArgumentsArgumentssize
allocation size
DescriptionDescription
Allocate enough 32bit PA addressable pages to cover size from the
page level allocator and map them into contiguous kernel virtual space.
DescriptionDescription
Allocate enough 32bit PA addressable pages to cover size from the
page level allocator and map them into contiguous kernel virtual space.
Namemempool_create --
create a memory pool
SynopsisSynopsismempool_t * mempool_create (min_nr, alloc_fn, free_fn, pool_data);int min_nr;mempool_alloc_t * alloc_fn;mempool_free_t * free_fn;void * pool_data;ArgumentsArgumentsmin_nr
the minimum number of elements guaranteed to be
allocated for this pool.
alloc_fn
user-defined element-allocation function.
free_fn
user-defined element-freeing function.
pool_data
optional private data available to the user-defined functions.
DescriptionDescription
this function creates and allocates a guaranteed size, preallocated
memory pool. The pool can be used from the mempool_alloc and mempool_free
functions. This function might sleep. Both the alloc_fn and the free_fn
functions might sleep - as long as the mempool_alloc function is not called
from IRQ contexts.
Namemempool_resize --
resize an existing memory pool
SynopsisSynopsisint mempool_resize (pool, new_min_nr, gfp_mask);mempool_t * pool;int new_min_nr;gfp_t gfp_mask;ArgumentsArgumentspool
pointer to the memory pool which was allocated via
mempool_create.
new_min_nr
the new minimum number of elements guaranteed to be
allocated for this pool.
gfp_mask
the usual allocation bitmask.
DescriptionDescription
This function shrinks/grows the pool. In the case of growing,
it cannot be guaranteed that the pool will be grown to the new
size immediately, but new mempool_free calls will refill it.
Note, the caller must guarantee that no mempool_destroy is called
while this function is running. mempool_alloc & mempool_free
might be called (eg. from IRQ contexts) while this function executes.
Namemempool_destroy --
deallocate a memory pool
SynopsisSynopsisvoid mempool_destroy (pool);mempool_t * pool;ArgumentsArgumentspool
pointer to the memory pool which was allocated via
mempool_create.
DescriptionDescription
this function only sleeps if the free_fn function sleeps. The caller
has to guarantee that all elements have been returned to the pool (ie:
freed) prior to calling mempool_destroy.
Namemempool_alloc --
allocate an element from a specific memory pool
SynopsisSynopsisvoid * mempool_alloc (pool, gfp_mask);mempool_t * pool;gfp_t gfp_mask;ArgumentsArgumentspool
pointer to the memory pool which was allocated via
mempool_create.
gfp_mask
the usual allocation bitmask.
DescriptionDescription
this function only sleeps if the alloc_fn function sleeps or
returns NULL. Note that due to preallocation, this function
*never* fails when called from process contexts. (it might
fail if called from an IRQ context.)
Namemempool_free --
return an element to the pool.
SynopsisSynopsisvoid mempool_free (element, pool);void * element;mempool_t * pool;ArgumentsArgumentselement
pool element pointer.
pool
pointer to the memory pool which was allocated via
mempool_create.
DescriptionDescription
this function only sleeps if the free_fn function sleeps.
Namebalance_dirty_pages_ratelimited --
balance dirty memory state
SynopsisSynopsisvoid balance_dirty_pages_ratelimited (mapping);struct address_space * mapping;ArgumentsArgumentsmapping
address_space which was dirtied
DescriptionDescription
Processes which are dirtying memory should call in here once for each page
which was newly dirtied. The function will periodically check the system's
dirty state and will initiate writeback if needed.
On really big machines, get_writeback_state is expensive, so try to avoid
calling it too often (ratelimiting). But once we're over the dirty memory
limit we decrease the ratelimiting by a lot, to prevent individual processes
from overshooting the limit by (ratelimit_pages) each.
Namewrite_one_page --
write out a single page and optionally wait on I/O
SynopsisSynopsisint write_one_page (page, wait);struct page * page;int wait;ArgumentsArgumentspage
the page to write
wait
if true, wait on writeout
DescriptionDescription
The page must be locked by the caller and will be unlocked upon return.
write_one_page returns a negative error code if I/O failed.
DescriptionDescription
The page must be locked by the caller and will be unlocked upon return.
write_one_page returns a negative error code if I/O failed.
Nametruncate_inode_pages_range --
truncate range of pages specified by start and
SynopsisSynopsisvoid truncate_inode_pages_range (mapping, lstart, lend);struct address_space * mapping;loff_t lstart;loff_t lend;ArgumentsArgumentsmapping
mapping to truncate
lstart
offset from which to truncate
lend
offset to which to truncate
DescriptionDescription
Truncate the page cache, removing the pages that are between
specified offsets (and zeroing out partial page
(if lstart is not page aligned)).
Truncate takes two passes - the first pass is nonblocking. It will not
block on page locks and it will not block on writeback. The second pass
will wait. This is to prevent as much IO as possible in the affected region.
The first pass will remove most pages, so the search cost of the second pass
is low.
When looking at page->index outside the page lock we need to be careful to
copy it into a local to avoid races (it could change at any time).
We pass down the cache-hot hint to the page freeing code. Even if the
mapping is large, it is probably the case that the final pages are the most
recently touched, and freeing happens in ascending file offset order.
DescriptionDescription
Truncate the page cache, removing the pages that are between
specified offsets (and zeroing out partial page
(if lstart is not page aligned)).
Truncate takes two passes - the first pass is nonblocking. It will not
block on page locks and it will not block on writeback. The second pass
will wait. This is to prevent as much IO as possible in the affected region.
The first pass will remove most pages, so the search cost of the second pass
is low.
When looking at page->index outside the page lock we need to be careful to
copy it into a local to avoid races (it could change at any time).
We pass down the cache-hot hint to the page freeing code. Even if the
mapping is large, it is probably the case that the final pages are the most
recently touched, and freeing happens in ascending file offset order.
Nametruncate_inode_pages --
truncate *all* the pages from an offset
SynopsisSynopsisvoid truncate_inode_pages (mapping, lstart);struct address_space * mapping;loff_t lstart;ArgumentsArgumentsmapping
mapping to truncate
lstart
offset from which to truncate
DescriptionDescription
Called under (and serialised by) inode->i_mutex.
Nameinvalidate_inode_pages2_range --
remove range of pages from an address_space
SynopsisSynopsisint invalidate_inode_pages2_range (mapping, start, end);struct address_space * mapping;pgoff_t start;pgoff_t end;ArgumentsArgumentsmapping
the address_space
start
the page offset 'from' which to invalidate
end
the page offset 'to' which to invalidate (inclusive)
DescriptionDescription
Any pages which are found to be mapped into pagetables are unmapped prior to
invalidation.
Returns -EIO if any pages could not be invalidated.
Nameinvalidate_inode_pages2 --
remove all pages from an address_space
SynopsisSynopsisint invalidate_inode_pages2 (mapping);struct address_space * mapping;ArgumentsArgumentsmapping
the address_space
DescriptionDescription
Any pages which are found to be mapped into pagetables are unmapped prior to
invalidation.
Returns -EIO if any pages could not be invalidated.
Kernel IPC facilitiesKernel IPC facilitiesChapter 5. Kernel IPC facilitiesIPC utilitiesIPC utilitiesNameipc_init --
initialise IPC subsystem
SynopsisSynopsisint __init ipc_init (void); void;ArgumentsArgumentsvoid
no arguments
DescriptionDescription
The various system5 IPC resources (semaphores, messages and shared
memory are initialised
Nameipc_init_ids --
initialise IPC identifiers
SynopsisSynopsisvoid __init ipc_init_ids (ids, size);struct ipc_ids * ids;int size;ArgumentsArgumentsids
Identifier set
size
Number of identifiers
DescriptionDescription
Given a size for the ipc identifier range (limited below IPCMNI)
set up the sequence range to use then allocate and initialise the
array itself.
Nameipc_init_proc_interface --
Create a proc interface for sysipc types
SynopsisSynopsisvoid __init ipc_init_proc_interface (path, header, ids, show);const char * path;const char * header;struct ipc_ids * ids;int (*show)
(struct seq_file *, void *);ArgumentsArgumentspath
Path in procfs
header
Banner to be printed at the beginning of the file.
ids
ipc id table to iterate.
show
show routine.
DescriptionDescription
using a seq_file interface.
Nameipc_findkey --
find a key in an ipc identifier set
SynopsisSynopsisint ipc_findkey (ids, key);struct ipc_ids * ids;key_t key;ArgumentsArgumentsids
Identifier set
key
The key to find
DescriptionDescription
Requires ipc_ids.sem locked.
Returns the identifier if found or -1 if not.
Nameipc_addid --
add an IPC identifier
SynopsisSynopsisint ipc_addid (ids, new, size);struct ipc_ids * ids;struct kern_ipc_perm * new;int size;ArgumentsArgumentsids
IPC identifier set
new
new IPC permission set
size
new size limit for the id array
DescriptionDescription
Add an entry 'new' to the IPC arrays. The permissions object is
initialised and the first free entry is set up and the id assigned
is returned. The list is returned in a locked state on success.
On failure the list is not locked and -1 is returned.
Called with ipc_ids.sem held.
Nameipc_rmid --
remove an IPC identifier
SynopsisSynopsisstruct kern_ipc_perm* ipc_rmid (ids, id);struct ipc_ids * ids;int id;ArgumentsArgumentsids
identifier set
id
Identifier to remove
DescriptionDescription
The identifier must be valid, and in use. The kernel will panic if
fed an invalid identifier. The entry is removed and internal
variables recomputed. The object associated with the identifier
is returned.
ipc_ids.sem and the spinlock for this ID is hold before this function
is called, and remain locked on the exit.
Nameipc_alloc --
allocate ipc space
SynopsisSynopsisvoid* ipc_alloc (size);int size;ArgumentsArgumentssize
size desired
DescriptionDescription
Allocate memory from the appropriate pools and return a pointer to it.
NULL is returned if the allocation fails
Nameipc_free --
free ipc space
SynopsisSynopsisvoid ipc_free (ptr, size);void * ptr;int size;ArgumentsArgumentsptr
pointer returned by ipc_alloc
size
size of block
DescriptionDescription
Free a block created with ipc_alloc. The caller must know the size
used in the allocation call.
Nameipc_rcu_alloc --
allocate ipc and rcu space
SynopsisSynopsisvoid* ipc_rcu_alloc (size);int size;ArgumentsArgumentssize
size desired
DescriptionDescription
Allocate memory for the rcu header structure + the object.
Returns the pointer to the object.
NULL is returned if the allocation fails.
Nameipc_schedule_free --
free ipc + rcu space
SynopsisSynopsisvoid ipc_schedule_free (head);struct rcu_head * head;ArgumentsArgumentshead
RCU callback structure for queued work
DescriptionDescription
Since RCU callback function is called in bh,
we need to defer the vfree to schedule_work
Nameipc_immediate_free --
free ipc + rcu space
SynopsisSynopsisvoid ipc_immediate_free (head);struct rcu_head * head;ArgumentsArgumentshead
RCU callback structure that contains pointer to be freed
DescriptionDescription
Free from the RCU callback context
Nameipcperms --
check IPC permissions
SynopsisSynopsisint ipcperms (ipcp, flag);struct kern_ipc_perm * ipcp;short flag;ArgumentsArgumentsipcp
IPC permission set
flag
desired permission set.
DescriptionDescription
Check user, group, other permissions for access
to ipc resources. return 0 if allowed
Namekernel_to_ipc64_perm --
convert kernel ipc permissions to user
SynopsisSynopsisvoid kernel_to_ipc64_perm (in, out);struct kern_ipc_perm * in;struct ipc64_perm * out;ArgumentsArgumentsin
kernel permissions
out
new style IPC permissions
DescriptionDescription
Turn the kernel object 'in' into a set of permissions descriptions
for returning to userspace (out).
Nameipc64_perm_to_ipc_perm --
convert old ipc permissions to new
SynopsisSynopsisvoid ipc64_perm_to_ipc_perm (in, out);struct ipc64_perm * in;struct ipc_perm * out;ArgumentsArgumentsin
new style IPC permissions
out
old style IPC permissions
DescriptionDescription
Turn the new style permissions object in into a compatibility
object and store it into the 'out' pointer.
Nameipc_parse_version --
IPC call version
SynopsisSynopsisint ipc_parse_version (cmd);int * cmd;ArgumentsArgumentscmd
pointer to command
DescriptionDescription
Return IPC_64 for new style IPC and IPC_OLD for old style IPC.
The cmd value is turned from an encoding command and version into
just the command code.
FIFO BufferFIFO BufferChapter 6. FIFO Bufferkfifo interfacekfifo interfaceName__kfifo_reset --
removes the entire FIFO contents, no locking version
SynopsisSynopsisvoid __kfifo_reset (fifo);struct kfifo * fifo;ArgumentsArgumentsfifo
the fifo to be emptied.
Namekfifo_reset --
removes the entire FIFO contents
SynopsisSynopsisvoid kfifo_reset (fifo);struct kfifo * fifo;ArgumentsArgumentsfifo
the fifo to be emptied.
Namekfifo_put --
puts some data into the FIFO
SynopsisSynopsisunsigned int kfifo_put (fifo, buffer, len);struct kfifo * fifo;unsigned char * buffer;unsigned int len;ArgumentsArgumentsfifo
the fifo to be used.
buffer
the data to be added.
len
the length of the data to be added.
DescriptionDescription
This function copies at most 'len' bytes from the 'buffer' into
the FIFO depending on the free space, and returns the number of
bytes copied.
Namekfifo_get --
gets some data from the FIFO
SynopsisSynopsisunsigned int kfifo_get (fifo, buffer, len);struct kfifo * fifo;unsigned char * buffer;unsigned int len;ArgumentsArgumentsfifo
the fifo to be used.
buffer
where the data must be copied.
len
the size of the destination buffer.
DescriptionDescription
This function copies at most 'len' bytes from the FIFO into the
'buffer' and returns the number of copied bytes.
Name__kfifo_len --
returns the number of bytes available in the FIFO, no locking version
SynopsisSynopsisunsigned int __kfifo_len (fifo);struct kfifo * fifo;ArgumentsArgumentsfifo
the fifo to be used.
Namekfifo_len --
returns the number of bytes available in the FIFO
SynopsisSynopsisunsigned int kfifo_len (fifo);struct kfifo * fifo;ArgumentsArgumentsfifo
the fifo to be used.
Namekfifo_init --
allocates a new FIFO using a preallocated buffer
SynopsisSynopsisstruct kfifo * kfifo_init (buffer, size, gfp_mask, lock);unsigned char * buffer;unsigned int size;gfp_t gfp_mask;spinlock_t * lock;ArgumentsArgumentsbuffer
the preallocated buffer to be used.
size
the size of the internal buffer, this have to be a power of 2.
gfp_mask
get_free_pages mask, passed to kmalloc
lock
the lock to be used to protect the fifo buffer
DescriptionDescription
Do NOT pass the kfifo to kfifo_free after use ! Simply free the
struct kfifo with kfree.
Namekfifo_alloc --
allocates a new FIFO and its internal buffer
SynopsisSynopsisstruct kfifo * kfifo_alloc (size, gfp_mask, lock);unsigned int size;gfp_t gfp_mask;spinlock_t * lock;ArgumentsArgumentssize
the size of the internal buffer to be allocated.
gfp_mask
get_free_pages mask, passed to kmalloc
lock
the lock to be used to protect the fifo buffer
DescriptionDescription
The size will be rounded-up to a power of 2.
Namekfifo_free --
frees the FIFO
SynopsisSynopsisvoid kfifo_free (fifo);struct kfifo * fifo;ArgumentsArgumentsfifo
the fifo to be freed.
Name__kfifo_put --
puts some data into the FIFO, no locking version
SynopsisSynopsisunsigned int __kfifo_put (fifo, buffer, len);struct kfifo * fifo;unsigned char * buffer;unsigned int len;ArgumentsArgumentsfifo
the fifo to be used.
buffer
the data to be added.
len
the length of the data to be added.
DescriptionDescription
This function copies at most 'len' bytes from the 'buffer' into
the FIFO depending on the free space, and returns the number of
bytes copied.
Note that with only one concurrent reader and one concurrent
writer, you don't need extra locking to use these functions.
Name__kfifo_get --
gets some data from the FIFO, no locking version
SynopsisSynopsisunsigned int __kfifo_get (fifo, buffer, len);struct kfifo * fifo;unsigned char * buffer;unsigned int len;ArgumentsArgumentsfifo
the fifo to be used.
buffer
where the data must be copied.
len
the size of the destination buffer.
DescriptionDescription
This function copies at most 'len' bytes from the FIFO into the
'buffer' and returns the number of copied bytes.
Note that with only one concurrent reader and one concurrent
writer, you don't need extra locking to use these functions.
The proc filesystemThe proc filesystemChapter 7. The proc filesystemsysctl interfacesysctl interfaceNameregister_sysctl_table --
register a sysctl hierarchy
SynopsisSynopsisstruct ctl_table_header * register_sysctl_table (table, insert_at_head);ctl_table * table;int insert_at_head;ArgumentsArgumentstable
the top-level table structure
insert_at_head
whether the entry should be inserted in front or at the end
DescriptionDescription
Register a sysctl table hierarchy. table should be a filled in ctl_table
array. An entry with a ctl_name of 0 terminates the table.
The members of the &ctl_table structure are used as follows:
ctl_name - This is the numeric sysctl value used by sysctl(2). The number
must be unique within that level of sysctl
procname - the name of the sysctl file under /proc/sys. Set to NULL to not
enter a sysctl file
data - a pointer to data for use by proc_handler
maxlen - the maximum size in bytes of the data
mode - the file permissions for the /proc/sys file, and for sysctl(2)
child - a pointer to the child sysctl table if this entry is a directory, or
NULL.
proc_handler - the text handler routine (described below)
strategy - the strategy routine (described below)
de - for internal use by the sysctl routines
extra1, extra2 - extra pointers usable by the proc handler routines
Leaf nodes in the sysctl tree will be represented by a single file
under /proc; non-leaf nodes will be represented by directories.
sysctl(2) can automatically manage read and write requests through
the sysctl table. The data and maxlen fields of the ctl_table
struct enable minimal validation of the values being written to be
performed, and the mode field allows minimal authentication.
More sophisticated management can be enabled by the provision of a
strategy routine with the table entry. This will be called before
any automatic read or write of the data is performed.
The strategy routine may return
< 0 - Error occurred (error is passed to user process)
0 - OK - proceed with automatic read or write.
> 0 - OK - read or write has been done by the strategy routine, so
return immediately.
There must be a proc_handler routine for any terminal nodes
mirrored under /proc/sys (non-terminals are handled by a built-in
directory handler). Several default handlers are available to
cover common cases -
proc_dostring, proc_dointvec, proc_dointvec_jiffies,
proc_dointvec_userhz_jiffies, proc_dointvec_minmax,
proc_doulongvec_ms_jiffies_minmax, proc_doulongvec_minmax
It is the handler's job to read the input buffer from user memory
and process it. The handler should return 0 on success.
This routine returns NULL on a failure to register, and a pointer
to the table header on success.
Nameunregister_sysctl_table --
unregister a sysctl table hierarchy
SynopsisSynopsisvoid unregister_sysctl_table (header);struct ctl_table_header * header;ArgumentsArgumentsheader
the header returned from register_sysctl_table
DescriptionDescription
Unregisters the sysctl table and all children. proc entries may not
actually be removed until they are no longer used by anyone.
Nameproc_dostring --
read a string sysctl
SynopsisSynopsisint proc_dostring (table, write, filp, buffer, lenp, ppos);ctl_table * table;int write;struct file * filp;void __user * buffer;size_t * lenp;loff_t * ppos;ArgumentsArgumentstable
the sysctl table
write
TRUE if this is a write to the sysctl file
filp
the file structure
buffer
the user buffer
lenp
the size of the user buffer
ppos
file position
DescriptionDescription
Reads/writes a string from/to the user buffer. If the kernel
buffer provided is not large enough to hold the string, the
string is truncated. The copied string is NULL-terminated.
If the string is being read by the user process, it is copied
and a newline '\n' is added. It is truncated if the buffer is
not large enough.
Returns 0 on success.
Nameproc_dointvec --
read a vector of integers
SynopsisSynopsisint proc_dointvec (table, write, filp, buffer, lenp, ppos);ctl_table * table;int write;struct file * filp;void __user * buffer;size_t * lenp;loff_t * ppos;ArgumentsArgumentstable
the sysctl table
write
TRUE if this is a write to the sysctl file
filp
the file structure
buffer
the user buffer
lenp
the size of the user buffer
ppos
file position
DescriptionDescription
Reads/writes up to table->maxlen/sizeof(unsigned int) integer
values from/to the user buffer, treated as an ASCII string.
Returns 0 on success.
Nameproc_dointvec_minmax --
read a vector of integers with min/max values
SynopsisSynopsisint proc_dointvec_minmax (table, write, filp, buffer, lenp, ppos);ctl_table * table;int write;struct file * filp;void __user * buffer;size_t * lenp;loff_t * ppos;ArgumentsArgumentstable
the sysctl table
write
TRUE if this is a write to the sysctl file
filp
the file structure
buffer
the user buffer
lenp
the size of the user buffer
ppos
file position
DescriptionDescription
Reads/writes up to table->maxlen/sizeof(unsigned int) integer
values from/to the user buffer, treated as an ASCII string.
This routine will ensure the values are within the range specified by
table->extra1 (min) and table->extra2 (max).
Returns 0 on success.
Nameproc_doulongvec_minmax --
read a vector of long integers with min/max values
SynopsisSynopsisint proc_doulongvec_minmax (table, write, filp, buffer, lenp, ppos);ctl_table * table;int write;struct file * filp;void __user * buffer;size_t * lenp;loff_t * ppos;ArgumentsArgumentstable
the sysctl table
write
TRUE if this is a write to the sysctl file
filp
the file structure
buffer
the user buffer
lenp
the size of the user buffer
ppos
file position
DescriptionDescription
Reads/writes up to table->maxlen/sizeof(unsigned long) unsigned long
values from/to the user buffer, treated as an ASCII string.
This routine will ensure the values are within the range specified by
table->extra1 (min) and table->extra2 (max).
Returns 0 on success.
Nameproc_doulongvec_ms_jiffies_minmax --
read a vector of millisecond values with min/max values
SynopsisSynopsisint proc_doulongvec_ms_jiffies_minmax (table, write, filp, buffer, lenp, ppos);ctl_table * table;int write;struct file * filp;void __user * buffer;size_t * lenp;loff_t * ppos;ArgumentsArgumentstable
the sysctl table
write
TRUE if this is a write to the sysctl file
filp
the file structure
buffer
the user buffer
lenp
the size of the user buffer
ppos
file position
DescriptionDescription
Reads/writes up to table->maxlen/sizeof(unsigned long) unsigned long
values from/to the user buffer, treated as an ASCII string. The values
are treated as milliseconds, and converted to jiffies when they are stored.
This routine will ensure the values are within the range specified by
table->extra1 (min) and table->extra2 (max).
Returns 0 on success.
Nameproc_dointvec_jiffies --
read a vector of integers as seconds
SynopsisSynopsisint proc_dointvec_jiffies (table, write, filp, buffer, lenp, ppos);ctl_table * table;int write;struct file * filp;void __user * buffer;size_t * lenp;loff_t * ppos;ArgumentsArgumentstable
the sysctl table
write
TRUE if this is a write to the sysctl file
filp
the file structure
buffer
the user buffer
lenp
the size of the user buffer
ppos
file position
DescriptionDescription
Reads/writes up to table->maxlen/sizeof(unsigned int) integer
values from/to the user buffer, treated as an ASCII string.
The values read are assumed to be in seconds, and are converted into
jiffies.
Returns 0 on success.
Nameproc_dointvec_userhz_jiffies --
read a vector of integers as 1/USER_HZ seconds
SynopsisSynopsisint proc_dointvec_userhz_jiffies (table, write, filp, buffer, lenp, ppos);ctl_table * table;int write;struct file * filp;void __user * buffer;size_t * lenp;loff_t * ppos;ArgumentsArgumentstable
the sysctl table
write
TRUE if this is a write to the sysctl file
filp
the file structure
buffer
the user buffer
lenp
the size of the user buffer
ppos
pointer to the file position
DescriptionDescription
Reads/writes up to table->maxlen/sizeof(unsigned int) integer
values from/to the user buffer, treated as an ASCII string.
The values read are assumed to be in 1/USER_HZ seconds, and
are converted into jiffies.
Returns 0 on success.
Nameproc_dointvec_ms_jiffies --
read a vector of integers as 1 milliseconds
SynopsisSynopsisint proc_dointvec_ms_jiffies (table, write, filp, buffer, lenp, ppos);ctl_table * table;int write;struct file * filp;void __user * buffer;size_t * lenp;loff_t * ppos;ArgumentsArgumentstable
the sysctl table
write
TRUE if this is a write to the sysctl file
filp
the file structure
buffer
the user buffer
lenp
the size of the user buffer
ppos
the current position in the file
DescriptionDescription
Reads/writes up to table->maxlen/sizeof(unsigned int) integer
values from/to the user buffer, treated as an ASCII string.
The values read are assumed to be in 1/1000 seconds, and
are converted into jiffies.
Returns 0 on success.
proc filesystem interfaceproc filesystem interfaceNameproc_pid_unhash --
Unhash /proc/pid entry from the dcache.
SynopsisSynopsisstruct dentry * proc_pid_unhash (p);struct task_struct * p;ArgumentsArgumentsp
task that should be flushed.
DescriptionDescription
Drops the /proc/pid dcache entry from the hash chains.
Dropping /proc/pid entries and detach_pid must be synchroneous,
otherwise e.g. /proc/pid/exe might point to the wrong executable,
if the pid value is immediately reused. This is enforced by
- caller must acquire spin_lock(p->proc_lock)
- must be called before detach_pid
- proc_pid_lookup acquires proc_lock, and checks that
the target is not dead by looking at the attach count
of PIDTYPE_PID.
Nameproc_pid_flush --
recover memory used by stale /proc/pid/x entries
SynopsisSynopsisvoid proc_pid_flush (proc_dentry);struct dentry * proc_dentry;ArgumentsArgumentsproc_dentry
directoy to prune.
DescriptionDescription
Shrink the /proc directory that was used by the just killed thread.
The debugfs filesystemThe debugfs filesystemChapter 8. The debugfs filesystemdebugfs interfacedebugfs interfaceNamedebugfs_create_file --
create a file in the debugfs filesystem
SynopsisSynopsisstruct dentry * debugfs_create_file (name, mode, parent, data, fops);const char * name;mode_t mode;struct dentry * parent;void * data;struct file_operations * fops;ArgumentsArgumentsname
a pointer to a string containing the name of the file to create.
mode
the permission that the file should have
parent
a pointer to the parent dentry for this file. This should be a
directory dentry if set. If this paramater is NULL, then the
file will be created in the root of the debugfs filesystem.
data
a pointer to something that the caller will want to get to later
on. The inode.u.generic_ip pointer will point to this value on
the open call.
fops
a pointer to a struct file_operations that should be used for
this file.
DescriptionDescription
This is the basic “create a file” function for debugfs. It allows for a
wide range of flexibility in createing a file, or a directory (if you
want to create a directory, the debugfs_create_dir function is
recommended to be used instead.)
This function will return a pointer to a dentry if it succeeds. This
pointer must be passed to the debugfs_remove function when the file is
to be removed (no automatic cleanup happens if your module is unloaded,
you are responsible here.) If an error occurs, NULL will be returned.
If debugfs is not enabled in the kernel, the value -ENODEV will be
returned. It is not wise to check for this value, but rather, check for
NULL or !NULL instead as to eliminate the need for #ifdef in the calling
code.
DescriptionDescription
This is the basic “create a file” function for debugfs. It allows for a
wide range of flexibility in createing a file, or a directory (if you
want to create a directory, the debugfs_create_dir function is
recommended to be used instead.)
This function will return a pointer to a dentry if it succeeds. This
pointer must be passed to the debugfs_remove function when the file is
to be removed (no automatic cleanup happens if your module is unloaded,
you are responsible here.) If an error occurs, NULL will be returned.
If debugfs is not enabled in the kernel, the value -ENODEV will be
returned. It is not wise to check for this value, but rather, check for
NULL or !NULL instead as to eliminate the need for #ifdef in the calling
code.
Namedebugfs_create_dir --
create a directory in the debugfs filesystem
SynopsisSynopsisstruct dentry * debugfs_create_dir (name, parent);const char * name;struct dentry * parent;ArgumentsArgumentsname
a pointer to a string containing the name of the directory to
create.
parent
a pointer to the parent dentry for this file. This should be a
directory dentry if set. If this paramater is NULL, then the
directory will be created in the root of the debugfs filesystem.
DescriptionDescription
This function creates a directory in debugfs with the given name.
This function will return a pointer to a dentry if it succeeds. This
pointer must be passed to the debugfs_remove function when the file is
to be removed (no automatic cleanup happens if your module is unloaded,
you are responsible here.) If an error occurs, NULL will be returned.
If debugfs is not enabled in the kernel, the value -ENODEV will be
returned. It is not wise to check for this value, but rather, check for
NULL or !NULL instead as to eliminate the need for #ifdef in the calling
code.
DescriptionDescription
This function creates a directory in debugfs with the given name.
This function will return a pointer to a dentry if it succeeds. This
pointer must be passed to the debugfs_remove function when the file is
to be removed (no automatic cleanup happens if your module is unloaded,
you are responsible here.) If an error occurs, NULL will be returned.
If debugfs is not enabled in the kernel, the value -ENODEV will be
returned. It is not wise to check for this value, but rather, check for
NULL or !NULL instead as to eliminate the need for #ifdef in the calling
code.
Namedebugfs_remove --
removes a file or directory from the debugfs filesystem
SynopsisSynopsisvoid debugfs_remove (dentry);struct dentry * dentry;ArgumentsArgumentsdentry
a pointer to a the dentry of the file or directory to be
removed.
DescriptionDescription
This function removes a file or directory in debugfs that was previously
created with a call to another debugfs function (like
debufs_create_file or variants thereof.)
This function is required to be called in order for the file to be
removed, no automatic cleanup of files will happen when a module is
removed, you are responsible here.
DescriptionDescription
This function removes a file or directory in debugfs that was previously
created with a call to another debugfs function (like
debufs_create_file or variants thereof.)
This function is required to be called in order for the file to be
removed, no automatic cleanup of files will happen when a module is
removed, you are responsible here.
Namedebugfs_create_u8 --
create a file in the debugfs filesystem that is used to read and write a unsigned 8 bit value.
SynopsisSynopsisstruct dentry * debugfs_create_u8 (name, mode, parent, value);const char * name;mode_t mode;struct dentry * parent;u8 * value;ArgumentsArgumentsname
a pointer to a string containing the name of the file to create.
mode
the permission that the file should have
parent
a pointer to the parent dentry for this file. This should be a
directory dentry if set. If this paramater is NULL, then the
file will be created in the root of the debugfs filesystem.
value
a pointer to the variable that the file should read to and write
from.
DescriptionDescription
This function creates a file in debugfs with the given name that
contains the value of the variable value. If the mode variable is so
set, it can be read from, and written to.
This function will return a pointer to a dentry if it succeeds. This
pointer must be passed to the debugfs_remove function when the file is
to be removed (no automatic cleanup happens if your module is unloaded,
you are responsible here.) If an error occurs, NULL will be returned.
If debugfs is not enabled in the kernel, the value -ENODEV will be
returned. It is not wise to check for this value, but rather, check for
NULL or !NULL instead as to eliminate the need for #ifdef in the calling
code.
DescriptionDescription
This function creates a file in debugfs with the given name that
contains the value of the variable value. If the mode variable is so
set, it can be read from, and written to.
This function will return a pointer to a dentry if it succeeds. This
pointer must be passed to the debugfs_remove function when the file is
to be removed (no automatic cleanup happens if your module is unloaded,
you are responsible here.) If an error occurs, NULL will be returned.
If debugfs is not enabled in the kernel, the value -ENODEV will be
returned. It is not wise to check for this value, but rather, check for
NULL or !NULL instead as to eliminate the need for #ifdef in the calling
code.
Namedebugfs_create_u16 --
create a file in the debugfs filesystem that is used to read and write a unsigned 8 bit value.
SynopsisSynopsisstruct dentry * debugfs_create_u16 (name, mode, parent, value);const char * name;mode_t mode;struct dentry * parent;u16 * value;ArgumentsArgumentsname
a pointer to a string containing the name of the file to create.
mode
the permission that the file should have
parent
a pointer to the parent dentry for this file. This should be a
directory dentry if set. If this paramater is NULL, then the
file will be created in the root of the debugfs filesystem.
value
a pointer to the variable that the file should read to and write
from.
DescriptionDescription
This function creates a file in debugfs with the given name that
contains the value of the variable value. If the mode variable is so
set, it can be read from, and written to.
This function will return a pointer to a dentry if it succeeds. This
pointer must be passed to the debugfs_remove function when the file is
to be removed (no automatic cleanup happens if your module is unloaded,
you are responsible here.) If an error occurs, NULL will be returned.
If debugfs is not enabled in the kernel, the value -ENODEV will be
returned. It is not wise to check for this value, but rather, check for
NULL or !NULL instead as to eliminate the need for #ifdef in the calling
code.
DescriptionDescription
This function creates a file in debugfs with the given name that
contains the value of the variable value. If the mode variable is so
set, it can be read from, and written to.
This function will return a pointer to a dentry if it succeeds. This
pointer must be passed to the debugfs_remove function when the file is
to be removed (no automatic cleanup happens if your module is unloaded,
you are responsible here.) If an error occurs, NULL will be returned.
If debugfs is not enabled in the kernel, the value -ENODEV will be
returned. It is not wise to check for this value, but rather, check for
NULL or !NULL instead as to eliminate the need for #ifdef in the calling
code.
Namedebugfs_create_u32 --
create a file in the debugfs filesystem that is used to read and write a unsigned 8 bit value.
SynopsisSynopsisstruct dentry * debugfs_create_u32 (name, mode, parent, value);const char * name;mode_t mode;struct dentry * parent;u32 * value;ArgumentsArgumentsname
a pointer to a string containing the name of the file to create.
mode
the permission that the file should have
parent
a pointer to the parent dentry for this file. This should be a
directory dentry if set. If this paramater is NULL, then the
file will be created in the root of the debugfs filesystem.
value
a pointer to the variable that the file should read to and write
from.
DescriptionDescription
This function creates a file in debugfs with the given name that
contains the value of the variable value. If the mode variable is so
set, it can be read from, and written to.
This function will return a pointer to a dentry if it succeeds. This
pointer must be passed to the debugfs_remove function when the file is
to be removed (no automatic cleanup happens if your module is unloaded,
you are responsible here.) If an error occurs, NULL will be returned.
If debugfs is not enabled in the kernel, the value -ENODEV will be
returned. It is not wise to check for this value, but rather, check for
NULL or !NULL instead as to eliminate the need for #ifdef in the calling
code.
DescriptionDescription
This function creates a file in debugfs with the given name that
contains the value of the variable value. If the mode variable is so
set, it can be read from, and written to.
This function will return a pointer to a dentry if it succeeds. This
pointer must be passed to the debugfs_remove function when the file is
to be removed (no automatic cleanup happens if your module is unloaded,
you are responsible here.) If an error occurs, NULL will be returned.
If debugfs is not enabled in the kernel, the value -ENODEV will be
returned. It is not wise to check for this value, but rather, check for
NULL or !NULL instead as to eliminate the need for #ifdef in the calling
code.
Namedebugfs_create_bool --
create a file in the debugfs filesystem that is used to read and write a boolean value.
SynopsisSynopsisstruct dentry * debugfs_create_bool (name, mode, parent, value);const char * name;mode_t mode;struct dentry * parent;u32 * value;ArgumentsArgumentsname
a pointer to a string containing the name of the file to create.
mode
the permission that the file should have
parent
a pointer to the parent dentry for this file. This should be a
directory dentry if set. If this paramater is NULL, then the
file will be created in the root of the debugfs filesystem.
value
a pointer to the variable that the file should read to and write
from.
DescriptionDescription
This function creates a file in debugfs with the given name that
contains the value of the variable value. If the mode variable is so
set, it can be read from, and written to.
This function will return a pointer to a dentry if it succeeds. This
pointer must be passed to the debugfs_remove function when the file is
to be removed (no automatic cleanup happens if your module is unloaded,
you are responsible here.) If an error occurs, NULL will be returned.
If debugfs is not enabled in the kernel, the value -ENODEV will be
returned. It is not wise to check for this value, but rather, check for
NULL or !NULL instead as to eliminate the need for #ifdef in the calling
code.
DescriptionDescription
This function creates a file in debugfs with the given name that
contains the value of the variable value. If the mode variable is so
set, it can be read from, and written to.
This function will return a pointer to a dentry if it succeeds. This
pointer must be passed to the debugfs_remove function when the file is
to be removed (no automatic cleanup happens if your module is unloaded,
you are responsible here.) If an error occurs, NULL will be returned.
If debugfs is not enabled in the kernel, the value -ENODEV will be
returned. It is not wise to check for this value, but rather, check for
NULL or !NULL instead as to eliminate the need for #ifdef in the calling
code.
The Linux VFSThe Linux VFSChapter 9. The Linux VFSThe Filesystem typesThe Filesystem typesNameenum positive_aop_returns --
aop return codes with specific semantics
SynopsisSynopsis
enum positive_aop_returns {
AOP_WRITEPAGE_ACTIVATE,
AOP_TRUNCATED_PAGE
}; ConstantsConstantsAOP_WRITEPAGE_ACTIVATE
Informs the caller that page writeback has
completed, that the page is still locked, and
should be considered active. The VM uses this hint
to return the page to the active list -- it won't
be a candidate for writeback again in the near
future. Other callers must be careful to unlock
the page if they get this return. Returned by
writepage;
AOP_TRUNCATED_PAGE
The AOP method that was handed a locked page has
unlocked it and the page might have been truncated.
The caller should back up to acquiring a new page and
trying again. The aop will be taking reasonable
precautions not to livelock. If the caller held a page
reference, it should drop it before retrying. Returned
by readpage, prepare_write, and commit_write.
DescriptionDescription
address_space_operation functions return these large constants to indicate
special semantics to the caller. These are much larger than the bytes in a
page to allow for functions that return the number of bytes operated on in a
given page.
DescriptionDescription
address_space_operation functions return these large constants to indicate
special semantics to the caller. These are much larger than the bytes in a
page to allow for functions that return the number of bytes operated on in a
given page.
Namestruct export_operations --
for nfsd to communicate with file systems
SynopsisSynopsis
struct export_operations {
struct dentry *(* decode_fh) (struct super_block *sb, __u32 *fh, int fh_len, int fh_type,int (*acceptable);
int (* encode_fh) (struct dentry *de, __u32 *fh, int *max_len,int connectable);
int (* get_name) (struct dentry *parent, char *name,struct dentry *child);
struct dentry * (* get_parent) (struct dentry *child);
struct dentry * (* get_dentry) (struct super_block *sb, void *inump);
struct dentry * (* find_exported_dentry) (struct super_block *sb, void *obj, void *parent,int (*acceptable);
}; MembersMembersdecode_fh
decode a file handle fragment and return a &struct dentry
encode_fh
encode a file handle fragment from a dentry
get_name
find the name for a given inode in a given directory
get_parent
find the parent of a given directory
get_dentry
find a dentry for the inode given a file handle sub-fragment
find_exported_dentry
set by the exporting module to a standard helper function.
DescriptionDescription
The export_operations structure provides a means for nfsd to communicate
with a particular exported file system - particularly enabling nfsd and
the filesystem to co-operate when dealing with file handles.
export_operations contains two basic operation for dealing with file
handles, decode_fh and encode_fh, and allows for some other
operations to be defined which standard helper routines use to get
specific information from the filesystem.
nfsd encodes information use to determine which filesystem a filehandle
applies to in the initial part of the file handle. The remainder, termed
a file handle fragment, is controlled completely by the filesystem. The
standard helper routines assume that this fragment will contain one or
two sub-fragments, one which identifies the file, and one which may be
used to identify the (a) directory containing the file.
In some situations, nfsd needs to get a dentry which is connected into a
specific part of the file tree. To allow for this, it passes the
function acceptable together with a context which can be used to see
if the dentry is acceptable. As there can be multiple dentrys for a
given file, the filesystem should check each one for acceptability before
looking for the next. As soon as an acceptable one is found, it should
be returned.
decode_fhdecode_fh
decode_fh is given a &struct super_block (sb), a file handle fragment
(fh, fh_len) and an acceptability testing function (acceptable,
context). It should return a &struct dentry which refers to the same
file that the file handle fragment refers to, and which passes the
acceptability test. If it cannot, it should return a NULL pointer if
the file was found but no acceptable &dentries were available, or a
ERR_PTR error code indicating why it couldn't be found (e.g. ENOENT or
ENOMEM).
encode_fhencode_fh
encode_fh should store in the file handle fragment fh (using at most
max_len bytes) information that can be used by decode_fh to recover the
file refered to by the &struct dentry de. If the connectable flag is
set, the encode_fh should store sufficient information so that a good
attempt can be made to find not only the file but also it's place in the
filesystem. This typically means storing a reference to de->d_parent in
the filehandle fragment. encode_fh should return the number of bytes
stored or a negative error code such as -ENOSPC
get_nameget_name
get_name should find a name for the given child in the given parent
directory. The name should be stored in the name (with the
understanding that it is already pointing to a a NAME_MAX+1 sized
buffer. get_name should return 0 on success, a negative error code
or error. get_name will be called without parent->i_mutex held.
get_parentget_parent
get_parent should find the parent directory for the given child which
is also a directory. In the event that it cannot be found, or storage
space cannot be allocated, a ERR_PTR should be returned.
get_dentryget_dentry
Given a &super_block (sb) and a pointer to a file-system specific inode
identifier, possibly an inode number, (inump) get_dentry should find
the identified inode and return a dentry for that inode. Any suitable
dentry can be returned including, if necessary, a new dentry created with
d_alloc_root. The caller can then find any other extant dentrys by
following the d_alias links. If a new dentry was created using
d_alloc_root, DCACHE_NFSD_DISCONNECTED should be set, and the dentry
should be d_rehashed.
If the inode cannot be found, either a NULL pointer or an ERR_PTR code
can be returned. The inump will be whatever was passed to
nfsd_find_fh_dentry in either the obj or parent parameters.
Locking rulesLocking rules
get_parent is called with child->d_inode->i_mutex down
get_name is not (which is possibly inconsistent)
The Directory CacheThe Directory CacheNamed_invalidate --
invalidate a dentry
SynopsisSynopsisint d_invalidate (dentry);struct dentry * dentry;ArgumentsArgumentsdentry
dentry to invalidate
DescriptionDescription
Try to invalidate the dentry if it turns out to be
possible. If there are other dentries that can be
reached through this one we can't delete it and we
return -EBUSY. On success we return 0.
no dcache lock.
Nameshrink_dcache_sb --
shrink dcache for a superblock
SynopsisSynopsisvoid shrink_dcache_sb (sb);struct super_block * sb;ArgumentsArgumentssb
superblock
DescriptionDescription
Shrink the dcache for the specified super block. This
is used to free the dcache before unmounting a file
system
Namehave_submounts --
check for mounts over a dentry
SynopsisSynopsisint have_submounts (parent);struct dentry * parent;ArgumentsArgumentsparent
dentry to check.
DescriptionDescription
Return true if the parent or its subdirectories contain
a mount point
Nameshrink_dcache_parent --
prune dcache
SynopsisSynopsisvoid shrink_dcache_parent (parent);struct dentry * parent;ArgumentsArgumentsparent
parent of entries to prune
DescriptionDescription
Prune the dcache to remove unused children of the parent dentry.
Named_alloc --
allocate a dcache entry
SynopsisSynopsisstruct dentry * d_alloc (parent, name);struct dentry * parent;const struct qstr * name;ArgumentsArgumentsparent
parent of entry to allocate
name
qstr of the name
DescriptionDescription
Allocates a dentry. It returns NULL if there is insufficient memory
available. On a success the dentry is returned. The name passed in is
copied and the copy passed in may be reused after this call.
Named_instantiate --
fill in inode information for a dentry
SynopsisSynopsisvoid d_instantiate (entry, inode);struct dentry * entry;struct inode * inode;ArgumentsArgumentsentry
dentry to complete
inode
inode to attach to this dentry
DescriptionDescription
Fill in inode information in the entry.
This turns negative dentries into productive full members
of society.
NOTE! This assumes that the inode count has been incremented
(or otherwise set) by the caller to indicate that it is now
in use by the dcache.
Named_instantiate_unique --
instantiate a non-aliased dentry
SynopsisSynopsisstruct dentry * d_instantiate_unique (entry, inode);struct dentry * entry;struct inode * inode;ArgumentsArgumentsentry
dentry to instantiate
inode
inode to attach to this dentry
DescriptionDescription
Fill in inode information in the entry. On success, it returns NULL.
If an unhashed alias of “entry” already exists, then we return the
aliased dentry instead and drop one reference to inode.
Note that in order to avoid conflicts with rename etc, the caller
had better be holding the parent directory semaphore.
This also assumes that the inode count has been incremented
(or otherwise set) by the caller to indicate that it is now
in use by the dcache.
Named_alloc_root --
allocate root dentry
SynopsisSynopsisstruct dentry * d_alloc_root (root_inode);struct inode * root_inode;ArgumentsArgumentsroot_inode
inode to allocate the root for
DescriptionDescription
Allocate a root (“/”) dentry for the inode given. The inode is
instantiated and returned. NULL is returned if there is insufficient
memory or the inode passed is NULL.
Named_alloc_anon --
allocate an anonymous dentry
SynopsisSynopsisstruct dentry * d_alloc_anon (inode);struct inode * inode;ArgumentsArgumentsinode
inode to allocate the dentry for
DescriptionDescription
This is similar to d_alloc_root. It is used by filesystems when
creating a dentry for a given inode, often in the process of
mapping a filehandle to a dentry. The returned dentry may be
anonymous, or may have a full name (if the inode was already
in the cache). The file system may need to make further
efforts to connect this dentry into the dcache properly.
When called on a directory inode, we must ensure that
the inode only ever has one dentry. If a dentry is
found, that is returned instead of allocating a new one.
On successful return, the reference to the inode has been transferred
to the dentry. If NULL is returned (indicating kmalloc failure),
the reference on the inode has not been released.
Named_splice_alias --
splice a disconnected dentry into the tree if one exists
SynopsisSynopsisstruct dentry * d_splice_alias (inode, dentry);struct inode * inode;struct dentry * dentry;ArgumentsArgumentsinode
the inode which may have a disconnected dentry
dentry
a negative dentry which we want to point to the inode.
DescriptionDescription
If inode is a directory and has a 'disconnected' dentry (i.e. IS_ROOT and
DCACHE_DISCONNECTED), then d_move that in place of the given dentry
and return it, else simply d_add the inode to the dentry and return NULL.
This is needed in the lookup routine of any filesystem that is exportable
(via knfsd) so that we can build dcache paths to directories effectively.
If a dentry was found and moved, then it is returned. Otherwise NULL
is returned. This matches the expected return value of ->lookup.
Named_lookup --
search for a dentry
SynopsisSynopsisstruct dentry * d_lookup (parent, name);struct dentry * parent;struct qstr * name;ArgumentsArgumentsparent
parent dentry
name
qstr of name we wish to find
DescriptionDescription
Searches the children of the parent dentry for the name in question. If
the dentry is found its reference count is incremented and the dentry
is returned. The caller must use d_put to free the entry when it has
finished using it. NULL is returned on failure.
__d_lookup is dcache_lock free. The hash list is protected using RCU.
Memory barriers are used while updating and doing lockless traversal.
To avoid races with d_move while rename is happening, d_lock is used.
Overflows in memcmp, while d_move, are avoided by keeping the length
and name pointer in one structure pointed by d_qstr.
rcu_read_lock and rcu_read_unlock are used to disable preemption while
lookup is going on.
dentry_unused list is not updated even if lookup finds the required dentry
in there. It is updated in places such as prune_dcache, shrink_dcache_sb,
select_parent and __dget_locked. This laziness saves lookup from dcache_lock
acquisition.
d_lookup is protected against the concurrent renames in some unrelated
directory using the seqlockt_t rename_lock.
Named_validate --
verify dentry provided from insecure source
SynopsisSynopsisint d_validate (dentry, dparent);struct dentry * dentry;struct dentry * dparent;ArgumentsArgumentsdentry
The dentry alleged to be valid child of dparent
dparent
The parent dentry (known to be valid)
DescriptionDescription
An insecure source has sent us a dentry, here we verify it and dget it.
This is used by ncpfs in its readdir implementation.
Zero is returned in the dentry is invalid.
Named_delete --
delete a dentry
SynopsisSynopsisvoid d_delete (dentry);struct dentry * dentry;ArgumentsArgumentsdentry
The dentry to delete
DescriptionDescription
Turn the dentry into a negative dentry if possible, otherwise
remove it from the hash queues so it can be deleted later
Named_rehash --
add an entry back to the hash
SynopsisSynopsisvoid d_rehash (entry);struct dentry * entry;ArgumentsArgumentsentry
dentry to add to the hash
DescriptionDescription
Adds a dentry to the hash according to its name.
Named_move --
move a dentry
SynopsisSynopsisvoid d_move (dentry, target);struct dentry * dentry;struct dentry * target;ArgumentsArgumentsdentry
entry to move
target
new dentry
DescriptionDescription
Update the dcache to reflect the move of a file name. Negative
dcache entries should not be moved in this way.
Namefind_inode_number --
check for dentry with name
SynopsisSynopsisino_t find_inode_number (dir, name);struct dentry * dir;struct qstr * name;ArgumentsArgumentsdir
directory to check
name
Name to find.
DescriptionDescription
Check whether a dentry already exists for the given name,
and return the inode number if it has an inode. Otherwise
0 is returned.
This routine is used to post-process directory listings for
filesystems using synthetic inode numbers, and is necessary
to keep getcwd working.
Name__d_drop --
drop a dentry
SynopsisSynopsisvoid __d_drop (dentry);struct dentry * dentry;ArgumentsArgumentsdentry
dentry to drop
DescriptionDescription
d_drop unhashes the entry from the parent dentry hashes, so that it won't
be found through a VFS lookup any more. Note that this is different from
deleting the dentry - d_delete will try to mark the dentry negative if
possible, giving a successful _negative_ lookup, while d_drop will
just make the cache lookup fail.
d_drop is used mainly for stuff that wants to invalidate a dentry for some
reason (NFS timeouts or autofs deletes).
__d_drop requires dentry->d_lock.
Named_add --
add dentry to hash queues
SynopsisSynopsisvoid d_add (entry, inode);struct dentry * entry;struct inode * inode;ArgumentsArgumentsentry
dentry to add
inode
The inode to attach to this dentry
DescriptionDescription
This adds the entry to the hash queues and initializes inode.
The entry was actually filled in earlier during d_alloc.
Named_add_unique --
add dentry to hash queues without aliasing
SynopsisSynopsisstruct dentry * d_add_unique (entry, inode);struct dentry * entry;struct inode * inode;ArgumentsArgumentsentry
dentry to add
inode
The inode to attach to this dentry
DescriptionDescription
This adds the entry to the hash queues and initializes inode.
The entry was actually filled in earlier during d_alloc.
Namedget --
get a reference to a dentry
SynopsisSynopsisstruct dentry * dget (dentry);struct dentry * dentry;ArgumentsArgumentsdentry
dentry to get a reference to
DescriptionDescription
Given a dentry or NULL pointer increment the reference count
if appropriate and return the dentry. A dentry will not be
destroyed when it has references. dget should never be
called for dentries with zero reference counter. For these cases
(preferably none, functions in dcache.c are sufficient for normal
needs and they take necessary precautions) you should hold dcache_lock
and call dget_locked instead of dget.
Named_unhashed --
is dentry hashed
SynopsisSynopsisint d_unhashed (dentry);struct dentry * dentry;ArgumentsArgumentsdentry
entry to check
DescriptionDescription
Returns true if the dentry passed is not currently hashed.
Inode HandlingInode HandlingNameclear_inode --
clear an inode
SynopsisSynopsisvoid clear_inode (inode);struct inode * inode;ArgumentsArgumentsinode
inode to clear
DescriptionDescription
This is called by the filesystem to tell us
that the inode is no longer useful. We just
terminate it with extreme prejudice.
Nameinvalidate_inodes --
discard the inodes on a device
SynopsisSynopsisint invalidate_inodes (sb);struct super_block * sb;ArgumentsArgumentssb
superblock
DescriptionDescription
Discard all of the inodes for a given superblock. If the discard
fails because there are busy inodes then a non zero value is returned.
If the discard is successful all the inodes have been discarded.
Namenew_inode --
obtain an inode
SynopsisSynopsisstruct inode * new_inode (sb);struct super_block * sb;ArgumentsArgumentssb
superblock
DescriptionDescription
Allocates a new inode for given superblock.
Nameiunique --
get a unique inode number
SynopsisSynopsisino_t iunique (sb, max_reserved);struct super_block * sb;ino_t max_reserved;ArgumentsArgumentssb
superblock
max_reserved
highest reserved inode number
DescriptionDescription
Obtain an inode number that is unique on the system for a given
superblock. This is used by file systems that have no natural
permanent inode numbering system. An inode number is returned that
is higher than the reserved limit but unique.
BUGSBUGS
With a large number of inodes live on the file system this function
currently becomes quite slow.
Nameilookup5_nowait --
search for an inode in the inode cache
SynopsisSynopsisstruct inode * ilookup5_nowait (sb, hashval, test, data);struct super_block * sb;unsigned long hashval;int (*test)
(struct inode *, void *);void * data;ArgumentsArgumentssb
super block of file system to search
hashval
hash value (usually inode number) to search for
test
callback used for comparisons between inodes
data
opaque data pointer to pass to test
DescriptionDescription
ilookup5 uses ifind to search for the inode specified by hashval and
data in the inode cache. This is a generalized version of ilookup for
file systems where the inode number is not sufficient for unique
identification of an inode.
If the inode is in the cache, the inode is returned with an incremented
reference count. Note, the inode lock is not waited upon so you have to be
very careful what you do with the returned inode. You probably should be
using ilookup5 instead.
Otherwise NULL is returned.
Note, test is called with the inode_lock held, so can't sleep.
Nameilookup5 --
search for an inode in the inode cache
SynopsisSynopsisstruct inode * ilookup5 (sb, hashval, test, data);struct super_block * sb;unsigned long hashval;int (*test)
(struct inode *, void *);void * data;ArgumentsArgumentssb
super block of file system to search
hashval
hash value (usually inode number) to search for
test
callback used for comparisons between inodes
data
opaque data pointer to pass to test
DescriptionDescription
ilookup5 uses ifind to search for the inode specified by hashval and
data in the inode cache. This is a generalized version of ilookup for
file systems where the inode number is not sufficient for unique
identification of an inode.
If the inode is in the cache, the inode lock is waited upon and the inode is
returned with an incremented reference count.
Otherwise NULL is returned.
Note, test is called with the inode_lock held, so can't sleep.
Nameilookup --
search for an inode in the inode cache
SynopsisSynopsisstruct inode * ilookup (sb, ino);struct super_block * sb;unsigned long ino;ArgumentsArgumentssb
super block of file system to search
ino
inode number to search for
DescriptionDescription
ilookup uses ifind_fast to search for the inode ino in the inode cache.
This is for file systems where the inode number is sufficient for unique
identification of an inode.
If the inode is in the cache, the inode is returned with an incremented
reference count.
Otherwise NULL is returned.
Nameiget5_locked --
obtain an inode from a mounted file system
SynopsisSynopsisstruct inode * iget5_locked (sb, hashval, test, set, data);struct super_block * sb;unsigned long hashval;int (*test)
(struct inode *, void *);int (*set)
(struct inode *, void *);void * data;ArgumentsArgumentssb
super block of file system
hashval
hash value (usually inode number) to get
test
callback used for comparisons between inodes
set
callback used to initialize a new struct inode
data
opaque data pointer to pass to test and set
DescriptionDescription
This is iget without the read_inode portion of get_new_inode.
iget5_locked uses ifind to search for the inode specified by hashval
and data in the inode cache and if present it is returned with an increased
reference count. This is a generalized version of iget_locked for file
systems where the inode number is not sufficient for unique identification
of an inode.
If the inode is not in cache, get_new_inode is called to allocate a new
inode and this is returned locked, hashed, and with the I_NEW flag set. The
file system gets to fill it in before unlocking it via unlock_new_inode.
Note both test and set are called with the inode_lock held, so can't sleep.
Nameiget_locked --
obtain an inode from a mounted file system
SynopsisSynopsisstruct inode * iget_locked (sb, ino);struct super_block * sb;unsigned long ino;ArgumentsArgumentssb
super block of file system
ino
inode number to get
DescriptionDescription
This is iget without the read_inode portion of get_new_inode_fast.
iget_locked uses ifind_fast to search for the inode specified by ino in
the inode cache and if present it is returned with an increased reference
count. This is for file systems where the inode number is sufficient for
unique identification of an inode.
If the inode is not in cache, get_new_inode_fast is called to allocate a
new inode and this is returned locked, hashed, and with the I_NEW flag set.
The file system gets to fill it in before unlocking it via
unlock_new_inode.
Name__insert_inode_hash --
hash an inode
SynopsisSynopsisvoid __insert_inode_hash (inode, hashval);struct inode * inode;unsigned long hashval;ArgumentsArgumentsinode
unhashed inode
hashval
unsigned long value used to locate this object in the
inode_hashtable.
DescriptionDescription
Add an inode to the inode hash for this superblock.
Nameremove_inode_hash --
remove an inode from the hash
SynopsisSynopsisvoid remove_inode_hash (inode);struct inode * inode;ArgumentsArgumentsinode
inode to unhash
DescriptionDescription
Remove an inode from the superblock.
Nameiput --
put an inode
SynopsisSynopsisvoid iput (inode);struct inode * inode;ArgumentsArgumentsinode
inode to put
DescriptionDescription
Puts an inode, dropping its usage count. If the inode use count hits
zero, the inode is then freed and may also be destroyed.
Consequently, iput can sleep.
Namebmap --
find a block number in a file
SynopsisSynopsissector_t bmap (inode, block);struct inode * inode;sector_t block;ArgumentsArgumentsinode
inode of file
block
block to find
DescriptionDescription
Returns the block number on the device holding the inode that
is the disk block number for the block of the file requested.
That is, asked for block 4 of inode 1 the function will return the
disk block relative to the disk start that holds that block of the
file.
Nametouch_atime --
update the access time
SynopsisSynopsisvoid touch_atime (mnt, dentry);struct vfsmount * mnt;struct dentry * dentry;ArgumentsArgumentsmnt
mount the inode is accessed on
dentry
dentry accessed
DescriptionDescription
Update the accessed time on an inode and mark it for writeback.
This function automatically handles read only file systems and media,
as well as the “noatime” flag and inode specific “noatime” markers.
Namefile_update_time --
update mtime and ctime time
SynopsisSynopsisvoid file_update_time (file);struct file * file;ArgumentsArgumentsfile
file accessed
DescriptionDescription
Update the mtime and ctime members of an inode and mark the inode
for writeback. Note that this function is meant exclusively for
usage in the file write path of filesystems, and filesystems may
choose to explicitly ignore update via this function with the
S_NOCTIME inode flag, e.g. for network filesystem where these
timestamps are handled by the server.
Namemake_bad_inode --
mark an inode bad due to an I/O error
SynopsisSynopsisvoid make_bad_inode (inode);struct inode * inode;ArgumentsArgumentsinode
Inode to mark bad
DescriptionDescription
When an inode cannot be read due to a media or remote network
failure this function makes the inode “bad” and causes I/O operations
on it to fail from this point on.
Nameis_bad_inode --
is an inode errored
SynopsisSynopsisint is_bad_inode (inode);struct inode * inode;ArgumentsArgumentsinode
inode to test
DescriptionDescription
Returns true if the inode in question has been marked as bad.
Registration and SuperblocksRegistration and SuperblocksNamedeactivate_super --
drop an active reference to superblock
SynopsisSynopsisvoid deactivate_super (s);struct super_block * s;ArgumentsArgumentss
superblock to deactivate
DescriptionDescription
Drops an active reference to superblock, acquiring a temprory one if
there is no active references left. In that case we lock superblock,
tell fs driver to shut it down and drop the temporary reference we
had just acquired.
Namegeneric_shutdown_super --
common helper for ->kill_sb
SynopsisSynopsisvoid generic_shutdown_super (sb);struct super_block * sb;ArgumentsArgumentssb
superblock to kill
DescriptionDescription
generic_shutdown_super does all fs-independent work on superblock
shutdown. Typical ->kill_sb should pick all fs-specific objects
that need destruction out of superblock, call generic_shutdown_super
and release aforementioned objects. Note: dentries and inodes _are_
taken care of and do not need specific handling.
Namesget --
find or create a superblock
SynopsisSynopsisstruct super_block * sget (type, test, set, data);struct file_system_type * type;int (*test)
(struct super_block *,void *);int (*set)
(struct super_block *,void *);void * data;ArgumentsArgumentstype
filesystem type superblock should belong to
test
comparison callback
set
setup callback
data
argument to each of them
Nameget_super --
get the superblock of a device
SynopsisSynopsisstruct super_block * get_super (bdev);struct block_device * bdev;ArgumentsArgumentsbdev
device to get the superblock for
DescriptionDescription
Scans the superblock list and finds the superblock of the file system
mounted on the device given. NULL is returned if no match is found.
File LocksFile LocksNameposix_lock_file --
Apply a POSIX-style lock to a file
SynopsisSynopsisint posix_lock_file (filp, fl);struct file * filp;struct file_lock * fl;ArgumentsArgumentsfilp
The file to apply the lock to
fl
The lock to be applied
DescriptionDescription
Add a POSIX style lock to a file.
We merge adjacent & overlapping locks whenever possible.
POSIX locks are sorted by owner task, then by starting address
Nameposix_lock_file_wait --
Apply a POSIX-style lock to a file
SynopsisSynopsisint posix_lock_file_wait (filp, fl);struct file * filp;struct file_lock * fl;ArgumentsArgumentsfilp
The file to apply the lock to
fl
The lock to be applied
DescriptionDescription
Add a POSIX style lock to a file.
We merge adjacent & overlapping locks whenever possible.
POSIX locks are sorted by owner task, then by starting address
Namelocks_mandatory_area --
Check for a conflicting lock
SynopsisSynopsisint locks_mandatory_area (read_write, inode, filp, offset, count);int read_write;struct inode * inode;struct file * filp;loff_t offset;size_t count;ArgumentsArgumentsread_write
FLOCK_VERIFY_WRITE for exclusive access, FLOCK_VERIFY_READ
for shared
inode
the file to check
filp
how the file was opened (if it was)
offset
start of area to check
count
length of area to check
DescriptionDescription
Searches the inode's list of locks to find any POSIX locks which conflict.
This function is called from rw_verify_area and
locks_verify_truncate.
Name__break_lease --
revoke all outstanding leases on file
SynopsisSynopsisint __break_lease (inode, mode);struct inode * inode;unsigned int mode;ArgumentsArgumentsinode
the inode of the file to return
mode
the open mode (read or write)
DescriptionDescription
break_lease (inlined for speed) has checked there already
is a lease on this file. Leases are broken on a call to open
or truncate. This function can sleep unless you
specified O_NONBLOCK to your open.
Namelease_get_mtime --
SynopsisSynopsisvoid lease_get_mtime (inode, time);struct inode * inode;struct timespec * time;ArgumentsArgumentsinode
the inode
time
pointer to a timespec which will contain the last modified time
DescriptionDescription
This is to force NFS clients to flush their caches for files with
exclusive leases. The justification is that if someone has an
exclusive lease, then they could be modifiying it.
Nameflock_lock_file_wait --
Apply a FLOCK-style lock to a file
SynopsisSynopsisint flock_lock_file_wait (filp, fl);struct file * filp;struct file_lock * fl;ArgumentsArgumentsfilp
The file to apply the lock to
fl
The lock to be applied
DescriptionDescription
Add a FLOCK style lock to a file.
Nameposix_block_lock --
blocks waiting for a file lock
SynopsisSynopsisvoid posix_block_lock (blocker, waiter);struct file_lock * blocker;struct file_lock * waiter;ArgumentsArgumentsblocker
the lock which is blocking
waiter
the lock which conflicts and has to wait
DescriptionDescription
lockd needs to block waiting for locks.
Nameposix_unblock_lock --
stop waiting for a file lock
SynopsisSynopsisint posix_unblock_lock (filp, waiter);struct file * filp;struct file_lock * waiter;ArgumentsArgumentsfilp
how the file was opened
waiter
the lock which was waiting
DescriptionDescription
lockd needs to block waiting for locks.
Namelock_may_read --
checks that the region is free of locks
SynopsisSynopsisint lock_may_read (inode, start, len);struct inode * inode;loff_t start;unsigned long len;ArgumentsArgumentsinode
the inode that is being read
start
the first byte to read
len
the number of bytes to read
DescriptionDescription
Emulates Windows locking requirements. Whole-file
mandatory locks (share modes) can prohibit a read and
byte-range POSIX locks can prohibit a read if they overlap.
N.B. this function is only ever called
from knfsd and ownership of locks is never checked.
Namelock_may_write --
checks that the region is free of locks
SynopsisSynopsisint lock_may_write (inode, start, len);struct inode * inode;loff_t start;unsigned long len;ArgumentsArgumentsinode
the inode that is being written
start
the first byte to write
len
the number of bytes to write
DescriptionDescription
Emulates Windows locking requirements. Whole-file
mandatory locks (share modes) can prohibit a write and
byte-range POSIX locks can prohibit a write if they overlap.
N.B. this function is only ever called
from knfsd and ownership of locks is never checked.
Namelocks_mandatory_locked --
Check for an active lock
SynopsisSynopsisint locks_mandatory_locked (inode);struct inode * inode;ArgumentsArgumentsinode
the file to check
DescriptionDescription
Searches the inode's list of locks to find any POSIX locks which conflict.
This function is called from locks_verify_locked only.
Namefcntl_getlease --
Enquire what lease is currently active
SynopsisSynopsisint fcntl_getlease (filp);struct file * filp;ArgumentsArgumentsfilp
the file
DescriptionDescription
The value returned by this function will be one of
(if no lease break is pending):
F_RDLCK to indicate a shared lease is held.
F_WRLCK to indicate an exclusive lease is held.
F_UNLCK to indicate no lease is held.
(if a lease break is pending):
F_RDLCK to indicate an exclusive lease needs to be
changed to a shared lease (or removed).
F_UNLCK to indicate the lease needs to be removed.
XXXXXX
sfr & willy disagree over whether F_INPROGRESS
should be returned to userspace.
Name__setlease --
sets a lease on an open file
SynopsisSynopsisint __setlease (filp, arg, flp);struct file * filp;long arg;struct file_lock ** flp;ArgumentsArgumentsfilp
file pointer
arg
type of lease to obtain
flp
input - file_lock to use, output - file_lock inserted
DescriptionDescription
The (input) flp->fl_lmops->fl_break function is required
by break_lease.
Called with kernel lock held.
Namefcntl_setlease --
sets a lease on an open file
SynopsisSynopsisint fcntl_setlease (fd, filp, arg);unsigned int fd;struct file * filp;long arg;ArgumentsArgumentsfd
open file descriptor
filp
file pointer
arg
type of lease to obtain
DescriptionDescription
Call this fcntl to establish a lease on the file.
Note that you also need to call F_SETSIG to
receive a signal when the lease is broken.
Namesys_flock --
flock system call.
SynopsisSynopsislong sys_flock (fd, cmd);unsigned int fd;unsigned int cmd;ArgumentsArgumentsfd
the file descriptor to lock.
cmd
the type of lock to apply.
DescriptionDescription
Apply a FL_FLOCK style lock to an open file descriptor.
The cmd can be one of
LOCK_SH -- a shared lock.
LOCK_EX -- an exclusive lock.
LOCK_UN -- remove an existing lock.
LOCK_MAND -- a `mandatory' flock. This exists to emulate Windows Share Modes.
LOCK_MAND can be combined with LOCK_READ or LOCK_WRITE to allow other
processes read and write access respectively.
Nameget_locks_status --
reports lock usage in /proc/locks
SynopsisSynopsisint get_locks_status (buffer, start, offset, length);char * buffer;char ** start;off_t offset;int length;ArgumentsArgumentsbuffer
address in userspace to write into
start
?
offset
how far we are through the buffer
length
how much to read
Other FunctionsOther FunctionsNamempage_readpages --
populate an address space with some pages, and
SynopsisSynopsisint mpage_readpages (mapping, pages, nr_pages, get_block);struct address_space * mapping;struct list_head * pages;unsigned nr_pages;get_block_t get_block;ArgumentsArgumentsmapping
the address_space
pages
The address of a list_head which contains the target pages. These
pages have their ->index populated and are otherwise uninitialised.
nr_pages
The number of pages at *pages
get_block
The filesystem's block mapper function.
DescriptionDescription
This function walks the pages and the blocks within each page, building and
emitting large BIOs.
If anything unusual happens, such as:
- encountering a page which has buffers
- encountering a page which has a non-hole after a hole
- encountering a page with non-contiguous blocks
then this code just gives up and calls the buffer_head-based read function.
It does handle a page which has holes at the end - that is a common case:
the end-of-file on blocksize < PAGE_CACHE_SIZE setups.
DescriptionDescription
This function walks the pages and the blocks within each page, building and
emitting large BIOs.
If anything unusual happens, such as:
- encountering a page which has buffers
- encountering a page which has a non-hole after a hole
- encountering a page with non-contiguous blocks
then this code just gives up and calls the buffer_head-based read function.
It does handle a page which has holes at the end - that is a common case:
the end-of-file on blocksize < PAGE_CACHE_SIZE setups.
DescriptionDescription
This function walks the pages and the blocks within each page, building and
emitting large BIOs.
If anything unusual happens, such as:
- encountering a page which has buffers
- encountering a page which has a non-hole after a hole
- encountering a page with non-contiguous blocks
then this code just gives up and calls the buffer_head-based read function.
It does handle a page which has holes at the end - that is a common case:
the end-of-file on blocksize < PAGE_CACHE_SIZE setups.
BH_Boundary explanationBH_Boundary explanation
There is a problem. The mpage read code assembles several pages, gets all
their disk mappings, and then submits them all. That's fine, but obtaining
the disk mappings may require I/O. Reads of indirect blocks, for example.
So an mpage read of the first 16 blocks of an ext2 file will cause I/O to be
submitted in the following ordersubmitted in the following order
12 0 1 2 3 4 5 6 7 8 9 10 11 13 14 15 16
because the indirect block has to be read to get the mappings of blocks
13,14,15,16. Obviously, this impacts performance.
So what we do it to allow the filesystem's get_block function to set
BH_Boundary when it maps block 11. BH_Boundary says: mapping of the block
after this one will require I/O against a block which is probably close to
this one. So you should push what I/O you have currently accumulated.
This all causes the disk requests to be issued in the correct order.
Namempage_writepages --
walk the list of dirty pages of the given
SynopsisSynopsisint mpage_writepages (mapping, wbc, get_block);struct address_space * mapping;struct writeback_control * wbc;get_block_t get_block;ArgumentsArgumentsmapping
address space structure to write
wbc
subtract the number of written pages from *wbc->nr_to_write
get_block
the filesystem's block mapper function.
If this is NULL then use a_ops->writepage. Otherwise, go
direct-to-BIO.
DescriptionDescription
This is a library function, which implements the writepages
address_space_operation.
If a page is already under I/O, generic_writepages skips it, even
if it's dirty. This is desirable behaviour for memory-cleaning writeback,
but it is INCORRECT for data-integrity system calls such as fsync. fsync
and msync need to guarantee that all the data which was dirty at the time
the call was made get new I/O started against them. If wbc->sync_mode is
WB_SYNC_ALL then we were called for data integrity and we must wait for
existing IO to complete.
DescriptionDescription
This is a library function, which implements the writepages
address_space_operation.
If a page is already under I/O, generic_writepages skips it, even
if it's dirty. This is desirable behaviour for memory-cleaning writeback,
but it is INCORRECT for data-integrity system calls such as fsync. fsync
and msync need to guarantee that all the data which was dirty at the time
the call was made get new I/O started against them. If wbc->sync_mode is
WB_SYNC_ALL then we were called for data integrity and we must wait for
existing IO to complete.
Namegeneric_permission --
check for access rights on a Posix-like filesystem
SynopsisSynopsisint generic_permission (inode, mask, check_acl);struct inode * inode;int mask;int (*check_acl)
(struct inode *inode, int mask);ArgumentsArgumentsinode
inode to check access rights for
mask
right to check for (MAY_READ, MAY_WRITE, MAY_EXEC)
check_acl
optional callback to check for Posix ACLs
DescriptionDescription
Used to check for read/write/execute permissions on a file.
We use “fsuid” for this, letting us set arbitrary permissions
for filesystem access without changing the “normal” uids which
are used for other things..
Namevfs_permission --
check for access rights to a given path
SynopsisSynopsisint vfs_permission (nd, mask);struct nameidata * nd;int mask;ArgumentsArgumentsnd
lookup result that describes the path
mask
right to check for (MAY_READ, MAY_WRITE, MAY_EXEC)
DescriptionDescription
Used to check for read/write/execute permissions on a path.
We use “fsuid” for this, letting us set arbitrary permissions
for filesystem access without changing the “normal” uids which
are used for other things.
Namefile_permission --
check for additional access rights to a given file
SynopsisSynopsisint file_permission (file, mask);struct file * file;int mask;ArgumentsArgumentsfile
file to check access rights for
mask
right to check for (MAY_READ, MAY_WRITE, MAY_EXEC)
DescriptionDescription
Used to check for read/write/execute permissions on an already opened
file.
NoteNote
Do not use this function in new code. All access checks should
be done using vfs_permission.
Namelookup_create --
lookup a dentry, creating it if it doesn't exist
SynopsisSynopsisstruct dentry * lookup_create (nd, is_dir);struct nameidata * nd;int is_dir;ArgumentsArgumentsnd
nameidata info
is_dir
directory flag
DescriptionDescription
Simple function to lookup and return a dentry and create it
if it doesn't exist. Is SMP-safe.
Returns with nd->dentry->d_inode->i_mutex locked.
Namefreeze_bdev --
- lock a filesystem and force it into a consistent state
SynopsisSynopsisstruct super_block * freeze_bdev (bdev);struct block_device * bdev;ArgumentsArgumentsbdev
blockdevice to lock
DescriptionDescription
This takes the block device bd_mount_sem to make sure no new mounts
happen on bdev until thaw_bdev is called.
If a superblock is found on this device, we take the s_umount semaphore
on it to make sure nobody unmounts until the snapshot creation is done.
Namethaw_bdev --
- unlock filesystem
SynopsisSynopsisvoid thaw_bdev (bdev, sb);struct block_device * bdev;struct super_block * sb;ArgumentsArgumentsbdev
blockdevice to unlock
sb
associated superblock
DescriptionDescription
Unlocks the filesystem and marks it writeable again after freeze_bdev.
Namesync_mapping_buffers --
write out and wait upon a mapping's “associated”
SynopsisSynopsisint sync_mapping_buffers (mapping);struct address_space * mapping;ArgumentsArgumentsmapping
the mapping which wants those buffers written
DescriptionDescription
Starts I/O against the buffers at mapping->private_list, and waits upon
that I/O.
Basically, this is a convenience function for fsync.
mapping is a file or directory which needs those buffers to be written for
a successful fsync.
DescriptionDescription
Starts I/O against the buffers at mapping->private_list, and waits upon
that I/O.
Basically, this is a convenience function for fsync.
mapping is a file or directory which needs those buffers to be written for
a successful fsync.
Namemark_buffer_dirty --
mark a buffer_head as needing writeout
SynopsisSynopsisvoid fastcall mark_buffer_dirty (bh);struct buffer_head * bh;ArgumentsArgumentsbh
the buffer_head to mark dirty
DescriptionDescription
mark_buffer_dirty will set the dirty bit against the buffer, then set its
backing page dirty, then tag the page as dirty in its address_space's radix
tree and then attach the address_space's inode to its superblock's dirty
inode list.
mark_buffer_dirty is atomic. It takes bh->b_page->mapping->private_lock,
mapping->tree_lock and the global inode_lock.
Name__bread --
reads a specified block and returns the bh
SynopsisSynopsisstruct buffer_head * __bread (bdev, block, size);struct block_device * bdev;sector_t block;int size;ArgumentsArgumentsbdev
the block_device to read from
block
number of block
size
size (in bytes) to read
DescriptionDescription
Reads a specified block, and returns buffer head that contains it.
It returns NULL if the block was unreadable.
Nametry_to_release_page --
release old fs-specific metadata on a page
SynopsisSynopsisint try_to_release_page (page, gfp_mask);struct page * page;gfp_t gfp_mask;ArgumentsArgumentspage
the page which the kernel is trying to free
gfp_mask
memory allocation flags (and I/O mode)
DescriptionDescription
The address_space is to try to release any data against the page
(presumably at page->private). If the release was successful, return `1'.
Otherwise return zero.
The gfp_mask argument specifies whether I/O may be performed to release
this page (__GFP_IO), and whether the call may block (__GFP_WAIT).
DescriptionDescription
The address_space is to try to release any data against the page
(presumably at page->private). If the release was successful, return `1'.
Otherwise return zero.
The gfp_mask argument specifies whether I/O may be performed to release
this page (__GFP_IO), and whether the call may block (__GFP_WAIT).
NOTENOTE
gfp_mask may go away, and this function may become non-blocking.
Nameblock_invalidatepage --
invalidate part of all of a buffer-backed page
SynopsisSynopsisint block_invalidatepage (page, offset);struct page * page;unsigned long offset;ArgumentsArgumentspage
the page which is affected
offset
the index of the truncation point
DescriptionDescription
block_invalidatepage is called when all or part of the page has become
invalidatedby a truncate operation.
block_invalidatepage does not have to release all buffers, but it must
ensure that no dirty buffer is left outside offset and that no I/O
is underway against any of the blocks which are outside the truncation
point. Because the caller is about to free (and possibly reuse) those
blocks on-disk.
DescriptionDescription
block_invalidatepage is called when all or part of the page has become
invalidatedby a truncate operation.
block_invalidatepage does not have to release all buffers, but it must
ensure that no dirty buffer is left outside offset and that no I/O
is underway against any of the blocks which are outside the truncation
point. Because the caller is about to free (and possibly reuse) those
blocks on-disk.
Namell_rw_block --
level access to block devices (DEPRECATED)
SynopsisSynopsisvoid ll_rw_block (rw, nr, bhs[]);int rw;int nr;struct buffer_head * bhs[];ArgumentsArgumentsrw
whether to READ or WRITE or SWRITE or maybe READA (readahead)
nr
number of &struct buffer_heads in the array
bhs[]
array of pointers to &struct buffer_head
DescriptionDescription
ll_rw_block takes an array of pointers to &struct buffer_heads, and
requests an I/O operation on them, either a READ or a WRITE. The third
SWRITE is like WRITE only we make sure that the *current* data in buffers
are sent to disk. The fourth READA option is described in the documentation
for generic_make_request which ll_rw_block calls.
This function drops any buffer that it cannot get a lock on (with the
BH_Lock state bit) unless SWRITE is required, any buffer that appears to be
clean when doing a write request, and any buffer that appears to be
up-to-date when doing read request. Further it marks as clean buffers that
are processed for writing (the buffer cache won't assume that they are
actually clean until the buffer gets unlocked).
ll_rw_block sets b_end_io to simple completion handler that marks
the buffer up-to-date (if approriate), unlocks the buffer and wakes
any waiters.
All of the buffers must be for the same device, and must also be a
multiple of the current approved size for the device.
Namebio_alloc_bioset --
allocate a bio for I/O
SynopsisSynopsisstruct bio * bio_alloc_bioset (gfp_mask, nr_iovecs, bs);gfp_t gfp_mask;int nr_iovecs;struct bio_set * bs;ArgumentsArgumentsgfp_mask
the GFP_ mask given to the slab allocator
nr_iovecs
number of iovecs to pre-allocate
bs
the bio_set to allocate from
DescriptionDescription
bio_alloc_bioset will first try it's on mempool to satisfy the allocation.
If __GFP_WAIT is set then we will block on the internal pool waiting
for a &struct bio to become free.
allocate bio and iovecs from the memory pools specified by the
bio_set structure.
Namebio_put --
release a reference to a bio
SynopsisSynopsisvoid bio_put (bio);struct bio * bio;ArgumentsArgumentsbio
bio to release reference to
DescriptionDescription
Put a reference to a &struct bio, either one you have gotten with
bio_alloc or bio_get. The last put of a bio will free it.
Name__bio_clone --
clone a bio
SynopsisSynopsisvoid __bio_clone (bio, bio_src);struct bio * bio;struct bio * bio_src;ArgumentsArgumentsbio
destination bio
bio_src
bio to clone
DescriptionDescription
Clone a &bio. Caller will own the returned bio, but not
the actual data it points to. Reference count of returned
bio will be one.
Namebio_clone --
clone a bio
SynopsisSynopsisstruct bio * bio_clone (bio, gfp_mask);struct bio * bio;gfp_t gfp_mask;ArgumentsArgumentsbio
bio to clone
gfp_mask
allocation priority
DescriptionDescription
Like __bio_clone, only also allocates the returned bio
Namebio_get_nr_vecs --
return approx number of vecs
SynopsisSynopsisint bio_get_nr_vecs (bdev);struct block_device * bdev;ArgumentsArgumentsbdev
I/O target
DescriptionDescription
Return the approximate number of pages we can send to this target.
There's no guarantee that you will be able to fit this number of pages
into a bio, it does not account for dynamic restrictions that vary
on offset.
Namebio_add_pc_page --
attempt to add page to bio
SynopsisSynopsisint bio_add_pc_page (q, bio, page, len, offset);request_queue_t * q;struct bio * bio;struct page * page;unsigned int len;unsigned int offset;ArgumentsArgumentsq
the target queue
bio
destination bio
page
page to add
len
vec entry length
offset
vec entry offset
DescriptionDescription
Attempt to add a page to the bio_vec maplist. This can fail for a
number of reasons, such as the bio being full or target block
device limitations. The target block device must allow bio's
smaller than PAGE_SIZE, so it is always possible to add a single
page to an empty bio. This should only be used by REQ_PC bios.
Namebio_add_page --
attempt to add page to bio
SynopsisSynopsisint bio_add_page (bio, page, len, offset);struct bio * bio;struct page * page;unsigned int len;unsigned int offset;ArgumentsArgumentsbio
destination bio
page
page to add
len
vec entry length
offset
vec entry offset
DescriptionDescription
Attempt to add a page to the bio_vec maplist. This can fail for a
number of reasons, such as the bio being full or target block
device limitations. The target block device must allow bio's
smaller than PAGE_SIZE, so it is always possible to add a single
page to an empty bio.
Namebio_uncopy_user --
finish previously mapped bio
SynopsisSynopsisint bio_uncopy_user (bio);struct bio * bio;ArgumentsArgumentsbio
bio being terminated
DescriptionDescription
Free pages allocated from bio_copy_user and write back data
to user space in case of a read.
Namebio_copy_user --
copy user data to bio
SynopsisSynopsisstruct bio * bio_copy_user (q, uaddr, len, write_to_vm);request_queue_t * q;unsigned long uaddr;unsigned int len;int write_to_vm;ArgumentsArgumentsq
destination block queue
uaddr
start of user address
len
length in bytes
write_to_vm
bool indicating writing to pages or not
DescriptionDescription
Prepares and returns a bio for indirect user io, bouncing data
to/from kernel pages as necessary. Must be paired with
call bio_uncopy_user on io completion.
Namebio_map_user --
map user address into bio
SynopsisSynopsisstruct bio * bio_map_user (q, bdev, uaddr, len, write_to_vm);request_queue_t * q;struct block_device * bdev;unsigned long uaddr;unsigned int len;int write_to_vm;ArgumentsArgumentsq
the request_queue_t for the bio
bdev
destination block device
uaddr
start of user address
len
length in bytes
write_to_vm
bool indicating writing to pages or not
DescriptionDescription
Map the user space address into a bio suitable for io to a block
device. Returns an error pointer in case of error.
Namebio_unmap_user --
unmap a bio
SynopsisSynopsisvoid bio_unmap_user (bio);struct bio * bio;ArgumentsArgumentsbio
the bio being unmapped
DescriptionDescription
Unmap a bio previously mapped by bio_map_user. Must be called with
a process context.
bio_unmap_user may sleep.
Namebio_map_kern --
map kernel address into bio
SynopsisSynopsisstruct bio * bio_map_kern (q, data, len, gfp_mask);request_queue_t * q;void * data;unsigned int len;gfp_t gfp_mask;ArgumentsArgumentsq
the request_queue_t for the bio
data
pointer to buffer to map
len
length in bytes
gfp_mask
allocation flags for bio allocation
DescriptionDescription
Map the kernel address into a bio suitable for io to a block
device. Returns an error pointer in case of error.
Namebio_endio --
end I/O on a bio
SynopsisSynopsisvoid bio_endio (bio, bytes_done, error);struct bio * bio;unsigned int bytes_done;int error;ArgumentsArgumentsbio
bio
bytes_done
number of bytes completed