numa_set_bind_policy(3) -- Linux man page
NAME
numa - NUMA policy library
SYNOPSIS
#include <numa.h>
cc ... -lnuma
int numa_available(void)
int numa_max_node(void)
int numa_preferred(void)
long numa_node_size(int node, long *freep)
nodemask_t numa_all_nodes
nodemask_t numa_no_nodes
int numa_node_to_cpus(int node, unsigned long *buffer, int bufferlen)
void numa_set_interleave_mask(nodemask_t *nodemask)
nodemask_t numa_get_interleave_mask(void)
void numa_bind(nodemask_t *nodemask)
void numa_set_preferred(int node)
void numa_set_localalloc(int flag)
void numa_set_membind(nodemask_t *nodemask)
nodemask_t numa_get_membind(void)
void *numa_alloc_interleaved_subset(size_t size, nodemask_t *nodemask)
void *numa_alloc_interleaved(size_t size)
void *numa_alloc_onnode(size_t size, int node)
void *numa_alloc_local(size_t size)
void *numa_alloc(size_t size)
void numa_free(void *start, size_t size)
int numa_run_on_node_mask(nodemask_t *nodemask)
int numa_run_on_node(int node)
int numa_get_run_node_mask(void)
void numa_interleave_memory(void *start, size_t size, nodemask_t *nodemask)
void numa_tonode_memory(void *start, size_t size, int node)
void numa_tonodemask_memory(void *start, size_t size, nodemask_t *nodemask)
void numa_setlocal_memory(void *start, size_t size)
void numa_police_memory(void *start, size_t size)
void numa_set_bind_policy(int strict)
void numa_set_strict(int strict)
void numa_error(char *where)
extern int numa_exit_on_error
DESCRIPTION
libnuma
offers a simple programming interface to the
NUMA policy supported by the
Linux kernel. On a NUMA (Non Uniform Memory Access) architecture some
memory areas have different latency or bandwidth than others.
Available policies are page interleaving, preferred node allocation, local allocation,
allocation only on specific nodes.
It also allows to bind threads to specific nodes. All policy exists per thread, but is
inherited to children. For setting global policy per process it is easiest
to run it using the
numactl(8)
utility. For more finegrained policy inside an application this library
can be used.
All numa memory allocation policy only takes effect when a page is actually
faulted into the address space of a process by accessing it. The
numa_alloc_*
functions take care of this automatically.
A node is defined as an area where all memory has the same speed as seen from
a particular CPU. Caches are ignored for this definition.
The mapping of nodes to cpus depends on the architecture. On the
AMD64
architecture each CPU is an own node. This library is only concerned about nodes.
Before any other calls in this library can be used
numa_available
must be called. When it returns an negative value all other functions in this
library are undefined.
numa_max_node
returns the highest node number available on the current system. When a node
number or a node mask with a bit set above the value returned by this function
is passed to a
libnuma
the result is undefined. The
numa_node_size
function returns the memory size of a node. When the argument
freep
is not NULL the free memory of the node is written to it.
On error it returns -1.
Some of these functions accept or return a
nodemask.
A nodemask has type
nodemask_t
which is an abstract bitmap type containing a bit set of nodes.
The maximum node number depends
on the architecture, but is not bigger than
NUMA_MAX_NODE.
When happens in
libnuma
calls when bits above
numa_max_node
are passed is undefined.
An
nodemask_t
should be only manipulated with the
nodemask_zero,
nodemask_clr,
nodemask_isset,
nodemask_set
functions.
nodemask_zero
clears an
nodemask_t,
nodemask_isset
returns true when
node
is set in the passed
nodemask,
nodemask_clr
clears
node
in
nodemask,
nodemask_set
sets
node
in
nodemask.
The predefined variable
numa_all_nodes
has all available nodes set,
numa_no_nodes
is the empty set.
nodeset_equal
returns non zero when the two nodesets are equal.
numa_preferred
returns the preferd node of the current thread. It is the node the kernel preferably
allocates memory on, unless some other policy overwrites this.
numa_set_interleave_mask
Set an memory interleave mask for the current thread to
nodemask.
All new memory allocations
are page interleaved over all nodes in the interleave mask. Interleaving
can be turned off again by passing a zero mask.
The page interleaving only occurs on the actual page fault that puts a new
page into the current address space. It is also only a hint, the kernel
will fall back to other nodes if no memory is available on the interleave
target. This is a low level
function, it may be more convenient to use the higher level functions like
numa_alloc_interleaved
or
numa_alloc_interleaved_subset.
numa_get_interleave_mask
returns the current interleave mask.
numa_bind
binds the current thread and its children to the nodes
specified in
nodemask.
They will only run on the CPUs of the specified nodes and only able to allocate
memory from them.
This function is equivalent to calling
numa_run_on_node_mask
and
numa_set_membind
with the same argument.
numa_set_preferred
sets the preferred node for the current thread to
node.
Preferred node is the node memory is
preferably allocated from before falling back to other nodes.
The default is to use the current node the process runs on
(local policy). Passing an -1 argument is equivalent to
numa_set_localalloc.
numa_set_localalloc
sets a local memory allocation policy for the current thread.
Memory is preferably allocated from the current node.
numa_set_membind
sets the memory allocation mask.
The thread will only allocate memory from the nodes set in
nodemask.
Passing an argument of
numa_no_nodes
or
numa_all_nodes
turns off memory binding to specific nodes.
numa_get_membind
returns the current node mask from which memory can be allocated.
numa_no_nodes
or
numa_all_nodes
means all nodes are available for memory allocation.
numa_alloc_interleaved
allocates
size
bytes of memory page interleaved on all nodes. This function is relatively slow
and should only be used for large areas consisting of multiple pages. The
interleaving works on page level and will only show an effect when the
area is large. It must be freed with
numa_free.
On errors NULL is returned.
numa_alloc_interleaved_subset
is like
numa_alloc_interleaved
except that it also accepts a mask of the nodes to interleave on.
On errors NULL is returned.
numa_alloc_onnode
allocates memory on a specific node. This function is relatively slow
and allocations are rounded to pagesize. The memory must be freed
with
numa_free
On errors NULL is returned.
numa_alloc_local
allocates
size
bytes of memory on the local node. This function is relatively slow
and allocations are rounded to pagesize. The memory must be freed
with
numa_free.
On errors NULL is returned.
numa_alloc
allocates
size
bytes of memory with the current NUMA policy. This function is relatively slow
and allocations are rounded to pagesize. The memory must be freed
with
numa_free.
On errors NULL is returned.
numa_free
frees
size
bytes of memory starting at
start,
allocated by the
numa_alloc_*
functions above.
numa_run_on_node
runs the current thread and its children
on a specific node. They will not migrate to CPUs of
other nodes until the node affinity is reset with a new call to
numa_run_on_node_mask.
Passing
-1
allows to schedule on all nodes again.
Returns an negative value and error in errno, or 0 on success.
numa_run_on_node_mask
runs the current thread and its children only on nodes specified in
nodemask.
They will not migrate to CPUs of
other nodes until the node affinity is reset with a new call to
numa_run_on_node_mask.
Passing
numa_all_nodes
allows to schedule on all nodes again.
Returns an negative value and error in errno, or 0 on success.
numa_get_run_node_mask
returns the mask of nodes that the current thread is allowed to run on.
numa_interleave_memory
pages interleaves
size
bytes memory from start on nodes
nodemask.
This is a lower level function to interleave not yet faulted in but allocated
memory. Not yet faulted in means the memory is allocated using
mmap(2)
or
shmat(2),
but has not been accessed by the current process yet. The memory is page
interleaved to all nodes specified in
nodemask.
Normally
numa_alloc_interleaved
should be used for private memory instead, but this function is useful to
handle shared memory areas. To be useful the memory area should be
significantly larger than a page.
When the
numa_set_strict
flag is true then the operation will cause an numa_error if there were already
pages in the mapping that do not follow the policy.
numa_tonode_memory
put memory on a specific node. The constraints described for
numa_interleave_memory
apply here too.
numa_tonodemask_memory
put memory on a specific set of nodes. The constraints described for
numa_interleave_memory
apply here too.
numa_setlocal_memory
locates memory on the current node. The constraints described for
numa_interleave_memory
apply here too.
numa_police_memory
locates memory with the current NUMA policy. The constraints described for
numa_interleave_memory
apply here too.
numa_node_to_cpus
converts a node number to a bitmask of cpus. The user must pass a long enough
buffer. When the buffer is not long enough
errno
will be set to
ERANGE
and -1 returned. On success 0 is returned.
numa_set_bind_policy
specifies whether calls that bind memory to a specific node should
use the preferred policy or a strict policy. Preferred allows
to allocate memory on other nodes when there isn't enough free
on the target node. strict will fail the allocation in that case.
Setting the argument to specifies strict, 0 preferred.
Note that specifying more than one node non strict may only use
the first node in some kernel versions.
numa_set_strict
sets a flag that says whether the functions allocating on specific
nodes should use use a strict policy. Strict means the allocation
will fail if the memory cannot be allocated on the target node.
Default operation is to fall back to other nodes.
This doesn't apply to interleave and default.
numa_error
is an weak internal libnuma function that can be overwritten by the
user program. It allows to specify a different error handling strategy
when an NUMA system call fails. It does not affect
numa_available.
The default action is to print an error to stderr and exit
the program when
numa_exit_on_error
is set to a non zero value. Default is zero.
THREAD SAFETY
numa_set_bind_policy
and
numa_exit_on_error
are process global. The other calls are thread safe. Memory policy for
an specific memory when
changed affects the whole process and possible other processes mapping
the same memory.
COPYRIGHT
Copyright 2002,2004 Andi Kleen, SuSE Labs.
libnuma is under the GNU Lesser General Public License, v2.1.
SEE ALSO
getpagesize(2)
mmap(2)
shmat(2)
numactl(8)
|