[BACK]Return to jemalloc.3 CVS log [TXT][DIR] Up to [cvs.NetBSD.org] / src / lib / libc / stdlib

Annotation of src/lib/libc/stdlib/jemalloc.3, Revision 1.8

1.1       jruoho      1: .\" $NetBSD $
                      2: .\"
                      3: .\" Copyright (c) 1980, 1991, 1993
                      4: .\"    The Regents of the University of California.  All rights reserved.
                      5: .\"
                      6: .\" This code is derived from software contributed to Berkeley by
                      7: .\" the American National Standards Committee X3, on Information
                      8: .\" Processing Systems.
                      9: .\"
                     10: .\" Redistribution and use in source and binary forms, with or without
                     11: .\" modification, are permitted provided that the following conditions
                     12: .\" are met:
                     13: .\" 1. Redistributions of source code must retain the above copyright
                     14: .\"    notice, this list of conditions and the following disclaimer.
                     15: .\" 2. Redistributions in binary form must reproduce the above copyright
                     16: .\"    notice, this list of conditions and the following disclaimer in the
                     17: .\"    documentation and/or other materials provided with the distribution.
                     18: .\" 3. Neither the name of the University nor the names of its contributors
                     19: .\"    may be used to endorse or promote products derived from this software
                     20: .\"    without specific prior written permission.
                     21: .\"
                     22: .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
                     23: .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
                     24: .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
                     25: .\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
                     26: .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
                     27: .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
                     28: .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
                     29: .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
                     30: .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
                     31: .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
                     32: .\" SUCH DAMAGE.
                     33: .\"
                     34: .\"     @(#)malloc.3   8.1 (Berkeley) 6/4/93
                     35: .\" $FreeBSD: src/lib/libc/stdlib/malloc.3,v 1.73 2007/06/15 22:32:33 jasone Exp $
                     36: .\"
1.7       jruoho     37: .Dd June 21, 2011
1.6       njoly      38: .Dt JEMALLOC 3
1.1       jruoho     39: .Os
                     40: .Sh NAME
                     41: .Nm jemalloc
                     42: .Nd the default system allocator
1.2       jruoho     43: .Sh LIBRARY
                     44: .Lb libc
                     45: .Sh SYNOPSIS
1.3       jruoho     46: .Ft const char *
1.2       jruoho     47: .Va _malloc_options ;
1.1       jruoho     48: .Sh DESCRIPTION
                     49: The
                     50: .Nm
                     51: is a general-purpose concurrent
                     52: .Xr malloc 3
                     53: implementation specifically designed to be scalable
                     54: on modern multi-processor systems.
                     55: It is the default user space system allocator in
                     56: .Nx .
1.5       jruoho     57: .Pp
1.1       jruoho     58: When the first call is made to one of the memory allocation
                     59: routines such as
                     60: .Fn malloc
                     61: or
                     62: .Fn realloc ,
                     63: various flags that affect the workings of the allocator are set or reset.
                     64: These are described below.
                     65: .Pp
                     66: The
                     67: .Dq name
                     68: of the file referenced by the symbolic link named
                     69: .Pa /etc/malloc.conf ,
                     70: the value of the environment variable
                     71: .Ev MALLOC_OPTIONS ,
                     72: and the string pointed to by the global variable
                     73: .Va _malloc_options
                     74: will be interpreted, in that order, character by character as flags.
                     75: .Pp
                     76: Most flags are single letters.
                     77: Uppercase letters indicate that the behavior is set, or on,
                     78: and lowercase letters mean that the behavior is not set, or off.
                     79: The following options are available.
                     80: .Bl -tag -width "A   " -offset 3n
                     81: .It Em A
                     82: All warnings (except for the warning about unknown
                     83: flags being set) become fatal.
                     84: The process will call
                     85: .Xr abort 3
                     86: in these cases.
                     87: .It Em H
                     88: Use
                     89: .Xr madvise 2
                     90: when pages within a chunk are no longer in use, but the chunk as a whole cannot
                     91: yet be deallocated.
                     92: This is primarily of use when swapping is a real possibility, due to the high
                     93: overhead of the
                     94: .Fn madvise
                     95: system call.
                     96: .It Em J
                     97: Each byte of new memory allocated by
                     98: .Fn malloc ,
                     99: .Fn realloc
                    100: will be initialized to 0xa5.
                    101: All memory returned by
                    102: .Fn free ,
                    103: .Fn realloc
                    104: will be initialized to 0x5a.
                    105: This is intended for debugging and will impact performance negatively.
                    106: .It Em K
                    107: Increase/decrease the virtual memory chunk size by a factor of two.
                    108: The default chunk size is 1 MB.
                    109: This option can be specified multiple times.
                    110: .It Em N
                    111: Increase/decrease the number of arenas by a factor of two.
                    112: The default number of arenas is four times the number of CPUs, or one if there
                    113: is a single CPU.
                    114: This option can be specified multiple times.
                    115: .It Em P
                    116: Various statistics are printed at program exit via an
                    117: .Xr atexit 3
                    118: function.
                    119: This has the potential to cause deadlock for a multi-threaded process that exits
                    120: while one or more threads are executing in the memory allocation functions.
                    121: Therefore, this option should only be used with care; it is primarily intended
                    122: as a performance tuning aid during application development.
                    123: .It Em Q
                    124: Increase/decrease the size of the allocation quantum by a factor of two.
                    125: The default quantum is the minimum allowed by the architecture (typically 8 or
                    126: 16 bytes).
                    127: This option can be specified multiple times.
                    128: .It Em S
                    129: Increase/decrease the size of the maximum size class that is a multiple of the
                    130: quantum by a factor of two.
                    131: Above this size, power-of-two spacing is used for size classes.
                    132: The default value is 512 bytes.
                    133: This option can be specified multiple times.
                    134: .It Em U
                    135: Generate
                    136: .Dq utrace
                    137: entries for
                    138: .Xr ktrace 1 ,
                    139: for all operations.
                    140: Consult the source for details on this option.
                    141: .It Em V
                    142: Attempting to allocate zero bytes will return a
                    143: .Dv NULL
                    144: pointer instead of a valid pointer.
                    145: (The default behavior is to make a minimal allocation and return a
                    146: pointer to it.)
                    147: This option is provided for System V compatibility.
                    148: This option is incompatible with the
                    149: .Em X
                    150: option.
                    151: .It Em X
                    152: Rather than return failure for any allocation function,
                    153: display a diagnostic message on
                    154: .Dv stderr
                    155: and cause the program to drop
                    156: core (using
                    157: .Xr abort 3 ) .
                    158: This option should be set at compile time by including the following in
                    159: the source code:
                    160: .Bd -literal -offset indent
                    161: _malloc_options = "X";
                    162: .Ed
                    163: .Pp
                    164: .It Em Z
                    165: Each byte of new memory allocated by
                    166: .Fn malloc ,
                    167: .Fn realloc
                    168: will be initialized to 0.
                    169: Note that this initialization only happens once for each byte, so
                    170: .Fn realloc
                    171: does not zero memory that was previously allocated.
                    172: This is intended for debugging and will impact performance negatively.
                    173: .El
                    174: .Pp
1.7       jruoho    175: Extra care should be taken when enabling
                    176: any of the options in production environments.
1.1       jruoho    177: The
1.7       jruoho    178: .Em A ,
                    179: .Em J ,
1.1       jruoho    180: and
                    181: .Em Z
                    182: options are intended for testing and debugging.
                    183: An application which changes its behavior when these options are used
                    184: is flawed.
                    185: .Sh IMPLEMENTATION NOTES
                    186: The
                    187: .Nm
                    188: allocator uses multiple arenas in order to reduce lock
                    189: contention for threaded programs on multi-processor systems.
                    190: This works well with regard to threading scalability, but incurs some costs.
                    191: There is a small fixed per-arena overhead, and additionally, arenas manage
                    192: memory completely independently of each other, which means a small fixed
                    193: increase in overall memory fragmentation.
                    194: These overheads are not generally an issue,
                    195: given the number of arenas normally used.
                    196: Note that using substantially more arenas than the default is not likely to
                    197: improve performance, mainly due to reduced cache performance.
                    198: However, it may make sense to reduce the number of arenas if an application
                    199: does not make much use of the allocation functions.
                    200: .Pp
                    201: Memory is conceptually broken into equal-sized chunks,
                    202: where the chunk size is a power of two that is greater than the page size.
                    203: Chunks are always aligned to multiples of the chunk size.
                    204: This alignment makes it possible to find
                    205: metadata for user objects very quickly.
                    206: .Pp
                    207: User objects are broken into three categories according to size:
                    208: .Bl -enum -offset 3n
                    209: .It
                    210: Small objects are smaller than one page.
                    211: .It
                    212: Large objects are smaller than the chunk size.
                    213: .It
                    214: Huge objects are a multiple of the chunk size.
                    215: .El
                    216: .Pp
                    217: Small and large objects are managed by arenas; huge objects are managed
                    218: separately in a single data structure that is shared by all threads.
                    219: Huge objects are used by applications infrequently enough that this single
                    220: data structure is not a scalability issue.
                    221: .Pp
                    222: Each chunk that is managed by an arena tracks its contents in a page map as
                    223: runs of contiguous pages (unused, backing a set of small objects, or backing
                    224: one large object).
                    225: The combination of chunk alignment and chunk page maps makes it possible to
                    226: determine all metadata regarding small and large allocations in constant time.
                    227: .Pp
                    228: Small objects are managed in groups by page runs.
                    229: Each run maintains a bitmap that tracks which regions are in use.
                    230: Allocation requests can be grouped as follows.
                    231: .Pp
                    232: .Bl -bullet -offset 3n
                    233: .It
                    234: Allocation requests that are no more than half the quantum (see the
                    235: .Em Q
                    236: option) are rounded up to the nearest power of two (typically 2, 4, or 8).
                    237: .It
                    238: Allocation requests that are more than half the quantum, but no more than the
                    239: maximum quantum-multiple size class (see the
                    240: .Em S
                    241: option) are rounded up to the nearest multiple of the quantum.
                    242: .It
                    243: Allocation requests that are larger than the maximum quantum-multiple size
                    244: class, but no larger than one half of a page, are rounded up to the nearest
                    245: power of two.
                    246: .It
                    247: Allocation requests that are larger than half of a page, but small enough to
                    248: fit in an arena-managed chunk (see the
                    249: .Em K
                    250: option), are rounded up to the nearest run size.
                    251: .It
                    252: Allocation requests that are too large to fit in an arena-managed chunk are
                    253: rounded up to the nearest multiple of the chunk size.
                    254: .El
                    255: .Pp
                    256: Allocations are packed tightly together, which can be an issue for
                    257: multi-threaded applications.
                    258: If you need to assure that allocations do not suffer from cache line sharing,
                    259: round your allocation requests up to the nearest multiple of the cache line
                    260: size.
                    261: .Sh DEBUGGING
                    262: The first thing to do is to set the
                    263: .Em A
                    264: option.
                    265: This option forces a coredump (if possible) at the first sign of trouble,
                    266: rather than the normal policy of trying to continue if at all possible.
                    267: .Pp
                    268: It is probably also a good idea to recompile the program with suitable
                    269: options and symbols for debugger support.
                    270: .Pp
                    271: If the program starts to give unusual results, coredump or generally behave
                    272: differently without emitting any of the messages mentioned in the next
                    273: section, it is likely because it depends on the storage being filled with
                    274: zero bytes.
                    275: Try running it with the
                    276: .Em Z
                    277: option set;
                    278: if that improves the situation, this diagnosis has been confirmed.
                    279: If the program still misbehaves,
                    280: the likely problem is accessing memory outside the allocated area.
                    281: .Pp
                    282: Alternatively, if the symptoms are not easy to reproduce, setting the
                    283: .Em J
                    284: option may help provoke the problem.
                    285: In truly difficult cases, the
                    286: .Em U
                    287: option, if supported by the kernel, can provide a detailed trace of
                    288: all calls made to these functions.
                    289: .Pp
                    290: Unfortunately,
                    291: .Nm
                    292: does not provide much detail about the problems it detects;
                    293: the performance impact for storing such information would be prohibitive.
                    294: There are a number of allocator implementations available on the Internet
                    295: which focus on detecting and pinpointing problems by trading performance for
                    296: extra sanity checks and detailed diagnostics.
1.8     ! wiz       297: .Sh ENVIRONMENT
        !           298: The following environment variables affect the execution of the allocation
        !           299: functions:
        !           300: .Bl -tag -width ".Ev MALLOC_OPTIONS"
        !           301: .It Ev MALLOC_OPTIONS
        !           302: If the environment variable
        !           303: .Ev MALLOC_OPTIONS
        !           304: is set, the characters it contains will be interpreted as flags to the
        !           305: allocation functions.
        !           306: .El
        !           307: .Sh EXAMPLES
        !           308: To dump core whenever a problem occurs:
        !           309: .Pp
        !           310: .Bd -literal -offset indent
        !           311: ln -s 'A' /etc/malloc.conf
        !           312: .Ed
        !           313: .Pp
        !           314: To specify in the source that a program does no return value checking
        !           315: on calls to these functions:
        !           316: .Bd -literal -offset indent
        !           317: _malloc_options = "X";
        !           318: .Ed
1.5       jruoho    319: .Sh DIAGNOSTICS
1.1       jruoho    320: If any of the memory allocation/deallocation functions detect an error or
                    321: warning condition, a message will be printed to file descriptor
                    322: .Dv STDERR_FILENO .
                    323: Errors will result in the process dumping core.
                    324: If the
                    325: .Em A
                    326: option is set, all warnings are treated as errors.
                    327: .Pp
1.3       jruoho    328: .\"
                    329: .\" XXX: The _malloc_message should be documented
                    330: .\"     better in order to be worth mentioning.
                    331: .\"
1.1       jruoho    332: The
                    333: .Va _malloc_message
                    334: variable allows the programmer to override the function which emits
                    335: the text strings forming the errors and warnings if for some reason
                    336: the
                    337: .Dv stderr
                    338: file descriptor is not suitable for this.
                    339: Please note that doing anything which tries to allocate memory in
                    340: this function is likely to result in a crash or deadlock.
                    341: .Pp
                    342: All messages are prefixed by
                    343: .Dq Ao Ar progname Ac Ns Li \&: Pq malloc .
                    344: .Sh SEE ALSO
                    345: .Xr emalloc 3 ,
                    346: .Xr malloc 3 ,
                    347: .Xr memory 3 ,
                    348: .Xr memoryallocators 9
                    349: .\"
                    350: .\" XXX: Add more references that could be worth reading.
                    351: .\"
                    352: .Rs
                    353: .%A Jason Evans
                    354: .%T "A Scalable Concurrent malloc(3) Implementation for FreeBSD"
                    355: .%D April 16, 2006
                    356: .%O BSDCan 2006
                    357: .%U http://people.freebsd.org/~jasone/jemalloc/bsdcan2006/jemalloc.pdf
                    358: .Re
                    359: .Rs
                    360: .%A Poul-Henning Kamp
                    361: .%T "Malloc(3) revisited"
                    362: .%I USENIX Association
                    363: .%B Proceedings of the FREENIX Track: 1998 USENIX Annual Technical Conference
                    364: .%D June 15-19, 1998
1.4       wiz       365: .%U http://www.usenix.org/publications/library/proceedings/usenix98/freenix/kamp.pdf
1.1       jruoho    366: .Re
                    367: .Rs
                    368: .%A Paul R. Wilson
                    369: .%A Mark S. Johnstone
                    370: .%A Michael Neely
                    371: .%A David Boles
                    372: .%T "Dynamic Storage Allocation: A Survey and Critical Review"
                    373: .%D 1995
                    374: .%I University of Texas at Austin
                    375: .%U ftp://ftp.cs.utexas.edu/pub/garbage/allocsrv.ps
                    376: .Re
                    377: .Sh HISTORY
                    378: The
                    379: .Nm
                    380: allocator became the default system allocator first in
                    381: .Fx 7.0
                    382: and then in
                    383: .Nx 5.0 .
                    384: In both systems it replaced the older so-called
                    385: .Dq phkmalloc
                    386: implementation.
                    387: .Sh AUTHORS
                    388: .An Jason Evans Aq jasone@canonware.com

CVSweb <webmaster@jp.NetBSD.org>