SpecialBuilds.txt 10.7 KB
Newer Older
1 2
This file describes some special Python build types enabled via compile-time
preprocessor defines.
3

4 5
IMPORTANT: if you want to build a debug-enabled Python, it is recommended that
you use ``./configure --with-pydebug``, rather than the options listed here.
6 7 8

However, if you wish to define some of these options individually, it is best
to define them in the EXTRA_CFLAGS make variable;
9
``make EXTRA_CFLAGS="-DPy_REF_DEBUG"``.
10

11

12 13 14 15 16 17
Py_REF_DEBUG
------------

Turn on aggregate reference counting.  This arranges that extern _Py_RefTotal
hold a count of all references, the sum of ob_refcnt across all objects.  In a
debug-mode build, this is where the "8288" comes from in
18 19 20 21 22 23 24

    >>> 23
    23
    [8288 refs]
    >>>

Note that if this count increases when you're not storing away new objects,
25 26
there's probably a leak.  Remember, though, that in interactive mode the special
name "_" holds a reference to the last result displayed!
27

28 29
Py_REF_DEBUG also checks after every decref to verify that the refcount hasn't
gone negative, and causes an immediate fatal error if it has.
30 31 32 33 34

Special gimmicks:

sys.gettotalrefcount()
    Return current total of all refcounts.
35 36 37 38 39 40 41 42 43 44 45


Py_TRACE_REFS
-------------

Turn on heavy reference debugging.  This is major surgery.  Every PyObject grows
two more pointers, to maintain a doubly-linked list of all live heap-allocated
objects.  Most built-in type objects are not in this list, as they're statically
allocated.  Starting in Python 2.3, if COUNT_ALLOCS (see below) is also defined,
a static type object T does appear in this list if at least one object of type T
has been created.
46 47

Note that because the fundamental PyObject layout changes, Python modules
48
compiled with Py_TRACE_REFS are incompatible with modules compiled without it.
49 50 51 52 53 54

Py_TRACE_REFS implies Py_REF_DEBUG.

Special gimmicks:

sys.getobjects(max[, type])
55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74
    Return list of the (no more than) max most-recently allocated objects, most
    recently allocated first in the list, least-recently allocated last in the
    list.  max=0 means no limit on list length.  If an optional type object is
    passed, the list is also restricted to objects of that type.  The return
    list itself, and some temp objects created just to call sys.getobjects(),
    are excluded from the return list.  Note that the list returned is just
    another object, though, so may appear in the return list the next time you
    call getobjects(); note that every object in the list is kept alive too,
    simply by virtue of being in the list.

envvar PYTHONDUMPREFS
    If this envvar exists, Py_Finalize() arranges to print a list of all
    still-live heap objects.  This is printed twice, in different formats,
    before and after Py_Finalize has cleaned up everything it can clean up.  The
    first output block produces the repr() of each object so is more
    informative; however, a lot of stuff destined to die is still alive then.
    The second output block is much harder to work with (repr() can't be invoked
    anymore -- the interpreter has been torn down too far), but doesn't list any
    objects that will die.  The tool script combinerefs.py can be run over this
    to combine the info from both output blocks.  The second output block, and
75
    combinerefs.py, were new in Python 2.3b1.
76 77 78 79


PYMALLOC_DEBUG
--------------
80

81
When pymalloc is enabled (WITH_PYMALLOC is defined), calls to the PyObject_
82 83 84 85 86
memory routines are handled by Python's own small-object allocator, while calls
to the PyMem_ memory routines are directed to the system malloc/ realloc/free.
If PYMALLOC_DEBUG is also defined, calls to both PyObject_ and PyMem_ memory
routines are directed to a special debugging mode of Python's small-object
allocator.
87

88 89 90
This mode fills dynamically allocated memory blocks with special, recognizable
bit patterns, and adds debugging info on each end of dynamically allocated
memory blocks.  The special bit patterns are:
91 92 93

#define CLEANBYTE     0xCB   /* clean (newly allocated) memory */
#define DEADBYTE      0xDB   /* dead (newly freed) memory */
94
#define FORBIDDENBYTE 0xFB   /* forbidden -- untouchable bytes */
95 96 97 98

Strings of these bytes are unlikely to be valid addresses, floats, or 7-bit
ASCII strings.

99 100 101 102 103
Let S = sizeof(size_t). 2*S bytes are added at each end of each block of N bytes
requested.  The memory layout is like so, where p represents the address
returned by a malloc-like or realloc-like function (p[i:j] means the slice of
bytes from *(p+i) inclusive up to *(p+j) exclusive; note that the treatment of
negative indices differs from a Python slice):
104 105

p[-2*S:-S]
106 107
    Number of bytes originally asked for.  This is a size_t, big-endian (easier
    to read in a memory dump).
108
p[-S:0]
109 110
    Copies of FORBIDDENBYTE.  Used to catch under- writes and reads.
p[0:N]
111
    The requested memory, filled with copies of CLEANBYTE, used to catch
112 113 114 115 116 117
    reference to uninitialized memory.  When a realloc-like function is called
    requesting a larger memory block, the new excess bytes are also filled with
    CLEANBYTE.  When a free-like function is called, these are overwritten with
    DEADBYTE, to catch reference to freed memory.  When a realloc- like function
    is called requesting a smaller memory block, the excess old bytes are also
    filled with DEADBYTE.
118
p[N:N+S]
119
    Copies of FORBIDDENBYTE.  Used to catch over- writes and reads.
120
p[N+S:N+2*S]
121
    A serial number, incremented by 1 on each call to a malloc-like or
122 123 124 125 126 127 128 129 130 131 132 133 134 135
    realloc-like function.  Big-endian size_t.  If "bad memory" is detected
    later, the serial number gives an excellent way to set a breakpoint on the
    next run, to capture the instant at which this block was passed out.  The
    static function bumpserialno() in obmalloc.c is the only place the serial
    number is incremented, and exists so you can set such a breakpoint easily.

A realloc-like or free-like function first checks that the FORBIDDENBYTEs at
each end are intact.  If they've been altered, diagnostic output is written to
stderr, and the program is aborted via Py_FatalError().  The other main failure
mode is provoking a memory error when a program reads up one of the special bit
patterns and tries to use it as an address.  If you get in a debugger then and
look at the object, you're likely to see that it's entirely filled with 0xDB
(meaning freed memory is getting used) or 0xCB (meaning uninitialized memory is
getting used).
136 137 138

Note that PYMALLOC_DEBUG requires WITH_PYMALLOC.

139 140
Special gimmicks:

141 142 143
envvar PYTHONMALLOCSTATS
    If this envvar exists, a report of pymalloc summary statistics is printed to
    stderr whenever a new arena is allocated, and also by Py_Finalize().
144 145 146 147

Changed in 2.5:  The number of extra bytes allocated is 4*sizeof(size_t).
Before it was 16 on all boxes, reflecting that Python couldn't make use of
allocations >= 2**32 bytes even on 64-bit boxes before 2.5.
148 149 150 151


Py_DEBUG
--------
152 153 154

This is what is generally meant by "a debug build" of Python.

155 156 157 158 159 160 161 162
Py_DEBUG implies LLTRACE, Py_REF_DEBUG, Py_TRACE_REFS, and PYMALLOC_DEBUG (if
WITH_PYMALLOC is enabled).  In addition, C assert()s are enabled (via the C way:
by not defining NDEBUG), and some routines do additional sanity checks inside
"#ifdef Py_DEBUG" blocks.


COUNT_ALLOCS
------------
163 164 165 166

Each type object grows three new members:

    /* Number of times an object of this type was allocated. */
167
    int tp_allocs;
168 169

    /* Number of times an object of this type was deallocated. */
170
    int tp_frees;
171

172 173 174 175 176
    /* Highwater mark:  the maximum value of tp_allocs - tp_frees so
     * far; or, IOW, the largest number of objects of this type alive at
     * the same time.
     */
    int tp_maxalloc;
177

178 179 180 181 182
Allocation and deallocation code keeps these counts up to date.  Py_Finalize()
displays a summary of the info returned by sys.getcounts() (see below), along
with assorted other special allocation counts (like the number of tuple
allocations satisfied by a tuple free-list, the number of 1-character strings
allocated, etc).
183 184

Before Python 2.2, type objects were immortal, and the COUNT_ALLOCS
185 186 187 188 189
implementation relies on that.  As of Python 2.2, heap-allocated type/ class
objects can go away.  COUNT_ALLOCS can blow up in 2.2 and 2.2.1 because of this;
this was fixed in 2.2.2.  Use of COUNT_ALLOCS makes all heap-allocated type
objects immortal, except for those for which no object of that type is ever
allocated.
190

191
Starting with Python 2.3, If Py_TRACE_REFS is also defined, COUNT_ALLOCS
192 193
arranges to ensure that the type object for each allocated object appears in the
doubly-linked list of all objects maintained by Py_TRACE_REFS.
194

195 196 197
Special gimmicks:

sys.getcounts()
198 199
    Return a list of 4-tuples, one entry for each type object for which at least
    one object of that type was allocated.  Each tuple is of the form:
200 201 202

        (tp_name, tp_allocs, tp_frees, tp_maxalloc)

203 204 205 206 207 208 209 210 211
    Each distinct type object gets a distinct entry in this list, even if two or
    more type objects have the same tp_name (in which case there's no way to
    distinguish them by looking at this list).  The list is ordered by time of
    first object allocation: the type object for which the first allocation of
    an object of that type occurred most recently is at the front of the list.


LLTRACE
-------
212

213
Compile in support for Low Level TRACE-ing of the main interpreter loop.
214

215 216 217 218 219 220
When this preprocessor symbol is defined, before PyEval_EvalFrame (eval_frame in
2.3 and 2.2, eval_code2 before that) executes a frame's code it checks the
frame's global namespace for a variable "__lltrace__".  If such a variable is
found, mounds of information about what the interpreter is doing are sprayed to
stdout, such as every opcode and opcode argument and values pushed onto and
popped off the value stack.
221 222

Not useful very often, but very useful when needed.
223

224 225 226

CALL_PROFILE
------------
227 228 229

Count the number of function calls executed.

230 231 232 233 234
When this symbol is defined, the ceval mainloop and helper functions count the
number of function calls made.  It keeps detailed statistics about what kind of
object was called and whether the call hit any of the special fast paths in the
code.

Michael W. Hudson's avatar
Michael W. Hudson committed
235

236 237
WITH_TSC
--------
Michael W. Hudson's avatar
Michael W. Hudson committed
238

239 240
Super-lowlevel profiling of the interpreter.  When enabled, the sys module grows
a new function:
Michael W. Hudson's avatar
Michael W. Hudson committed
241 242

settscdump(bool)
243 244 245
    If true, tell the Python interpreter to dump VM measurements to stderr.  If
    false, turn off dump.  The measurements are based on the processor's
    time-stamp counter.
Michael W. Hudson's avatar
Michael W. Hudson committed
246

247 248 249
This build option requires a small amount of platform specific code.  Currently
this code is present for linux/x86 and any PowerPC platform that uses GCC
(i.e. OS X and linux/ppc).
Michael W. Hudson's avatar
Michael W. Hudson committed
250

251 252 253 254
On the PowerPC the rate at which the time base register is incremented is not
defined by the architecture specification, so you'll need to find the manual for
your specific processor.  For the 750CX, 750CXe and 750FX (all sold as the G3)
we find:
Michael W. Hudson's avatar
Michael W. Hudson committed
255

256 257
    The time base counter is clocked at a frequency that is one-fourth that of
    the bus clock.
Michael W. Hudson's avatar
Michael W. Hudson committed
258 259

This build is enabled by the --with-tsc flag to configure.