Kaydet (Commit) f6b8045c authored tarafından Tim Peters's avatar Tim Peters

Reworked has_finalizer() to use the new _PyObject_Lookup() instead

of PyObject_HasAttr(); the former promises never to execute
arbitrary Python code.  Undid many of the changes recently made to
worm around the worst consequences of that PyObject_HasAttr() could
execute arbitrary Python code.

Compatibility is hard to discuss, because the dangerous cases are
so perverse, and much of this appears to rely on implementation
accidents.

To start with, using hasattr() to check for __del__ wasn't only
dangerous, in some cases it was wrong:  if an instance of an old-
style class didn't have "__del__" in its instance dict or in any
base class dict, but a getattr hook said __del__ existed, then
hasattr() said "yes, this object has a __del__".  But
instance_dealloc() ignores the possibility of getattr hooks when
looking for a __del__, so while object.__del__ succeeds, no
__del__ method is called when the object is deleted.  gc was
therefore incorrect in believing that the object had a finalizer.

The new method doesn't suffer that problem (like instance_dealloc(),
_PyObject_Lookup() doesn't believe __del__ exists in that case), but
does suffer a somewhat opposite-- and even more obscure --oddity:
if an instance of an old-style class doesn't have "__del__" in its
instance dict, and a base class does have "__del__" in its dict,
and the first base class with a "__del__" associates it with a
descriptor (an object with a __get__ method), *and* if that
descriptor raises an exception when __get__ is called, then
(a) the current method believes the instance does have a __del__,
but (b) hasattr() does not believe the instance has a __del__.

While these disagree, I believe the new method is "more correct":
because the descriptor *will* be called when the object is
destructed, it can execute arbitrary Python code at the time the
object is destructed, and that's really what gc means by "has a
finalizer":  not specifically a __del__ method, but more generally
the possibility of executing arbitrary Python code at object
destruction time.  Code in a descriptor's __get__() executed at
destruction time can be just as problematic as code in a
__del__() executed then.

So I believe the new method is better on all counts.

Bugfix candidate, but it's unclear to me how all this differs in
the 2.2 branch (e.g., new-style and old-style classes already
took different gc paths in 2.3 before this last round of patches,
but don't in the 2.2 branch).
üst df875b99
...@@ -272,8 +272,9 @@ def test_boom(): ...@@ -272,8 +272,9 @@ def test_boom():
# the internal "attr" attributes as a side effect. That causes the # the internal "attr" attributes as a side effect. That causes the
# trash cycle to get reclaimed via refcounts falling to 0, thus mutating # trash cycle to get reclaimed via refcounts falling to 0, thus mutating
# the trash graph as a side effect of merely asking whether __del__ # the trash graph as a side effect of merely asking whether __del__
# exists. This used to (before 2.3b1) crash Python. # exists. This used to (before 2.3b1) crash Python. Now __getattr__
expect(gc.collect(), 0, "boom") # isn't called.
expect(gc.collect(), 4, "boom")
expect(len(gc.garbage), garbagelen, "boom") expect(len(gc.garbage), garbagelen, "boom")
class Boom2: class Boom2:
......
...@@ -12,6 +12,14 @@ What's New in Python 2.3 beta 1? ...@@ -12,6 +12,14 @@ What's New in Python 2.3 beta 1?
Core and builtins Core and builtins
----------------- -----------------
- Some horridly obscure problems were fixed involving interaction
between garbage collection and old-style classes with "ambitious"
getattr hooks. If an old-style instance didn't have a __del__ method,
but did have a __getattr__ hook, and the instance became reachable
only from an unreachable cycle, and the hook resurrected or deleted
unreachable objects when asked to resolve "__del__", anything up to
a segfault could happen. That's been repaired.
- dict.pop now takes an optional argument specifying a default - dict.pop now takes an optional argument specifying a default
value to return if the key is not in the dict. If a default is not value to return if the key is not in the dict. If a default is not
given and the key is not found, a KeyError will still be raised. given and the key is not found, a KeyError will still be raised.
...@@ -77,7 +85,7 @@ Library ...@@ -77,7 +85,7 @@ Library
return 'not a == b' rather than 'a != b'. This gives the desired return 'not a == b' rather than 'a != b'. This gives the desired
result for classes that define __eq__ without defining __ne__. result for classes that define __eq__ without defining __ne__.
- sgmllib now supports SGML marked sections, in particular the - sgmllib now supports SGML marked sections, in particular the
MS Office extensions. MS Office extensions.
- The urllib module now offers support for the iterator protocol. - The urllib module now offers support for the iterator protocol.
...@@ -154,10 +162,10 @@ Mac ...@@ -154,10 +162,10 @@ Mac
- EasyDialogs dialogs are now movable-modal, and if the application is - EasyDialogs dialogs are now movable-modal, and if the application is
currently in the background they will ask to be moved to the foreground currently in the background they will ask to be moved to the foreground
before displaying. before displaying.
- OSA Scripting support has improved a lot, and gensuitemodule.py can now - OSA Scripting support has improved a lot, and gensuitemodule.py can now
be used by mere mortals. be used by mere mortals.
- The IDE (in a framework build) now includes introductory documentation - The IDE (in a framework build) now includes introductory documentation
in Apple Help Viewer format. in Apple Help Viewer format.
......
...@@ -359,19 +359,17 @@ move_unreachable(PyGC_Head *young, PyGC_Head *unreachable) ...@@ -359,19 +359,17 @@ move_unreachable(PyGC_Head *young, PyGC_Head *unreachable)
/* Return true if object has a finalization method. /* Return true if object has a finalization method.
* CAUTION: An instance of an old-style class has to be checked for a * CAUTION: An instance of an old-style class has to be checked for a
*__del__ method, and that can cause arbitrary Python code to get executed *__del__ method, and earlier versions of this used to call PyObject_HasAttr,
* via the class's __getattr__ hook (if any). This function can therefore * which in turn could call the class's __getattr__ hook (if any). That
* mutate the object graph, and that's been the source of subtle bugs. * could invoke arbitrary Python code, mutating the object graph in arbitrary
* ways, and that was the source of some excruciatingly subtle bugs.
*/ */
static int static int
has_finalizer(PyObject *op) has_finalizer(PyObject *op)
{ {
if (PyInstance_Check(op)) { if (PyInstance_Check(op)) {
/* This is the dangerous path: hasattr can invoke
* the class __getattr__(), and that can do anything.
*/
assert(delstr != NULL); assert(delstr != NULL);
return PyObject_HasAttr(op, delstr); return _PyInstance_Lookup(op, delstr) != NULL;
} }
else if (PyType_HasFeature(op->ob_type, Py_TPFLAGS_HEAPTYPE)) else if (PyType_HasFeature(op->ob_type, Py_TPFLAGS_HEAPTYPE))
return op->ob_type->tp_del != NULL; return op->ob_type->tp_del != NULL;
...@@ -379,38 +377,28 @@ has_finalizer(PyObject *op) ...@@ -379,38 +377,28 @@ has_finalizer(PyObject *op)
return 0; return 0;
} }
/* Move all objects out of unreachable, into collectable or finalizers. /* Move the objects in unreachable with __del__ methods into finalizers.
* It's possible that some objects will get collected (via refcount falling * The objects remaining in unreachable do not have __del__ methods, and
* to 0), or resurrected, as a side effect of checking for __del__ methods. * gc_refs remains GC_TENTATIVELY_UNREACHABLE for them. The objects
* After, finalizers contains all the objects from unreachable that haven't * moved into finalizers have gc_refs changed to GC_REACHABLE.
* been collected by magic, and that have a finalizer. gc_refs is
* GC_REACHABLE for all of those. collectable contains all the remaining
* objects from unreachable, and gc_refs remains GC_TENTATIVELY_UNREACHABLE
* for those (we're still not sure they're reclaimable after this! Some
* may yet by reachable *from* the objects in finalizers).
*/ */
static void static void
move_finalizers(PyGC_Head *unreachable, PyGC_Head *collectable, move_finalizers(PyGC_Head *unreachable, PyGC_Head *finalizers)
PyGC_Head *finalizers)
{ {
while (!gc_list_is_empty(unreachable)) { PyGC_Head *gc = unreachable->gc.gc_next;
PyGC_Head *gc = unreachable->gc.gc_next;
while (gc != unreachable) {
PyObject *op = FROM_GC(gc); PyObject *op = FROM_GC(gc);
int finalizer; PyGC_Head *next = gc->gc.gc_next;
assert(IS_TENTATIVELY_UNREACHABLE(op)); assert(IS_TENTATIVELY_UNREACHABLE(op));
finalizer = has_finalizer(op); if (has_finalizer(op)) {
if (unreachable->gc.gc_next == gc) {
gc_list_remove(gc); gc_list_remove(gc);
if (finalizer) { gc_list_append(gc, finalizers);
gc_list_append(gc, finalizers); gc->gc.gc_refs = GC_REACHABLE;
gc->gc.gc_refs = GC_REACHABLE;
}
else
gc_list_append(gc, collectable);
} }
/* else has_finalizer() deleted op via side effect */ gc = next;
} }
} }
...@@ -430,11 +418,10 @@ visit_move(PyObject *op, PyGC_Head *tolist) ...@@ -430,11 +418,10 @@ visit_move(PyObject *op, PyGC_Head *tolist)
} }
/* Move objects that are reachable from finalizers, from the unreachable set /* Move objects that are reachable from finalizers, from the unreachable set
* into the reachable_from_finalizers set. * into finalizers set.
*/ */
static void static void
move_finalizer_reachable(PyGC_Head *finalizers, move_finalizer_reachable(PyGC_Head *finalizers)
PyGC_Head *reachable_from_finalizers)
{ {
traverseproc traverse; traverseproc traverse;
PyGC_Head *gc = finalizers->gc.gc_next; PyGC_Head *gc = finalizers->gc.gc_next;
...@@ -443,7 +430,7 @@ move_finalizer_reachable(PyGC_Head *finalizers, ...@@ -443,7 +430,7 @@ move_finalizer_reachable(PyGC_Head *finalizers,
traverse = FROM_GC(gc)->ob_type->tp_traverse; traverse = FROM_GC(gc)->ob_type->tp_traverse;
(void) traverse(FROM_GC(gc), (void) traverse(FROM_GC(gc),
(visitproc)visit_move, (visitproc)visit_move,
(void *)reachable_from_finalizers); (void *)finalizers);
} }
} }
...@@ -475,24 +462,32 @@ debug_cycle(char *msg, PyObject *op) ...@@ -475,24 +462,32 @@ debug_cycle(char *msg, PyObject *op)
/* Handle uncollectable garbage (cycles with finalizers, and stuff reachable /* Handle uncollectable garbage (cycles with finalizers, and stuff reachable
* only from such cycles). * only from such cycles).
* If DEBUG_SAVEALL or hasfinalizer, the objects in finalizers are appended * If DEBUG_SAVEALL, all objects in finalizers are appended to the module
* to the module garbage list (a Python list). The objects in finalizers * garbage list (a Python list), else only the objects in finalizers with
* are merged into the old list regardless. * __del__ methods are appended to garbage. All objects in finalizers are
* merged into the old list regardless.
* Returns 0 if all OK, <0 on error (out of memory to grow the garbage list). * Returns 0 if all OK, <0 on error (out of memory to grow the garbage list).
* The finalizers list is made empty on a successful return. * The finalizers list is made empty on a successful return.
*/ */
static int static int
handle_finalizers(PyGC_Head *finalizers, PyGC_Head *old, int hasfinalizer) handle_finalizers(PyGC_Head *finalizers, PyGC_Head *old)
{ {
PyGC_Head *gc = finalizers->gc.gc_next;
if (garbage == NULL) { if (garbage == NULL) {
garbage = PyList_New(0); garbage = PyList_New(0);
if (garbage == NULL) if (garbage == NULL)
Py_FatalError("gc couldn't create gc.garbage list"); Py_FatalError("gc couldn't create gc.garbage list");
} }
if ((debug & DEBUG_SAVEALL) || hasfinalizer) { for (; gc != finalizers; gc = gc->gc.gc_next) {
if (append_objects(garbage, finalizers) < 0) PyObject *op = FROM_GC(gc);
return -1;
if ((debug & DEBUG_SAVEALL) || has_finalizer(op)) {
if (PyList_Append(garbage, op) < 0)
return -1;
}
} }
gc_list_merge(finalizers, old); gc_list_merge(finalizers, old);
return 0; return 0;
} }
...@@ -541,9 +536,7 @@ collect(int generation) ...@@ -541,9 +536,7 @@ collect(int generation)
PyGC_Head *young; /* the generation we are examining */ PyGC_Head *young; /* the generation we are examining */
PyGC_Head *old; /* next older generation */ PyGC_Head *old; /* next older generation */
PyGC_Head unreachable; PyGC_Head unreachable;
PyGC_Head collectable;
PyGC_Head finalizers; PyGC_Head finalizers;
PyGC_Head reachable_from_finalizers;
PyGC_Head *gc; PyGC_Head *gc;
if (delstr == NULL) { if (delstr == NULL) {
...@@ -606,31 +599,19 @@ collect(int generation) ...@@ -606,31 +599,19 @@ collect(int generation)
* care not to create such things. For Python, finalizers means * care not to create such things. For Python, finalizers means
* instance objects with __del__ methods. * instance objects with __del__ methods.
* *
* Move each unreachable object into the collectable set or the * Move unreachable objects with finalizers into a different list.
* finalizers set. Because we need to check for __del__ methods on */
* instances of classic classes, arbitrary Python code may get
* executed by getattr hooks: that may resurrect or deallocate (via
* refcount falling to 0) unreachable objects, so this is very
* delicate.
*/
gc_list_init(&collectable);
gc_list_init(&finalizers); gc_list_init(&finalizers);
move_finalizers(&unreachable, &collectable, &finalizers); move_finalizers(&unreachable, &finalizers);
/* finalizers contains the unreachable objects with a finalizer; /* finalizers contains the unreachable objects with a finalizer;
* unreachable objects reachable only *from* those are also * unreachable objects reachable only *from* those are also
* uncollectable; we move those into a separate list * uncollectable, and we move those into the finalizers list too.
* (reachable_from_finalizers) so we don't have to do the dangerous
* has_finalizer() test again later.
*/ */
gc_list_init(&reachable_from_finalizers); move_finalizer_reachable(&finalizers);
move_finalizer_reachable(&finalizers, &reachable_from_finalizers);
/* And move everything only reachable from the reachable stuff. */
move_finalizer_reachable(&reachable_from_finalizers,
&reachable_from_finalizers);
/* Collect statistics on collectable objects found and print /* Collect statistics on collectable objects found and print
* debugging information. */ * debugging information. */
for (gc = collectable.gc.gc_next; gc != &collectable; for (gc = unreachable.gc.gc_next; gc != &unreachable;
gc = gc->gc.gc_next) { gc = gc->gc.gc_next) {
m++; m++;
if (debug & DEBUG_COLLECTABLE) { if (debug & DEBUG_COLLECTABLE) {
...@@ -640,7 +621,7 @@ collect(int generation) ...@@ -640,7 +621,7 @@ collect(int generation)
/* Call tp_clear on objects in the collectable set. This will cause /* Call tp_clear on objects in the collectable set. This will cause
* the reference cycles to be broken. It may also cause some objects * the reference cycles to be broken. It may also cause some objects
* in finalizers and/or reachable_from_finalizers to be freed */ * in finalizers and/or reachable_from_finalizers to be freed */
delete_garbage(&collectable, old); delete_garbage(&unreachable, old);
/* Collect statistics on uncollectable objects found and print /* Collect statistics on uncollectable objects found and print
* debugging information. */ * debugging information. */
...@@ -651,13 +632,6 @@ collect(int generation) ...@@ -651,13 +632,6 @@ collect(int generation)
if (debug & DEBUG_UNCOLLECTABLE) if (debug & DEBUG_UNCOLLECTABLE)
debug_cycle("uncollectable", FROM_GC(gc)); debug_cycle("uncollectable", FROM_GC(gc));
} }
for (gc = reachable_from_finalizers.gc.gc_next;
gc != &reachable_from_finalizers;
gc = gc->gc.gc_next) {
n++;
if (debug & DEBUG_UNCOLLECTABLE)
debug_cycle("uncollectable", FROM_GC(gc));
}
if (debug & DEBUG_STATS) { if (debug & DEBUG_STATS) {
if (m == 0 && n == 0) { if (m == 0 && n == 0) {
PySys_WriteStderr("gc: done.\n"); PySys_WriteStderr("gc: done.\n");
...@@ -673,13 +647,11 @@ collect(int generation) ...@@ -673,13 +647,11 @@ collect(int generation)
* reachable list of garbage. The programmer has to deal with * reachable list of garbage. The programmer has to deal with
* this if they insist on creating this type of structure. * this if they insist on creating this type of structure.
*/ */
if (handle_finalizers(&finalizers, old, 1) == 0) (void)handle_finalizers(&finalizers, old);
(void)handle_finalizers(&reachable_from_finalizers, old, 0);
if (PyErr_Occurred()) { if (PyErr_Occurred()) {
if (gc_str == NULL) { if (gc_str == NULL)
gc_str = PyString_FromString("garbage collection"); gc_str = PyString_FromString("garbage collection");
}
PyErr_WriteUnraisable(gc_str); PyErr_WriteUnraisable(gc_str);
Py_FatalError("unexpected exception during garbage collection"); Py_FatalError("unexpected exception during garbage collection");
} }
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment