Kaydet (Commit) f6b8045c authored tarafından Tim Peters's avatar Tim Peters

Reworked has_finalizer() to use the new _PyObject_Lookup() instead

of PyObject_HasAttr(); the former promises never to execute
arbitrary Python code.  Undid many of the changes recently made to
worm around the worst consequences of that PyObject_HasAttr() could
execute arbitrary Python code.

Compatibility is hard to discuss, because the dangerous cases are
so perverse, and much of this appears to rely on implementation
accidents.

To start with, using hasattr() to check for __del__ wasn't only
dangerous, in some cases it was wrong:  if an instance of an old-
style class didn't have "__del__" in its instance dict or in any
base class dict, but a getattr hook said __del__ existed, then
hasattr() said "yes, this object has a __del__".  But
instance_dealloc() ignores the possibility of getattr hooks when
looking for a __del__, so while object.__del__ succeeds, no
__del__ method is called when the object is deleted.  gc was
therefore incorrect in believing that the object had a finalizer.

The new method doesn't suffer that problem (like instance_dealloc(),
_PyObject_Lookup() doesn't believe __del__ exists in that case), but
does suffer a somewhat opposite-- and even more obscure --oddity:
if an instance of an old-style class doesn't have "__del__" in its
instance dict, and a base class does have "__del__" in its dict,
and the first base class with a "__del__" associates it with a
descriptor (an object with a __get__ method), *and* if that
descriptor raises an exception when __get__ is called, then
(a) the current method believes the instance does have a __del__,
but (b) hasattr() does not believe the instance has a __del__.

While these disagree, I believe the new method is "more correct":
because the descriptor *will* be called when the object is
destructed, it can execute arbitrary Python code at the time the
object is destructed, and that's really what gc means by "has a
finalizer":  not specifically a __del__ method, but more generally
the possibility of executing arbitrary Python code at object
destruction time.  Code in a descriptor's __get__() executed at
destruction time can be just as problematic as code in a
__del__() executed then.

So I believe the new method is better on all counts.

Bugfix candidate, but it's unclear to me how all this differs in
the 2.2 branch (e.g., new-style and old-style classes already
took different gc paths in 2.3 before this last round of patches,
but don't in the 2.2 branch).
üst df875b99
......@@ -272,8 +272,9 @@ def test_boom():
# the internal "attr" attributes as a side effect. That causes the
# trash cycle to get reclaimed via refcounts falling to 0, thus mutating
# the trash graph as a side effect of merely asking whether __del__
# exists. This used to (before 2.3b1) crash Python.
expect(gc.collect(), 0, "boom")
# exists. This used to (before 2.3b1) crash Python. Now __getattr__
# isn't called.
expect(gc.collect(), 4, "boom")
expect(len(gc.garbage), garbagelen, "boom")
class Boom2:
......
......@@ -12,6 +12,14 @@ What's New in Python 2.3 beta 1?
Core and builtins
-----------------
- Some horridly obscure problems were fixed involving interaction
between garbage collection and old-style classes with "ambitious"
getattr hooks. If an old-style instance didn't have a __del__ method,
but did have a __getattr__ hook, and the instance became reachable
only from an unreachable cycle, and the hook resurrected or deleted
unreachable objects when asked to resolve "__del__", anything up to
a segfault could happen. That's been repaired.
- dict.pop now takes an optional argument specifying a default
value to return if the key is not in the dict. If a default is not
given and the key is not found, a KeyError will still be raised.
......
......@@ -359,19 +359,17 @@ move_unreachable(PyGC_Head *young, PyGC_Head *unreachable)
/* Return true if object has a finalization method.
* CAUTION: An instance of an old-style class has to be checked for a
*__del__ method, and that can cause arbitrary Python code to get executed
* via the class's __getattr__ hook (if any). This function can therefore
* mutate the object graph, and that's been the source of subtle bugs.
*__del__ method, and earlier versions of this used to call PyObject_HasAttr,
* which in turn could call the class's __getattr__ hook (if any). That
* could invoke arbitrary Python code, mutating the object graph in arbitrary
* ways, and that was the source of some excruciatingly subtle bugs.
*/
static int
has_finalizer(PyObject *op)
{
if (PyInstance_Check(op)) {
/* This is the dangerous path: hasattr can invoke
* the class __getattr__(), and that can do anything.
*/
assert(delstr != NULL);
return PyObject_HasAttr(op, delstr);
return _PyInstance_Lookup(op, delstr) != NULL;
}
else if (PyType_HasFeature(op->ob_type, Py_TPFLAGS_HEAPTYPE))
return op->ob_type->tp_del != NULL;
......@@ -379,38 +377,28 @@ has_finalizer(PyObject *op)
return 0;
}
/* Move all objects out of unreachable, into collectable or finalizers.
* It's possible that some objects will get collected (via refcount falling
* to 0), or resurrected, as a side effect of checking for __del__ methods.
* After, finalizers contains all the objects from unreachable that haven't
* been collected by magic, and that have a finalizer. gc_refs is
* GC_REACHABLE for all of those. collectable contains all the remaining
* objects from unreachable, and gc_refs remains GC_TENTATIVELY_UNREACHABLE
* for those (we're still not sure they're reclaimable after this! Some
* may yet by reachable *from* the objects in finalizers).
/* Move the objects in unreachable with __del__ methods into finalizers.
* The objects remaining in unreachable do not have __del__ methods, and
* gc_refs remains GC_TENTATIVELY_UNREACHABLE for them. The objects
* moved into finalizers have gc_refs changed to GC_REACHABLE.
*/
static void
move_finalizers(PyGC_Head *unreachable, PyGC_Head *collectable,
PyGC_Head *finalizers)
move_finalizers(PyGC_Head *unreachable, PyGC_Head *finalizers)
{
while (!gc_list_is_empty(unreachable)) {
PyGC_Head *gc = unreachable->gc.gc_next;
while (gc != unreachable) {
PyObject *op = FROM_GC(gc);
int finalizer;
PyGC_Head *next = gc->gc.gc_next;
assert(IS_TENTATIVELY_UNREACHABLE(op));
finalizer = has_finalizer(op);
if (unreachable->gc.gc_next == gc) {
if (has_finalizer(op)) {
gc_list_remove(gc);
if (finalizer) {
gc_list_append(gc, finalizers);
gc->gc.gc_refs = GC_REACHABLE;
}
else
gc_list_append(gc, collectable);
}
/* else has_finalizer() deleted op via side effect */
gc = next;
}
}
......@@ -430,11 +418,10 @@ visit_move(PyObject *op, PyGC_Head *tolist)
}
/* Move objects that are reachable from finalizers, from the unreachable set
* into the reachable_from_finalizers set.
* into finalizers set.
*/
static void
move_finalizer_reachable(PyGC_Head *finalizers,
PyGC_Head *reachable_from_finalizers)
move_finalizer_reachable(PyGC_Head *finalizers)
{
traverseproc traverse;
PyGC_Head *gc = finalizers->gc.gc_next;
......@@ -443,7 +430,7 @@ move_finalizer_reachable(PyGC_Head *finalizers,
traverse = FROM_GC(gc)->ob_type->tp_traverse;
(void) traverse(FROM_GC(gc),
(visitproc)visit_move,
(void *)reachable_from_finalizers);
(void *)finalizers);
}
}
......@@ -475,24 +462,32 @@ debug_cycle(char *msg, PyObject *op)
/* Handle uncollectable garbage (cycles with finalizers, and stuff reachable
* only from such cycles).
* If DEBUG_SAVEALL or hasfinalizer, the objects in finalizers are appended
* to the module garbage list (a Python list). The objects in finalizers
* are merged into the old list regardless.
* If DEBUG_SAVEALL, all objects in finalizers are appended to the module
* garbage list (a Python list), else only the objects in finalizers with
* __del__ methods are appended to garbage. All objects in finalizers are
* merged into the old list regardless.
* Returns 0 if all OK, <0 on error (out of memory to grow the garbage list).
* The finalizers list is made empty on a successful return.
*/
static int
handle_finalizers(PyGC_Head *finalizers, PyGC_Head *old, int hasfinalizer)
handle_finalizers(PyGC_Head *finalizers, PyGC_Head *old)
{
PyGC_Head *gc = finalizers->gc.gc_next;
if (garbage == NULL) {
garbage = PyList_New(0);
if (garbage == NULL)
Py_FatalError("gc couldn't create gc.garbage list");
}
if ((debug & DEBUG_SAVEALL) || hasfinalizer) {
if (append_objects(garbage, finalizers) < 0)
for (; gc != finalizers; gc = gc->gc.gc_next) {
PyObject *op = FROM_GC(gc);
if ((debug & DEBUG_SAVEALL) || has_finalizer(op)) {
if (PyList_Append(garbage, op) < 0)
return -1;
}
}
gc_list_merge(finalizers, old);
return 0;
}
......@@ -541,9 +536,7 @@ collect(int generation)
PyGC_Head *young; /* the generation we are examining */
PyGC_Head *old; /* next older generation */
PyGC_Head unreachable;
PyGC_Head collectable;
PyGC_Head finalizers;
PyGC_Head reachable_from_finalizers;
PyGC_Head *gc;
if (delstr == NULL) {
......@@ -606,31 +599,19 @@ collect(int generation)
* care not to create such things. For Python, finalizers means
* instance objects with __del__ methods.
*
* Move each unreachable object into the collectable set or the
* finalizers set. Because we need to check for __del__ methods on
* instances of classic classes, arbitrary Python code may get
* executed by getattr hooks: that may resurrect or deallocate (via
* refcount falling to 0) unreachable objects, so this is very
* delicate.
* Move unreachable objects with finalizers into a different list.
*/
gc_list_init(&collectable);
gc_list_init(&finalizers);
move_finalizers(&unreachable, &collectable, &finalizers);
move_finalizers(&unreachable, &finalizers);
/* finalizers contains the unreachable objects with a finalizer;
* unreachable objects reachable only *from* those are also
* uncollectable; we move those into a separate list
* (reachable_from_finalizers) so we don't have to do the dangerous
* has_finalizer() test again later.
* uncollectable, and we move those into the finalizers list too.
*/
gc_list_init(&reachable_from_finalizers);
move_finalizer_reachable(&finalizers, &reachable_from_finalizers);
/* And move everything only reachable from the reachable stuff. */
move_finalizer_reachable(&reachable_from_finalizers,
&reachable_from_finalizers);
move_finalizer_reachable(&finalizers);
/* Collect statistics on collectable objects found and print
* debugging information. */
for (gc = collectable.gc.gc_next; gc != &collectable;
for (gc = unreachable.gc.gc_next; gc != &unreachable;
gc = gc->gc.gc_next) {
m++;
if (debug & DEBUG_COLLECTABLE) {
......@@ -640,7 +621,7 @@ collect(int generation)
/* Call tp_clear on objects in the collectable set. This will cause
* the reference cycles to be broken. It may also cause some objects
* in finalizers and/or reachable_from_finalizers to be freed */
delete_garbage(&collectable, old);
delete_garbage(&unreachable, old);
/* Collect statistics on uncollectable objects found and print
* debugging information. */
......@@ -651,13 +632,6 @@ collect(int generation)
if (debug & DEBUG_UNCOLLECTABLE)
debug_cycle("uncollectable", FROM_GC(gc));
}
for (gc = reachable_from_finalizers.gc.gc_next;
gc != &reachable_from_finalizers;
gc = gc->gc.gc_next) {
n++;
if (debug & DEBUG_UNCOLLECTABLE)
debug_cycle("uncollectable", FROM_GC(gc));
}
if (debug & DEBUG_STATS) {
if (m == 0 && n == 0) {
PySys_WriteStderr("gc: done.\n");
......@@ -673,13 +647,11 @@ collect(int generation)
* reachable list of garbage. The programmer has to deal with
* this if they insist on creating this type of structure.
*/
if (handle_finalizers(&finalizers, old, 1) == 0)
(void)handle_finalizers(&reachable_from_finalizers, old, 0);
(void)handle_finalizers(&finalizers, old);
if (PyErr_Occurred()) {
if (gc_str == NULL) {
if (gc_str == NULL)
gc_str = PyString_FromString("garbage collection");
}
PyErr_WriteUnraisable(gc_str);
Py_FatalError("unexpected exception during garbage collection");
}
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment