2

Up until now, I have been making an egregious error in my Python development: I have been assuming that streams get closed once their corresponding objects go out of scope. Specifically, I assumed that when a file or some instance of a class inheriting io.IOBase invoked __del__, it would run the object's close method. However, upon executing the following code I noticed that this was certainly not the case.

def wrap_close(old_close):
    def new_close(*args, **kwargs):
        print("closing")
        return old_close(*args, **kwargs)
    return new_close

f = open("tmp.txt")
f.close = wrap_close(f.close)
f.close() # prints "closing"

f = open("tmp.txt")
f.close = wrap_close(f.close)
del(f) # nothing printed

My question is, what is the benefit of not closing a file or stream automatically when its __del__ method is invoked? It seems like the implementation would be trivial, but I imagine there has to be a reason for allowing streams to stay open after their corresponding object goes out of scope.

18
  • The problem is that closing a file descriptor is an OS-level task that only properly gets done when you call .close(). del doesn't have any special behavior built-in to deal with streams - it just removes the object from the namespace, and then the garbage collector composts it. Neither of them care that there's an open file descriptor somewhere, because how would they know? Commented Feb 14, 2020 at 12:58
  • 1
    Printing the actual file descriptors (via f.fileno()) shows that they are re-used, i.e. the file is closed. Note that .close is a Python level function, whereas _io is implemented in C and its classes will directly call its C functions, not the Python wrappers. Commented Feb 14, 2020 at 13:04
  • 1
    James: If you read the documentation closely, you'll see that there's no guarantee that deleted objects will ever be garbage collected. Commented Feb 14, 2020 at 13:12
  • 1
    @JamesMchugh Are you confusing __del__ with del perhaps? Python does use the equivalent of _io.TextIOWrapper.__del__, but it is a C function which doesn't call the Python level f.close. Commented Feb 14, 2020 at 13:17
  • 1
    @JamesMchugh del only unlinks a name. As a result, this might push the reference count down to 0 (in CPython) or eventually trigger garbage collection (any implementation). It does not directly invoke __del__. Commented Feb 14, 2020 at 13:23

1 Answer 1

3

self.close keeps a reference on self, so you're writing self.close = lambda: self.close(), creating a circular reference. As a result:

  1. del does nothing, CPython will not reclaim the object until an actual GC collection happens, either implicitly at some point or through an explicit gc.collect()
  2. CPython has to break the cycle, so by the time it comes around to collecting the object it might have removed the attribute from it

You can see this very clearly if you replace your strong references to old_close by weak ones:

def wrap_close(old_close):
    old_close = weakref.ref(old_close)
    def new_close(*args, **kwargs):
        print("closing")
        c = old_close()
        if c:
            c(*args, **kwargs)
    return new_close

f = open('/dev/zero')
f.close = wrap_close(f.close)
f.close() # prints "closing"

f = open('/dev/zero')
f.close = wrap_close(f.close)
del f # nothing printed

prints "closing" not just twice but thrice:

  • when close() is called explicitly
  • when the first f is garbaged (as the second one is created)
  • when the second one is del'd

My question is, what is the benefit of not closing a file or stream automatically when its del method is invoked?

Python absolutely closes the file object when it's finalised.

Sign up to request clarification or add additional context in comments.

5 Comments

A circular reference will not prevent an object from being collected. The CPython gc explicitly exists for circular references only.
@MisterMiyagi it will not prevent an object from being collected in the long term but del will not do it (you can see that by creating a weakref to the file and checking for the weakref's liveness after the del, on a normal file the weakref is empty whereas here it's not). And here even after forcing a GC run, the "replacement method" never gets called, possibly because CPython decides to break the cycle by first unsetting the callable.
So in this case, the wrapper for close could potentially be deleted before the actual file stream. I did not take that into account. On top of that, just because del may reduce the pointer count of an object to 0, that does not mean it will be GC'd at that point. I think I have a better understanding now. Thank you.
@JamesMchugh CPython more or less guarantees that a refcount of 0 will cause collection (note that this is specific to cpython, it's not generally valid) however if you have a cycle the recount is never 0, the object lives in an unreachable bubble of its own refcount. Also if you use weakrefs you can clearly see that there's no issue with the collection itself.
So using a weak reference removes that cycle since it doesn't protect the underlying object from GC, thus allowing the refcount to reach 0. That is super interesting. I did not even know that you could use weak references in Python. Thank you for making me aware of that.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.