2

First of all, this post does NOT answer my question or give me any guide to answer my question at all.

My question is about mechanism function resolving non-local variables.

Code

# code block 1
def func():
    vals = [0, 0, 0]
    other_vals = [7, 8, 9]
    other = 12

    def func1():
        vals[1] += 1
        print(vals)

    def func2():
        vals[2] += 2
        print vals

    return (func1, func2)

f1, f2 = func()

Try to run f1, f2:

>>> f1()
[0, 1, 0]
>>> f2
[0, 1, 2]

This shows that the object previously referred by vals are shared by f1 and f2, and not garbage collected after execution of func.

Will objects referred by other_vals and other be garbage collected? I think so. But how does Python decide not to garbage collect vals?

Assumption 1

Python interpreter will resolve variable names within func1 and func2 to figure out references inside the function, and increase the reference count of [0, 0, 0] by 1 preventing it from garbage collection after the func call.

But if I do

# code block 2
def outerfunc():
    def innerfunc():
        print(non_existent_variable)
f = outerfunc()

No error reported. Further more

# code block 3
def my_func():
    print(yet_to_define)
yet_to_define = "hello"

works.

Assumption 2 Variable names are resolved dynamically at run time. This makes observations in code block 2 and 3 easy to explain, but how did the interpreter know it need to increase reference count of [0, 0, 0] in code block 1?

Which assumption is correct?

9
  • 1
    en.wikipedia.org/wiki/Garbage_collection_(computer_science) Commented Dec 19, 2015 at 6:17
  • (also worth reading, though I don't know if it carries well to newer versions: docs.python.org/release/2.5.2/ext/refcounts.html ) Commented Dec 19, 2015 at 6:18
  • I agree that the dupe target doesn't adequately answer your question. Hopefully, it'll get re-opened. Commented Dec 19, 2015 at 7:39
  • If I understand the question, the linked dupe doesn't really answer it. I think what the OP is looking for is actually answered in Ned Batchelder's blog post Facts and myths about Python names and values. I'm pretty sure this question is still a duplicate, though. Perhaps this question would be a better dupe? Commented Dec 19, 2015 at 7:41
  • In the mean time, I'll give a quick summary here. Your 1st example creates a closure, so the interpreter stores a reference to vals in the returned function objects func1 and func2, and that ref prevents vals from being garbage collected. See What exactly is contained within a obj.__closure__?. In your 2nd example non_existent_variable is presumably a global, and you will get an error if you don't define it before calling f. Commented Dec 19, 2015 at 7:41

1 Answer 1

3

Your first example creates a closure; also see Why aren't python nested functions called closures?, Can you explain closures (as they relate to Python)?, and What exactly is contained within a obj.__closure__?.

The closure mechanism ensures that the interpreter stores a reference to vals in the returned function objects func1 and func2. Your Assumption 1 is correct: that reference prevents vals from being garbage collected when func returns.

In your second example, the interpreter cannot see a reference to non_existent_variable in the enclosing scope(s), but that's ok because your Assumption 2 is also correct, so you're free to use names that haven't yet been bound to objects at function declaration time, so long as the name is in scope when you actually call the function.

The answer to "how did the interpreter know it need to increase reference count of [0, 0, 0] in code block 1?" is that the closure mechanism is an explicit thing the interpreter does when it executes a function definition, i.e., when it's creating a function object from the function definition in your script.

Every Python function object (both normal def-style functions and lambdas) has an attribute to store this closure information, with a minor difference between Python 2 and Python 3. See the links at the start of this answer for details, but I will mention here that Python 3 provides the nonlocal keyword, which works a bit like the global keyword: nonlocal allows you to make assignments to closed-over simple variables; J.F. Sebastian's answer has a simple example illustrating the use of nonlocal.

Note that with nested functions the inner function definitions are processed each time you call the outer function, which allows you to do things like:

def func(vals):
    def func1():
        vals[1] += 1
        print(vals)

    def func2():
        vals[2] += 2
        print(vals)

    return func1, func2

f1, f2 = func([0, 0, 0])
f1()
f2()

f1, f2 = func([10, 20, 30])
f1()
f2()

output

[0, 1, 0]
[0, 1, 2]
[10, 21, 30]
[10, 21, 32]
Sign up to request clarification or add additional context in comments.

1 Comment

Learned a lot. Thank you!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.