1
class NN(object):

    def __init__(...):
        [...] #some intialization of the class

    #define a recursive function to return  a vector which has atleast one non-zero element
    @staticmethod
    def generate_random_nodes(dropout_prob, size):
        temp = np.random.binomial(1, dropout_prob, size)
        return temp if not sum(temp) else generate_random_nodes(dropout_prob, size)

    def compute_dropout(self, activations, dropout_prob = 0.5):
        [...]
        mult = np.copy(activations)          
        temp = generate_random_nodes(dropout_prob, size = activations.shape[0])
        mult[:,i] = temp            
        activations*=mult
        return activations

    def fit(self, ...):
        compute_dropout(...)

I want to create a function within my class which is called by a class-method. This function is recursive and is meant to return a vector of 0s and 1s only if the vector has atleast one non-zero element

The error I'm getting is "Nameerror: name 'generate_random_nodes' is not defined

1 Answer 1

2

Anything defined inside a class must be referenced by qualified name, either looked up on the class directly, or on an instance of it. So the simplest fix here is to explicitly call NN.generate_random_nodes for the recursive call, and self.generate_random_nodes in the initial calls to it (only showing methods with changes):

@staticmethod
def generate_random_nodes(dropout_prob, size):
    temp = np.random.binomial(1, dropout_prob, size)
    # Must explicitly qualify recursive call
    return temp if not sum(temp) else NN.generate_random_nodes(dropout_prob, size)

def compute_dropout(self, activations, dropout_prob = 0.5):
    [...]
    mult = np.copy(activations)          
    # Can call static on self just fine, and avoids hard-coding class name
    temp = self.generate_random_nodes(dropout_prob, size=activations.shape[0])
    mult[:,i] = temp            
    activations*=mult
    return activations

Note that as a CPython implementation detail on Python 3.x, referencing __class__ inside a method defined in a class creates a closure scope that gives you access to the class it was defined it, allowing you to avoid repeating yourself by explicitly specifying the class, so generate_random_nodes could be:

@staticmethod
def generate_random_nodes(dropout_prob, size):
    temp = np.random.binomial(1, dropout_prob, size)
    # Must qualify recursive call
    return temp if not sum(temp) else __class__.generate_random_nodes(dropout_prob, size)

which has a couple advantages:

  1. Nested scope lookup of __class__ slightly faster than global scope lookup of NN, and
  2. If the name of your NN class changes during development, you don't need to change generate_random_nodes at all (because it's implicitly getting a reference to the class it was defined in).

You could also (without relying on CPython implementation details) change it to a classmethod to get the same basic benefit:

@classmethod
def generate_random_nodes(cls, dropout_prob, size):
    temp = np.random.binomial(1, dropout_prob, size)
    # Must qualify recursive call
    return temp if not sum(temp) else cls.generate_random_nodes(dropout_prob, size)

since classmethods receive a reference to the class they were called on (the class of the instance they were called on if called on an instance). This is a slight abuse of classmethod (classmethod's only intended use is for alternate constructors in class hierarchies where subclasses need to be able to be constructed using the alternate constructor without overloading it in the subclass); it's perfectly legal, just slightly unorthodox.

As discussed below in the comments:

  1. Python is bad at recursion
  2. Your recursion condition is backwards (you return temp only if the sum of it is 0, meaning temp is an array of all zeroes), which dramatically increases the chance of recursion, and makes a recursion error nearly certain for sufficiently high dropout_prob/size arguments.

So you want to change temp if not sum(temp) else <recursive call> to temp if sum(temp) else <recursive call>, or for better performance/obviousness given it's a numpy array, temp if temp.any() else <recursive call>. And while that's likely to make the chance of recursion errors pretty small to start with, if you want to be extra careful, just change to a while loop based approach that can't risk indefinite recursion:

@staticmethod
def generate_random_nodes(dropout_prob, size):
    while True:
        temp = np.random.binomial(1, dropout_prob, size)
        if temp.any():
            return temp
Sign up to request clarification or add additional context in comments.

6 Comments

Thank you for such a detailed response, if I may ask another question, I'm seeing a the following message when I run RecursionError: maximum recursion depth exceeded while calling a Python object. This is making me wonder if solving a problem recursively in python is not optimal
@Josh: You are correct there. Python (at least the CPython reference interpreter) is incapable of applying the tail call optimization, and (to prevent an actual C stack overflow) by default raises an exception if you descend more than 1000 frames (checkable with sys.getrecursionlimit(), changable with sys.setrecursionlimit(), but changing it just risks actual stack overflow crashes, not a "nice" exception). Recursive solutions are usually a bad idea if they recurse indefinitely. In this case, it seems like a simple while loop would be the way to go.
That said, I think your function is backwards. It keeps making arrays until it gets an array of all zeroes, then returns the zeroes. I think you wanted it the other way around.
Thanks, I was trying to avoid using a while loop. But if you think about it statistically, I'm trying to create a vector of 0s and 1s with a random distribution, with a 50% probability of selecting a 1.. so we really shouldn't run into an indefinite recursive state
@Josh: I updated the answer to include this info. I'll note that using sum for the test is quite inefficient (it can't short-circuit, and it has to convert each numpy value to a Python level int one by one as it goes), when numpy already provides a .any() method that short-circuits and operates directly on the C level values without conversion.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.