0

I've put together some computation which I'm trying to compute a loss on the result, and compute the gradients of all the parameters of the model w.r.t. that loss. The problem is that nestled in the computation is a tunable model that I want to be able to tune (eventually). Right now I am just trying to confirm that I can see the gradients of the model parameters when they are updated with backward(), which I cannot, This is the problem. Below I post code, the output, and the desired output.

class ExpModelTunable(torch.nn.Module):
    def __init__(self):
        super(ExpModelTunable, self).__init__()
        self.alpha = torch.nn.Parameter( torch.tensor(1.0, requires_grad=True) )
        self.beta = torch.nn.Parameter( torch.tensor(1.0, requires_grad=True) )
    
    def forward(self, t):
        return self.alpha * torch.exp(  - self.beta * t ) 

def func_f(t, t_list):
  mu = torch.tensor(0.13191110355, requires_grad=True)
  running_sum = torch.sum( torch.tensor( [ f(t-ti) for ti in t_list ], requires_grad=True ) )
  return mu + running_sum

def pytorch_objective_tunable(u, t_list):
  global U
  steps = torch.linspace(t_list[-1].item(),u.item(),100, requires_grad=True)
  func_values = torch.tensor( [ func_f(steps[i], t_list) for i in range(len(steps)) ], requires_grad=True )
  return torch.log(U) + torch.trapz(func_values, steps)

def newton_method(function, func, initial, t_list, iteration=200, convergence=0.0001):
    for i in range(iteration): 
        previous_data = initial.clone()
        value = function(initial, t_list)
        initial.data -= (value / func(initial.item(), t_list)).data

        if torch.abs(initial - previous_data) < torch.tensor(convergence):
            return initial
    return initial # return our final after iteration

# call starts
f = ExpModelTunable()
U = torch.rand(1, requires_grad=True)
initial_x = torch.tensor([.1], requires_grad=True) 
t_list = torch.tensor([0.0], requires_grad=True)
result = newton_method(pytorch_objective_tunable, func_f, initial_x, t_list)
print("Next Arrival at ", result.item())

This prints, the output is correct, all good here: Next Arrival at 4.500311374664307. My problem occures here:

loss = result - torch.tensor(1)
loss.backward()
print( result.grad )
for param in f.parameters():
    print(param.grad)

output:

tensor([1.])
None #this should not be None
None #this should not be None

So we can see the result variable's gradient is updating, but the model f's parameters' gradients aren't getting updated. I tried to go back through all the computation, all the code is here, and make sure any and everything has requires_grad=True but still I can't get it to work. This should work right? Anyone have any tips? Thanks.

0

1 Answer 1

2

There are a few issues with your code. Straight off you can tell if the model can at least initiate a backpropagation by looking at your output tensor:

>>> result
tensor([...], requires_grad=True)

It doesn't have a grad_fn, so you already know it's not connected to a graph.

Now for debugging the issues, here are some tips:

  • First, you should never mutate .data or use .item if you're planning on backpropagating. This will essentially kill the graph! As any operation performed after won't be attached to a graph.

  • You actually don't need to use requires_grad most of the time. Do note nn.Parameter will assign requires_grad=True to the tensor by default.

  • When working with list comprehensions inside your PyTorch pipeline, you can wrap the list with a torch.stack which is very effective to keep it tidy.

  • I wouldn't use a global if I was you...


Here is the corrected version:

class ExpModelTunable(nn.Module):
    def __init__(self):
        super(ExpModelTunable, self).__init__()
        self.alpha = nn.Parameter(torch.ones(1))
        self.beta = nn.Parameter(torch.ones(1))
    
    def forward(self, t):
        return self.alpha * torch.exp(-self.beta*t) 

f = ExpModelTunable()
def func_f(t, t_list):
    mu = torch.tensor(0.13191110355)
    running_sum = torch.stack([f(t-ti) for ti in t_list]).sum()
    return mu + running_sum

def pytorch_objective_tunable(u, t_list):
    global U
    steps = torch.linspace(t_list[-1].item(), u.item(), 100)
    func_values = torch.stack([func_f(steps[i], t_list) for i in range(len(steps))])
    return torch.log(U) + torch.trapz(func_values, steps)
    # return torch.trapz(func_values, steps)

def newton_method(function, func, initial, t_list, iteration=1, convergence=0.0001):
    for i in range(iteration): 
        previous_data = initial.clone()
        value = function(initial, t_list)
        initial -= (value / func(initial, t_list))

        if torch.abs(initial - previous_data) < torch.tensor(convergence):
            return initial
    return initial # return our final after iteration

U = torch.rand(1, requires_grad=True)
initial_x = torch.tensor([.1]) 
t_list = torch.tensor([0.0], requires_grad=True)
result = newton_method(pytorch_objective_tunable, func_f, initial_x, t_list)

Notice now the grad_fn attached to result:

>>> result
tensor([...], grad_fn=<SubBackward0>)
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.