I've put together some computation which I'm trying to compute a loss on the result, and compute the gradients of all the parameters of the model w.r.t. that loss. The problem is that nestled in the computation is a tunable model that I want to be able to tune (eventually). Right now I am just trying to confirm that I can see the gradients of the model parameters when they are updated with backward()
, which I cannot, This is the problem. Below I post code, the output, and the desired output.
class ExpModelTunable(torch.nn.Module):
def __init__(self):
super(ExpModelTunable, self).__init__()
self.alpha = torch.nn.Parameter( torch.tensor(1.0, requires_grad=True) )
self.beta = torch.nn.Parameter( torch.tensor(1.0, requires_grad=True) )
def forward(self, t):
return self.alpha * torch.exp( - self.beta * t )
def func_f(t, t_list):
mu = torch.tensor(0.13191110355, requires_grad=True)
running_sum = torch.sum( torch.tensor( [ f(t-ti) for ti in t_list ], requires_grad=True ) )
return mu + running_sum
def pytorch_objective_tunable(u, t_list):
global U
steps = torch.linspace(t_list[-1].item(),u.item(),100, requires_grad=True)
func_values = torch.tensor( [ func_f(steps[i], t_list) for i in range(len(steps)) ], requires_grad=True )
return torch.log(U) + torch.trapz(func_values, steps)
def newton_method(function, func, initial, t_list, iteration=200, convergence=0.0001):
for i in range(iteration):
previous_data = initial.clone()
value = function(initial, t_list)
initial.data -= (value / func(initial.item(), t_list)).data
if torch.abs(initial - previous_data) < torch.tensor(convergence):
return initial
return initial # return our final after iteration
# call starts
f = ExpModelTunable()
U = torch.rand(1, requires_grad=True)
initial_x = torch.tensor([.1], requires_grad=True)
t_list = torch.tensor([0.0], requires_grad=True)
result = newton_method(pytorch_objective_tunable, func_f, initial_x, t_list)
print("Next Arrival at ", result.item())
This prints, the output is correct, all good here: Next Arrival at 4.500311374664307
. My problem occures here:
loss = result - torch.tensor(1)
loss.backward()
print( result.grad )
for param in f.parameters():
print(param.grad)
output:
tensor([1.])
None #this should not be None
None #this should not be None
So we can see the result variable's gradient is updating, but the model f
's parameters' gradients aren't getting updated. I tried to go back through all the computation, all the code is here, and make sure any and everything has requires_grad=True
but still I can't get it to work. This should work right? Anyone have any tips? Thanks.