I'm relatively new to PyTorch and am trying to reproduce an algorithm from an academic paper that approximates a term using the Hessian matrix. I've set up a toy problem so that I can compare the results of the full Hessian with the approximation. I found this gist and have been playing with it to compute the full Hessian part of the algorithm.
I am getting the error: "RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation."
I've scoured through the simple example code, documentation, and many, many forum posts about this issue and cannot find any in-place operations. Any help would be greatly appreciated!
Here is my code:
import torch
import torch.autograd as autograd
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import numpy as np
torch.set_printoptions(precision=20, linewidth=180)
def jacobian(y, x, create_graph=False):
jac = []
flat_y = y.reshape(-1)
grad_y = torch.zeros_like(flat_y)
for i in range(len(flat_y)):
grad_y[i] = 1.
grad_x, = torch.autograd.grad(flat_y, x, grad_y, retain_graph=True, create_graph=create_graph)
jac.append(grad_x.reshape(x.shape))
grad_y[i] = 0.
return torch.stack(jac).reshape(y.shape + x.shape)
def hessian(y, x):
return jacobian(jacobian(y, x, create_graph=True), x)
def f(x):
return x * x
np.random.seed(435537698)
num_dims = 2
num_samples = 3
X = [np.random.uniform(size=num_dims) for i in range(num_samples)]
print('X: \n{}\n\n'.format(X))
mean = torch.Tensor(np.mean(X, axis=0))
mean.requires_grad = True
print('mean: \n{}\n\n'.format(mean))
cov = torch.Tensor(np.cov(X, rowvar=False))
print('cov: \n{}\n\n'.format(cov))
with autograd.detect_anomaly():
hessian_matrices = hessian(f(mean), mean)
print('hessian: \n{}\n\n'.format(hessian_matrices))
And here is the output with the stack trace:
X:
[array([0.81700949, 0.17141617]), array([0.53579366, 0.31141496]), array([0.49756485, 0.97495776])]
mean:
tensor([0.61678934097290039062, 0.48592963814735412598], requires_grad=True)
cov:
tensor([[ 0.03043144382536411285, -0.05357056483626365662],
[-0.05357056483626365662, 0.18426130712032318115]])
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-3-5a1c492d2873> in <module>()
42
43 with autograd.detect_anomaly():
---> 44 hessian_matrices = hessian(f(mean), mean)
45 print('hessian: \n{}\n\n'.format(hessian_matrices))
2 frames
<ipython-input-3-5a1c492d2873> in hessian(y, x)
21
22 def hessian(y, x):
---> 23 return jacobian(jacobian(y, x, create_graph=True), x)
24
25 def f(x):
<ipython-input-3-5a1c492d2873> in jacobian(y, x, create_graph)
15 for i in range(len(flat_y)):
16 grad_y[i] = 1.
---> 17 grad_x, = torch.autograd.grad(flat_y, x, grad_y, retain_graph=True, create_graph=create_graph)
18 jac.append(grad_x.reshape(x.shape))
19 grad_y[i] = 0.
/usr/local/lib/python3.6/dist-packages/torch/autograd/__init__.py in grad(outputs, inputs, grad_outputs, retain_graph, create_graph, only_inputs, allow_unused)
155 return Variable._execution_engine.run_backward(
156 outputs, grad_outputs, retain_graph, create_graph,
--> 157 inputs, allow_unused)
158
159
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [2]] is at version 4; expected version 3 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
torch.autograd.grad
... Changing the definition off(x)
fromx*x
tox*x*torch.ones_like(x)
solves the problem. I have no idea why... Seems like a bug in PyTorch to me...