pytorch - gradients not calculated for parameters

Question

a = torch.nn.Parameter(torch.randn(1, requires_grad=True, dtype=torch.float, device=device))
b = torch.nn.Parameter(torch.randn(1, requires_grad=True, dtype=torch.float, device=device))
c = a + 1
d = torch.nn.Parameter(c, requires_grad=True,)
for epoch in range(n_epochs):
    yhat = d + b * x_train_tensor
    error = y_train_tensor - yhat
    loss = (error ** 2).mean()
    loss.backward()
    print(a.grad)
    print(b.grad)
    print(c.grad)
    print(d.grad)

Prints out

None
tensor([-0.8707])
None
tensor([-1.1125])

How do I learn the gradient for a and c? variable d needs to stay a parameter

Here, c is a regular tensor, not a parameter, so PyTorch does not compute gradients of them. This tutorial (youtube.com/watch?v=MswxJw-8PvE) may help you. — Wasi Ahmad
– Wasi Ahmad, Commented Oct 16, 2019 at 22:57

zihaozhihao · Accepted Answer · 2019-10-17 04:09:44Z

Basically, when you create a new tensor, like torch.nn.Parameter() or torch.tensor(), you are creating a leaf node tensor.

And when you do something like c=a+1, c will be intermediate node. You can print(c.is_leaf) to check whether the tensor is leaf node or not. Pytorch will not calculate the gradient of intermediate node in default.

In your code snippet, a, b, d are all leaf node tensor, and c is intermediate node. c.grad will None as pytorch doesn't calculate the gradient for intermediate node. a is isolated from the graph when you call loss.backword(). That's why a.grad is also None.

If you change the code to this

a = torch.nn.Parameter(torch.randn(1, requires_grad=True, dtype=torch.float, device=device))
b = torch.nn.Parameter(torch.randn(1, requires_grad=True, dtype=torch.float, device=device))
c = a + 1
d = c
for epoch in range(n_epochs):
    yhat = d + b * x_train_tensor
    error = y_train_tensor - yhat
    loss = (error ** 2).mean()
    loss.backward()
    print(a.grad) # Not None
    print(b.grad) # Not None
    print(c.grad) # None
    print(d.grad) # None

You will find a and b have gradients, but c.grad and d.grad are None, because they're intermediate node.

Is there a way for d to stay a parameter but as an intermediate node?
@algogator I think older version of pytorch can set .reqiure_grad to true for intermediate node, but the latest version seems not allowed to do that.

Collectives™ on Stack Overflow

pytorch - gradients not calculated for parameters

1 Answer 1

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Related