4
a = torch.nn.Parameter(torch.randn(1, requires_grad=True, dtype=torch.float, device=device))
b = torch.nn.Parameter(torch.randn(1, requires_grad=True, dtype=torch.float, device=device))
c = a + 1
d = torch.nn.Parameter(c, requires_grad=True,)
for epoch in range(n_epochs):
    yhat = d + b * x_train_tensor
    error = y_train_tensor - yhat
    loss = (error ** 2).mean()
    loss.backward()
    print(a.grad)
    print(b.grad)
    print(c.grad)
    print(d.grad)

Prints out

None
tensor([-0.8707])
None
tensor([-1.1125])

How do I learn the gradient for a and c? variable d needs to stay a parameter

2
  • Here, c is a regular tensor, not a parameter, so PyTorch does not compute gradients of them. This tutorial (youtube.com/watch?v=MswxJw-8PvE) may help you. Commented Oct 16, 2019 at 22:57
  • @WasiAhmad what about a? Commented Oct 16, 2019 at 23:27

1 Answer 1

3

Basically, when you create a new tensor, like torch.nn.Parameter() or torch.tensor(), you are creating a leaf node tensor.

And when you do something like c=a+1, c will be intermediate node. You can print(c.is_leaf) to check whether the tensor is leaf node or not. Pytorch will not calculate the gradient of intermediate node in default.

In your code snippet, a, b, d are all leaf node tensor, and c is intermediate node. c.grad will None as pytorch doesn't calculate the gradient for intermediate node. a is isolated from the graph when you call loss.backword(). That's why a.grad is also None.

If you change the code to this

a = torch.nn.Parameter(torch.randn(1, requires_grad=True, dtype=torch.float, device=device))
b = torch.nn.Parameter(torch.randn(1, requires_grad=True, dtype=torch.float, device=device))
c = a + 1
d = c
for epoch in range(n_epochs):
    yhat = d + b * x_train_tensor
    error = y_train_tensor - yhat
    loss = (error ** 2).mean()
    loss.backward()
    print(a.grad) # Not None
    print(b.grad) # Not None
    print(c.grad) # None
    print(d.grad) # None

You will find a and b have gradients, but c.grad and d.grad are None, because they're intermediate node.

Sign up to request clarification or add additional context in comments.

2 Comments

Is there a way for d to stay a parameter but as an intermediate node?
@algogator I think older version of pytorch can set .reqiure_grad to true for intermediate node, but the latest version seems not allowed to do that.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.