Why pytorch can't calculate gradient in the loop?

Question

I've encountered a problem but I don't know why. I created a tensor with torch.tensor() at first and my goal is to calculate the gradient of y=2*x. It did work by setting the parameter requires_grad = True at very begining. I run the y.backward() and it worked.

I thought the steps mentioned above as the pattern. I'd like to see if this pattern work for each element in the vector a. So I wrote the for-loop, but the new steps return None instead of tensor(2).

I tried to separate each i out of the loop, like in the picture and it worked.

I'm confused. Please tell my why. Thank you very much!

import torch
x = torch.tensor([1.0,2.0,3.0,7.0],requires_grad=True) #vector
y = 2*x #vector
# while pytorch could only return scalar
#y.sum().backward()
#print(x.grad)
#x.requires_grad_(True)
for i in x:
    i.requires_grad_(True)
    print(i)
    z = 2 * i
    z.backward()
    print(i.grad)



a = torch.tensor(1.0,requires_grad=True)
b = 2 * a
b.backward()
print(a)
print(a.grad)

The output shows as

tensor(1., grad_fn=<UnbindBackward0>)
None
tensor(2., grad_fn=<UnbindBackward0>)
None
tensor(3., grad_fn=<UnbindBackward0>)
None
tensor(7., grad_fn=<UnbindBackward0>)
None
tensor(1., requires_grad=True)
tensor(2.)

If my words are confusing please check the picture link at the top of my words. That would help you a lot. Thank you guys! — Leo Jin
– Leo Jin, Commented Apr 11, 2022 at 12:27
Welcome to SO! Please post code, not pictures of code. Your code should ideally run. It will be much easier to help you. Have a read of this: stackoverflow.com/help/how-to-ask — Matt Hall
– Matt Hall, Commented Apr 11, 2022 at 12:29

mak13 · Accepted Answer · 2022-04-14 06:28:43Z

I think you should consider specifying a loss function, I quote from PyTorch official getting started tutorial:

To optimize weights of parameters in the neural network, we need to compute the derivatives of our loss function with respect to parameters...

You can refer to the tutorial part about gradients by visiting the following link: AUTOMATIC DIFFERENTIATION WITH TORCH.AUTOGRAD

I wish my answer was helpful to you.

New Edit: Firstly, because of the tensor behavior wrt to inplace ops, you should not access the tensors in "x" using for el in tensor In particular, this line

for i in x:

should be changed to

for d in range(x.size()[0]):
    a = x[d]

Secondly, running the code will give a user warning and I quote:

UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the .grad field to be populated for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations. So changing the code into the following will allow you to calculate the gradient in the loop:

import torch
x = torch.tensor([1.0,2.0,3.0,7.0], requires_grad=True) #vector
y = 2 * x
for d in range(x.size()[0]):
    i = x[d]
    i.retain_grad()
    z = 2 * i
    z.backward()
    print(i)
    print(i.grad)

and the output will be:

tensor(1., grad_fn=<SelectBackward0>)
tensor(2.)
tensor(2., grad_fn=<SelectBackward0>)
tensor(2.)
tensor(3., grad_fn=<SelectBackward0>)
tensor(2.)
tensor(7., grad_fn=<SelectBackward0>)
tensor(2.)

I think you misunderstand my question. My question is that why in the loop I may get None as the output but outside the loop I can get the correct gradient of the tensor?
Because of "The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward()." Inside the for loop you are accessing the tensor in a different way than when you access it outside a for loop.

Collectives™ on Stack Overflow

Why pytorch can't calculate gradient in the loop?

1 Answer 1

3 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Related