3

In this pytorch code:

import torch
a = torch.tensor([2.], requires_grad=True)
y = torch.zeros((10))
gt = torch.zeros((10))

y[0] = a
y[1] = y[0] * 2
y.retain_grad()

loss = torch.sum((y-gt) ** 2)
loss.backward()
print(y.grad)

I want y[0]'s gradient to consist 2 parts:

  1. loss backward to y[0] itself.
  2. y[0] is used to calculate y[1], so it should have the part of y[1]'s gradient.

but when I run this code, there is only part 1 in y[0]'s gradient.

So how to make y[0]'s gradient to have all 2 parts?

edit: the output is:

tensor([4., 8., 0., 0., 0., 0., 0., 0., 0., 0.])

but I expect:

tensor([20., 8., 0., 0., 0., 0., 0., 0., 0., 0.])
1
  • I am a beginner - Hi @Shai @Decarbonized formaldehyde - how is the a.grad = (y[0] - 0) ^ + (y[1] - 0)^2 ? Commented Oct 2, 2023 at 6:36

1 Answer 1

2

y[0] and y[1] are two different elements, therefore they have different grad. The only thing that "binds" them is the underlying relation to a. If you inspect the grad of a, you'll see:

print(a.grad)
tensor([20.])

That is, the two parts of the gradients are combined in a.grad.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.