In pytorch, how to calculate gradient for a element in a tensor when it is used to calculate another element in this tensor?

Question

In this pytorch code:

import torch
a = torch.tensor([2.], requires_grad=True)
y = torch.zeros((10))
gt = torch.zeros((10))

y[0] = a
y[1] = y[0] * 2
y.retain_grad()

loss = torch.sum((y-gt) ** 2)
loss.backward()
print(y.grad)

I want y[0]'s gradient to consist 2 parts:

loss backward to y[0] itself.
y[0] is used to calculate y[1], so it should have the part of y[1]'s gradient.

but when I run this code, there is only part 1 in y[0]'s gradient.

So how to make y[0]'s gradient to have all 2 parts?

edit: the output is:

tensor([4., 8., 0., 0., 0., 0., 0., 0., 0., 0.])

but I expect:

tensor([20., 8., 0., 0., 0., 0., 0., 0., 0., 0.])

I am a beginner - Hi @Shai @Decarbonized formaldehyde - how is the a.grad = (y[0] - 0) ^ + (y[1] - 0)^2 ? — Ishank
– Ishank, Commented Oct 2, 2023 at 6:36

Shai · Accepted Answer · 2022-09-13 09:38:13Z

2

y[0] and y[1] are two different elements, therefore they have different grad. The only thing that "binds" them is the underlying relation to a. If you inspect the grad of a, you'll see:

print(a.grad)

tensor([20.])

That is, the two parts of the gradients are combined in a.grad.

answered Sep 13, 2022 at 9:38

Shai

115k39 gold badges259 silver badges398 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

In pytorch, how to calculate gradient for a element in a tensor when it is used to calculate another element in this tensor?

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related