Since it is not really clear to me what you actually want to archive, besides computing gradients for parameter_current
,
I just focus on describing why it doesn't work and what you can do to acutally compute gradients.
I've added some comments in the code to make it more clear what the problem is.
But in short the problem is that your parameter_current
is not part of the computation of your loss resp. the tensor you call backward()
on which is outputmySecondFunction
.
So currently you are only computing gradients for i
as you have set requires_grad=True
for it.
Please check the comments, for detailes:
import torch
def myFirstFunction(parameter_current_here):
# I removed some stuff to reduce it to the core features
# removed torch.enable_grad(), since it is enabled by default
# removed Optimal=100000000000000 and Optimal=i, they are not used
optimalValue=100000000000000
for j in range(2,10):
# Are you sure you want to compute gradients this tensor i?
# Because this is actually what requires_grad=True does.
# Just as a side note, this isn't your problem, but affects performance of the model.
i= torch.ones(1,requires_grad=True)*j
optimalValueNow=i*parameter_current_here.sum()
if (optimalValueNow<optimalValue):
optimalValue=optimalValueNow
# Part Problem 1:
# optimalValueNow is multiplied with your parameter_current
# i is just your parameter i, nothing else
# lets jump now the output below in the loop: outputMyFirstFunction
return optimalValueNow,i
def mySecondFunction(Current):
y=(20*Current)/2 + (Current**2)/10
return y
counter=0
while counter<5:
parameter_current = torch.randn(2, 2,requires_grad=True)
# Part Problem 2:
# this is a tuple (optimalValueNow,i) like described above
outputMyFirstFunction=myFirstFunction(parameter_current)
# now you are taking i as an input
# and i is just torch.ones(1,requires_grad=True)*j
# it as no connection to parameter_current
# thus nothing is optimized
outputmySecondFunction=mySecondFunction(outputMyFirstFunction[1])
# calculating gradients, since parameter_current is not part of the computation
# no gradients will be computed, you only get gradients for i
# Btw. if you would not have set requires_grad=True for i, you actually would get an error message
# for calling backward on this
outputmySecondFunction.backward()
print("outputMyFirstFunction after backward:",outputMyFirstFunction)
print("outputmySecondFunction after backward:",outputmySecondFunction)
print("parameter_current Gradient after backward:",parameter_current.grad)
counter=counter+1
So if you want to compute gradients for parameter_current
you simply have to make sure it is part of the computation
of the tensor you call backward()
on, you can do so for example by changing:
outputmySecondFunction=mySecondFunction(outputMyFirstFunction[1])
to:
outputmySecondFunction=mySecondFunction(outputMyFirstFunction[0])
Will have this effect, as soon as you change it you will get gradients for parameter_current
!
I hope it helps!
Full working code:
import torch
def myFirstFunction(parameter_current_here):
optimalValue=100000000000000
for j in range(2,10):
i= torch.ones(1,requires_grad=True)*j
optimalValueNow=i*parameter_current_here.sum()
if (optimalValueNow<optimalValue):
optimalValue=optimalValueNow
return optimalValueNow,i
def mySecondFunction(Current):
y=(20*Current)/2 + (Current**2)/10
return y
counter=0
while counter<5:
parameter_current = torch.randn(2, 2,requires_grad=True)
outputMyFirstFunction=myFirstFunction(parameter_current)
outputmySecondFunction=mySecondFunction(outputMyFirstFunction[0]) # changed line
outputmySecondFunction.backward()
print("outputMyFirstFunction after backward:",outputMyFirstFunction)
print("outputmySecondFunction after backward:",outputmySecondFunction)
print("parameter_current Gradient after backward:",parameter_current.grad)
counter=counter+1
Output:
outputMyFirstFunction after backward: (tensor([ 1.0394]), tensor([ 9.]))
outputmySecondFunction after backward: tensor([ 10.5021])
parameter_current Gradient after backward: tensor([[ 91.8709, 91.8709],
[ 91.8709, 91.8709]])
outputMyFirstFunction after backward: (tensor([ 13.1481]), tensor([ 9.]))
outputmySecondFunction after backward: tensor([ 148.7688])
parameter_current Gradient after backward: tensor([[ 113.6667, 113.6667],
[ 113.6667, 113.6667]])
outputMyFirstFunction after backward: (tensor([ 5.7205]), tensor([ 9.]))
outputmySecondFunction after backward: tensor([ 60.4772])
parameter_current Gradient after backward: tensor([[ 100.2969, 100.2969],
[ 100.2969, 100.2969]])
outputMyFirstFunction after backward: (tensor([-13.9846]), tensor([ 9.]))
outputmySecondFunction after backward: tensor([-120.2888])
parameter_current Gradient after backward: tensor([[ 64.8278, 64.8278],
[ 64.8278, 64.8278]])
outputMyFirstFunction after backward: (tensor([-10.5533]), tensor([ 9.]))
outputmySecondFunction after backward: tensor([-94.3959])
parameter_current Gradient after backward: tensor([[ 71.0040, 71.0040],
[ 71.0040, 71.0040]])