How can I differentiate a neural network with multi-outputs?

Issue

Since we know the automatic differentiation is achieved by tf.GradientTape in Python, like:

with tf.GradientTape(persistent=True) as tape1:
    func_1 = u(x, y)
d_fun1_dx, d_fun1_dy = tape1.gradient(func_1, [x, y])
del tape1

it could get the derivative of a single output neural network.

And i have an neural network with two inputs x, y and two outputs f1, f2. I want to get df1/dx, df1/dy, df2/dx, df2/dy, how can i achieve this?

Solution

What you are looking for is a Jacobian, not a gradient. It is implemented in tf under tape1.jacobian and will return a jacobian matrix of partial derivatives.

Example from the documentation:

with tf.GradientTape() as g:
  x  = tf.constant([1.0, 2.0])
  g.watch(x)
  y = x * x
jacobian = g.jacobian(y, x)
# jacobian value is [[2., 0.], [0., 4.]]

That being said, use of Jacobians usually requires more advanced methods, what are you planning to do with it really will guide you if you really need a Jacobian. For example if you were to simply use "gradient descent" you need to now make a decision what to do with 2 gradients per parameter. Are you going to analyse them? Are you just going to add them? If you were to add them than note that

(dy/dx) f(x) + (dz/dx) f(x) = (d/dx) [ f.z(x) + f.y(x) ]

so it is equivalent to just adding outputs and computing normal gradient. There are of course uses of Jacobians but they go much beyond typical gradient descent algorithms.

Answered By – lejlot

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published