Correct way to apply gradients in TF2 custom training loop with multiple Keras models

Issue

I am working to implement a custom training loop with GradientTape involving multiple Keras models.
I have 3 networks, model_a, model_b, and model_c. I have created a list to hold their trainbale_weights as:

trainables = list() 
trainables.append(model_a.trainable_weights) # CovNet 
trainables.append(model_b.trainable_weights) # CovNet 
trainables.append(model_c.trainable_weights) # Fully Connected Network

I then calculate loss and try to apply gradients as:

loss = 0.
optimizer = tf.keras.optimizers.Adam()
for x, y in train_dataset:
    with tf.GradientTape() as tape:
        y = ...
        loss = ... # custom loss function!
    gradients = tape.gradient(loss, trainables)
    optimizer.apply_gradients(zip(gradients, trainables))    

But I get a following error I am not sure where’s the mistake:

AttributeError: 'list' object has no attribute '_in_graph_mode'

If I iterate over gradients and trainables and then apply gradients it works but I am not sure if this is the right way to do it.

for i in range(len(gradients)):
    optimizer.apply_gradients(zip(gradients[i], trainables[i]))

Solution

The problem is that tape.gradient expects trainables to be a flat list of trainable variables rather than a list of lists. You can solve this issue by concatenating all the trainable weights into a flat list:

trainables = model_a.trainable_weights + model_b.trainable_weights + model_c.trainable_weights

Answered By – rvinas

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published