I am using custom Recall and Precision metrics in my model. I know they have them built into Keras but I only care about one of the classes.
As I begin an epoch, I get values printing out for the metrics but after many steps one metrics returns NaN and a few hundred epochs later the second custom metric shows NaN.
The recall metric is written in the same
def precision(y_true, y_pred): ''' Calculates precision metric over gun label Precision = TP/(TP+FP) ''' #I only care about the last label y_true = y_true[:,-1] y_pred = y_pred[:,-1] y_pred = tf.where(y_pred>.5, 1, 0) y_pred = tf.cast(y_pred, tf.float32) y_true = tf.cast(y_true, tf.float32) true_positives = K.sum(y_true * y_pred) false_positive = tf.math.reduce_sum(tf.where(tf.logical_and(tf.not_equal(y_true,y_pred), y_pred==1), 1, 0)) false_positive = tf.cast(false_positive, tf.float32) precision = true_positives / (true_positives + false_positive) return precision
Training a multi label so my last dense layer is
preds = Dense(num_classes, activation='sigmoid', name='Classifier')(x).
model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy', precision, recall]) model.fit(train_ds, steps_per_epoch=10000, validation_data=valid_ds, validation_steps=1181, epochs=200)
18/10000 [............] - ETA: 6:43 - loss: 0.6919 - accuracy: 0.0046 - precision: 0.2597 - recall: 0.4691 315/10000 [...........] - ETA: 7:56 - loss: 0.4174 - accuracy: 0.1145 - precision: nan - recall: 0.6115 10000/10000 [=========>] - ETA: 0s - loss: 0.0797 - accuracy: 0.5432 - precision: nan - recall: nan 10000/10000 [=========>] - 576s 56ms/step - loss: 0.0797 - accuracy: 0.5432 - precision: nan - recall: nan - val_loss: 0.0557 - val_accuracy: 0.5807 - val_precision: 0.9698 - val_recall: 0.9529
At the beginning of each epoch, the metrics show numbers again but after many steps they go back to NaN. With observation, I can confirm they do not go to 0 nor 1 right before NaN.
The issue was a divide by zero. I added a small value in each denominator which solves the problem. This occurs if there are no positive predictions by the network in any batch. This is why it occurred intermittently.
import tensorflow.keras.backend as K precision = true_positives / (true_positives + false_positive + K.epsilon())
Answered By – theastronomist