When I was training my neural network there is a sudden drop in validation accuracy during the 8th epoch what does this mean?
Train for 281 steps, validate for 24 steps
281/281 [==============================] - 106s 378ms/step - loss: 1.5758 - accuracy: 0.8089 - val_loss: 1.8909 - val_accuracy: 0.4766
281/281 [==============================] - 99s 353ms/step - loss: 1.5057 - accuracy: 0.8715 - val_loss: 1.7364 - val_accuracy: 0.6276
281/281 [==============================] - 99s 353ms/step - loss: 1.4829 - accuracy: 0.8929 - val_loss: 1.5347 - val_accuracy: 0.8398
281/281 [==============================] - 99s 353ms/step - loss: 1.4445 - accuracy: 0.9301 - val_loss: 1.5551 - val_accuracy: 0.8047
281/281 [==============================] - 99s 353ms/step - loss: 1.4331 - accuracy: 0.9412 - val_loss: 1.5043 - val_accuracy: 0.8659
281/281 [==============================] - 97s 344ms/step - loss: 1.4100 - accuracy: 0.9639 - val_loss: 1.5562 - val_accuracy: 0.8151
281/281 [==============================] - 96s 342ms/step - loss: 1.4140 - accuracy: 0.9585 - val_loss: 1.4935 - val_accuracy: 0.8737
281/281 [==============================] - 96s 341ms/step - loss: 1.4173 - accuracy: 0.9567 - val_loss: 1.7569 - val_accuracy: 0.6055
281/281 [==============================] - 96s 340ms/step - loss: 1.4241 - accuracy: 0.9490 - val_loss: 1.4756 - val_accuracy: 0.9023
281/281 [==============================] - 96s 340ms/step - loss: 1.4067 - accuracy: 0.9662 - val_loss: 1.4167 - val_accuracy: 0.9648
Sudden drops in validation loss and training loss occur due to the batch training; in essence, the convergence would be smooth only if we trained with the entire dataset, not with batches. Therefore, it is normal to see such drops (both for training and for validation).
0.6055(Epoch with drop)
If you take a look at the validation loss, it merely increased with
0.26; however, this resulted in a
27% decrease in your accuracy. In this case, it is due to the fact that your model is not certain when it makes a prediction (at least at this stage of training).
Imagine that you have a binary classification model(between apples and oranges). At each prediction, when the ground truth is an apple, the network is
51% confident that the image is of an apple. We have the ground_truth apple, and as Keras does behind the curtains, the default confidence threshold is
50%. Then all the predictions are good and you have a good accuracy.
However, now comes the ‘problematic’ epoch. Due to the changed values of the weights of your neural network after another epoch of training, when you predict on your validation dataset, you get a confidence of
48-49% for each ground_truth apple, and again, since the threshold is
50%, you get much poorer accuracy than the previous epoch.
This particular case that you are experiencing, as you can now infer from the previous explanation, does not affect the loss so much, but the accuracy. It does not affect the loss that much during backpropagation, because a difference in the confidence prediction between
51% when computing the loss is not a very significant difference in the overall loss(as you see in your case, only a
0.26%). In the end, even at the ‘previous epoch’, when the model predicted correctly an apple, the neural network was not that extremely confident, by yielding only
51% confidence for an apple, not
95% for instance.
Answered By – Timbus Calin