The clear_session() method of keras.backend does not clean up the fitting data

Issue

I am working on a comparison of the fitting accuracy results for the different types of data quality. A “good data” is the data without any NA in the feature values. A “bad data” is the data with NA in the feature values. A “bad data” should be fixed by some value correction. As a value correction, it might be replacing NA with zero or mean value.

In my code, I am trying to perform multiple fitting procedures.

Review the simplified code:

from keras import backend as K
...

xTrainGood = ... # the good version of the xTrain data 

xTrainBad = ... #  the bad version of the xTrain data

...

model = Sequential()

model.add(...)

...

historyGood = model.fit(..., xTrainGood, ...) # fitting the model with 
                                              # the original data without
                                              # NA, zeroes, or the feature mean values

Review the fitting accuracy plot, based on historyGood data:

enter image description here

After that, the code resets a stored the model and re-train the model with the “bad” data:

K.clear_session()

historyBad = model.fit(..., xTrainBad, ...)

Review the fitting process results, based on historyBad data:

enter image description here

As one can notice, the initial accuracy > 0.7, which means the model “remembers” previous fitting.

For the comparison, this is the standalone fitting results of “bad” data:

enter image description here

How to reset the model to the “initial” state?

Solution

K.clear_session() isn’t enough to reset states and ensure reproducibility. You’ll also need to:

  • Set (& reset) random seeds
  • Reset TensorFlow default graph
  • Delete previous model

Code accomplishing each below.

reset_seeds()
model = make_model() # example function to instantiate model
model.fit(x_good, y_good)

del model
K.clear_session()
tf.compat.v1.reset_default_graph()

reset_seeds()
model = make_model()
model.fit(x_bad, y_bad)

Note that if other variables reference the model, you should del them also – e.g. model = make_model(); model2 = model –> del model, model2 – else they may persist. Lastly, tf random seeds aren’t as easily reset as random‘s or numpy‘s, and require the graph to be cleared beforehand.


Function/modules used:

import tensorflow as tf
import numpy as np
import random
import keras.backend as K

def reset_seeds():
    np.random.seed(1)
    random.seed(2)
    if tf.__version__[0] == '2':
        tf.random.set_seed(3)
    else:
        tf.set_random_seed(3)
    print("RANDOM SEEDS RESET")

Answered By – OverLordGoldDragon

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published