Saving history = model.fit() variable in order to use it again when I start a new VS Code Jupyter Session. Dill and %store does not work

Issue

So I am doing Transfer Learning with tensorflow and I want to be able to run

history = model.fit(...) # Run initial training with base_model.trainable = False

After the first training is done, I can fine-tune it by unfreezing some layers so if the first session ran for 20 epochs my next block of code will be:

# Train the model again for a few epochs
fine_tune_epochs = 10
total_epochs = len(history.epoch) + fine_tune_epochs
history_tuned = model.fit(train_set, validation_data = dev_set, initial_epoch=history.epoch[-1], epochs=total_epochs,verbose=2, callbacks=callbacks)

Basically it will take the epochs from history and will continue training from the last epoch and save these results in history_tuned

But I might want to train it again with more layers unfreezed so I would run history_tuned02 again and keep using the epochs for each history so my graphs look like one like the image below.
These graph contains the original training session + fine_tuned session

As you can see from the graph, it’s all connected together but in reality is two different training sessions. The first one where the model is frozen and then the fine-tuned session. You can even tell where fine-tuning starts from the bump in performance.

The problem is, for me to do this I have to leave Jupyter open for days, because if I close it, all the variables are gone and I would need to train everything again, which would take insane amounts of time.

I tried using dill package but it would not work on history. I also tried using %store history but it also would not work for some reason as you can see from the image below on a dummy notebook that I test things.
%store does not work on history variable

So is there a way, to save history variable on disk, close jupyter, open it again, restore history and continue my work? Even if I leave jupyter and VS Code open until I finish with the model, crashes do happen.

Also I use checkpoint callback on tensorflow so I have my weights saved, restoring those is not a problem, but I do need history as well if it’s possible.

UPDATE:

When I use CSVLogger callback as suggested and read it with

history = pd.read_csv('demo/logs/hist.log')

then

history.head()

The output is

history.head() output

Solution

You can save your history in two ways:

The manual method:

Simply interrupt your training and save your history file as a dictionary:

with open('/history_dict', 'wb') as file:
    pickle.dump(history.history, file)

You can then reload it with:

history = pickle.load(open('/history_dict'), "rb")

The automated method:

You can create a simple callback that every epoch stores your history. So, even if your training crashes, it was automatically saved and can be restored.

The callback can be something like this:

from tensorflow import keras
import tensorflow.keras.backend as K
import os
import csv

my_dir = './model_dir' # where to save history

class SaveHistory(keras.callbacks.Callback):

  def on_epoch_end(self, batch, logs=None):
    if ('lr' not in logs.keys()):
      logs.setdefault('lr', 0)
      logs['lr'] = K.get_value(self.model.optimizer.lr)

    if not ('history.csv' in os.listdir(my_dir)):
      with open(my_dir + 'history.csv', 'a') as f:
        content = csv.DictWriter(f, logs.keys())
        content.writeheader()

    with open(my_dir + 'history.csv','a') as f:
      content = csv.DictWriter(f, logs.keys())
      content.writerow(logs)

model.fit(..., callbacks=[SaveHistory()])

To reload the history saved as a .csv simply do:

import pandas as pd
history = pd.read_csv('history.csv')

Also I think that besides the custom callback, you can also save the history along your model checkpoints with a CSVLogger like this:

history = model.fit(..., callbacks=[keras.callbacks.CSVLogger('history.csv')])

This can be loaded back with pandas as shown above.

Answered By – claudia

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published