I have a lot of files, which are a result of machine learning with YOLO model, generated by Tensorflow.
Each filename is named:
With the only differences, being:
013 – epoch/generationNumber
0016 – loss (kind of accuracy, but from training PoV)
228 – additional precision of loss (I think, but I’m not sure yet, so I consider it as separate for now)
It would be pretty simple, but it’s a combination of dashes, underscores and dots, given that I’m pretty new to python and what it’s capable of, I couldn’t find any solution that would fit this filename and I’m having difficulties writing a regexp for this.
For now my python "logic" selects the model solely based on which file was the last one saved, which is a start, but it’s far from ideal.
def findBestModel(): # "best" model is currently just the last model in directory last_model = max(glob.glob('data/models/*.h5'), key=os.path.getmtime) print('Selected model: ' + last_model) return last_model
What would be the right way to actually make findBestModel() return those 3 variables that I need, from each filename, without making it overcomplicated?
You could use re.match to return a
match object from each of the files you already obtained with
glob.glob. The use of
re.match with named groups will result in a dictionary with 3 separated values from your file name: epoch, loss and precision loss. The files can then be sorted using the
key parameter from
sorted using the loss value (
int(x.group('loss')) returned by the regex. The
match object has the attribute
string that can be used to return the string passed to
match(), which will correspond to the path to your best model.
data └── models ├── detection_model-ex-002--loss-3416.127.h5 ├── detection_model-ex-013--loss-0016.228.h5 ├── detection_model-ex-2462--loss-0173.093.h5 ├── detection_model-ex-486--loss-1933.981.h5 └── detection_model-ex-83713--loss-0001.048.h5
import glob import re def findBestModel(): models =  for model in glob.glob('./data/models/*.h5'): m = re.match( r'.*?-(?P<epoch>\d+)--' + r'loss-(?P<loss>\d+)\.' + r'(?P<precision>\d+)\.h5' , model) models.append(m) sorted_models = sorted(models, key=lambda x:int(x.group('loss'))) return sorted_models.string best_model = findBestModel() print(best_model) # './data/models/detection_model-ex-83713--loss-0001.048.h5'
Answered By – n1colas.m