Issue
I just downloaded caltech256 dataset in my drive and want to load it in my colab so that I’d use it to train my model.
How do I do that?
or is there a better way ?
this is what I did so far
from google.colab import drive
drive.mount('/content/gdrive')
then
!tar -xvf "/content/gdrive/MyDrive/Data_Clatech256/256_ObjectCategories.tar" -C "/content/gdrive/MyDrive/Data_Clatech256/"
I can read and write files from drive, but how do I load the data ready to train ?
Solution
Have a look at the ImageDataGenerator with .flow_from_directory(directory)
. The ImageDataGenerator allows you to do a lot of preprocessing and data augmentation on the fly. By calling .flow_from_directory(directory_of_your_ds)
you can then build a pipeline to your drive. Have a look at an example from the documentation to get more insights:
train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
'data/train',
target_size=(150, 150),
batch_size=32,
class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
'data/validation',
target_size=(150, 150),
batch_size=32,
class_mode='binary')
model.fit(
train_generator,
steps_per_epoch=2000,
epochs=50,
validation_data=validation_generator,
validation_steps=800)
I hope this is what you were looking for.
Answered By – Nicolai
This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0