How to load existing local dataset into Tensorflow and scale images [0,255] to [-1,1]

Issue

I have my own dataset that is split to Train and test directories. Like this:

LFW-A:
 |
 |
 |___ Train
        |
        |
        |___images...
 |
 |
 |___ Test
        |
        |
        |___images...

Currently, I am loading the MNIST dataset like this:

(trainX, trainy), (testX, testY) = keras.datasets.fashion_mnist.load_data()

My own dataset is in the same directory. How can I load that instead of the builtin MNIST?

This is my function:

# load fashion mnist images
def load_real_samples():
    # load dataset
    (trainX, trainy), (testX, testY) = load_data()
    # expand to 3d, e.g. add channels
    X = expand_dims(trainX, axis=-1)
    # convert from ints to floats
    X = X.astype('float32')
    # scale from [0,255] to [-1,1]
    X = (X - 127.5) / 127.5
    return [X, trainy]

Solution

You can use ImageDataGenerator and preprocessing_function for preprocessing and scale images from [0,255] to [-1,1] and use flow_from_directory for loading images from local path. Suppose the dataset in the local path is like the below.

enter image description here

Reading from local path

import tensorflow as tf

def preprc_func(img):
    img = img.astype(np.float32) / 255.0
    # scale from [0,255] to [-1,1]
    img = (img - 0.5) * 2
    return img

datagen = tf.keras.preprocessing.image.ImageDataGenerator(
    preprocessing_function=preprc_func
)
train_generator = datagen.flow_from_directory(
        'data/train',
        target_size=(100, 100),
        batch_size=32,
        shuffle=True,
        class_mode='categorical')
test_generator = datagen.flow_from_directory(
        'data/test',
        target_size=(100, 100),
        batch_size=32,
        class_mode='categorical')

# Found 30 images belonging to 3 classes.
# Found 30 images belonging to 3 classes.

Check one image:

>>> next(iter(train_generator))[0][0].shape
(100, 100, 3)


>>> next(iter(train_generator))[0][0]
array([[[[ 0.082353  ,  0.082353  ,  0.082353  ],
         [ 0.62352943,  0.62352943,  0.62352943],
         [-0.05098039, -0.05098039, -0.05098039],
         ...,
         [-0.42745095, -0.42745095, -0.42745095],
         [-0.7411765 , -0.7411765 , -0.7411765 ],
         [-0.8980392 , -0.8980392 , -0.8980392 ]]]], dtype=float32)

Show one image after loading from path:

import matplotlib.pyplot as plt
plt.imshow(next(iter(train_generator))[0][0])

enter image description here

Generate random images for this answer:

import numpy as np
from PIL import Image

for idx, loc in enumerate(['data/train', 'data/test']*10):
    for category in ['class_1', 'class_2', 'class_3']:
        imarray = np.random.rand(100,100) * 255
        im = Image.fromarray(imarray.astype('uint8'))
        im.save(f'{loc}/{category}/img_{idx}.png')

Answered By – I'mahdi

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published