Creating a tf.Dataset from an numpy array with shape (890,2048,3)

Issue

I am working on the point net implementation for the registration of point clouds. for that I created 890 source and target point clouds stored in NumPy arrays with shape=(2048,3). I then combined all 890 source and target arrays into 2 big arrays with shape=(890,2048,3). Now I want to create an input pipeline for a TensorFlow model. How do I create a Tensorflow dataset from these two numpy arrays and how do I check whether it worked?
I tried :

data1 = tf.data.Dataset.from_tensor_slices((source,targ))
data

But I only get:

<TensorSliceDataset element_spec=(TensorSpec(shape=(2048, 3), dtype=tf.float64, name=None), TensorSpec(shape=(2048, 3), dtype=tf.float64, name=None))>'

as an output..

I really appreciate any help or guidance to where to look at:)

Solution

This is because you need to batch your data. Otherwise tensorflow retains the original shape with which you created the dataset and sends in batches of 1

Contrast

source = np.random.normal(size=(890,2048,3))
targ = np.random.normal(size=(890,2048,3))

data1 = tf.data.Dataset.from_tensor_slices((source,targ))

for x,y in data1.take(1):
  print(x.shape)
  print(y.shape)

>>>(2048, 3)
(2048, 3)

with

source = np.random.normal(size=(890,2048,3))
targ = np.random.normal(size=(890,2048,3))

data1 = tf.data.Dataset.from_tensor_slices((source,targ))
data1 = data1.batch(8) #Or some number of convenience

for x,y in data1.take(1):
  print(x.shape)
  print(y.shape)

>>>(8, 2048, 3)
(8, 2048, 3)

Answered By – Vishal Balaji

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published