# Slice 3d-tensor-based dataset into smaller tensor lengths

## Issue

I have a dataset for training networks, formed out of two tensors, my features and my labels. The shape of my demonstration set is [351, 4, 34] for features, and [351] for labels.

Now, I would like to re-shape the dataset into chunks of size k (ideally while loading data with DataLoader), to obtain a new demonstration set for features of shape [351 * n, 4, k] and the corresponding label shape [351 * n], with n = floor(34 / k). The main aim is to reduce the length of each feature, to decrease the size of my network afterwards.

As written example: Starting from

``````t = [[1, 2, 3, 4],
[5, 6, 7, 8]]
``````

i.e. a `[2, 4]`-tensor, with

``````l = [1, 0]
``````

as labels, I would like to be able to go to (with k = 2)

``````t = [[1, 2],
[3, 4],
[5, 6],
[7, 8]]
l = [1, 1, 0, 0]
``````

or to (with k = 3)

``````t = [[1, 2, 3],
[5, 6, 7]]
l = [1, 0]
``````

I found some solutions for reshaping one of the tensors (by using variations of `split()`), but then I would have to transfer that to my other tensor, too, and therefore I’d prefer solutions inside my DataLoader instead.

Is that possible?

## Solution

You can reshape the input to the desired shape (first dimension is `n` times longer) while the label can be repeated with `torch.repeat_interleave`.

``````def split(x, y, k=2):
n = floor(x.size(1) / k)
x_ = x.reshape(len(x)*n, -1)[:,:k]
y_ = y.repeat_interleave(len(x_)//len(y))
return x_, y_
``````

You can test it like so:

``````>>> split(t, l, k=2)
(tensor([[1, 2],
[3, 4],
[5, 6],
[7, 8]]), tensor([1, 1, 0, 0]))

>>> split(t, l, k=3)
(tensor([[1, 2, 3],
[5, 6, 7]]), tensor([1, 0]))
``````

I recommend doing this kind of processing in your dataset class.

Answered By – Ivan

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0