how to update/assigning new values to elements of a DatasetDict outside the loop (for cycle) in python?

Issue

I have a dataset of images, I resize every image of the dataset and then re-assign the i-th element of the dataset with the resized_image. I’m doing this with the following code:

for i in range(0,len(dataset['train'])): #len(dataset['train'])

  ex = dataset['train'][i] #i
  image = ex['image']
  image = image.convert("RGB") # <class 'PIL.Image.Image'> <PIL.Image.Image image mode=RGB size=500x333 at 0x7F84F1948150>
  image_resized = image.resize(size_to_resize) # <PIL.Image.Image image mode=RGB size=224x224 at 0x7F84F17885D0>
  
  dataset['train'][i]['image'] = image_resized

the point is that out of the for loop, the

dataset['train'][Iterator]['image'] # where iterator=0,1,2,3,4...

give me back the i-th image not resized!

Solution

Alas, you can’t change it inplace.

DatasetDict is backed by Arrow tables, which are immutable

Answered By – Marat

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published