I’m using an object detection module for classifying images. My specs are as follows:

  • OS: Ubuntu 18.04 LTS
  • Python: 3.6.7
  • VirtualEnv: Version: 16.4.3
  • Pip3 version inside virtualenv: 19.0.3
  • TensorFlow Version: 1.13.1
  • Protoc Version: 3.0.0-9

I’m working on Windows virtualenv and google-colab. This is the error message I get:

python3 legacy/train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/ssd_mobilenet_v1_pets.config

INFO:tensorflow:global step 1: loss = 18.5013 (48.934 sec/step)
INFO:tensorflow:Finished training! Saving model to disk.
/home/priyank/venv/lib/python3.6/site-packages/tensorflow/python/summary/writer/writer.py:386: UserWarning: Attempting to use a closed FileWriter. The operation will be a noop unless the FileWriter is explicitly reopened.
  warnings.warn("Attempting to use a closed FileWriter. "
Traceback (most recent call last):
  File "legacy/train.py", line 184, in <module>
  File "/home/priyank/venv/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
  File "/home/priyank/venv/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 324, in new_func
    return func(*args, **kwargs)
  File "legacy/train.py", line 180, in main
  File "/home/priyank/venv/models-master/research/object_detection/legacy/trainer.py", line 416, in train
  File "/home/priyank/venv/lib/python3.6/site-packages/tensorflow/contrib/slim/python/slim/learning.py", line 785, in train
  File "/home/priyank/venv/lib/python3.6/site-packages/tensorflow/python/training/supervisor.py", line 832, in stop
  File "/home/priyank/venv/lib/python3.6/site-packages/tensorflow/python/training/coordinator.py", line 389, in join
  File "/home/priyank/venv/lib/python3.6/site-packages/six.py", line 693, in reraise
    raise value
  File "/home/priyank/venv/lib/python3.6/site-packages/tensorflow/python/training/queue_runner_impl.py", line 257, in _run
  File "/home/priyank/venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1257, in _single_operation_run
    self._call_tf_sessionrun(None, {}, [], target_list, None)
  File "/home/priyank/venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
<b>tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[15,1,1755,2777,3] and type float on /job:localhost/replica:0/task:0/device:CPU:0 by allocator cpu
     [[{{node batch}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.</b>


You can try the following fixes:
1. Reducing the image dimension in case you are using very high image resolution
2. Try reducing the batch size
3. Check if any other process is using up your memory

Could you also please share your config file

Answered By – Jitesh Malipeddi

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published