Issue
I am using Keras with tensorflow-gpu in backend, I don’t have tensorflow (CPU – version) installed, all the outputs show GPU selected but tf is using CPU and system memory
when i run my code the output is: output_code
I even ran device_lib.list_local_device() and the output is: list_local_devices_output
After running the code I tried nvidia-smi to see the usage of gpu and the output is:
nvidia-smi output
Tensorflow-gpu = "1.12.0"
CUDA toolkit = "9.0"
cuDNN = "7.4.1.5"
Environment Variables contain:
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\bin;
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\libnvvp;
C:\WINDOWS\system32;
C:\WINDOWS;
C:\WINDOWS\System32\Wbem;
C:\WINDOWS\System32\WindowsPowerShell\v1.0\;
C:\WINDOWS\System32\OpenSSH\;
C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;
D:\Anaconda3;D:\Anaconda3\Library\mingw-w64\bin
D:\Anaconda3\Library\usr\bin;
D:\Anaconda3\Library\bin;
D:\Anaconda3\Scripts;D:\ffmpeg\bin\;
But still when i check for memory usage in task manager the output is
CPU utilization 51%, RAM utilization 86%
GPU utilization 1%, GPU-RAM utilization 0%
Task_manager_Output
So, I think it is still using CPU instead of GPU.
System Configuration:
Windows-10 64 bit; IDE: Liclipse; Python: 3.6.5
Solution
It is using the GPU, as you can see in logs.
The problem is, that a lot of things can not be done on the GPU and as long your data is small and your complexity is low, you will end up with low GPU usage.
- Maybe the batch_size is to low -> Increase until you run into OOM Errors
- Your data loading is consuming a lot of time and your gpu has to wait (IO Reads)
- Your RAM is to low and the application uses Disk as a fallback
- Preprocsssing is to slow. If you are dealing with image try to compute everything as a generator or on the gpu if possible
- You are using some operations, which are not GPU accelerated
Here is some more detailed explanation.
Answered By – ixeption
This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0