You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): 16.04.LTS
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): 1.3.0
Python version: 2.7.12
CUDA/cuDNN version: 8.0/6.0.21
GPU model and memory: Nvidia Tegra X2
Describe the problem
I'm trying to run an inference using resnet50 as a feature encoder (semantic segmentation with 2 classes). Depending on my memory load, I get the following error log sooner or later:
2017-11-10 05:10:43.484563: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: Invalid reduction dimension (-1146944963 for input with 4 dimension(s)
2017-11-10 05:10:44.646881: E tensorflow/stream_executor/cuda/cuda_driver.cc:1068] failed to synchronize the stop event: CUDA_ERROR_LAUNCH_FAILED
2017-11-10 05:10:44.646946: E tensorflow/stream_executor/cuda/cuda_timer.cc:54] Internal: error destroying CUDA event in context 0x30eb3d0: CUDA_ERROR_LAUNCH_FAILED
2017-11-10 05:10:44.646975: E tensorflow/stream_executor/cuda/cuda_timer.cc:59] Internal: error destroying CUDA event in context 0x30eb3d0: CUDA_ERROR_LAUNCH_FAILED
2017-11-10 05:10:44.647369: E tensorflow/stream_executor/cuda/cuda_blas.cc:551] failed to run cuBLAS routine cublasSgemm_v2: CUBLAS_STATUS_EXECUTION_FAILED
2017-11-10 05:10:44.647478: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: slice index 1000163558 of dimension 0 out of bounds.
2017-11-10 05:10:44.647529: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: slice index 1021428837 of dimension 0 out of bounds.
2017-11-10 05:10:44.647573: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: slice index 1004492442 of dimension 0 out of bounds.
This happens whether a swapfile is being used or not. When this happens, any other inference run is impossible, even with a network with a small footprint. I'm wondering whether there is a memory issue and if yes how to deal with this ?
For info, I happen to get a similar error log when using a TX1 (compiled and binary tensorflow were tried, same os / tf configuration as above)
The text was updated successfully, but these errors were encountered:
System information
Describe the problem
I'm trying to run an inference using resnet50 as a feature encoder (semantic segmentation with 2 classes). Depending on my memory load, I get the following error log sooner or later:
2017-11-10 05:10:43.484563: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: Invalid reduction dimension (-1146944963 for input with 4 dimension(s)
2017-11-10 05:10:44.646881: E tensorflow/stream_executor/cuda/cuda_driver.cc:1068] failed to synchronize the stop event: CUDA_ERROR_LAUNCH_FAILED
2017-11-10 05:10:44.646946: E tensorflow/stream_executor/cuda/cuda_timer.cc:54] Internal: error destroying CUDA event in context 0x30eb3d0: CUDA_ERROR_LAUNCH_FAILED
2017-11-10 05:10:44.646975: E tensorflow/stream_executor/cuda/cuda_timer.cc:59] Internal: error destroying CUDA event in context 0x30eb3d0: CUDA_ERROR_LAUNCH_FAILED
2017-11-10 05:10:44.647369: E tensorflow/stream_executor/cuda/cuda_blas.cc:551] failed to run cuBLAS routine cublasSgemm_v2: CUBLAS_STATUS_EXECUTION_FAILED
2017-11-10 05:10:44.647478: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: slice index 1000163558 of dimension 0 out of bounds.
2017-11-10 05:10:44.647529: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: slice index 1021428837 of dimension 0 out of bounds.
2017-11-10 05:10:44.647573: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: slice index 1004492442 of dimension 0 out of bounds.
This happens whether a swapfile is being used or not. When this happens, any other inference run is impossible, even with a network with a small footprint. I'm wondering whether there is a memory issue and if yes how to deal with this ?
For info, I happen to get a similar error log when using a TX1 (compiled and binary tensorflow were tried, same os / tf configuration as above)
The text was updated successfully, but these errors were encountered: