31

Debug CUDA error for PyTorch

 3 years ago
source link: http://www.donghao.org/2021/04/23/debug-cuda-error-for-pytorch/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client
Debug CUDA error for PyTorch – Robin on Linux

After I changed my dataset for my code, the training failed:

/tmp/pip-req-build-_tx3iysr/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:310: operator(): block: [0,0,0], thread: [59,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/tmp/pip-req-build-_tx3iysr/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:310: operator(): block: [0,0,0], thread: [60,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/tmp/pip-req-build-_tx3iysr/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:310: operator(): block: [0,0,0], thread: [61,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/tmp/pip-req-build-_tx3iysr/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:310: operator(): block: [0,0,0], thread: [62,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/tmp/pip-req-build-_tx3iysr/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:310: operator(): block: [0,0,0], thread: [63,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
Traceback (most recent call last):                                                                                                                                                                              
  File "train.py", line 337, in <module>                                                                                                                                                                        
    train(args, train_loader, eval_loader)                                                                                                                                                                      
  File "train.py", line 189, in train                                                                                                                                                                           
    sounds = aug(sounds)                                                                                                                                                                                        
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 881, in _call_impl                                                                                                             
    result = self.forward(*input, **kwargs)                                                                                                                                                                     
  File "/home/sanbai/birds_sound_classification/utils/augment.py", line 13, in forward                                                                                                                          
    image = (image - image.mean()) / image.std()                                                                                                                                                                
RuntimeError: CUDA error: device-side assert triggered 
Python
xxxxxxxxxx
/tmp/pip-req-build-_tx3iysr/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:310: operator(): block: [0,0,0], thread: [59,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/tmp/pip-req-build-_tx3iysr/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:310: operator(): block: [0,0,0], thread: [60,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/tmp/pip-req-build-_tx3iysr/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:310: operator(): block: [0,0,0], thread: [61,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/tmp/pip-req-build-_tx3iysr/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:310: operator(): block: [0,0,0], thread: [62,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/tmp/pip-req-build-_tx3iysr/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:310: operator(): block: [0,0,0], thread: [63,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
Traceback (most recent call last):                                                                                                                                                                              
  File "train.py", line 337, in <module>                                                                                                                                                                        
    train(args, train_loader, eval_loader)                                                                                                                                                                      
  File "train.py", line 189, in train                                                                                                                                                                           
    sounds = aug(sounds)                                                                                                                                                                                        
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 881, in _call_impl                                                                                                             
    result = self.forward(*input, **kwargs)                                                                                                                                                                     
  File "/home/sanbai/birds_sound_classification/utils/augment.py", line 13, in forward                                                                                                                          
    image = (image - image.mean()) / image.std()                                                                                                                                                                
RuntimeError: CUDA error: device-side assert triggered 

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK