-
Notifications
You must be signed in to change notification settings - Fork 240
Open
Description
Environment:
- GPU cards: Tesla K80
- CUDA:8.0
- cuDNN:5.1
- OpenMPI:1.10.2
Problems:
After make there are five files in .../nvidia/bin , they are:
conv_bench gemm_bench nccl_mpi_all_reduce nccl_single_all_reduce rnn_bench
And I can successfully run 'rnn_bench', 'nccl_single_all_reduce',
- But when I run 'gemm_bench' it give me the error of "terminate called after throwing an instance of 'std::runtime_error'";
- run 'conv_bench' it will be stop when procedure doing the 11th test,and the error is " terminate called after throwing an instance of 'std::runtime_error' what(): Illegal algorithm passed to get_fwd_algo_string. Algo: 7"
- run 'nccl_mpi_all_reduce' the error is "terminate called after throwing an instance of 'std::runtime_error'what(): NCCL failure: invalid device pointer in nccl_mpi_all_reduce.cu at line: 86 rank: 0"
How can I fix it?
Metadata
Metadata
Assignees
Labels
No labels