Compile and run src/cudnn_conv_float32.cc
or src/cudnn_conv_int8.cc
with CUDA 8.0 and cuDNN 6.0.
The code is self contained and all the parameters are hardcoded into the code to help debugging the propblem.
src/cudnn_conv_float32.cc
is a simple implementation of FLOAT32 convolution using cuDNN 6. This code seems to work.
src/cudnn_conv_int8.cc
is a variant of the above FLOAT32 version for INT8-based convolution. As explained in the user manual, you must have compute capability of 6.1 or higher. A number of parameters are changed from the FLOAT32 version following the user manual. This code fails with CUDNN_STATUS_NOT_SUPPORTED
error.
- Descriptor data types are set to
CUDNN_DATA_INT8
for the input/output tensors and filter (See page 59) - Convolution descriptor has data type
CUDNN_DATA_INT32
, as instructed in the manual (See page 59) - Data format for the input/output tensor descriptors and the filter are changed to
CUDNN_TENSOR_NHWC
(See page 62)
We can reproduce the problem on GTX 1070, Titan X (Pascal), and Quadro 6000.
CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_PRECOMPUTED_GEMM
(see page 62) is not supported incudnnConvolutionFwdAlgo_t
. The closest alternative seems to beCUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_PRECOMP_GEMM
("PRECOMP" instead of "PRECOMPUTED")- The job fails at cudnnConvolutionForward() with
CUDNN_STATUS_NOT_SUPPORTED
error. This happens regardless of what algorithm I choose. I tested all algo types in page 16 of the manual. - FLOAT32 implementation (
CUDNN_DATA_FLOAT
) doesn't have this issue
Please see my post on the developer forum.