My understanding is, if the original model is already quantized to int8, we just need to pass do_quantization=False in rknn.build() to bypass the rknn quantization step. The model will be translated to rknn format (with all its supported OP implementations). The inference will be accurate (since no precision lost) and fast (hardware accelerated).
protossw512 发表于 2019-3-27 06:42
That sounds reasonable, but without inspect their source code(which they do not provide), I am not ...
BTW, have you by any chance tried do_quantization=False for deeplabv3 model? I got all 0 output from ArgMax layer. The output is valid for do_quantization=True case with everything else remains the same. Could it be caused by float32 overflow? Does rknn support float64?
chuyee 发表于 2019-3-27 07:46
BTW, have you by any chance tried do_quantization=False for deeplabv3 model? I got all 0 output fr ...
I have done that before in order to test if the output is correct compare to tensorflow model, and the results were valid, I didn't see any issue with floating point model.
protossw512 发表于 2019-3-28 02:07
I have done that before in order to test if the output is correct compare to tensorflow model, and ...
I still get the same behavior, for both dynamic_fixed_point-16 and do_quantization=False. However both
asymmetric_quantized-u8 and dynamic_fixed_point-8 work correctly. I also identify the problem happens after the MobilenetV2/expanded_conv_3/expand/BatchNorm/FusedBatchNorm layer. Before that, there are some input needs to be load from weights file directly by 'read' operation (how is it handled by rknn?). I'm not sure it's related though.
chuyee 发表于 2019-3-28 14:00
I still get the same behavior, for both dynamic_fixed_point-16 and do_quantization=False. However ...
I also encountered the issue you mentioned above.
I tried to get the output form BatchNorm with quantization=False, and the output turned out to be zero tensors. I think it is a bug.