如何转化quantize好的模型？

显示全部楼层 · 发表于 2019-3-26 02:15:21

My understanding is, if the original model is already quantized to int8, we just need to pass do_quantization=False in rknn.build() to bypass the rknn quantization step. The model will be translated to rknn format (with all its supported OP implementations). The inference will be accurate (since no precision lost) and fast (hardware accelerated).

显示全部楼层 · 发表于 2019-3-27 07:46:58

protossw512 发表于 2019-3-27 06:42
That sounds reasonable, but without inspect their source code(which they do not provide), I am not ...

BTW, have you by any chance tried do_quantization=False for deeplabv3 model? I got all 0 output from ArgMax layer. The output is valid for do_quantization=True case with everything else remains the same. Could it be caused by float32 overflow? Does rknn support float64?

显示全部楼层 · 发表于 2019-3-28 14:00:21

protossw512 发表于 2019-3-28 02:07
I have done that before in order to test if the output is correct compare to tensorflow model, and ...

I still get the same behavior, for both dynamic_fixed_point-16 and do_quantization=False. However both
asymmetric_quantized-u8 and dynamic_fixed_point-8 work correctly. I also identify the problem happens after the MobilenetV2/expanded_conv_3/expand/BatchNorm/FusedBatchNorm layer. Before that, there are some input needs to be load from weights file directly by 'read' operation (how is it handled by rknn?). I'm not sure it's related though.