如何转化quantize好的模型？

[复制链接] · 发表于 2019-3-26 01:18:03

有一些tensorflow或者tflite的模型是已经quantize好的，请问这样的模型应该如何转化呢？如果转换的时候quantization=True是不是会重新quantize一下，这样进一步降低了精度？

只看该作者 · 发表于 2019-3-26 02:15:21

My understanding is, if the original model is already quantized to int8, we just need to pass do_quantization=False in rknn.build() to bypass the rknn quantization step. The model will be translated to rknn format (with all its supported OP implementations). The inference will be accurate (since no precision lost) and fast (hardware accelerated).

只看该作者 · 发表于 2019-3-27 06:42:28

chuyee 发表于 2019-3-26 02:15
My understanding is, if the original model is already quantized to int8, we just need to pass do_qua ...

That sounds reasonable, but without inspect their source code(which they do not provide), I am not quite sure whether you are correct.

Any official folks to answer my question?

只看该作者 · 发表于 2019-3-27 07:46:58

protossw512 发表于 2019-3-27 06:42
That sounds reasonable, but without inspect their source code(which they do not provide), I am not ...

BTW, have you by any chance tried do_quantization=False for deeplabv3 model? I got all 0 output from ArgMax layer. The output is valid for do_quantization=True case with everything else remains the same. Could it be caused by float32 overflow? Does rknn support float64?

只看该作者 · 发表于 2019-3-28 02:07:19

chuyee 发表于 2019-3-27 07:46
BTW, have you by any chance tried do_quantization=False for deeplabv3 model? I got all 0 output fr ...

I have done that before in order to test if the output is correct compare to tensorflow model, and the results were valid, I didn't see any issue with floating point model.

只看该作者 · 发表于 2019-3-28 14:00:21

protossw512 发表于 2019-3-28 02:07
I have done that before in order to test if the output is correct compare to tensorflow model, and ...

I still get the same behavior, for both dynamic_fixed_point-16 and do_quantization=False. However both
asymmetric_quantized-u8 and dynamic_fixed_point-8 work correctly. I also identify the problem happens after the MobilenetV2/expanded_conv_3/expand/BatchNorm/FusedBatchNorm layer. Before that, there are some input needs to be load from weights file directly by 'read' operation (how is it handled by rknn?). I'm not sure it's related though.

只看该作者 · 发表于 2019-3-29 02:01:07

chuyee 发表于 2019-3-28 14:00
I still get the same behavior, for both dynamic_fixed_point-16 and do_quantization=False. However ...

I also encountered the issue you mentioned above.
I tried to get the output form BatchNorm with quantization=False, and the output turned out to be zero tensors. I think it is a bug.