Toybrick

如何转化quantize好的模型?

chuyee

中级会员

积分
352
楼主
发表于 2019-3-26 02:15:21 | 显示全部楼层
My understanding is, if the original model is already quantized to int8, we just need to pass do_quantization=False in rknn.build() to bypass the rknn quantization step. The model will be translated to rknn format (with all its supported OP implementations). The inference will be accurate (since no precision lost) and fast (hardware accelerated).
回复

使用道具 举报

chuyee

中级会员

积分
352
沙发
发表于 2019-3-27 07:46:58 | 显示全部楼层
protossw512 发表于 2019-3-27 06:42
That sounds reasonable, but without inspect their source code(which they do not provide), I am not ...

BTW, have you by any chance tried do_quantization=False for deeplabv3 model? I got all 0 output from ArgMax layer. The output is valid for do_quantization=True case with everything else remains the same. Could it be caused by float32 overflow? Does rknn support float64?
回复

使用道具 举报

chuyee

中级会员

积分
352
板凳
发表于 2019-3-28 14:00:21 | 显示全部楼层
protossw512 发表于 2019-3-28 02:07
I have done that before in order to test if the output is correct compare to tensorflow model, and ...

I still get the same behavior, for both dynamic_fixed_point-16 and do_quantization=False. However both
asymmetric_quantized-u8 and dynamic_fixed_point-8 work correctly. I also identify the problem happens after the MobilenetV2/expanded_conv_3/expand/BatchNorm/FusedBatchNorm layer. Before that, there are some input needs to be load from weights file directly by 'read' operation (how is it handled by rknn?). I'm not sure it's related though.
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

产品中心 购买渠道 开源社区 Wiki教程 资料下载 关于Toybrick


快速回复 返回顶部 返回列表