|
我在rv1126上部署超分模型,模型后面有一个pixelshuffle算子,用pytorch训练的pth模型直接转为rknn后,可以顺利进行量化,此时在netron中可以看到pixelshuffle变为了depth2space,但是通过rknn的耗时分析,发现depth2space特别耗时; 然后我发现当我用torch.jit.trace将pth模型转为pt模型后,再转为rknn模型并进行量化时,depth2space会被拆分为reshape、permute、reshape三个op,此时模型推理速度会加快,但推理结果有问题。
随后我用rknn的准确性分析,分别对jit前后的两个rknn模型进行了分析,发现两个模型无论是输出的各层结果还是各层的余弦相似性都完全相同。但是使用rknn.inference链接板子进行推理时,非jit转换的模型就可以得到正确的结果,jit转换的模型就会出现错误。请问这种情况会是什么原因导致jit之后的rknn模型无法推理出正确的结果楠?准确性分析出现的"entire_qnt_error_analysis.txt"文件内容如下所示:
未进行jit直接转换的余弦相似性结果如下:
inputs_15_out0_nhwc_1_120_90_3.npy u8 eculidean_norm=0.002681 cosine_norm=0.999996
Resize__Resize_3_out0_nhwc_1_480_360_3.npy None eculidean_norm=0.002681 cosine_norm=0.999996
Conv__body.0_Conv_14_out0_nhwc_1_120_90_64.npy None eculidean_norm=0.027703 cosine_norm=0.999616
PRelu__body.1_PRelu_13_out0_nhwc_1_120_90_64.npy None eculidean_norm=0.108149 cosine_norm=0.994152
Conv__body.2_Conv_12_out0_nhwc_1_120_90_64.npy None eculidean_norm=0.107276 cosine_norm=0.994246
PRelu__body.3_PRelu_11_out0_nhwc_1_120_90_64.npy None eculidean_norm=0.151476 cosine_norm=0.988528
Conv__body.4_Conv_10_out0_nhwc_1_120_90_64.npy None eculidean_norm=0.127811 cosine_norm=0.991832
PRelu__body.5_PRelu_9_out0_nhwc_1_120_90_64.npy None eculidean_norm=0.217819 cosine_norm=0.976277
Conv__body.6_Conv_8_out0_nhwc_1_120_90_64.npy None eculidean_norm=0.104153 cosine_norm=0.994576
PRelu__body.7_PRelu_7_out0_nhwc_1_120_90_64.npy None eculidean_norm=0.311085 cosine_norm=0.951613
Conv__body.8_Conv_6_out0_nhwc_1_120_90_64.npy None eculidean_norm=0.168560 cosine_norm=0.985794
PRelu__body.9_PRelu_5_out0_nhwc_1_120_90_64.npy None eculidean_norm=0.335073 cosine_norm=0.943863
Conv__body.10_Conv_4_out0_nhwc_1_120_90_48.npy None eculidean_norm=0.241604 cosine_norm=0.970814
DepthToSpace__upsampler_DepthToSpace_2_out0_nhwc_1_480_360_3.npy None eculidean_norm=0.241604 cosine_norm=0.970814
Add__Add_1_out0_nhwc_1_480_360_3.npy None eculidean_norm=0.033127 cosine_norm=0.999451
进行jit后再进行转换的余弦相似性结果如下:
x.1_0_out0_nhwc_1_120_90_3.npy u8 eculidean_norm=0.002681 cosine_norm=0.999996
upsample_nearest2d_at_base.1_13_15_out0_nhwc_1_480_360_3.npy u8 eculidean_norm=0.002681 cosine_norm=0.999996
convolution_at_input.2_1_1_out0_nhwc_1_120_90_64.npy u8 eculidean_norm=0.027703 cosine_norm=0.999616
prelu_at_55_2_2_out0_nhwc_1_120_90_64.npy u8 eculidean_norm=0.108149 cosine_norm=0.994152
convolution_at_input.4_3_3_out0_nhwc_1_120_90_64.npy u8 eculidean_norm=0.107276 cosine_norm=0.994246
prelu_at_68_4_4_out0_nhwc_1_120_90_64.npy u8 eculidean_norm=0.151476 cosine_norm=0.988528
convolution_at_input.6_5_5_out0_nhwc_1_120_90_64.npy u8 eculidean_norm=0.127811 cosine_norm=0.991832
prelu_at_81_6_6_out0_nhwc_1_120_90_64.npy u8 eculidean_norm=0.217819 cosine_norm=0.976277
convolution_at_input.8_7_7_out0_nhwc_1_120_90_64.npy u8 eculidean_norm=0.104153 cosine_norm=0.994576
prelu_at_94_8_8_out0_nhwc_1_120_90_64.npy u8 eculidean_norm=0.311085 cosine_norm=0.951613
convolution_at_input.10_9_9_out0_nhwc_1_120_90_64.npy u8 eculidean_norm=0.168560 cosine_norm=0.985794
prelu_at_107_10_10_out0_nhwc_1_120_90_64.npy u8 eculidean_norm=0.335073 cosine_norm=0.943863
convolution_at_input.1_11_11_out0_nhwc_1_120_90_48.npy u8 eculidean_norm=0.241604 cosine_norm=0.970814
pixel_shuffle_to_reshape_0_at_120_12_12_acuity_mark_perm_18_out0_nhwc_1_48_120_90.npy u8 eculidean_norm=0.241604 cosine_norm=0.970814
pixel_shuffle_to_reshape_0_at_120_12_12_out0_nhwc_1_3_4_4_120_90.npy u8 eculidean_norm=0.241604 cosine_norm=0.970814
pixel_shuffle_to_permute_1_at_120_13_13_out0_nhwc_1_3_120_4_90_4.npy u8 eculidean_norm=0.241604 cosine_norm=0.970814
pixel_shuffle_to_reshape_1_at_120_14_14_out0_nhwc_1_3_480_360.npy u8 eculidean_norm=0.241604 cosine_norm=0.970814
add_at_42_14_16_acuity_mark_perm_19_out0_nhwc_1_480_360_3.npy u8 eculidean_norm=0.241604 cosine_norm=0.970814
add_at_42_14_16_out0_nhwc_1_480_360_3.npy u8 eculidean_norm=0.033127 cosine_norm=0.999451
麻烦有朋友遇到过该问题,或者瑞芯微工作人员看到的话,能帮忙解答一下,谢谢
|
|