|
官方给的预训练权重模型,无量化 打印日志:
toybrick@debian10:~/yolov3_demo$ python3 rknn_picture_yolov4_tiny.py
-->loading model
loading model done
--> Init runtime environment
W Flag perf_debug has been set, it will affect the performance of inference!
I NPUTransfer: Starting NPU Transfer Client, Transfer version 2.0.0 (8f9ebbc@2020-04-03T09:12:43)
D RKNNAPI: ==============================================
D RKNNAPI: RKNN VERSION:
D RKNNAPI: API: 1.4.0 (b4a8096 build: 2020-08-12 10:16:10)
D RKNNAPI: DRV: 1.3.0 (c4f8c23 build: 2019-11-25 10:39:29)
D RKNNAPI: ==============================================
done
W When performing performance evaluation, inputs can be set to None to use fake inputs.
========================================================================
Performance
#### The performance result is just for debugging, ####
#### may worse than actual performance! ####
========================================================================
Layer ID Name Operator Uid Time(us)
27 convolution_1_0 RESHUFFLE 1 1408
CONVOLUTION
0 leakyrelu_3_0 ACTIVATION 3 690
28 convolution_4_0 RESHUFFLE 4 2854
CONVOLUTION
1 leakyrelu_6_0 ACTIVATION 6 399
29 convolution_7_2 CONVOLUTION 7 476
2 leakyrelu_9_0 ACTIVATION 9 362
3 slice_10_1 SPECIAL_OP 10 484
30 convolution_11_2 CONVOLUTION 11 279
4 leakyrelu_13_0 ACTIVATION 13 241
31 convolution_14_2 CONVOLUTION 14 277
5 leakyrelu_16_0 ACTIVATION 16 239
32 convolution_18_2 CONVOLUTION 18 239
6 leakyrelu_20_0 ACTIVATION 20 360
7 pooling_22_0 POOLING 22 494
33 convolution_23_2 CONVOLUTION 23 461
8 leakyrelu_25_0 ACTIVATION 25 231
9 slice_26_1 SPECIAL_OP 26 347
34 convolution_27_2 CONVOLUTION 27 196
10 leakyrelu_29_0 ACTIVATION 29 165
35 convolution_30_2 CONVOLUTION 30 191
11 leakyrelu_32_0 ACTIVATION 32 168
36 convolution_34_2 CONVOLUTION 34 330
12 leakyrelu_36_0 ACTIVATION 36 229
13 pooling_38_0 POOLING 38 246
37 convolution_39_2 CONVOLUTION 39 446
14 leakyrelu_41_0 ACTIVATION 41 161
15 slice_42_1 SPECIAL_OP 42 347
38 convolution_43_2 CONVOLUTION 43 191
16 leakyrelu_45_0 ACTIVATION 45 130
39 convolution_46_2 CONVOLUTION 46 184
17 leakyrelu_48_0 ACTIVATION 48 128
40 convolution_50_2 CONVOLUTION 50 225
18 leakyrelu_52_0 ACTIVATION 52 161
19 TensorCopy_0 SPECIAL_OP 196
20 pooling_54_0 POOLING 54 178
41 convolution_55_2 CONVOLUTION 55 517
21 leakyrelu_57_0 ACTIVATION 57 127
42 convolution_58_2 CONVOLUTION 58 167
22 leakyrelu_60_0 ACTIVATION 60 113
43 convolution_61_2 CONVOLUTION 61 313
23 leakyrelu_63_0 ACTIVATION 63 130
45 convolution_64_2 CONVOLUTION 64 165
44 convolution_67_2 CONVOLUTION 67 117
24 leakyrelu_69_0 ACTIVATION 69 102
25 upsampling_70_1 SPECIAL_OP 70 445
46 convolution_72_2 CONVOLUTION 72 650
26 leakyrelu_74_0 ACTIVATION 74 163
47 convolution_75_2 CONVOLUTION 75 221
Total Time(us): 16943
FPS: 59.02
自己的训练模型 无量化日志
toybrick@debian10:~/yolov3_demo$ python3 rknn_picture_yolov4_tiny_3l_416x416_2.py
-->loading model
loading model done
--> Init runtime environment
W Flag perf_debug has been set, it will affect the performance of inference!
I NPUTransfer: Starting NPU Transfer Client, Transfer version 2.0.0 (8f9ebbc@2020-04-03T09:12:43)
D RKNNAPI: ==============================================
D RKNNAPI: RKNN VERSION:
D RKNNAPI: API: 1.4.0 (b4a8096 build: 2020-08-12 10:16:10)
D RKNNAPI: DRV: 1.3.0 (c4f8c23 build: 2019-11-25 10:39:29)
D RKNNAPI: ==============================================
done
W When performing performance evaluation, inputs can be set to None to use fake inputs.
========================================================================
Performance
#### The performance result is just for debugging, ####
#### may worse than actual performance! ####
========================================================================
Layer ID Name Operator Uid Time(us)
31 convolution_1_0 RESHUFFLE 1 1423
CONVOLUTION
0 leakyrelu_3_0 ACTIVATION 3 689
32 convolution_4_0 RESHUFFLE 4 2846
CONVOLUTION
1 leakyrelu_6_0 ACTIVATION 6 401
33 convolution_7_2 CONVOLUTION 7 471
2 leakyrelu_9_0 ACTIVATION 9 363
3 slice_10_1 SPECIAL_OP 10 478
34 convolution_11_2 CONVOLUTION 11 277
4 leakyrelu_13_0 ACTIVATION 13 245
35 convolution_14_2 CONVOLUTION 14 271
5 leakyrelu_16_0 ACTIVATION 16 238
36 convolution_18_2 CONVOLUTION 18 238
6 leakyrelu_20_0 ACTIVATION 20 369
7 pooling_22_0 POOLING 22 463
37 convolution_23_2 CONVOLUTION 23 462
8 leakyrelu_25_0 ACTIVATION 25 231
9 slice_26_1 SPECIAL_OP 26 315
38 convolution_27_2 CONVOLUTION 27 192
10 leakyrelu_29_0 ACTIVATION 29 165
39 convolution_30_2 CONVOLUTION 30 198
11 leakyrelu_32_0 ACTIVATION 32 162
40 convolution_34_2 CONVOLUTION 34 330
12 leakyrelu_36_0 ACTIVATION 36 237
13 TensorCopy_0 SPECIAL_OP 308
14 pooling_38_0 POOLING 38 275
41 convolution_39_2 CONVOLUTION 39 449
15 leakyrelu_41_0 ACTIVATION 41 159
16 slice_42_1 SPECIAL_OP 42 348
42 convolution_43_2 CONVOLUTION 43 190
17 leakyrelu_45_0 ACTIVATION 45 131
43 convolution_46_2 CONVOLUTION 46 194
18 leakyrelu_48_0 ACTIVATION 48 129
44 convolution_50_2 CONVOLUTION 50 220
19 leakyrelu_52_0 ACTIVATION 52 161
20 TensorCopy_0 SPECIAL_OP 195
21 pooling_54_0 POOLING 54 206
45 convolution_55_2 CONVOLUTION 55 515
22 leakyrelu_57_0 ACTIVATION 57 130
46 convolution_58_2 CONVOLUTION 58 173
23 leakyrelu_60_0 ACTIVATION 60 110
47 convolution_61_2 CONVOLUTION 61 316
24 leakyrelu_63_0 ACTIVATION 63 127
49 convolution_64_2 CONVOLUTION 64 164
48 convolution_67_2 CONVOLUTION 67 114
25 leakyrelu_69_0 ACTIVATION 69 107
26 upsampling_70_1 SPECIAL_OP 70 450
50 convolution_72_2 CONVOLUTION 72 655
27 leakyrelu_74_0 ACTIVATION 74 161
51 convolution_75_2 CONVOLUTION 75 240
52 convolution_78_2 CONVOLUTION 78 153
28 leakyrelu_80_0 ACTIVATION 80 121
29 upsampling_81_1 SPECIAL_OP 81 484
53 convolution_83_2 CONVOLUTION 83 672
30 leakyrelu_85_0 ACTIVATION 85 233
54 convolution_86_2 CONVOLUTION 86 423
Total Time(us): 19377
FPS: 51.61
自己训练的模型 量化日志:
toybrick@debian10:~/yolov3_demo$ python3 rknn_picture_yolov4_tiny_3l_416x416_2.py
-->loading model
loading model done
--> Init runtime environment
W Flag perf_debug has been set, it will affect the performance of inference!
I NPUTransfer: Starting NPU Transfer Client, Transfer version 2.0.0 (8f9ebbc@2020-04-03T09:12:43)
D RKNNAPI: ==============================================
D RKNNAPI: RKNN VERSION:
D RKNNAPI: API: 1.4.0 (b4a8096 build: 2020-08-12 10:16:10)
D RKNNAPI: DRV: 1.3.0 (c4f8c23 build: 2019-11-25 10:39:29)
D RKNNAPI: ==============================================
done
W When performing performance evaluation, inputs can be set to None to use fake inputs.
========================================================================
Performance
#### The performance result is just for debugging, ####
#### may worse than actual performance! ####
========================================================================
Layer ID Name Operator Uid Time(us)
0 convolution_1_pre_asymmetric_affineu8_to SPECIAL_OP 2 381
_float32_2_0
35 convolution_1_0 RESHUFFLE 1 2822
CONVOLUTION
RESHUFFLE
CONVOLUTION
1 convolution_1_next_float32_to_asymmetric SPECIAL_OP 5 834
_affineu8_5_0
2 leakyrelu_3_0 ACTIVATION 3 691
3 convolution_4_pre_asymmetric_affineu8_to SPECIAL_OP 8 1069
_float32_8_0
36 convolution_4_0 RESHUFFLE 4 15912
CONVOLUTION
RESHUFFLE
CONVOLUTION
RESHUFFLE
CONVOLUTION
RESHUFFLE
CONVOLUTION
RESHUFFLE
CONVOLUTION
4 convolution_4_next_float32_to_asymmetric SPECIAL_OP 12 380
_affineu8_12_0
5 leakyrelu_6_0 ACTIVATION 6 404
37 convolution_7_2 CONVOLUTION 7 473
6 leakyrelu_9_0 ACTIVATION 9 367
7 slice_10_1 SPECIAL_OP 10 487
38 convolution_11_2 CONVOLUTION 11 288
8 leakyrelu_13_0 ACTIVATION 13 242
39 convolution_14_2 CONVOLUTION 14 270
9 leakyrelu_16_0 ACTIVATION 16 238
40 convolution_18_2 CONVOLUTION 18 247
10 leakyrelu_20_0 ACTIVATION 20 363
11 pooling_22_0 POOLING 22 458
41 convolution_23_2 CONVOLUTION 23 505
12 leakyrelu_25_0 ACTIVATION 25 235
13 slice_26_1 SPECIAL_OP 26 339
42 convolution_27_2 CONVOLUTION 27 196
14 leakyrelu_29_0 ACTIVATION 29 159
43 convolution_30_2 CONVOLUTION 30 191
15 leakyrelu_32_0 ACTIVATION 32 161
44 convolution_34_2 CONVOLUTION 34 337
16 leakyrelu_36_0 ACTIVATION 36 231
17 TensorCopy_0 SPECIAL_OP 300
18 pooling_38_0 POOLING 38 277
45 convolution_39_2 CONVOLUTION 39 444
19 leakyrelu_41_0 ACTIVATION 41 162
20 slice_42_1 SPECIAL_OP 42 356
46 convolution_43_2 CONVOLUTION 43 199
21 leakyrelu_45_0 ACTIVATION 45 133
47 convolution_46_2 CONVOLUTION 46 185
22 leakyrelu_48_0 ACTIVATION 48 130
48 convolution_50_2 CONVOLUTION 50 219
23 leakyrelu_52_0 ACTIVATION 52 163
24 TensorCopy_0 SPECIAL_OP 201
25 pooling_54_0 POOLING 54 190
49 convolution_55_2 CONVOLUTION 55 522
26 leakyrelu_57_0 ACTIVATION 57 129
50 convolution_58_2 CONVOLUTION 58 169
27 leakyrelu_60_0 ACTIVATION 60 112
51 convolution_61_2 CONVOLUTION 61 313
28 leakyrelu_63_0 ACTIVATION 63 127
53 convolution_64_2 CONVOLUTION 64 197
52 convolution_67_2 CONVOLUTION 67 114
29 leakyrelu_69_0 ACTIVATION 69 102
30 upsampling_70_1 SPECIAL_OP 70 450
54 convolution_72_2 CONVOLUTION 72 653
31 leakyrelu_74_0 ACTIVATION 74 166
55 convolution_75_2 CONVOLUTION 75 246
56 convolution_78_2 CONVOLUTION 78 174
32 leakyrelu_80_0 ACTIVATION 80 119
33 upsampling_81_1 SPECIAL_OP 81 454
57 convolution_83_2 CONVOLUTION 83 675
34 leakyrelu_85_0 ACTIVATION 85 235
58 convolution_86_2 CONVOLUTION 86 433
Total Time(us): 36629
FPS: 27.30
观察上述日志:发现推理性能还慢了?
|
|