|
Hi,
I found an interesting problem for rknn.eval_perf() while evaluating rknn with deeplabv3 (See [1] and [2] for more info on that topic).
After rknn.inference(), I use below code to print the performance evaluation result.
perf_results = rknn.eval_perf(inputs=[img]) However I got 3x difference (3.8 vs 13.08) with the result if I turn on/off perf_debug in rknn.init_runtime(). Below is the result, nothing else changed except the init_runtime() call.
ret = rknn.init_runtime(perf_debug=False, eval_mem=False) ========================================================================
Performance
========================================================================
Total Time(us): 263127
FPS: 3.80
========================================================================
ret = rknn.init_runtime(perf_debug=True, eval_mem=False) ========================================================================
Performance
#### The performance result is just for debugging, ####
#### may worse than actual performance! ####
========================================================================
Layer ID Name Operator Time(us)
0 ConvolutionReluPoolingLayer2_0 RESHUFFLE 4153
1 ConvolutionReluPoolingLayer2_2 CONVOLUTION 780
2 ConvolutionReluPoolingLayer2_2 CONVOLUTION 568
3 ConvolutionReluPoolingLayer2_2 CONVOLUTION 1683
4 ConvolutionReluPoolingLayer2_0 RESHUFFLE 3944
5 ConvolutionReluPoolingLayer2_2 CONVOLUTION 269
6 ConvolutionReluPoolingLayer2_2 CONVOLUTION 314
7 ConvolutionReluPoolingLayer2_2 CONVOLUTION 1310
8 ConvolutionReluPoolingLayer2_2 CONVOLUTION 1070
9 ConvolutionReluPoolingLayer2_2 CONVOLUTION 232
10 ConvolutionReluPoolingLayer2_2 CONVOLUTION 320
11 ConvolutionReluPoolingLayer2_0 RESHUFFLE 1463
12 ConvolutionReluPoolingLayer2_2 CONVOLUTION 164
13 ConvolutionReluPoolingLayer2_2 CONVOLUTION 202
14 ConvolutionReluPoolingLayer2_2 CONVOLUTION 341
15 ConvolutionReluPoolingLayer2_2 CONVOLUTION 181
16 ConvolutionReluPoolingLayer2_2 CONVOLUTION 119
17 ConvolutionReluPoolingLayer2_2 CONVOLUTION 232
18 ConvolutionReluPoolingLayer2_2 CONVOLUTION 323
19 ConvolutionReluPoolingLayer2_2 CONVOLUTION 179
20 ConvolutionReluPoolingLayer2_2 CONVOLUTION 126
21 ConvolutionReluPoolingLayer2_2 CONVOLUTION 216
22 ConvolutionReluPoolingLayer2_2 CONVOLUTION 322
23 ConvolutionReluPoolingLayer2_2 CONVOLUTION 188
24 ConvolutionReluPoolingLayer2_2 CONVOLUTION 351
25 ConvolutionReluPoolingLayer2_2 CONVOLUTION 1144
26 ConvolutionReluPoolingLayer2_2 CONVOLUTION 278
27 ConvolutionReluPoolingLayer2_2 CONVOLUTION 158
28 ConvolutionReluPoolingLayer2_2 CONVOLUTION 351
29 ConvolutionReluPoolingLayer2_2 CONVOLUTION 1138
30 ConvolutionReluPoolingLayer2_2 CONVOLUTION 277
31 ConvolutionReluPoolingLayer2_2 CONVOLUTION 158
32 ConvolutionReluPoolingLayer2_2 CONVOLUTION 350
33 ConvolutionReluPoolingLayer2_2 CONVOLUTION 1118
34 ConvolutionReluPoolingLayer2_2 CONVOLUTION 272
35 ConvolutionReluPoolingLayer2_2 CONVOLUTION 158
36 ConvolutionReluPoolingLayer2_2 CONVOLUTION 347
37 ConvolutionReluPoolingLayer2_2 CONVOLUTION 1112
38 ConvolutionReluPoolingLayer2_2 CONVOLUTION 294
39 ConvolutionReluPoolingLayer2_2 CONVOLUTION 513
40 ConvolutionReluPoolingLayer2_2 CONVOLUTION 2144
41 ConvolutionReluPoolingLayer2_2 CONVOLUTION 773
42 ConvolutionReluPoolingLayer2_2 CONVOLUTION 278
43 ConvolutionReluPoolingLayer2_2 CONVOLUTION 511
44 ConvolutionReluPoolingLayer2_2 CONVOLUTION 2153
45 ConvolutionReluPoolingLayer2_2 CONVOLUTION 782
46 ConvolutionReluPoolingLayer2_2 CONVOLUTION 210
47 ConvolutionReluPoolingLayer2_2 CONVOLUTION 512
48 ConvolutionReluPoolingLayer2_2 CONVOLUTION 2154
49 ConvolutionReluPoolingLayer2_2 CONVOLUTION 775
50 ConvolutionReluPoolingLayer2_2 CONVOLUTION 804
51 ConvolutionReluPoolingLayer2_2 CONVOLUTION 9144
52 ConvolutionReluPoolingLayer2_2 CONVOLUTION 1231
53 ConvolutionReluPoolingLayer2_2 CONVOLUTION 372
54 ConvolutionReluPoolingLayer2_2 CONVOLUTION 804
55 ConvolutionReluPoolingLayer2_2 CONVOLUTION 9148
56 ConvolutionReluPoolingLayer2_2 CONVOLUTION 1204
57 ConvolutionReluPoolingLayer2_2 CONVOLUTION 324
58 ConvolutionReluPoolingLayer2_2 CONVOLUTION 1673
59 ConvolutionReluPoolingLayer2_2 CONVOLUTION 9531
60 ConvolutionReluPoolingLayer2_2 CONVOLUTION 834
61 PoolingLayer2_1 POOLING 413
62 ConvolutionReluPoolingLayer2_2 CONVOLUTION 1195
63 FullyConnectedReluLayer_0 FULLYCONNECTED 106
65 ConvolutionReluPoolingLayer2_2 CONVOLUTION 2314
66 ConvolutionReluPoolingLayer2_2 CONVOLUTION 328
Total Time(us): 76435
FPS: 13.08
========================================================================
Time measurement with Python time.time()
The real time rknn.inference() takes is about 0.3s. So perf_debug=False seems to give more closer result. But that's a black box to me. RK people please take a look at your internal implementation what might cause the problem. Thanks!
Reference:
1. http://t.rock-chips.com/forum.ph ... &extra=#pid1329
2. http://t.rock-chips.com/forum.ph ... =deeplab&page=2
|
|