Toybrick

标题: YOLOv4推理時間是v3的2.5倍 [打印本页]

作者: brunolin 时间: 2020-9-10 09:58
标题: YOLOv4推理時間是v3的2.5倍
YOLOv4在darknet上列出的BFLOPS比YOLOv3少以416x416模型來說，
YOLOv3: 65.879 BFLOPS
YOLOv4: 60.137 BFLOPS

使用rknn-toolkit v1.4.0轉檔後，在RK1808計算棒上執行推理
v4的耗時卻是v3的2.5倍以上
YOLOv3: 82.4 ms
YOLOv4: 210.0 ms

請問這是甚麼原因? 有甚麼加速的方法?

作者: rakidis 时间: 2020-10-9 16:34
I believe the reason for yolov4 being so much more slower is the "mish" activation layer. since it uses exponential calculations, it could be behind this bottleneck.

I am trying to find a reasonable solution to this issue as well, if you have any idea please share with me mate.

cheers

作者: brunolin 时间: 2020-10-15 17:28

rakidis 发表于 2020-10-9 16:34
I believe the reason for yolov4 being so much more slower is the "mish" activation layer. since it u ...

Actually mish op is not key factor.
You can output layers for analysis by eval_perf().
There are 3 pooling operators which consume much time.
They are from SPP module in YOLOv4.

作者: rakidis 时间: 2020-10-20 20:31
Ohh I see, so the solution would be to remove those pooling layers from base network and then retrain with imagenet?

作者: rakidis 时间: 2020-11-13 15:54
Did you manage to find a solution to this issue? I asked a way to remove spp in darknet repository, but haven't got an answer.

I wonder why spp structure causes a bottleneck. Normally maxpool layers work fine, but when in spp they are taking too much time.

作者: brunolin 时间: 2020-11-20 16:02

rakidis 发表于 2020-11-13 15:54
Did you manage to find a solution to this issue? I asked a way to remove spp in darknet repository, ...

You can remove SPP from yolov4.cfg by comparing yolov3.cfg and yolov3-spp.cfg.
Then re-train it.
The kernel size used in these maxpoo is so large that it's inefficient to execute on RKNN NPU.

欢迎光临 Toybrick (https://t.rock-chips.com/)