Toybrick

标题: YOLOv4推理時間是v3的2.5倍 [打印本页]

作者: brunolin    时间: 2020-9-10 09:58
标题: YOLOv4推理時間是v3的2.5倍
YOLOv4在darknet上列出的BFLOPS比YOLOv3少以416x416模型來說,
YOLOv3: 65.879 BFLOPS
YOLOv4: 60.137 BFLOPS

使用rknn-toolkit v1.4.0轉檔後,在RK1808計算棒上執行推理
v4的耗時卻是v3的2.5倍以上
YOLOv3:  82.4 ms
YOLOv4: 210.0 ms

請問這是甚麼原因? 有甚麼加速的方法?

作者: rakidis    时间: 2020-10-9 16:34
I believe the reason for yolov4 being so much more slower is the "mish" activation layer. since it uses exponential calculations, it could be behind this bottleneck.

I am trying to find a reasonable solution to this issue as well, if you have any idea please share with me mate.

cheers
作者: brunolin    时间: 2020-10-15 17:28
rakidis 发表于 2020-10-9 16:34
I believe the reason for yolov4 being so much more slower is the "mish" activation layer. since it u ...

Actually mish op is not key factor.
You can output layers for analysis by eval_perf().
There are 3 pooling operators which consume much time.
They are from SPP module in YOLOv4.

作者: rakidis    时间: 2020-10-20 20:31
Ohh I see, so the solution would be to remove those pooling layers from base network and then retrain with imagenet?
作者: rakidis    时间: 2020-11-13 15:54
Did you manage to find a solution to this issue? I asked a way to remove spp in darknet repository, but haven't got an answer.

I wonder why spp structure causes a  bottleneck. Normally maxpool layers work fine, but when in spp they are taking too much time.
作者: brunolin    时间: 2020-11-20 16:02
rakidis 发表于 2020-11-13 15:54
Did you manage to find a solution to this issue? I asked a way to remove spp in darknet repository,  ...

You can remove SPP from yolov4.cfg by comparing yolov3.cfg and yolov3-spp.cfg.
Then re-train it.
The kernel size used in these maxpoo is so large that it's inefficient to execute on RKNN NPU.




欢迎光临 Toybrick (https://t.rock-chips.com/) Powered by Discuz! X3.3