Toybrick

YOLOv4推理時間是v3的2.5倍

brunolin

中级会员

积分
325
发表于 2020-9-10 09:58:21    查看: 802|回复: 5 | [复制链接]    打印 | 显示全部楼层
YOLOv4在darknet上列出的BFLOPS比YOLOv3少以416x416模型來說,
YOLOv3: 65.879 BFLOPS
YOLOv4: 60.137 BFLOPS

使用rknn-toolkit v1.4.0轉檔後,在RK1808計算棒上執行推理
v4的耗時卻是v3的2.5倍以上
YOLOv3:  82.4 ms
YOLOv4: 210.0 ms

請問這是甚麼原因? 有甚麼加速的方法?
回复

使用道具 举报

rakidis

中级会员

积分
274
发表于 2020-10-9 16:34:33 | 显示全部楼层
I believe the reason for yolov4 being so much more slower is the "mish" activation layer. since it uses exponential calculations, it could be behind this bottleneck.

I am trying to find a reasonable solution to this issue as well, if you have any idea please share with me mate.

cheers
回复

使用道具 举报

brunolin

中级会员

积分
325
 楼主| 发表于 2020-10-15 17:28:49 | 显示全部楼层
rakidis 发表于 2020-10-9 16:34
I believe the reason for yolov4 being so much more slower is the "mish" activation layer. since it u ...

Actually mish op is not key factor.
You can output layers for analysis by eval_perf().
There are 3 pooling operators which consume much time.
They are from SPP module in YOLOv4.
回复

使用道具 举报

rakidis

中级会员

积分
274
发表于 2020-10-20 20:31:08 | 显示全部楼层
Ohh I see, so the solution would be to remove those pooling layers from base network and then retrain with imagenet?
回复

使用道具 举报

rakidis

中级会员

积分
274
发表于 2020-11-13 15:54:36 | 显示全部楼层
Did you manage to find a solution to this issue? I asked a way to remove spp in darknet repository, but haven't got an answer.

I wonder why spp structure causes a  bottleneck. Normally maxpool layers work fine, but when in spp they are taking too much time.
回复

使用道具 举报

brunolin

中级会员

积分
325
 楼主| 发表于 4 天前 | 显示全部楼层
rakidis 发表于 2020-11-13 15:54
Did you manage to find a solution to this issue? I asked a way to remove spp in darknet repository,  ...

You can remove SPP from yolov4.cfg by comparing yolov3.cfg and yolov3-spp.cfg.
Then re-train it.
The kernel size used in these maxpoo is so large that it's inefficient to execute on RKNN NPU.
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

产品中心 购买渠道 开源社区 Wiki教程 资料下载 关于Toybrick


快速回复 返回顶部 返回列表