【已解决！】RKNN转换Tensorflow官方的Deeplabv3后inference结果不

[复制链接] · 发表于 2019-3-5 05:11:15

本帖最后由 protossw512 于 2019-3-6 07:32 编辑

问题已经解决了，原因是中间层的输出是channel first的，所以reshap的时候需要注意。

只看该作者 · 发表于 2019-3-5 16:17:58

谢谢反馈，我这里复现下。

只看该作者 · 发表于 2019-3-5 16:28:55

raul 发表于 2019-3-5 16:17
谢谢反馈，我这里复现下。

顺便一说，我在这里调出了第一层conv(MobilenetV2/Conv/Conv2D)的结果，发现已经不对了。
然后我尝试避免使用rknn自带的图像归一化，直接用 (0. 0. 0. 1.0)，然后自己把图像转成np.float32，再自己做归一化，结果inference的时候出现了segmentation fault不知道为什么，np.float16也不行。
我的quantization一直都没开。
有没有其他什么办法能调出rknn归一化后输入网络之前的数据么？

只看该作者 · 发表于 2019-3-6 03:26:07

raul 发表于 2019-3-5 16:17
谢谢反馈，我这里复现下。

你好，问题已经解决了，原因是中间层的输出是channel在前面的，所以最后的输出reshape不应该是1, 65, 65, 21, 而应该是1, 21, 65, 65。然后再np.transpose成 1, 65, 65, 21进行处理。

非常感谢大家的帮助！

只看该作者 · 发表于 2019-3-6 09:31:53

protossw512 发表于 2019-3-6 03:26
你好，问题已经解决了，原因是中间层的输出是channel在前面的，所以最后的输出reshape不应该是1, 65, 65, ...

好的。我这里验证也是结果没错。

只看该作者 · 发表于 2019-3-8 07:56:36

What's the inference time do you get for DeepLabv3? Is it possible for less than 1s?

只看该作者 · 发表于 2019-3-8 14:25:30

chuyee 发表于 2019-3-8 07:56
What's the inference time do you get for DeepLabv3? Is it possible for less than 1s?

Deeplabv3+ is not actually as computation intensive as you think of. Depending on your network architecture, you can run mobilnetv2_dm0.5 up to 15 fps with input size of 513x513.

只看该作者 · 发表于 2019-3-11 10:07:53

protossw512 发表于 2019-3-8 14:25
Deeplabv3+ is not actually as computation intensive as you think of. Depending on your network arc ...

That's amazing!

只看该作者 · 发表于 2019-3-11 11:50:26

protossw512 发表于 2019-3-8 14:25
Deeplabv3+ is not actually as computation intensive as you think of. Depending on your network arc ...

What does "mobilnetv2_dm0.5" stand for? I got only ~1.2s with GTX 1080 Ti with the demo code https://github.com/tensorflow/mo ... /deeplab_demo.ipynb, which uses model deeplabv3_mnv2_pascal_train_aug_2018_01_29.tar.gz (513x513, mobilenet_v2 coco dataset). I haven't ported it to rknn successfully yet. But do you think I can achieve 15FPS after the porting?

只看该作者 · 发表于 2019-3-12 02:28:13

chuyee 发表于 2019-3-11 10:07
That's amazing!

你的输入层设的是 MobilenetV2/Conv/Conv2D吧？那前面那些层都在CPU上处理吗？mobilenet_v2在rknn上到40fps都没问题。问题是加上前期和后期的处理后你处理一副513x513的图片在3399pro上需要多少秒呢？

【已解决！】RKNN转换Tensorflow官方的Deeplabv3后inference结果不

本帖子中包含更多资源