chuyee 发表于 2019-3-8 07:56
What's the inference time do you get for DeepLabv3? Is it possible for less than 1s?
Deeplabv3+ is not actually as computation intensive as you think of. Depending on your network architecture, you can run mobilnetv2_dm0.5 up to 15 fps with input size of 513x513.
chuyee 发表于 2019-3-12 16:55
I answer it myself, dm stands for depth multiplier. 0.5 means halve the number of channels used in ...
Yep, the first frame cannot represent general runtime on tensorflow framework. Even if you use dm=1.0 and add aspp and decoder, you can still run it with 10 fps with input size of 513x513, pretty amazing.
I am pretty sure, since after tested on Python with official mobilenet deeplabv3 I switched to C++ and use native C++ code evaluated the performance on my own deeplabv3.
In addition to official mobilenet version, I added decoder and aspp modules, which brings additional operations and input size of 400x400. I am able to run it with 9.x FPS.
I also find using node argmax is pretty slow, so I used biasAdd as output, and write my own implementation on C++ to get segmentation result.