|
本帖最后由 771488533 于 2019-3-14 14:58 编辑
使用mtcnn 的p-net模型做人脸检测,在开发板上面运行,打开 RKNN_FLAG_COLLECT_PERF_MASK 标志位获得
NPU的运算时间为 6320 us,使用c++的计时器返回的模型推理时间为14.00ms左右,实际推理耗时接近NPU运算时间的2倍多,
请说明是C++模型推理接口的BUG问题,还是调用推理接口本身会耗时8ms左右时间? mtcnn人脸检测会产生大量候选框,会重复
调用模型推理接口,目前耗时较长,640*480输入大小图片情况下无法做到实时检测
运行后打印log:
- p_net detetion cost time: 14.000000 ms.
- perf_run.run_duration = 6320 us
- perf_run.perf_data =
- Layer id: Name: Operation id: Operator: Target: Time(us):
- 0 ConvolutionReluPoolingLayer2 0 CONVOLUTION NN 552
- 1 com vivantecorp extension vxcParametricRelu 0 unkown operation type SH 725
- 2 PoolingLayer2 0 POOLING TP 232
- 3 ConvolutionReluPoolingLayer2 0 CONVOLUTION NN 619
- 4 com vivantecorp extension vxcParametricRelu 0 unkown operation type SH 455
- 5 ConvolutionReluPoolingLayer2 0 CONVOLUTION NN 1477
- 6 com vivantecorp extension vxcParametricRelu 0 unkown operation type SH 702
- 7 ConvolutionReluPoolingLayer2 0 CONVOLUTION NN 216
- 8 ConvolutionReluPoolingLayer2 0 CONVOLUTION NN 214
- 9 SoftmaxLayer 0 SOFTMAX SH 244
推理时间计时:
- double start = ti.elapsedMilliSeconds();
- ret = rknn_inputs_set(ctx, 1, inputs);
- if(ret < 0) {
- printf("rknn_input_set fail! ret=%d\n", ret);
- return -1;
- }
- ret = rknn_run(ctx, nullptr);
- if(ret < 0) {
- printf("rknn_run fail! ret=%d\n", ret);
- return -1;
- }
- outputs[0].want_float = true;
- outputs[0].is_prealloc = false;
- outputs[1].want_float = true;
- outputs[1].is_prealloc = false;
- ret = rknn_outputs_get(ctx, 2, outputs, nullptr);
- if(ret < 0) {
- printf("rknn_outputs_get fail! ret=%d\n", ret);
- return -1;
- }
- double stop = ti.elapsedMilliSeconds();
- printf("p_net detetion cost time: %f ms.\n",stop - start);
测试代码见附件
编译方式:
- mkdir build
- cd build
- cmake ../
- make
|
本帖子中包含更多资源
您需要 登录 才可以下载或查看,没有帐号?立即注册
x
|