Toybrick

标题: asymmetric_quantized-u8量化问题 [打印本页]

作者: LSC    时间: 2019-12-28 10:36
标题: asymmetric_quantized-u8量化问题
各位大神好,我使用asymmetric_quantized-u8量化,有下面的警告,请问这是什么原因?
(ReLU不是没有参数吗?)
  1. --> Building model
  2. W Adjust min from [-5.106467] to [0.] for relu relu8_39
  3. W Adjust min from [-5.106467] to [0.] for relu relu13_67
  4. W Adjust min from [-5.106467] to [0.] for relu relu15_76
  5. W Adjust min from [-5.106467] to [0.] for relu relu17_85
  6. W Adjust min from [-4.101224] to [0.] for relu relu18_90
  7. W Adjust min from [-2.5565655] to [0.] for relu relu43_218
  8. W Adjust min from [-4.101224] to [0.] for relu relu23_118
  9. W Adjust min from [-4.101224] to [0.] for relu relu25_127
  10. W Adjust min from [-4.101224] to [0.] for relu relu27_136
  11. W Adjust min from [-3.4539304] to [0.] for relu relu28_141
  12. W Adjust min from [-24.208406] to [0.] for relu relu38_193
  13. W Adjust min from [-1.8253068] to [0.] for relu relu44_223
  14. done
复制代码







作者: jefferyzhang    时间: 2019-12-30 08:35
量化的精度损失警告而已。relu本来就不会有负数,所以负数范围全部归0,不影响你模型推理.
转出来模型可以正常工作么?
作者: LSC    时间: 2019-12-30 09:45
jefferyzhang 发表于 2019-12-30 08:35
量化的精度损失警告而已。relu本来就不会有负数,所以负数范围全部归0,不影响你模型推理.
转出来模型可以 ...

感谢版主回复,使用混合量化时,第一步生成的cfg文件没做修改直接,第二步的时候报的在这些警告。使用生成的rknn模型时,效果很差。
作者: LSC    时间: 2019-12-30 09:50
jefferyzhang 发表于 2019-12-30 08:35
量化的精度损失警告而已。relu本来就不会有负数,所以负数范围全部归0,不影响你模型推理.
转出来模型可以 ...

另外,如果直接量化(即build时令do_quantization=True,而不是混合量化),使用5000张图做数据集,batch_size=1,epochs=-1,每次到迭代到200左右时都会报错,请问这是什么原因?
  1. ...
  2. D Process output_243 ...
  3. D RKNN output shape(output): (1 128 128 80)
  4. D Real output shape: (1, 128, 128, 80)
  5. I Build ctdet_coco_dlav0_1x_2 complete.
  6. I Running 5000 iterations
  7. D 0(0.02%), Queue size 0
  8. D 1(0.04%), Queue size 0
  9. D 2(0.05%), Queue size 0
  10. D 3(0.08%), Queue size 0
  11. ...
  12. D 195(3.92%), Queue size 0
  13. D 196(3.94%), Queue size 0
  14. D 197(3.96%), Queue size 0
  15. D 198(3.98%), Queue size 0
  16. D 199(4.00%), Queue size 0
  17. E Catch exception when building RKNN model!
  18. E Traceback (most recent call last):
  19. E   File "/home/lsc/anaconda3/envs/RKNN/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1278, in _do_call
  20. E     return fn(*args)
  21. E   File "/home/lsc/anaconda3/envs/RKNN/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1263, in _run_fn
  22. E     options, feed_dict, fetch_list, target_list, run_metadata)
  23. E   File "/home/lsc/anaconda3/envs/RKNN/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1350, in _call_tf_sessionrun
  24. E     run_metadata)
  25. E tensorflow.python.framework.errors_impl.OutOfRangeError: FIFOQueue '_0_fifo_queue' is closed and has insufficient elements (requested 1, current size 0)
  26. E          [[Node: fifo_queue_Dequeue = QueueDequeueV2[component_types=[DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](fifo_queue)]]
  27. E During handling of the above exception, another exception occurred:
  28. E Traceback (most recent call last):
  29. E   File "rknn/api/rknn_base.py", line 584, in rknn.api.rknn_base.RKNNBase.build
  30. E   File "rknn/api/rknn_base.py", line 1424, in rknn.api.rknn_base.RKNNBase._quantize2
  31. E   File "rknn/base/RKNNlib/app/medusa/quantization.py", line 95, in rknn.base.RKNNlib.app.medusa.quantization.Quantization.run
  32. E   File "rknn/base/RKNNlib/app/medusa/quantization.py", line 44, in rknn.base.RKNNlib.app.medusa.quantization.Quantization._run_quantization
  33. E   File "rknn/base/RKNNlib/app/medusa/workspace.py", line 129, in rknn.base.RKNNlib.app.medusa.workspace.Workspace.run
  34. E   File "rknn/base/RKNNlib/app/medusa/workspace.py", line 110, in rknn.base.RKNNlib.app.medusa.workspace.Workspace._run_iteration
  35. E   File "rknn/base/RKNNlib/RKNN_session.py", line 30, in rknn.base.RKNNlib.RKNN_session.RKNNSession.run
  36. E   File "/home/lsc/anaconda3/envs/RKNN/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 877, in run
  37. E     run_metadata_ptr)
  38. E   File "/home/lsc/anaconda3/envs/RKNN/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1100, in _run
  39. E     feed_dict_tensor, options, run_metadata)
  40. E   File "/home/lsc/anaconda3/envs/RKNN/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1272, in _do_run
  41. E     run_metadata)
  42. E   File "/home/lsc/anaconda3/envs/RKNN/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1291, in _do_call
  43. E     raise type(e)(node_def, op, message)
  44. E tensorflow.python.framework.errors_impl.OutOfRangeError: FIFOQueue '_0_fifo_queue' is closed and has insufficient elements (requested 1, current size 0)
  45. E          [[Node: fifo_queue_Dequeue = QueueDequeueV2[component_types=[DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](fifo_queue)]]
  46. E Caused by op 'fifo_queue_Dequeue', defined at:
  47. E   File "caffe2rknn.py", line 22, in <module>
  48. E     rknn.build(do_quantization=True, dataset='dataset.txt')
  49. E   File "/home/lsc/anaconda3/envs/RKNN/lib/python3.6/site-packages/rknn/api/rknn.py", line 166, in build
  50. E     ret = self.rknn_base.build(do_quantization=do_quantization, dataset=dataset, pack_vdata=pre_compile, batch_size=rknn_batch_size)
  51. E   File "/home/lsc/anaconda3/envs/RKNN/lib/python3.6/site-packages/tensorflow/python/ops/data_flow_ops.py", line 433, in dequeue
  52. E     self._queue_ref, self._dtypes, name=name)
  53. E   File "/home/lsc/anaconda3/envs/RKNN/lib/python3.6/site-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 3734, in queue_dequeue_v2
  54. E     timeout_ms=timeout_ms, name=name)
  55. E   File "/home/lsc/anaconda3/envs/RKNN/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
  56. E     op_def=op_def)
  57. E   File "/home/lsc/anaconda3/envs/RKNN/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 454, in new_func
  58. E     return func(*args, **kwargs)
  59. E   File "/home/lsc/anaconda3/envs/RKNN/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3155, in create_op
  60. E     op_def=op_def)
  61. E   File "/home/lsc/anaconda3/envs/RKNN/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1717, in __init__
  62. E     self._traceback = tf_stack.extract_stack()
  63. E OutOfRangeError (see above for traceback): FIFOQueue '_0_fifo_queue' is closed and has insufficient elements (requested 1, current size 0)
  64. E          [[Node: fifo_queue_Dequeue = QueueDequeueV2[component_types=[DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](fifo_queue)]]
复制代码





作者: jefferyzhang    时间: 2019-12-30 10:01
1. 请问下您使用的rknn toolkit版本号,早期版本有这个问题,但是后来已经修复了,至少1.2.1上应该不会有这个错。
2. 确认下您模型是不是3通道,但是使用的datasets里有单通道图片,也会造成这种错误。
作者: LSC    时间: 2019-12-30 10:22
jefferyzhang 发表于 2019-12-30 10:01
1. 请问下您使用的rknn toolkit版本号,早期版本有这个问题,但是后来已经修复了,至少1.2.1上应该不会有这 ...

1. 版本号是1.2.1,
2. 图片是下载的coco_voc_2017,而且刚用脚本验证了一下,都是3通道的。
会不会和内存有关?但是每次不是只处理一张图片吗?
作者: jefferyzhang    时间: 2019-12-30 10:25
用1.3.0试下,https://eyun.baidu.com/enterpris ... =201910153024123244
密码:rknn

跟内存应该关系不大,你是在PC上执行量化么?
作者: LSC    时间: 2019-12-30 10:37
jefferyzhang 发表于 2019-12-30 10:25
用1.3.0试下,https://eyun.baidu.com/enterprise/share/init?cid=8272257679089781337&uk=2751701137&sid= ...

是的PC,Linux,CPU上,epochs=100就不报这个错误,但是检测的时候一个框都检测不到。
好的,我去试试1.3.0。

还有个问题,就是混合量化第一步生成的cfg文件不修改直接进行第二步,是不是就和直接量化(不是混合量化,即转模型时do_quantization=True)效果一样?
作者: jefferyzhang    时间: 2019-12-30 11:23
LSC 发表于 2019-12-30 10:37
是的PC,Linux,CPU上,epochs=100就不报这个错误,但是检测的时候一个框都检测不到。
好的,我去试试1.3. ...

是的
作者: LSC    时间: 2019-12-30 11:27
jefferyzhang 发表于 2019-12-30 11:23
是的

非常感谢,我先去1.3.0试下
作者: LSC    时间: 2019-12-30 17:11
用了v1.3.0,还是会有同样的报错,不过错误多了一些信息
  1. D 196(3.94%), Queue size 0
  2. D 197(3.96%), Queue size 0
  3. D 198(3.98%), Queue size 0
  4. D 199(4.00%), Queue size 0
  5. E Catch exception when building RKNN model!
  6. Exception in thread Thread-1:
  7. Traceback (most recent call last):
  8.   File "/home/lsc/anaconda3/envs/RKNN-130/lib/python3.6/threading.py", line 916, in _bootstrap_inner
  9.     self.run()
  10.   File "rknn/base/acuitylib/provider/queue_provider.py", line 98, in rknn.base.acuitylib.provider.queue_provider.QueueProvider.run
  11.   File "rknn/base/acuitylib/provider/queue_provider.py", line 102, in rknn.base.acuitylib.provider.queue_provider.QueueProvider.ru
  12.   File "rknn/base/acuitylib/provider/text_provider.py", line 65, in rknn.base.acuitylib.provider.text_provider.TextProvider.get_ba
  13.   File "rknn/base/acuitylib/provider/base_provider.py", line 163, in rknn.base.acuitylib.provider.base_provider.BaseProvider._tens_batch
  14.   File "rknn/base/acuitylib/provider/base_provider.py", line 105, in rknn.base.acuitylib.provider.base_provider.BaseProvider._reshs
  15.   File "/home/lsc/anaconda3/envs/RKNN-130/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 292, in reshape
  16.     return _wrapfunc(a, 'reshape', newshape, order=order)
  17.   File "/home/lsc/anaconda3/envs/RKNN-130/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 56, in _wrapfunc
  18.     return getattr(obj, method)(*args, **kwds)
  19. ValueError: cannot reshape array of size 262144 into shape (1,512,512,3)

  20. E Traceback (most recent call last):
  21. E   File "/home/lsc/anaconda3/envs/RKNN-130/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1292, in _do_ca
  22. E     return fn(*args)
  23. E   File "/home/lsc/anaconda3/envs/RKNN-130/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1277, in _run_f
  24. E     options, feed_dict, fetch_list, target_list, run_metadata)
  25. E   File "/home/lsc/anaconda3/envs/RKNN-130/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1367, in _call_run
  26. E     run_metadata)
  27. E tensorflow.python.framework.errors_impl.OutOfRangeError: FIFOQueue '_0_fifo_queue' is closed and has insufficient elements (requurrent size 0)
  28. E        [[{{node fifo_queue_Dequeue}} = QueueDequeueV2[component_types=[DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replicadevice:CPU:0"](fifo_queue)]]
  29. E During handling of the above exception, another exception occurred:
  30. E Traceback (most recent call last):
  31. E   File "rknn/api/rknn_base.py", line 737, in rknn.api.rknn_base.RKNNBase.build
  32. E   File "rknn/api/rknn_base.py", line 1644, in rknn.api.rknn_base.RKNNBase._quantize2
  33. E   File "rknn/base/RKNNlib/app/medusa/quantization.py", line 105, in rknn.base.RKNNlib.app.medusa.quantization.Quantization.run
  34. E   File "rknn/base/RKNNlib/app/medusa/quantization.py", line 44, in rknn.base.RKNNlib.app.medusa.quantization.Quantization._run_qn
  35. E   File "rknn/base/RKNNlib/app/medusa/workspace.py", line 135, in rknn.base.RKNNlib.app.medusa.workspace.Workspace.run
  36. E   File "rknn/base/RKNNlib/app/medusa/workspace.py", line 116, in rknn.base.RKNNlib.app.medusa.workspace.Workspace._run_iteration
  37. E   File "rknn/base/RKNNlib/RKNN_session.py", line 30, in rknn.base.RKNNlib.RKNN_session.RKNNSession.run
  38. E   File "/home/lsc/anaconda3/envs/RKNN-130/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 887, in run
  39. E     run_metadata_ptr)
  40. E   File "/home/lsc/anaconda3/envs/RKNN-130/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1110, in _run
  41. E     feed_dict_tensor, options, run_metadata)
  42. E   File "/home/lsc/anaconda3/envs/RKNN-130/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1286, in _do_ru
  43. E     run_metadata)
  44. E   File "/home/lsc/anaconda3/envs/RKNN-130/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1308, in _do_ca
  45. E     raise type(e)(node_def, op, message)
  46. E tensorflow.python.framework.errors_impl.OutOfRangeError: FIFOQueue '_0_fifo_queue' is closed and has insufficient elements (requurrent size 0)
  47. E        [[{{node fifo_queue_Dequeue}} = QueueDequeueV2[component_types=[DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replicadevice:CPU:0"](fifo_queue)]]
  48. E Caused by op 'fifo_queue_Dequeue', defined at:
  49. E   File "caffe2rknn.py", line 22, in <module>
  50. E     rknn.build(do_quantization=True, dataset='dataset.txt')
  51. E   File "/home/lsc/anaconda3/envs/RKNN-130/lib/python3.6/site-packages/rknn/api/rknn.py", line 238, in build
  52. E     ret = self.rknn_base.build(do_quantization=do_quantization, dataset=dataset, pack_vdata=pre_compile, batch_size=rknn_batch_s
  53. E   File "/home/lsc/anaconda3/envs/RKNN-130/lib/python3.6/site-packages/tensorflow/python/ops/data_flow_ops.py", line 433, in dequ
  54. E     self._queue_ref, self._dtypes, name=name)
  55. E   File "/home/lsc/anaconda3/envs/RKNN-130/lib/python3.6/site-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 3735, inueue_v2
  56. E     timeout_ms=timeout_ms, name=name)
  57. E   File "/home/lsc/anaconda3/envs/RKNN-130/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, op_helper
  58. E     op_def=op_def)
  59. E   File "/home/lsc/anaconda3/envs/RKNN-130/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_f
  60. E     return func(*args, **kwargs)
  61. E   File "/home/lsc/anaconda3/envs/RKNN-130/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3272, in create_
  62. E     op_def=op_def)
  63. E   File "/home/lsc/anaconda3/envs/RKNN-130/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1768, in __init_
  64. E     self._traceback = tf_stack.extract_stack()
  65. E OutOfRangeError (see above for traceback): FIFOQueue '_0_fifo_queue' is closed and has insufficient elements (requested 1, curre
  66. E        [[{{node fifo_queue_Dequeue}} = QueueDequeueV2[component_types=[DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replicadevice:CPU:0"](fifo_queue)]]
  67. done
复制代码

作者: LSC    时间: 2019-12-30 17:13
然后我预先把数据集的图片手动resize成输入尺寸512x512,再运行就成功了。
作者: LSC    时间: 2019-12-30 22:03
然而调用这个模型时,在init_runtime这一步发生了段错误。
  1. >>> from rknn.api import RKNN
  2. >>> import cv2
  3. >>> import numpy as np
  4. >>> rknn = RKNN()
  5. >>> rknn_model = "../ctdet_coco_dlav0_1x_q2.rknn"
  6. >>> rknn.load_rknn(rknn_model)
  7. 0
  8. >>> rknn.init_runtime()
  9. Segmentation fault (core dumped)
复制代码

作者: jefferyzhang    时间: 2019-12-31 08:19
LSC 发表于 2019-12-30 22:03
然而调用这个模型时,在init_runtime这一步发生了段错误。

把pb原文件和转换脚本一起发给我,我转给NPU的同事debug
作者: LSC    时间: 2019-12-31 08:48
jefferyzhang 发表于 2019-12-31 08:19
把pb原文件和转换脚本一起发给我,我转给NPU的同事debug

1.3.0版本会发生段错误,1.2.0不会,还是先继续用1.2.0吧
作者: jefferyzhang    时间: 2019-12-31 14:18
LSC 发表于 2019-12-31 08:48
1.3.0版本会发生段错误,1.2.0不会,还是先继续用1.2.0吧

嗯,这个段错误主要是因为1.3和你固件驱动不匹配,新版本固件我们下个月会发布
作者: LSC    时间: 2019-12-31 14:59
jefferyzhang 发表于 2019-12-31 14:18
嗯,这个段错误主要是因为1.3和你固件驱动不匹配,新版本固件我们下个月会发布 ...

好的,感谢您的回复




欢迎光临 Toybrick (https://t.rock-chips.com/) Powered by Discuz! X3.3