跳转至

QatConfig

Function

Quantization parameter configuration class, used to save the parameters configured during the quantization process.

Prototype

QatConfig(w_bit=8, a_bit=8, a_sym=False, amp_num=0, steps=1, ema=0.99, is_forward=False, ignore_head_tail_node=False, disable_names=None, has_init_quant=False, quant_mode=True, grad_scale=0.0, compressed_model_checkpoint=None, opset_version=11, save_params=False, input_names=None, output_names=None, save_onnx_name=None)

Parameters

Parameter Name Input Description Constraints
w_bit Input Weight quantization bit. Optional.
Data Type: int.
Default is 8, modification is not supported.
a_bit Input Activation layer quantization bit. Optional.
Data Type: int.
Default is 8, modification is not supported.
a_sym Input Whether to use symmetric quantization for activation values. Optional.
Data Type: bool.
Default is False.
amp_num Input Number of automatic fallback layers.
When accuracy drops significantly, you can increase the number of fallback layers. It is recommended to prioritize fallback of 1 to 3 layers. If accuracy recovery is not significant, then increase the number of fallback layers.
Optional.
Data Type: int.
Value Range is [0,10], default is 0. Inputs such as 1, 2, 3 are allowed.
steps Input Number of steps for automatic fallback. Optional.
Data Type: int.
Default is 1, value range is greater than or equal to 1.
ema Input Parameter in the Adam optimizer, the exponential moving average metric. Optional.
Data Type: float.
Value Range is [0.1,1.0], default is 0.99.
is_forward Input Whether to process the forward pass with reference to mmdetection. Optional.
Data Type: bool.
Default is False.
ignore_head_tail_node Input Whether to ignore the first and last layers and not quantize them. Optional.
Data Type: bool.
Default is False.
disable_names Input Names of nodes to be excluded from quantization, i.e., the names of quantization layers for manual fallback.
If the accuracy is poor, you can select quantization layers for fallback.
Optional.
Data Type: list[str].
Default is None.
has_init_quant Input Whether the model has undergone quantization initialization. Optional.
Data Type: bool.
Default is False.
quant_mode Input Whether to enable quantization mode. Optional.
Data Type: bool.
Default is True.
grad_scale Input Gradient compensation strength. Optional.
Data Type: float.
Default is 0.0, recommended configuration is 0.001.
compressed_model_checkpoint Input The weight file and its path for the pseudo-quantized model saved when exporting the ONNX model. Optional.
Data Type: string.
Default is None.
opset_version Input Version number when exporting the ONNX model. The corresponding ONNX version must be installed in advance. Optional.
Data Type: int.
Optional values are 11 and 13, default is 11.
save_params Input Whether to save quantization-related parameters as .npy files during export. Optional.
Data Type: bool.
Default is False.
input_names Input Input names for ONNX. Optional.
Data Type: list[str]
Default is None.
output_names Input Output names for ONNX. Optional.
Data Type: list[str]
Default is None.
save_onnx_name Input Pseudo-quantized model weights. Optional.
Data Type: str.
Default is None.

Sample

from msmodelslim.pytorch.quant.qat_tools import QatConfig
quant_config = QatConfig(grad_scale=0.001)