Specification of config.json
The config.json
file saves configurations used to quantize floating points in coefficient.npy
.
Specification
Each item in config.json
stands for the configuration of one layer. Take the following code as an example:
{
"l1": {"/* the configuration of layer l1 */"},
"l2": {"/* the configuration of layer l2 */"},
"l3": {"/* the configuration of layer l3 */"},
...
}
The key of each item is the layer name. The convert tool convert.py
searches for the corresponding .npy
files according to the layer name. For example, if a layer is named “l1”, the tool will search for l1’s filter coefficients in “l1_filter.npy”. The layer name in config.json should be consistent with the layer name in the name of .npy files.
The value of each item is the layer configuration. Please fill arguments about layer configuration listed in Table 1:
Key |
Type |
Value |
---|---|---|
“operation” |
string |
|
“feature_type” |
string |
|
“filter_exponent” |
integer |
|
“bias” |
string |
|
“output_exponent” |
integer |
Both output and bias are quantized according to the equation: value_float = value_int * 2^exponent.
For now, “output_exponent” is effective only for “bias” coefficient conversion. “output_exponent” must be provided when using per-tensor quantization. If there is no “bias” in a specific layer or using per-channel quantization, “output_exponent” could be dropped.
|
“input_exponent” |
integer |
When using per-channel quantization, the exponent of bias is related to input_exponent and filter_exponent.
“input_exponent” must be provided for “bias” coefficient conversion. If there is no “bias” in a specific layer or using per-tensor quantization, “output_exponent” could be dropped.
|
“activation” |
dict |
|
- 1
exponent: the number of times the base is multiplied by itself for quantization. For better understanding, please refer to Quantization Specification.
- 2
dropped: to leave a specific argument empty.
Key |
Type |
Value |
---|---|---|
“type” |
string |
|
“exponent” |
integer |
|
Example
Assume that for a one-layer model:
1. using int16 per-tensor quantization:
layer name: “mylayer”
operation: Conv2D(input, filter) + bias
output_exponent: -10, exponent for the result of operation
feature_type: s16, which means int16 quantization
type of activation: PReLU
The config.json file should be written as:
{
"mylayer": {
"operation": "conv2d",
"feature_type": "s16",
"bias": "True",
"output_exponent": -10,
"activation": {
"type": "PReLU"
}
}
}
“filter_exponent” and “exponent” of “activation” are dropped. must provide “output_exponent” for bias in this layer.
2. using int8 per-tensor quantization:
layer name: “mylayer”
operation: Conv2D(input, filter) + bias
output_exponent: -7, exponent for the result of this layer
feature_type: s8
type of activation: PReLU
The config.json file should be written as:
{
"mylayer": {
"operation": "conv2d",
"feature_type": "s8",
"bias": "True",
"output_exponent": -7,
"activation": {
"type": "PReLU"
}
}
}
must provide “output_exponent” for bias in this layer.
3. using int8 per-channel quantization:
layer name: “mylayer”
operation: Conv2D(input, filter) + bias
input_exponent: -7, exponent for the input of this layer
feature_type: s8
type of activation: PReLU
The config.json file should be written as:
{
"mylayer": {
"operation": "conv2d",
"feature_type": "s8",
"bias": "True",
"input_exponent": -7,
"activation": {
"type": "PReLU"
}
}
}
“input_exponent” must be provided for bias in this layer.
Meanwhile, mylayer_filter.npy
, mylayer_bias.npy
and mylayer_activation.npy
should be ready.