FAQs
What’s this?
HGQ is a method for quantization aware training of neural works to be deployed on FPGAs, which allows for per-weight and per-activation bitwidth optimization. HGQ2 is the second generation of a library implementing this method.
Why is it useful?
Depending on the specific application, HGQ could achieve up to 10x resource reduction compared to the traditional AutoQkeras
approach, while maintaining the same accuracy. For some more challenging tasks, where the model is already under-fitted, HGQ could still improve the performance under the same on-board resource consumption. For more details, please refer to our paper at arXiv.
Can I use it?
The primary usage for HGQ is to quantize your model for FPGA deployment. Currently the recommended way to use HGQ is to train your model with HGQ, and then convert the trained model to HLS C++ code with hls4ml
. However, you can also use the quantized model for other purposes, such as deploying on other platforms, or even just for model compression.
For the training part, you can use HGQ as a drop-in replacement for Keras
layers. For the deployment part, the following conditions should be met:
Your model is competible with
hls4ml
(i.e. it can be converted to HLS C++ code).You are using
Vitis
orVivado
(deprecated) as your FPGA backend.other backend MAY work if you don’t use heterogeneous activation quantization.
Your model is representable in HGQ2 layers
Some layers in HGQ2 are not supported by
hls4ml
under certain configurations.
If you meet all the above conditions, you can probably use HGQ to quantize your model.
What’s the status of the project?
The project is still under development. The codebase and documentation are not stable yet, and we are working on it. However, the core functionality is already implemented and tested, and the APIs are considered semi-stable. If you encounter any issues, please feel free to contact us.
I’m getting terrible results during/after training, what should I do?
Do you observe a collapse of loss value, as well as EBOPs and accuracy? If so, you may want to try the following:
Increase the initial bitwidth of the quantizer (
b
,f
, andi
(non-WRAP
overflow mode) in the quantizer config).Reduce
beta0
in quantizer config.Reduce or remove bitwidth regularization in quantizer config.
If it still doesn’t work, please try using the quantizer type
dummy
to see if the problem is caused by the quantizer. If not, it may be caused by the model itself. If the keras model works but not the HGQ model with dummy quantizer, please report an issue.
ERROR: [XFORM 203-504]
for Vivado HLS
Warning
Please consider switching to Vitis HLS, as Vivado HLS is considered deprecated. Vitis HLS does not have this limitation.
This is due to some hard-coded limitation in Vivado HLS, and there is no known way to disable this check. You can try to reduce the size of your layer, or try to use parallel_factor
for convolutional layers, or split dense layers manually.
QKeras?
Most QKeras
models targeting hls4ml
may be represented in HGQ2, but usually not vice versa. However, there are certain layers in QKeras
that are not supported by HGQ2, such as the RNN family and hard activation layers.
Premission and License
This library, HGQ2, is a free software licensed under the LGPLv3 license. You can find the full license text in the LICENSE
file in the root directory of the repository.
If you use this work in your research, we would appreciate a citation to our paper at arXiv.