High Granularity Quantization 2

HGQ2 (High Granularity Quantization 2) is a quantization-aware training framework built on Keras v3, targeting real-time deep learning applications on edge devices like FPGAs. It provides a comprehensive set of tools for creating and training quantized neural networks with minimal effort.

HGQ2 implements an gradient-based automatic bitwidth optimization and quantization-aware training algorithm. By laveraging gradients, it allows for bitwidth optimization at arbitrary granularity, up to per-weight and per-activation level.

Key Features

Multi-backend support: Works with TensorFlow, JAX, and PyTorch through Keras v3
Flexible quantization: Supports different quantization schemes including fixed-point and minifloat
Hardware synthesis: Direct integration with hls4ml for FPGA deployment
Trainable quantization parameters: Optimize bitwidths through gradient-based methods
Effective Bit-Operations (EBOP): Accurate resource estimation during training for the deployed firmware
Advanced layer support: HGQ2 supports advanced layers like einsum, einsum dense, and multi-head attention layers with quantization and hardware synthesis support

Simple example

import keras
from hgq.layers import QDense, QConv2D
from hgq.config import LayerConfigScope, QuantizerConfigScope

# Setup quantization configuration
# These values are the defaults, just for demonstration purposes here
with (
   # Configuration scope for setting the default quantization type and overflow mode
   # The second configuration scope overrides the first one for the 'datalane' place
   QuantizerConfigScope(place='all', default_q_type='kbi', overflow_mode='SAT_SYM'),
   # Configuration scope for enabling EBOPs and setting the beta0 value
   QuantizerConfigScope(place='datalane', default_q_type='kif', overflow_mode='WRAP'),
   LayerConfigScope(enable_ebops=True, beta0=1e-5),
):
   model = keras.Sequential([
      QConv2D(32, (3, 3), activation='relu'),
      keras.layers.MaxPooling2D((2, 2)),
      keras.layers.Flatten(),
      QDense(10)
   ])

Index

API Reference:

High Granularity Quantization 2

Key Features

Index

Indices and tables