HGQ.proxy package

Subpackages

HGQ.proxy.plugins package

Submodules

HGQ.proxy.convert module

class HGQ.proxy.convert.Namer

Bases: object

Helper class to generate unique names for layers, if one being used multiple times.

next_name(name: str)

class HGQ.proxy.convert.ProxyLayerXFormer(SAT='WRAP', use_uniary_lut=False)

Bases: object

get_kbiRS(layer: HLayerBase, **kwargs)

HGQ.proxy.convert.apply_layer(layer: ~keras.src.engine.base_layer.Layer | None, inp_tensors: list[~tensorflow.python.framework.tensor.Tensor] | ~tensorflow.python.framework.tensor.Tensor, namer: None | ~HGQ.proxy.convert.Namer = None, layer_xformer: ~collections.abc.Callable[[~keras.src.engine.base_layer.Layer], ~keras.src.engine.base_layer.Layer] = <function <lambda>>)

HGQ.proxy.convert.apply_layer(layer: ~keras.src.engine.training.Model, inp_tensors: list[~tensorflow.python.framework.tensor.Tensor] | ~tensorflow.python.framework.tensor.Tensor, namer: None | ~HGQ.proxy.convert.Namer = None, layer_xformer: ~collections.abc.Callable[[~keras.src.engine.base_layer.Layer], ~keras.src.engine.base_layer.Layer] = <function <lambda>>)

Apply a layer to the input tensors:

if the layer is a model, apply the model recursively in a flattened manner.
if the layer is a keras layer, transform it by layer_transformer, until the exactly same reference is returned.

HGQ.proxy.convert.convert_model(model: ~keras.src.engine.training.Model, layer_xformer: ~collections.abc.Callable[[~keras.src.engine.base_layer.Layer], ~keras.src.engine.base_layer.Layer] = <function <lambda>>)

For a keras model, convert each layer with layer_transformer, flatten all layers (remove all keras.Model as layers), and return a new model.

Parameters:

model (keras.Model) – Input keras model.
layer_transformer – A function that takes a keras layer and returns a keras layer (can be a keras.Model). For the final set of layer, it must return the same reference, rather than a copy, or the conversion will never end.

Returns:

A new keras model.

Return type:

keras.Model

Examples

Simply flatten a keras model: convert_model(model) Flatten a keras model, return a new model with all

HGQ.proxy.convert.copy_fused_weights(src: Layer, dst: Layer): For HGQ layers, some layers may have different fused weights for kernel and bias (Processed weights are deployment). This function copies the fused kernel and bias to the keras proxy.

HGQ.proxy.convert.get_all_nodes(model: Model) → set[Node]: Get all nodes in the model as a set.

HGQ.proxy.convert.get_weight(layer: Layer, name: str): Given a layer and a weight name, return the weight. The weight name may or may not contain the layer name. If the number index is missing, it is assumed to be 0.

HGQ.proxy.convert.solve_dependencies(model: Model): Given a keras model, return the input nodes, output nodes and a list of (layer, requires, provides) tuples. Requires is a list of nodes that are parents of the layer, provides is the node that is the output of the layer.

HGQ.proxy.convert.to_keras_layer(layer)
HGQ.proxy.convert.to_keras_layer(layer: ABSBaseLayer)

HGQ.proxy.convert.to_proxy_model(model: Model, aggressive: bool = True, accum_fp_max_offset: int | None = None, unary_lut_max_table_size=-1)

Given a HGQ model, return a hls4ml-ready keras model.

Parameters:

model – The HGQ model to be converted.
(default (unary_lut_max_table_size) – True): If True, use WRAP overflow mode. Significant performance degradation may occur if overflow occurs, but latency may be reduced. If False, use SAT overflow mode. Performance is more stable when it overflows, but latency may be increased.
(default – None): If not None, autoset accumulator such that the model is bit accurate (when no overflow occurs and up to fp precision). If set, use the specified number of floating bits plus result float bits as accumulator float bits. May improve latency in some rare cases, not recommended in general.
(default – -1): If greater than 0, use unary LUT for HActivation layers, when the required table size is less than or equal to the specified value. If set to -1, do not use unary LUT.

HGQ.proxy.fixed_point_quantizer module

class HGQ.proxy.fixed_point_quantizer.FixedPointQuantizer(*args, **kwargs)

Bases: Layer

call(x, training=None)

This is where the layer’s logic lives.

The call() method may not create state (except in its first invocation, wrapping the creation of variables or other resources in tf.init_scope()). It is recommended to create state, including tf.Variable instances and nested Layer instances,

in __init__(), or in the build() method that is

called automatically before call() executes for the first time.

Parameters:

inputs –
Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

arguments, and inputs cannot be provided via the default value of a keyword argument.
- NumPy array or Python scalar values in inputs get cast as tensors.
- Keras mask metadata is only collected from inputs.
- Layers are built (build(input_shape) method) using shape info from inputs only.
- input_spec compatibility is only checked against inputs.
- Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.
- The SavedModel input specification is generated using inputs only.
- Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.
*args – Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.
**kwargs –
Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

whether the call is meant for training or inference.
- mask: Boolean input mask. If the layer’s call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns:

A tensor or list/tuple of tensors.

classmethod from_config(config: dict)

Creates a layer from its config.

This method is the reverse of get_config, capable of instantiating the same layer from the config dictionary. It does not handle layer connectivity (handled by Network), nor weights (handled by set_weights).

Parameters:: config – A Python dictionary, typically the output of get_config.
Returns:: A layer instance.

property fusible: Delete this quantizer if no heterogeneity is detected.

get_config()

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns:: Python dictionary.

property heterogeneous

property result_t_kif

HGQ.proxy.fixed_point_quantizer.RND(x)

HGQ.proxy.fixed_point_quantizer.RND_CONV(x)

HGQ.proxy.fixed_point_quantizer.RND_INF(x)

HGQ.proxy.fixed_point_quantizer.RND_MIN_INF(x)

HGQ.proxy.fixed_point_quantizer.RND_ZERO(x)

HGQ.proxy.fixed_point_quantizer.SAT(x, k, b)

HGQ.proxy.fixed_point_quantizer.SAT_SYM(x, k, b)

HGQ.proxy.fixed_point_quantizer.SAT_ZERO(x, k, b)

HGQ.proxy.fixed_point_quantizer.TRN(x)

HGQ.proxy.fixed_point_quantizer.TRN_ZERO(x)

HGQ.proxy.fixed_point_quantizer.WRAP(x, k, b)

HGQ.proxy.fixed_point_quantizer.WRAP_SM(x, k, b)

HGQ.proxy.fixed_point_quantizer.fixed(bits, integer_bits, RND='TRN', SAT='WRAP') → Callable

HGQ.proxy.fixed_point_quantizer.gfixed(keep_negative, bits, integer_bits, RND='TRN', SAT='WRAP') → Callable

HGQ.proxy.fixed_point_quantizer.gfixed_quantizer(x, keep_negative, bits, integer_bits, RND='TRN', SAT='WRAP'): Generalized fixed point quantizer, should have the same behavior to ap_fixed/ap_ufixed. Support high granularity quantization and broadcasting of bitwidths. RND and SAT mode must be strings.

HGQ.proxy.fixed_point_quantizer.ufixed(bits, integer_bits, RND='TRN', SAT='WRAP') → Callable

HGQ.proxy.precision_derivation module

HGQ.proxy.precision_derivation.STREAM = False: This variable is not used for now.

HGQ.proxy.precision_derivation.activation_kif_forward(func: Callable, k: int, i: int, f: int): Given the input bitwidth (kif) of an activation function, get the output bitwidth (kif).

HGQ.proxy.precision_derivation.derive_result_kifRS_from_next_quantizers(layer: Layer) → tuple[int, int, int, str, str]: Get the result bitwidth of a layer that has a quantizer following immediately, as a tuple of (k, i, f, RND, SAT). In general, any InputLayer or layers with kernels will have a quantizer following immediately.

HGQ.proxy.precision_derivation.get_arr_container(arr: ndarray, silent=False): Get the minimal fixed integer that can represent the array (kif format). If the result is greater than ~30, consider that as inf. (Not representable by fixed point with reasonable bitwidth.)

HGQ.proxy.precision_derivation.get_config(layer: Layer, accum_fp_max_offset: None | int = None): Get the quantization configuration for a layer in the proxy model.

HGQ.proxy.precision_derivation.get_config_table_tablesize_result(layer: Activation): Get the quantization configuration for a activation layer in the proxy model.

HGQ.proxy.precision_derivation.get_config_wight_accum_result_bias(layer: Layer, accum_fp_max_offset: None | int = None): Get the quantization configuration for a layer with kernel in the proxy model.

HGQ.proxy.precision_derivation.get_input_kifs(layer: Layer) → tuple[tuple[int, int, int] | ndarray, ...]: Get the input bitwidth of a layer, as a tuple of (k, i, f).

HGQ.proxy.precision_derivation.get_produced_kif(layer) → tuple[int, int, int]
HGQ.proxy.precision_derivation.get_produced_kif(layer: FixedPointQuantizer)
HGQ.proxy.precision_derivation.get_produced_kif(layer: Activation | ReLU | LeakyReLU | Softmax)
HGQ.proxy.precision_derivation.get_produced_kif(layer: Activation | ReLU | LeakyReLU | Softmax)
HGQ.proxy.precision_derivation.get_produced_kif(layer: Activation | ReLU | LeakyReLU | Softmax)
HGQ.proxy.precision_derivation.get_produced_kif(layer: Activation | ReLU | LeakyReLU | Softmax)
HGQ.proxy.precision_derivation.get_produced_kif(layer: AveragePooling1D | AveragePooling2D | AveragePooling3D)
HGQ.proxy.precision_derivation.get_produced_kif(layer: AveragePooling1D | AveragePooling2D | AveragePooling3D)
HGQ.proxy.precision_derivation.get_produced_kif(layer: AveragePooling1D | AveragePooling2D | AveragePooling3D)
HGQ.proxy.precision_derivation.get_produced_kif(layer: Add | Subtract)
HGQ.proxy.precision_derivation.get_produced_kif(layer: Add | Subtract)
HGQ.proxy.precision_derivation.get_produced_kif(layer: Dot)
HGQ.proxy.precision_derivation.get_produced_kif(layer: Concatenate)
HGQ.proxy.precision_derivation.get_produced_kif(layer: Dense | Conv)
HGQ.proxy.precision_derivation.get_produced_kif(layer: Dense | Conv)
HGQ.proxy.precision_derivation.get_produced_kif(layer: Reshape | Flatten | MaxPooling3D | MaxPooling2D | MaxPooling1D | Permute)
HGQ.proxy.precision_derivation.get_produced_kif(layer: Reshape | Flatten | MaxPooling3D | MaxPooling2D | MaxPooling1D | Permute)
HGQ.proxy.precision_derivation.get_produced_kif(layer: Reshape | Flatten | MaxPooling3D | MaxPooling2D | MaxPooling1D | Permute)
HGQ.proxy.precision_derivation.get_produced_kif(layer: Reshape | Flatten | MaxPooling3D | MaxPooling2D | MaxPooling1D | Permute)
HGQ.proxy.precision_derivation.get_produced_kif(layer: Reshape | Flatten | MaxPooling3D | MaxPooling2D | MaxPooling1D | Permute)
HGQ.proxy.precision_derivation.get_produced_kif(layer: Reshape | Flatten | MaxPooling3D | MaxPooling2D | MaxPooling1D | Permute)
HGQ.proxy.precision_derivation.get_produced_kif(layer: InputLayer)
HGQ.proxy.precision_derivation.get_produced_kif(x: UnaryLUT): Get the produced bitwidth of a layer, as a tuple of (k, i, f).

HGQ.proxy.precision_derivation.get_request_kif(layer: Layer) → tuple[int, int, int]
HGQ.proxy.precision_derivation.get_request_kif(layer: FixedPointQuantizer): Get the requested bitwidth of a layer, as a tuple of (k, i, f)

HGQ.proxy.precision_derivation.get_requested_kif(layer: Layer | FixedPointQuantizer) → tuple[int, int, int]: Get the bitwidth requested by downstream layers, as a tuple of (k, i, f). By requested, it means the maximum bitwidth that downstream layers may make use of.

HGQ.proxy.precision_derivation.get_result_kifRS(layer: Layer) → tuple[int, int, int, str, str]: Get the result bitwidth of a layer, as a tuple of (k, i, f, RND, SAT).

HGQ.proxy.precision_derivation.get_whatever_quantizer(layer: Layer): Find the quantizer before or after the layer.

HGQ.proxy.precision_derivation.merge_precision(available: tuple[int, int, int], request: tuple[int, int, int]): Given available precision and the maximum precision can be accepted by downstream, return the precision should be allocated for the data path.

HGQ.proxy.precision_derivation.register_qconf(layer: Layer, accum_fp_max_offset: None | int = None): Get and register quantization configuration for a layer in the proxy model.

HGQ.proxy.precision_derivation.result_kifRS_layer_with_fusible_quantizer(layer: Layer): Get the result bitwidth of a layer that has a fusible quantizer following immediately, as a tuple of (k, i, f, RND, SAT). When the layer has exactly one quantizer following it, and the quantizer is not heterogenous, the quantizer will be purged during synthesis, and the result bitwidth of the layer will be the same as the quantizer.

HGQ.proxy.unary_lut module

class HGQ.proxy.unary_lut.UnaryLUT(*args, **kwargs)

Bases: Layer

build(input_shape)

Creates the variables of the layer (for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call. It is invoked automatically before the first execution of call().

This is typically used to create the weights of Layer subclasses (at the discretion of the subclass implementer).

Parameters:: input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs, **kwargs)

This is where the layer’s logic lives.

The call() method may not create state (except in its first invocation, wrapping the creation of variables or other resources in tf.init_scope()). It is recommended to create state, including tf.Variable instances and nested Layer instances,

in __init__(), or in the build() method that is

called automatically before call() executes for the first time.

Parameters:

inputs –
Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

arguments, and inputs cannot be provided via the default value of a keyword argument.
- NumPy array or Python scalar values in inputs get cast as tensors.
- Keras mask metadata is only collected from inputs.
- Layers are built (build(input_shape) method) using shape info from inputs only.
- input_spec compatibility is only checked against inputs.
- Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.
- The SavedModel input specification is generated using inputs only.
- Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.
*args – Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.
**kwargs –
Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

whether the call is meant for training or inference.
- mask: Boolean input mask. If the layer’s call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns:

A tensor or list/tuple of tensors.

classmethod from_activation(activation: Layer | Callable, kif_in=None, kifRS_out=None)

get_config()

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns:: Python dictionary.

proxy_ready = True

HGQ.proxy.unary_lut.xfr_to_unary_lut(layer: Layer, max_table_size=1024)

HGQ.proxy package

Subpackages

Submodules

HGQ.proxy.convert module

HGQ.proxy.fixed_point_quantizer module

HGQ.proxy.precision_derivation module

HGQ.proxy.unary_lut module

Module contents