da4ml.cmvm.core package

Submodules

da4ml.cmvm.core.indexers module

da4ml.cmvm.core.indexers.idx_mc(state: DAState): Choose the pair with highest frequency.

da4ml.cmvm.core.indexers.idx_mc_dc(state: DAState, absolute: bool = False): Choose the pair with highest frequency with latency penalty. If absolute is True, return -1 if any latency overhead may present.

da4ml.cmvm.core.indexers.idx_wmc(state: DAState): Choose the pair with the highest weighted most common subexpression (WMC) score.

da4ml.cmvm.core.indexers.idx_wmc_dc(state: DAState, absolute: bool = False): Choose the pair with the highest weighted most common subexpression (WMC) score with latency and cost penalty. When absolute is True, return -1 if any latency overhead may present.

da4ml.cmvm.core.indexers.overlap_and_accum(qint0: QInterval, qint1: QInterval): Calculate the overlap and total number of bits for two QIntervals, when represented in fixed-point format.

da4ml.cmvm.core.state_opr module

da4ml.cmvm.core.state_opr.cost_add(qint0: QInterval, qint1: QInterval, shift: int, sub: bool = False, adder_size: int = -1, carry_size: int = -1) → tuple[float, float]

Calculate the latency and cost of an addition operation.

Parameters:

qint1 (QInterval) – The first QInterval.
qint2 (QInterval) – The second QInterval.
sub (bool) – If True, the operation is a subtraction (a - b) instead of an addition (a + b).
adder_size (int) – The atomic size of the adder.
carry_size (int) – The size of the look-ahead carry.

Returns:

The latency and cost of the addition operation.

Return type:

tuple[float, float]

da4ml.cmvm.core.state_opr.create_state(kernel: ndarray, qintervals: list[QInterval], inp_latencies: list[float], no_stat_init: bool = False)

da4ml.cmvm.core.state_opr.gather_matching_idxs(state: DAState, pair: Pair): Generates all i_out, j0, j1 ST expr[i_out][in0, j0] and expr[i_out][in1, j1] corresponds to op provided.

da4ml.cmvm.core.state_opr.pair_to_op(pair: Pair, state: DAState, adder_size: int = -1, carry_size: int = -1)

da4ml.cmvm.core.state_opr.qint_add(qint0: QInterval, qint1: QInterval, shift: int, sub0=False, sub1=False) → QInterval

da4ml.cmvm.core.state_opr.update_expr(state: DAState, pair: Pair, adder_size: int, carry_size: int): Updates the state by implementing the operation op, excepts common 2-term pair freq update.

da4ml.cmvm.core.state_opr.update_state(state: DAState, pair_chosen: Pair, adder_size: int, carry_size: int): Update the state by removing all occurrences of pair_chosen from the state, register op code, and update the statistics.

da4ml.cmvm.core.state_opr.update_stats(state: DAState, pair: Pair): Updates the statistics of any 2-term pair in the state that may be affected by implementing op.

Module contents

da4ml.cmvm.core.cmvm(kernel: ndarray, method: str = 'wmc', qintervals: list[QInterval] | None = None, inp_latencies: list[float] | None = None, adder_size: int = -1, carry_size: int = -1) → DAState

Optimizes the kernel using the CMVM algorithm.

Parameters:

kernel (np.ndarray) – The kernel to optimize.
method (str, optional) – Which indexing method to use, by default ‘wmc’ (weighted most common) Must be one of [mc, mc-dc, mc-pdc, wmc, wmc-dc, wmc-pdc, dummy]
qintervals (list[QInterval] | None, optional) – List of QIntervals for each input, by default None If None, defaults to [-128., 127., 1.] for each input.
inp_latencies (list[float] | None, optional) – List of latencies for each input, by default None If None, defaults to 0. for each input.
adder_size (int, optional) – The atomic size of the adder for cost computation, by default -1 if -1, each adder can be arbitrary large, and the cost will be the number of adders
carry_size (int, optional) – The size of the carry unit for latency computation, by default -1 if -1, each carry unit can be arbitrary large, and the cost will be the depth of the adder tree

Returns:

The optimized kernel as a DAState object.

Return type:

DAState

da4ml.cmvm.core.to_solution(state: DAState, adder_size: int, carry_size: int)

Converts the DAState to a Solution object with balanced tree reduction for the non-extracted bits in the kernel.

Parameters:

state (DAState) – The DAState to convert.
adder_size (int, optional) – The atomic size of the adder for cost computation, by default -1 if -1, each adder can be arbitrary large, and the cost will be the number of adders
carry_size (int, optional) – The size of the carry unit for latency computation, by default -1 if -1, each carry unit can be arbitrary large, and the cost will be the depth of the adder tree

Returns:

The Solution object with the optimized kernel.

Return type:

Solution