Distributed Arithmetic for Machine Learning

https://img.shields.io/github/actions/workflow/status/calad0i/alkaid/unit-test.yml?label=test https://img.shields.io/github/actions/workflow/status/calad0i/alkaid/sphinx-build.yml?label=doc https://img.shields.io/pypi/v/alkaid https://img.shields.io/badge/arXiv-2507.04535-b31b1b.svg https://img.shields.io/codecov/c/github/calad0i/alkaid alkaid-overview

alkaid is a lightweight compiler for generating low-latency, static-dataflow kernels for FPGAs. It traces quantized arithmetic graph into ALIR, applies distributed-arithmetic optimization through CMVM where useful, and emits RTL or HLS projects.

As a static-dataflow compiler, alkaid is specialized for kernels that are equivalent to combinational logic or an initiation-interval-one pipeline. The generated kernels are intended to be building blocks that users can compose into larger designs when resource sharing or time multiplexing is required.

With DA in its name, alkaid performs distributed-arithmetic (DA) optimization to generate efficient kernels for linear DSP operations. The algorithm is described in the TRETS’25 paper. With DA optimization, linear DSP operations can be implemented with adders and lookup tables instead of hardened multipliers; users can also exclude selected multiplication pairs from DA optimization.

Installation

pip install alkaid

Binary wheels are published for Linux x86_64 and macOS ARM64. Building from source requires Python 3.10 or newer, NumPy, meson-python, and a C++20 compiler with OpenMP support.

Index

Indices and tables