Parallel Accurate Minifloat MACCs for Neural Network Inference on Versal FPGAs

Hans Jakob Damsgaard, Konstantin J. Hossfeld, Jari Nurmi, Thomas B. Preusser

Tutkimustuotos: ArtikkeliTieteellinenvertaisarvioitu

10 Lataukset (Pure)

Abstrakti

Machine Learning (ML) is ubiquitous in contemporary applications. Its need for efficient acceleration has driven vast research efforts into the quantization of neural networks with low-precision numerical formats. Models quantized with minifloat formats of eight or fewer bits have proven capable of outperforming models quantized into same-size integers. However, unlike integers, minifloats require accurate accumulation to prevent the introduction of rounding errors. We explore the design space of parallel accurate minifloat Multiply-Accumulators (MACCs) targeting the AMD Versal FPGA fabric. We experiment with three variations of the multiply-and-shift and adder tree components of a minifloat MACC. For comparison, we apply similar alterations to a parallel integer MACC. Our results show that custom compressor trees with external sign-inversion gates reduce the mean area of the minifloat MACCs by 17.7% and increase their clock frequency by 16.2%. In comparison, custom compressor trees with absorbed partial product generation gates reduce the mean area of integer MACCs by 28.1% and increase their clock frequency by 3.60%. Comparing the best-performing designs, we observe that minifloat MACCs consume 20% to 180% more resources than integer ones with same-size operands without accounting for a conversion back into a floating-point format, and 60% to 300% more resources when including it. Our data enable engineers to make informed decisions in their designs of deeply-integrated embedded ML solutions when trading off training and fine-tuning effort vs. resource cost.

AlkuperäiskieliEnglanti
Sivut2181-2194
JulkaisuIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Vuosikerta44
Numero6
Varhainen verkossa julkaisun päivämäärä4 jouluk. 2024
DOI - pysyväislinkit
TilaJulkaistu - 2025
OKM-julkaisutyyppiA1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä

Julkaisufoorumi-taso

  • Jufo-taso 2

!!ASJC Scopus subject areas

  • Software
  • Computer Graphics and Computer-Aided Design
  • Electrical and Electronic Engineering

Sormenjälki

Sukella tutkimusaiheisiin 'Parallel Accurate Minifloat MACCs for Neural Network Inference on Versal FPGAs'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä