- Published 2015 in Algorithmica

We consider the problem of constructing fast and small parallel prefix adders for non-uniform input arrival times. In modern computer chips, adders with up to hundreds of inputs occur frequently, and they are often embedded into more complex circuits, e.g. multipliers, leading to instance-specific non-uniform input arrival times. Most previous results are based on representing binary carry-propagate adders as parallel prefix graphs, in which pairs of generate and propagate signals are combined using complex gates called prefix gates. Examples of commonly-used adders are constructed based on the Kogge–Stone or Ladner–Fischer prefix graphs. Adders constructed in this model usually minimize the delay in terms of these prefix gates. However, the delay in terms of logic gates can be worse by a factor of two. In contrast, we aim to minimize the delay of the underlying logic circuit directly. We prove a lower bound on the delay of a carry bit computation achievable by any prefix carry bit circuit and develop an algorithm that computes a prefix carry bit circuit with optimum delay up to a small additive constant. Our algorithm improves the running time of a previous dynamic program for constructing a prefix carry bit from $$\mathcal {O}(n^3)$$ O ( n 3 ) to $$\mathcal {O}(n \log ^2 n)$$ O ( n log 2 n ) while simultaneously improving the delay and size guarantee, where n is the number of bits in the summands. Furthermore, we use this algorithm as a subroutine to compute a full adder in near-linear time, reducing the delay approximation factor of 2 from previous approaches to 1.441 for our algorithm.

@article{Held2015FastPA,
title={Fast Prefix Adders for Non-uniform Input Arrival Times},
author={Stephan Held and Sophie Theresa Spirkl},
journal={Algorithmica},
year={2015},
volume={77},
pages={287-308}
}