Corpus ID: 730782

Fast exact summation using small and large superaccumulators

@article{Neal2015FastES,
  title={Fast exact summation using small and large superaccumulators},
  author={R. Neal},
  journal={ArXiv},
  year={2015},
  volume={abs/1505.05571}
}
  • R. Neal
  • Published 2015
  • Mathematics, Computer Science
  • ArXiv
  • I present two new methods for exactly summing a set of floating-point numbers, and then correctly rounding to the nearest floating-point number. Higher accuracy than simple summation (rounding after each addition) is important in many applications, such as finding the sample mean of data. Exact summation also guarantees identical results with parallel and serial implementations, since the exact sum is independent of order. The new methods use variations on the concept of a "superaccumulator… CONTINUE READING
    15 Citations

    Paper Mentions

    Parallel Exact Summation Algorithms on the GPU
    • Highly Influenced
    • PDF
    Numerical reproducibility for the parallel reduction on multi- and many-core architectures
    • 42
    • PDF
    Parallel Algorithms for Summing Floating-Point Numbers
    • 1
    • PDF
    Multiple-Precision Summation on Hybrid CPU-GPU Platforms Using RNS-based Floating-Point Representation
    Parallel Experiments with RARE-BLAS
    • PDF
    Reproducible, Accurately Rounded and Efficient BLAS
    • 6
    • PDF
    ExBLAS: Reproducible and Accurate BLAS Library
    • 25
    • PDF
    Fast and Stable Multivariate Kernel Density Estimation by Fast Sum Updating
    • 16
    • PDF

    References

    SHOWING 1-10 OF 14 REFERENCES
    Algorithm 908: Online Exact Summation of Floating-Point Streams
    • 70
    Pracniques: further remarks on reducing truncation errors
    • 375
    • Highly Influential
    • PDF
    Very fast and exact accumulation of products
    • 10
    The arithmetic of the digital computer: A new approach
    • 150
    Efficiency of Reproducible Level 1 BLAS
    • 7
    • PDF
    2013–2015) pqR — a pretty quick version of R. http://pqR-project.org
    • 2015
    2015b) “Numerical reproducibility for the parallel reduction on multi- and many-core architectures
    • 2015