# Number parsing at a gigabyte per second

@article{Lemire2021NumberPA, title={Number parsing at a gigabyte per second}, author={Daniel Lemire}, journal={Software: Practice and Experience}, year={2021}, volume={51}, pages={1700 - 1727} }

With disks and networks providing gigabytes per second, parsing decimal numbers from strings becomes a bottleneck. We consider the problem of parsing decimal numbers to the nearest binary floating‐point value. The general problem requires variable‐precision arithmetic. However, we need at most 17 digits to represent 64‐bit standard floating‐point numbers (IEEE 754). Thus, we can represent the decimal significand with a single 64‐bit word. By combining the significand and precomputed tables, we…

## 2 Citations

### Fast Number Parsing Without Fallback

- Computer ScienceSoftware: Practice and Experience
- 2023

A check leading to a fallback function to ensure correctness is never called in practice and it is proved that the fallback is unnecessary, which can slightly simplify the algorithm and its implementation.

## References

SHOWING 1-10 OF 30 REFERENCES

### Parsing gigabytes of JSON per second

- Computer ScienceThe VLDB Journal
- 2019

This work presents the first standard-compliant JSON parser to process gigabytes of data per second on a single core, using commodity processors, and makes extensive use of single instruction and multiple data instructions.

### Printing floating-point numbers quickly and accurately with integers

- Computer SciencePLDI '10
- 2010

We present algorithms for accurately converting floating-point numbers to decimal representation. They are fast (up to 4 times faster than commonly used algorithms that use high-precision integers)…

### Printing floating-point numbers: a faster, always correct method

- Computer SciencePOPL
- 2016

Errol is presented, a new complete algorithm that is guaranteed to produce correct and optimal results for all inputs while simultaneously being 2x faster than the incomplete Grisu3 and 4X faster than previous complete methods.

### How to print floating-point numbers accurately

- Computer ScienceSIGP
- 2004

Algorithms for accurately converting floating-point numbers to decimal representation and modification of the well-known algorithm for radix-conversion of fixed-point fractions by multiplication for use in fixed-format applications.

### Ryū revisited: printf floating point conversion

- Computer ScienceProc. ACM Program. Lang.
- 2019

It is shown that both Ryū and Ryū Printf generalize to arbitrary number bases, which implies the existence of a fast algorithm to convert from base-10 to base-2, as long as the maximum precision of the input is known a priori.

### How to read floating point numbers accurately

- Computer SciencePLDI '90
- 1990

This paper presents an efficient algorithm that always finds the best approximation, and when using 64 bits of precision to compute IEEE double precision results, the algorithm avoids higher-precision arithmetic over 99% of the time.

### A decimal floating-point specification

- Computer ScienceProceedings 15th IEEE Symposium on Computer Arithmetic. ARITH-15 2001
- 2001

This paper proposes a decimal format which meets the requirements of existing standards for decimal arithmetic and is efficient for hardware implementation, in the hope that designers will consider providing decimal arithmetic in future microprocessors and that future decimal software specifications will consider hardware efficiencies.

### Faster remainder by direct computation: Applications to compilers and software libraries

- Mathematics, Computer ScienceSoftw. Pract. Exp.
- 2019

A generally applicable algorithm to compute the remainder of the division by a constant from the quotient by a multiplication and a subtraction and derives new tight bounds on the precision required when representing the inverse of the divisor.

### A customized precision format based on mantissa segmentation for accelerating sparse linear algebra

- Computer ScienceConcurr. Comput. Pract. Exp.
- 2020

A customized precision memory format derived by splitting the mantissa (significand) of standard IEEE formats into segments, such that values can be accessed faster if lower accuracy is acceptable, is presented.

### Compiling for SIMD Within a Register

- Computer ScienceLCPC
- 1998

This paper focuses on how these missing operations can be implemented using either the existing SWAR hardware or even conventional 32-bit integer instructions, and offers a few new challenges for compiler optimization.