# Efficient Computation of Positional Population Counts Using SIMD Instructions

@article{Klarqvist2019EfficientCO, title={Efficient Computation of Positional Population Counts Using SIMD Instructions}, author={Marcus D. R. Klarqvist and Wojciech Mula and Daniel Lemire}, journal={ArXiv}, year={2019}, volume={abs/1911.02696} }

In several fields such as statistics, machine learning, and bioinformatics, categorical variables are frequently represented as one-hot encoded vectors. For example, given 8 distinct values, we map each value to a byte where only a single bit has been set. We are motivated to quickly compute statistics over such encodings. Given a stream of k-bit words, we seek to compute k~distinct sums corresponding to bit values at indexes 0, 1, 2, ..., k-1. If the k-bit words are one-hot encoded then the… CONTINUE READING

