Multiple byte processing with full-word instructions

  title={Multiple byte processing with full-word instructions},
  author={Leslie Lamport},
  journal={Communications of the ACM},
  pages={471 - 475}
  • L. Lamport
  • Published 1 August 1975
  • Computer Science, Psychology
  • Communications of the ACM
A method is described which allows parallel processing of packed data items using only ordinary full-word computer instructions, even though the processing requires operations whose execution is contingent upon the value of a datum. It provides a useful technique for processing small data items such as alphanumeric characters. 

BitWeaving: fast scans for main memory data processing

The proposed BitWeaving technique exploits the parallelism available at the bit level in modern processors to produce significant performance benefits over the existing state-of-the-art methods, and in some cases produce over an order of magnitude in performance improvement.

Accelerating aggregation using intra-cycle parallelism

  • Ziqiang FengEric Lo
  • Computer Science
    2015 IEEE 31st International Conference on Data Engineering
  • 2015
A suite of bit-parallel algorithms to accelerate all standard aggregation operations: SUM, MIN, MAX, AVG, MEDIAN, COUNT are presented, designed to fully leverage the intra-cycle parallelism in CPU cores when aggregating words of packed values.

Efficient Lightweight Compression Alongside Fast Scans

The increasing main-memory capacity has allowed query execution to occur primarily in main memory. Database systems employ compression, not only to fit the data in main memory, but also to address

Vectorizing Database Column Scans with Complex Predicates

This paper presents a framework for vectorized scans with more complex predicates and shows that a performant vectorized implementation is possible using the new Intel AVX2 instruction set, and improves previous algorithms by leveraging the increased vector-width.

Broadword Implementation of Parenthesis Queries

This work proposes broadword algorithms for finding matching closed parentheses and the k-th far closed parenthesis, which work in time O(log w) on a word of w bits, and contain no branch and no test instruction.

Functions realizable with word-parallel logical and two's-complement addition instructions

There is more than one way to implement recursion using stacks, and the charging scheme adopted, whereby each basic test or assignment counts 1 unit of time irrespective of its type, can be modified without affecting the relative merits of the solutions.

Hacker's Delight

The term "hacker" in the title is meant in the originalsense of an aficionado of computers—someone who enjoys making computers do new things, or do old things in a new and clever way.

Accelerating mono and multi-column selection predicates in modern main-memory database systems

This thesis tackles the aforementioned challenges of creating hardwaresensitive operator implementations automatically and exploiting the relation between multiple selection predicates and introduces the abstraction of code optimizations as a means to generate hardware-sensitive code variants automatically.

FPGA vs. SIMD: Comparison for Main Memory-Based Fast Column Scan

This paper investigates on different well-known fast column scan techniques using SIMD (Single Instruction Multiple Data) vectorization as well as using Field Programmable Gate Arrays (FPGA) to find out the best column scan technique as per implementation mechanism–FPGA and SIMD.

Understanding and Optimizing Conjunctive Predicates Under Memory-Efficient Storage Layouts

A hybrid empirical/analytical cost model is proposed to unveil the performance characteristics of memory-efficient storage layouts when applying to predicate evaluation and a simple execution scheme Hebe is proposed, which is order-oblivious while maintaining high performance.



On the Consecutive Retrieval Property in File Organization

Some observations about boundary conditions are added to Gosh's study of the CR property, which is a property of a query that can be answered by the retrieval of consecutive records in a file.

File organization: the consecutive retrieval property

Conditions under which the consecutive retrieval property exists and remain invariant have been established and an outline for designing an information retrieval system based on the consecutive retrieved property is discussed.

File Organization: Consecutive Storage of Relevant Records on Drum-Type Storage

On the theory of consecutive storage of relevant records