PaC-trees: supporting parallel and compressed purely-functional collections

  title={PaC-trees: supporting parallel and compressed purely-functional collections},
  author={Laxman Dhulipala and Guy E. Blelloch and Yan Gu and Yihan Sun},
  journal={Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation},
  • Laxman Dhulipala, G. Blelloch, Yihan Sun
  • Published 12 April 2022
  • Computer Science
  • Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation
Many modern programming languages are shifting toward a functional style for collection interfaces such as sets, maps, and sequences. Functional interfaces offer many advantages, including being safe for parallelism and providing simple and lightweight snapshots. However, existing high-performance functional interfaces such as PAM, which are based on balanced purely-functional trees, incur large space overheads for large-scale data analysis due to storing every element in a separate node in a… 

Parallel Cover Trees and their Applications

This paper shows highly parallel and work-efficient cover tree algorithms that can handle batch insertions (and thus construction) and batch deletions and uses three key ideas to guarantee work-efficiency: the prefix-doubling scheme, a careful design to limit the graph size on which it applies MIS, and a strategy to propagate information among different levels in the cover tree.

Many Sequential Iterative Algorithms Can Be Parallel and (Nearly) Work-efficient

This paper presents work-efficient and round-efficient algorithms for a variety of classic problems and proposes general approaches to do so, and uses two types of general techniques to enable work-efficiency and high parallelism.

A Work-Efficient Parallel Algorithm for Longest Increasing Subsequence

This paper proposes a parallel LIS algorithm that costs 𝑂 ( πš‚ log π“˜ ) work, ˜ 𝐂 (π‘˜ ) span, and 𝒂 ( I𝑛 ) space, and is much simpler than the previous Parallel LIS algorithms.

Hierarchical Agglomerative Graph Clustering in Poly-Logarithmic Depth

It is shown that ParHAC obtains a 50.1x speedup on average over the best sequential baseline, while achieving quality similar to the exact HAC algorithm, and can cluster one of the largest publicly available graph datasets with 124 billion edges in a little over three hours using a commodity multicore machine.



Low-latency graph streaming using compressed purely-functional trees

This paper designs theoretically-efficient and practical algorithms for performing batch updates to C-trees, and shows that it can store massive dynamic real-world graphs using only a few bytes per edge, thereby achieving space usage close to that of the best static graph processing frameworks.

Purely functional data structures

This work describes several techniques for designing functional data structures, and numerous original data structures based on these techniques, including multiple variations of lists, queues, double-ended queues, and heaps, many supporting more exotic features such as random access or efficient catenation.

On Supporting Efficient Snapshot Isolation for Hybrid Workloads with Multi-Versioned Indexes

The Parallel Binary Tree (P-Tree) index structure is proposed, based on pure (immutable) data structures that use path-copying for updates for fast multi-versioning, to achieve SI and MVCC for multicore in-memory HTAP DBMSs.

PAM: parallel augmented maps

An interface for ordered maps that is augmented to support fast range queries and sums, and a parallel and concurrent library called PAM (Parallel Augmented Maps) that implements the interface are described.

Optimal Parallel Algorithms in the Binary-Forking Model

This paper explores techniques for designing optimal algorithms when limited to binary forking and assuming asynchrony, and develops the first algorithms with optimal work and span in the binary-forking model.

Implicitly-threaded parallelism in Manticore

This paper presents Manticore, a language for building parallel applications on commodity multicore hardware including a diverse collection of parallel constructs for different granularities of work, and focuses on the implicitly-threaded parallel constructs in the high-level functional language.

Getting to the Root of Concurrent Binary Search Tree Performance

This paper focuses on optimistic binary search trees and performs a detailed performance analysis of 10 state-of-the-art BSTs on large scale x86-64 hardware, using both microbenchmarks and an in-memory database system.

Theory and Practice of Chunked Sequences

This paper presents chunking techniques, one direct and one based on bootstrapping, that can reduce the practical overheads of sophisticated sequence data structures, such as finger trees, making them competitive in practice with specialpurpose data structures.

Parallel Write-Efficient Algorithms and Data Structures for Computational Geometry

This paper designs parallel write-efficient geometric algorithms that perform asymptotically fewer writes than standard algorithms for the same problem, and introduces several techniques for obtaining write-efficiency, including DAG tracing, prefix doubling, and Ξ±-labeling.

Constant-time snapshots with applications to concurrent data structures

Given a concurrent data structure, this work presents an approach for efficiently taking snapshots of its constituent CAS objects that supports a constant-time operation that returns a snapshot handle that can later be used to read the value of any base object at the time the snapshot was taken.