• Corpus ID: 21675214

# An O(N) Sorting Algorithm: Machine Learning Sorting

@article{Zhao2018AnOS,
title={An O(N) Sorting Algorithm: Machine Learning Sorting},
author={Hanqing Zhao and Yuehan Luo},
journal={ArXiv},
year={2018},
volume={abs/1805.04272}
}
• Published 11 May 2018
• Computer Science
• ArXiv
We propose an $O(N\cdot M)$ sorting algorithm by Machine Learning method, which shows a huge potential sorting big data. This sorting algorithm can be applied to parallel sorting and is suitable for GPU or TPU acceleration. Furthermore, we discuss the application of this algorithm to sparse hash table.
2 Citations

## Figures from this paper

• Computer Science
OSDI
• 2021
This work proposes a learning-based framework that explicitly optimizes concurrency control via offline training to maximize performance and builds Polyjuice, a novel algorithms that can outperform existing algorithms by specializing to a given workload.
• Computer Science
ArXiv
• 2019
Doraemon caches the previously-trained models and incrementally fine-tunes them for similar access patterns and data distribution and improves the query latency by 45.1% and reduces the model re-training time to 1/20.

## References

SHOWING 1-10 OF 30 REFERENCES

• Computer Science
J. Parallel Distributed Comput.
• 1992
• Computer Science
2011 Frontiers of Information Technology
• 2011
This paper is presenting an analysis of parallel and sequential bitonic, odd-even and rank-sort algorithms on different GPU and CPU architectures written to exploit task parallelism model as available on multi-core GPUs using the OpenCL specification.
• Computer Science
• 1998
A comparative performance evaluation of three different parallel sorting algorithms: bitonic sort, sample sort, and parallel radix sort shows that the relative performance of the algorithms differed on the various machines.
• Computer Science
J. Parallel Distributed Comput.
• 1998
A novel variation on sample sort which uses only two rounds of regular all-to-all personalized communication in a scheme that yields very good load balancing with virtually no overhead, and its performance is invariant over the set of input distributions unlike previous efficient algorithms.
• F. Leighton
• Computer Science, Mathematics
IEEE Transactions on Computers
• 1985
Tight upper and lower bounds are proved on the number of processors, information transfer, wire area, and time needed to sort N numbers in a bounded-degree fixed-connection network.
• Computer Science
2010 International Conference on Reconfigurable Computing and FPGAs
• 2010
The hardware implementation and optimization of parallel recursive algorithms that sort data using binary trees using a hierarchical finite state machine are described and the performance of sorting operations is increased compared to previous implementations.
• Computer Science
2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA)
• 2017
This paper evaluates a custom ASIC-called a Tensor Processing Unit (TPU)-deployed in datacenters since 2015 that accelerates the inference phase of neural networks (NN) and compares it to a server-class Intel Haswell CPU and an Nvidia K80 GPU, which are contemporaries deployed in the samedatacenters.
The toolkit leverages the principles of SECDA, a hardware/software co-design methodology, to reduce the design time of optimized DNN inference accelerators on edge devices with FPGAs and includes modules for cost-effective SystemC simulation, proﬁling, and AXI-based data communication.
• Art
• 2001
Here the authors haven’t even started the project yet, and already they’re forced to answer many questions: what will this thing be named, what directory will it be in, what type of module is it, how should it be compiled, and so on.