Skip to search form
Skip to main content
Skip to account menu
Semantic Scholar
Semantic Scholar's Logo
Search 225,166,826 papers from all fields of science
Search
Sign In
Create Free Account
AVX-512
Known as:
AVX3
, AVX512
, Advanced Vector Extensions 512
AVX-512 are 512-bit extensions to the 256-bit Advanced Vector Extensions SIMD instructions for x86 instruction set architecture (ISA) proposed by…
Expand
Wikipedia
(opens in a new tab)
Create Alert
Alert
Related topics
Related topics
24 relations
256-bit
APL
Addressing mode
Advanced Vector Extensions
Expand
Papers overview
Semantic Scholar uses AI to extract papers important to this topic.
2019
2019
Designing efficient SIMD algorithms for direct Connected Component Labeling
A. Hennequin
,
I. Masliah
,
L. Lacassagne
WPMVP'19
2019
Corpus ID: 59337271
Connected Component Labeling (CCL) is a fundamental algorithm in computer vision, and is often required for real-time…
Expand
2019
2019
An Efficient Convolutional Neural Network Computation using AVX-512 Instructions
Hiroki Kataoka
,
Kohei Yamashita
,
K. Nakano
,
Yasuaki Ito
,
Akihiko Kasagi
,
T. Tabaru
2019
Corpus ID: 198342690
Recently, Convolutional Neural Networks (CNNs) are widely used for image processing. Since the computation cost is high, it is…
Expand
2018
2018
Fused Table Scans: Combining AVX-512 and JIT to Double the Performance of Multi-Predicate Scans
Markus Dreseler
,
Jan Kossmann
,
Johannes Frohnhofen
,
M. Uflacker
,
H. Plattner
IEEE 34th International Conference on Data…
2018
Corpus ID: 49572429
Recent work has started to combine two approaches for faster query execution: Vectorization and Just-in-Time Compilation (JIT…
Expand
2017
2017
Practical Implementation of Lattice QCD Simulation on Intel Xeon Phi Knights Landing
I. Kanamori
,
H. Matsufuru
International Symposium on Computing and…
2017
Corpus ID: 13771669
We investigate implementation of lattice Quantum Chromodynamics (QCD) code on the Intel Xeon Phi Knights Landing (KNL). The most…
Expand
2017
2017
Optimizing the Decoding Process of a Post-Quantum Cryptographic Algorithm
Antonio Guimarães
,
Diego F. Aranha
,
E. Borin
2017
Corpus ID: 53453861
QcBits is a state-of-the-art constant-time implementation of a codebased encryption scheme for post-quantum public key…
Expand
2017
2017
From MPI to MPI+OpenACC: Conversion of a legacy FORTRAN PCG solver for the spherical Laplace equation
R. Caplan
,
Z. Mikić
,
J. Linker
arXiv.org
2017
Corpus ID: 24145579
A real-world example of adding OpenACC to a legacy MPI FORTRAN Preconditioned Conjugate Gradient code is described, and timing…
Expand
2017
2017
A Hardware-Oblivious Optimizer for Data Stream Processing
Constantin Pohl
PhD@VLDB
2017
Corpus ID: 29805360
High throughput and low latency are key requirements for data stream processing. This is achieved typically through different…
Expand
2016
2016
Portable Explicit Vectorization Intrinsics
P. Souza
,
L. Borges
,
Cedric Andreolli
,
P. Thierry
2016
Corpus ID: 64330171
2015
2015
Efficient execution of recursive programs on commodity vector hardware
Bin Ren
,
Youngjoon Jo
,
S. Krishnamoorthy
,
Kunal Agrawal
,
Milind Kulkarni
ACM-SIGPLAN Symposium on Programming Language…
2015
Corpus ID: 6287970
The pursuit of computational efficiency has led to the proliferation of throughput-oriented hardware, from GPUs to increasingly…
Expand
2015
2015
Optimizing Total Energy–Mass Flux (TEMF) Planetary Boundary Layer Scheme for Intel’s Many Integrated Core (MIC) Architecture
Jarno Mielikäinen
,
Bormin Huang
,
Hung-Lung Huang
IEEE Journal of Selected Topics in Applied Earth…
2015
Corpus ID: 9035362
In order to make use of the ever-improving microprocessor performance, the applications must be modified to take advantage of the…
Expand
By clicking accept or continuing to use the site, you agree to the terms outlined in our
Privacy Policy
(opens in a new tab)
,
Terms of Service
(opens in a new tab)
, and
Dataset License
(opens in a new tab)
ACCEPT & CONTINUE