Learn More
Based on Intel's Many Integrated Core (MIC) architecture, Intel Xeon Phi is one of the few truly many-core CPUs - featuring around 60 fairly powerful cores, two levels of caches, and graphic memory, all interconnected by a very fast ring. Given its promised ease-of-use and high performance, we took Xeon Phi out for a test drive. In this paper, we present(More)
With at least 50 cores, Intel Xeon Phi is a true manycore architecture. Featuring fairly powerful cores, two cache levels, and very fast interconnections, the Xeon Phi can get a theoretical peak of 1000 GFLOPs and over 240 GB/s. These numbers, as well as its flexibility it can be used both as a coprocessor or as a stand-alone processor are very tempting for(More)
No part of this series may be reproduced in any form or by any means without prior written permission of the publisher. PDS Wp Wp With a minimum of 50 cores, Intel's Xeon Phi is a true many-core architecture. Featuring fairly powerful cores, two levels of caches, and a very fast interconnection, the Xeon Phi is able to achieve theoretical peak of 1000(More)
HOSTA is an in-house high-order CFD software that can simulate complex flows with complex geometries. Large scale high-order CFD simulations using HOSTA require massive HPC resources, thus motivating us to port it onto modern GPU accelerated supercomputers like Tianhe-1A. To achieve a greater speedup and fully tap the potential of Tianhe-1A, we collaborate(More)
This paper describes performance tuning experiences with a parallel CFD code to enhance its performance and flexibility on large scale parallel computers. The code solves the incompressible Navier-Stokes equations based on the novel Slightly Compressible Model on three-dimensional structure grids. High level loop transformations and argument based code(More)
The semi-implicit time stepping scheme in non-hydrostatic compressible atmosphere model makes it necessary to solve 3-D helmholtz equations, which are complicated with variable coefficients and cross derivative terms. Since the ill-conditioned matrix is nonsymmetric, preconditioned GMRES Krylov iterative algorithm is adopted. Based on PETSc and Hypre(More)
This paper comparatively evaluates the microarchitectural performance of two representative Computational Fluid Dynamics (CFD) applications on the Intel Many Integrated Core (MIC) product, the Intel Knights Corner (KNC) coprocessor, and the Intel Sand Bridge (SNB) processor. Performance Monitoring Unit-based measurement method is used, along with a(More)
We use Intel Xeon Phi Many Integrated Core (MIC) to accelerate our 3D full band self-consistent ensemble Monte Carlo simulator. We put Quantum Correction part onto MIC and others are still processed on CPU. We compare results between this newly developed MIC+CPU mode and traditional all-on-CPU mode in three different situations. We find that MIC(More)