Learn More
A parallel Lattice Boltzmann Method (pLBM), which is based on hierarchical spatial decomposition, is designed to perform large-scale flow simulations. The algorithm uses critical section-free, dual representation in order to expose maximal concurrency and data locality. Performances of emerging multi-core platforms—PlayStation3 (Cell Broadband Engine) and(More)
  • XIAO Zhi-bin, LIU Peng, YAO Ying-biao, YAO, Qing-dong
  • 2006
The 32-bit extensible embedded processor RISC3200 originating from an RTL prototype core is intended for low-cost consumer multimedia products. In order to incorporate the reduced instruction set and the multimedia extension instruction set in a unifying pipeline, a scalable super-pipeline technique is adopted. Several other optimization techniques are(More)
PURPOSE To develop a protocol to measure the intraocular pressure (IOP) of living mice and to determine the IOP of genetically different mouse strains. METHODS Eyes of anesthetized animals were cannulated with a very fine fluid-filled glass microneedle. The microneedle was connected to a pressure transducer, and the pressure signal was analyzed with a(More)
Stencil based computation on structured grids is a common kernel to broad scientific applications. The order of stencils increases with the required precision, and it is a challenge to optimize such high-order stencils on multicore architectures. Here, we propose a multilevel parallelization framework that combines: (1) inter-node parallelism by spatial(More)
Stencil computation (SC) is of critical importance for broad scientific and engineering applications. However, it is a challenge to optimize complex, high-order SC on emerging clusters of multicore processors. We have developed a hierarchical SC parallelization framework that combines: (1) spatial decomposition based on message passing; (2) multithreading(More)
Information Retrieval (IR) forms the basis of many information management tasks. Information management itself has become an extremely important area as the amount of electronically available information increases dramatically. There are numerous methods of performing the IR task both by utilising different techniques and through using different(More)
In this paper, we apply in-core optimization techniques to high-order stencil computations, including: (1) cache blocking for efficient L2 cache use; (2) register blocking and data-level parallelism via single-instruction multiple-data (SIMD) techniques to increase L1 cache efficiency; and (3) software prefetching techniques. Our generic approach is tested(More)
We have developed a scalable hierarchical parallelization framework for molecular dynamics (MD) simulation on emerging multicore clusters. The framework combines: (1) inter-node level parallelism by spatial decomposition using message passing; (2) intra-node (inter-core) level parallelism through a master/worker paradigm and cellular decomposition using(More)
BACKGROUND Acute myocardial infarction (AMI) is one of the leading causes for death in both developed and developing countries and it is the single largest cause of death in the United States, responsible for 1 out of every 6 deaths. The objective of this study was to determine microRNA (miRNA) expression in AMI and determine whether miR-133, miR-1291 and(More)
We have developed a scalable hierarchical parallelization scheme for molecular dynamics (MD) simulation on multicore clusters. The scheme explores multi-level parallelism combining: (1) Internode parallelism using spatial decomposition via message passing; (2) intercore parallelism using cellular decomposition via mul-tithreading employing a master/worker(More)