Longxiang Chen

Learn More
Boosting performance and energy efficiency of scientific applications running on high performance computing systems arise cruicially nowadays. Software and hardware based solutions for improving communication performance have been recognized as significant means of achieving performance gain and thus energy savings for such applications. As a fundamental(More)
Soft errors are one-time events that corrupt the state of a computing system but not its overall functionality. Soft errors normally do not interrupt the execution of the affected program, but the affected computation results can not be trusted any more. A well known technique to correct soft errors in matrix-matrix multiplication is algorithm-based fault(More)
Keywords: Power and energy Performance Power management Supercomputers Numerical linear algebra DVFS a b s t r a c t Extreme scale supercomputers available before the end of this decade are expected to have 100 million to 1 billion computing cores. The power and energy efficiency issue has become one of the primary concerns of extreme scale high performance(More)
The demands of improving energy efficiency for high performance scientific applications arise crucially nowadays. Software-controlled hardware solutions directed by Dynamic Voltage and Frequency Scaling (DVFS) have shown their effectiveness extensively. Although DVFS is beneficial to green computing, introducing DVFS itself can incur non-negligible(More)
Keywords: Algorithm-based fault tolerance Matrix multiplication Fault tolerant linear algebra On-line algorithm based fault tolerance a b s t r a c t Soft errors are one-time events that corrupt the state of a computing system but not its overall func-tionality. Soft errors normally do not interrupt the execution of the affected program, but the affected(More)
Excessive energy consumption has become one of the major challenges in high performance computing. Reducing the energy consumption of frequently used high performance computing applications not only saves the energy cost but also reduces the greenhouse gas emissions. This paper focuses on developing energy efficient algorithms and software for the widely(More)
As the size and complexity of high performance computers increase, more soft errors will be encountered during computations. Algorithm-Based Fault Tolerance (ABFT) has been proved to be a highly efficient technique to detect soft errors in dense linear algebra operations including matrix multiplication, Cholesky and LU factorization. While ABFT can also be(More)
  • 1