Goal: Improving communication performance of distributed matrix multiplication to achieve energy efficiency Devise a high performance communication scheme o Fully exploiting network bandwidth of distributed matrix multiplication via non-blocking pipeline broadcast with tuned chunk size Model and quantify the communication time complexity of binomial… (More)

- Panruo Wu, Chong Ding, Longxiang Chen, Feng Gao, Teresa Davies, Christer Karlsson +1 other
- ScalA@SC
- 2011

Keywords: Algorithm-based fault tolerance Matrix multiplication Fault tolerant linear algebra On-line algorithm based fault tolerance a b s t r a c t Soft errors are one-time events that corrupt the state of a computing system but not its overall func-tionality. Soft errors normally do not interrupt the execution of the affected program, but the affected… (More)

Keywords: Power and energy Performance Power management Supercomputers Numerical linear algebra DVFS a b s t r a c t Extreme scale supercomputers available before the end of this decade are expected to have 100 million to 1 billion computing cores. The power and energy efficiency issue has become one of the primary concerns of extreme scale high performance… (More)

