Efficient Realization of Householder Transform Through Algorithm-Architecture Co-Design for Acceleration of QR Factorization

@article{Merchant2018EfficientRO,
  title={Efficient Realization of Householder Transform Through Algorithm-Architecture Co-Design for Acceleration of QR Factorization},
  author={Farhad Merchant and Tarun Vatwani and Anupam Chattopadhyay and Soumyendu Raha and S. K. Nandy and Ranjani Narayan},
  journal={IEEE Transactions on Parallel and Distributed Systems},
  year={2018},
  volume={29},
  pages={1707-1720}
}
QR factorization is a ubiquitous operation in many engineering and scientific applications. In this paper, we present efficient realization of Householder Transform (HT) based QR factorization through algorithm-architecture co-design where we achieve performance improvement of 3-90x in-terms of Gflops/watt over state-of-the-art multicore, General Purpose Graphics Processing Units (GPGPUs), Field Programmable Gate Arrays (FPGAs), and ClearSpeed CSX700. Theoretical and experimental analysis of… CONTINUE READING

Citations

Publications citing this paper.
SHOWING 1-4 OF 4 CITATIONS

References

Publications referenced by this paper.
SHOWING 1-10 OF 35 REFERENCES

A framework for post-silicon realization of arbitrary instruction extensions on reconfigurable data-paths

  • S. Das
  • J. Syst. Archit. - Embedded Syst. Des., vol. 60…
  • 2014
1 Excerpt

Similar Papers

Loading similar papers…