Kailash Gopalakrishnan

Learn More
Training of large-scale deep neural networks is often constrained by the available computational resources. We study the effect of limited precision data representation and computation on neural network training. Within the context of low-precision fixed-point computations, we observe the rounding scheme to play a crucial role in determining the network’s(More)
Storage-class memory (SCM) combines the benefits of a solidstate memory, such as high performance and robustness, with the archival capabilities and low cost of conventional hard-disk magnetic storage. Such a device would require a solid-state nonvolatile memory technology that could be manufactured at an extremely high effective areal density using some(More)
The memory capacity, computational power, communication bandwidth, energy consumption, and physical size of the brain all tend to scale with the number of synapses, which outnumber neurons by a factor of 10,000. Although progress in cortical simulations using modern digital computers has been rapid, the essential disparity between the classical von Neumann(More)
In this position paper, we present an argument in the favor of employing stochastic computation for approximate calculation of kernels commonly used in machine learning algorithms. The low cost and complexity of hardware implementation of stochastic computational blocks, coupled with the inherent error resilience of a wide range of machine learning(More)
Approximate computing is gaining traction as a computing paradigm for data analytics and cognitive applications that aim to extract deep insight from vast quantities of data. In this paper, we demonstrate that multiple approximation techniques can be applied to applications in these domains and can be further combined together to compound their benefits. In(More)
This paper highlights new opportunities for designing large-scale machine learning systems as a consequence of blurring traditional boundaries that have allowed algorithm designers and application-level practitioners to stay – for the most part – oblivious to the details of the underlying hardware-level implementations. The hardware/software co-design(More)
Deep Neural Networks (DNNs) have emerged as a powerful and versatile set of techniques showing successes on challenging artificial intelligence (AI) problems. Applications in domains such as image/video processing, autonomous cars, natural language processing, speech synthesis and recognition, genomics and many others have embraced deep learning as the(More)
We present an energy-efficient implementation of RGB-D simultaneous localization and mapping (SLAM) by applying approximate computing (AC) techniques such as loop perforation (LP) and reduced precision (RP). To reduce processing time and power consumption, LP and RP were applied to the two most computationally challenging portion of the multi-kernel(More)
  • 1