Mohak Sukhwani

Learn More
We propose an end-to-end recurrent encoder-decoder based sequence learning approach for printed text Optical Character Recognition (OCR). In contrast to present day existing state-of-art OCR solution [Graves et al. (2006)] which uses CTC output layer, our approach makes minimalistic assumptions on the structure and length of the sequence. We use a two step(More)
Automatically describing videos has ever been fascinating. In this work, we attempt to describe videos from a specific domain – broadcast videos of lawn tennis matches. Given a video shot from a tennis match, we intend to generate a textual commentary similar to what a human expert would write on a sports website. Unlike many recent works that focus on(More)
  • 1