Video Description Using Bidirectional Recurrent Neural Networks


Although traditionally used in the machine translation field, the encoder-decoder framework has been recently applied for the generation of video and image descriptions. The combination of Convolutional and Recurrent Neural Networks in these models has proven to outperform the previous state of the art, obtaining more accurate video descriptions. In this… (More)
DOI: 10.1007/978-3-319-44781-0_1


4 Figures and Tables


Citations per Year

Citation Velocity: 13

Averaging 13 citations per year over the last 2 years.

Learn more about how we calculate this metric in our FAQ.

Slides referencing similar topics