Learn More
Action Recognition in videos is an active research field that is fueled by an acute need, spanning several application domains. Still, existing systems fall short of the applications' needs in real-world scenarios, where the quality of the video is less than optimal and the viewpoint is uncontrolled and often not static. In this paper, we extend the Motion(More)
We extend the word2vec framework to capture meaning across languages. The input consists of a source text and a word-aligned parallel text in a second language. The joint word2vec tool then represents words in both languages within a common " semantic " vector space. The result can be used to enrich lexicons of under-resourced languages, to identify(More)
In video understanding, the spatial patterns formed by local space-time interest points hold discriminative information. We encode these spatial regularities using a word2vec neural network, a recently proposed tool in the field of text processing. Then, building upon recent accumulator based image representation solutions, input videos are represented in a(More)
Deep methods based on Convolutional Neural Networks serve as accurate facial points and body parts detectors. However, most methods do not provide a confidence score for the quality of the localization process. In real world applications, such a score could be invaluable. We, therefore, study the problem of estimating the success of the localization process(More)
  • 1