Corpus ID: 236428530

TinyAction Challenge: Recognizing Real-world Low-resolution Activities in Videos

  title={TinyAction Challenge: Recognizing Real-world Low-resolution Activities in Videos},
  author={Praveen Tirupattur and Aayush Jung Rana and Tushar Sangam and Shruti Vyas and Yogesh Singh Rawat and Mubarak Shah},
This paper summarizes the TinyAction challenge 1 which was organized in ActivityNet workshop at CVPR 2021. This challenge focuses on recognizing real-world low-resolution activities present in videos. Action recognition task is currently focused around classifying the actions from highquality videos where the actors and the action is clearly visible. While various approaches have been shown effective for recognition task in recent works, they often do not deal with videos of lower resolution… Expand
1 Citations

Figures and Tables from this paper

Video Action Understanding
This tutorial introduces and systematizes fundamental topics, basic concepts, and notable examples in supervised video action understanding, and clarifies a taxonomy of action problems, catalog and highlight video datasets, and formalize domain-specific metrics to baseline proposed solutions. Expand


TinyVIRAT: Low-resolution Video Action Recognition
  • U. Demir, Y. Rawat, M. Shah
  • Computer Science, Engineering
  • 2020 25th International Conference on Pattern Recognition (ICPR)
  • 2021
A novel method for recognizing tiny actions in videos is proposed which utilizes a progressive generative approach to improve the quality of low-resolution actions and consists of a weakly trained attention mechanism which helps in focusing on the activity regions in the video. Expand
A large-scale benchmark dataset for event recognition in surveillance video
We introduce a new large-scale video dataset designed to assess the performance of diverse visual event recognition algorithms with a focus on continuous visual event recognition (CVER) in outdoorExpand
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
I3D models considerably improve upon the state-of-the-art in action classification, reaching 80.2% on HMDB-51 and 97.9% on UCF-101 after pre-training on Kinetics, and a new Two-Stream Inflated 3D Conv net that is based on 2D ConvNet inflation is introduced. Expand
ActivityNet: A large-scale video benchmark for human activity understanding
This paper introduces ActivityNet, a new large-scale video benchmark for human activity understanding that aims at covering a wide range of complex human activities that are of interest to people in their daily living. Expand
SUSTech&HKU Submission to TinyAction Challenge 2021
This report describes the details of our solution to TinyAction Challenge 2021 that focuses on recognizing tiny actions in videos. To extract rich spatio-temporal features from low-resolution videos,Expand
UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild
This work introduces UCF101 which is currently the largest dataset of human actions and provides baseline action recognition results on this new dataset using standard bag of words approach with overall performance of 44.5%. Expand
MEVA: A Large-Scale Multiview, Multimodal Video Dataset for Activity Detection
This work presents the Multiview Extended Video with Activities (MEVA) dataset, a new and very-large-scale dataset for human activity recognition, scripted to include diverse, simultaneous activities, along with spontaneous background activity. Expand
Can humans fly? Action understanding with multiple classes of actors
This paper marks the first effort in the computer vision community to jointly consider various types of actors undergoing various actions in comprehensive action understanding and demonstrates that inference jointly over actors and actions outperforms inference independently over them. Expand
HMDB: A large video database for human motion recognition
This paper uses the largest action video database to-date with 51 action categories, which in total contain around 7,000 manually annotated clips extracted from a variety of sources ranging from digitized movies to YouTube, to evaluate the performance of two representative computer vision systems for action recognition and explore the robustness of these methods under various conditions. Expand
YouTube-8M: A Large-Scale Video Classification Benchmark
YouTube-8M is introduced, the largest multi-label video classification dataset, composed of ~8 million videos (500K hours of video), annotated with a vocabulary of 4800 visual entities, and various (modest) classification models are trained on the dataset. Expand