TinyVIRAT: Low-resolution Video Action Recognition

  title={TinyVIRAT: Low-resolution Video Action Recognition},
  author={Uğur Demir and Yogesh Singh Rawat and Mubarak Shah},
  journal={2020 25th International Conference on Pattern Recognition (ICPR)},
  • U. Demir, Y. Rawat, M. Shah
  • Published 2021
  • Computer Science, Engineering
  • 2020 25th International Conference on Pattern Recognition (ICPR)
The existing research in action recognition is mostly focused on high-quality videos where the action is distinctly visible. In real-world surveillance environments, the actions in videos are captured at a wide range of resolutions. Most activities occur at a distance with a small resolution and recognizing such activities is a challenging problem. In this work, we focus on recognizing tiny actions in videos. We introduce a benchmark dataset, Tiny VIRAT, which contains natural low-resolution… Expand
TinyAction Challenge: Recognizing Real-world Low-resolution Activities in Videos
This work uses current state of the art action recognition methods on the dataset as a benchmark, and proposes a benchmark dataset, TinyVIRAT-v2 2, which is comprised of naturally occuring low-resolution actions, an extension of the TinyVirAT dataset and consists of actions with multiple labels. Expand
Extreme Low-Resolution Activity Recognition Using a Super-Resolution-Oriented Generative Adversarial Network
A super-resolution-driven generative adversarial network is proposed for activity recognition that outperforms several state-of-the-art low-resolution activity recognition approaches. Expand
Video Action Understanding
This tutorial introduces and systematizes fundamental topics, basic concepts, and notable examples in supervised video action understanding, and clarifies a taxonomy of action problems, catalog and highlight video datasets, and formalize domain-specific metrics to baseline proposed solutions. Expand
SUSTech&HKU Submission to TinyAction Challenge 2021
This report describes the details of our solution to TinyAction Challenge 2021 that focuses on recognizing tiny actions in videos. To extract rich spatio-temporal features from low-resolution videos,Expand
"Knights": First Place Submission for VIPriors21 Action Recognition Challenge at ICCV 2021
This technical report presents our approach “Knights” to solve the action recognition task on a small subset of Kinetics-400 i.e. Kinetics400ViPriors without using any extra-data. Our approach has 3Expand


Temporal Segment Networks: Towards Good Practices for Deep Action Recognition
Deep convolutional networks have achieved great success for visual recognition in still images. However, for action recognition in videos, the advantage over traditional methods is not so evident.Expand
Multi-view Action Recognition Using Cross-View Video Prediction
An unsupervised representation learning framework is proposed, which encodes the scene dynamics in videos captured from multiple viewpoints via predicting actions from unseen views, which achieves state-of-the-art results with depth modality and validates the generalization capability of the approach to other data modalities. Expand
A large-scale benchmark dataset for event recognition in surveillance video
We introduce a new large-scale video dataset designed to assess the performance of diverse visual event recognition algorithms with a focus on continuous visual event recognition (CVER) in outdoorExpand
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
I3D models considerably improve upon the state-of-the-art in action classification, reaching 80.2% on HMDB-51 and 97.9% on UCF-101 after pre-training on Kinetics, and a new Two-Stream Inflated 3D Conv net that is based on 2D ConvNet inflation is introduced. Expand
Semi-Coupled Two-Stream Fusion ConvNets for Action Recognition at Extremely Low Resolutions
A semi-coupled, filter-sharing network that leverages highresolution (HR) videos during training in order to assist an eLR ConvNet and outperforms state-of-the-art methods at extremely low resolutions on IXMAS and HMDB datasets. Expand
Two-Stream Action Recognition-Oriented Video Super-Resolution
This work proposes two video SR methods for the spatial and temporal streams respectively, tailored for two-stream action recognition networks, and proposes a siamese network for the temporal-oriented SR (ToSR) training that emphasizes the temporal continuity between consecutive frames. Expand
ActivityNet: A large-scale video benchmark for human activity understanding
This paper introduces ActivityNet, a new large-scale video benchmark for human activity understanding that aims at covering a wide range of complex human activities that are of interest to people in their daily living. Expand
HMDB: A large video database for human motion recognition
This paper uses the largest action video database to-date with 51 action categories, which in total contain around 7,000 manually annotated clips extracted from a variety of sources ranging from digitized movies to YouTube, to evaluate the performance of two representative computer vision systems for action recognition and explore the robustness of these methods under various conditions. Expand
Fully-Coupled Two-Stream Spatiotemporal Networks for Extremely Low Resolution Action Recognition
A fully-coupled two-stream spatiotemporal architecture for reliable human action recognition on extremely low resolution videos is proposed, providing an efficient method to extract spatial and temporal features and to aggregate them into a robust feature representation for an entire action video sequence. Expand
Extreme Low Resolution Activity Recognition with Multi-Siamese Embedding Learning
The approach of jointly learning such transform robust LR video representation and the classifier outperforms the previous state-of-the-art low resolution recognition approaches on two public standard datasets by a meaningful margin. Expand