Few-Shot Video Classification via Temporal Alignment
@article{Cao2019FewShotVC, title={Few-Shot Video Classification via Temporal Alignment}, author={Kaidi Cao and Jingwei Ji and Zhangjie Cao and C. Chang and Juan Carlos Niebles}, journal={2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, year={2019}, pages={10615-10624} }
Difficulty in collecting and annotating large-scale video data raises a growing interest in learning models which can recognize novel classes with only a few training examples. In this paper, we propose the Ordered Temporal Alignment Module (OTAM), a novel few-shot learning framework that can learn to classify a previously unseen video. While most previous work neglects long-term temporal ordering information, our proposed model explicitly leverages the temporal ordering information in video…
110 Citations
Learning Implicit Temporal Alignment for Few-shot Video Classification
- Computer ScienceIJCAI
- 2021
This work introduces an implicit temporal alignment for a video pair, capable of estimating the similarity between them in an accurate and robust manner, and designs an effective context encoding module to incorporate spatial and feature channel context, resulting in better modeling of intra-class variations.
A Closer Look at Few-Shot Video Classification: A New Baseline and Benchmark
- Computer ScienceBMVC
- 2021
This paper proposes a simple classifier-based baseline without any temporal alignment that surprisingly outperforms the state-of-the-art meta-learning based methods and presents a new benchmark with more base data to facilitate future few-shot video classification without pre-training.
Generalized Few-Shot Video Classification With Video Retrieval and Feature Generation
- Computer ScienceIEEE Transactions on Pattern Analysis and Machine Intelligence
- 2022
This work argues that previous methods underestimate the importance of video feature learning and proposes to learn spatiotemporal features using a 3D CNN and a two-stage approach that learns video features on base classes followed by fine-tuning the classifiers on novel classes.
Label Independent Memory for Semi-Supervised Few-Shot Video Classification
- Computer ScienceIEEE Transactions on Pattern Analysis and Machine Intelligence
- 2022
A label independent memory (LIM) to cache label related features, which enables a similarity search over a large set of videos, and produces a class prototype for few-shot training, which is more robust to noisy video features.
TNT: Text-Conditioned Network with Transductive Inference for Few-Shot Video Classification
- Computer ScienceBMVC
- 2021
This paper formulate a text-based task conditioner to adapt video features to the few-shot learning task and follows a transductive setting to improve the task-adaptation ability of the model by using the support textual descriptions and query instances to update a set of class prototypes.
TAEN: Temporal Aware Embedding Network for Few-Shot Action Recognition
- Computer Science2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
- 2021
A Temporal Aware Embedding Network (TAEN) for few-shot action recognition, that learns to represent actions, in a metric space as a trajectory, conveying both short term semantics and longer term connectivity between sub-actions.
Less than Few: Self-Shot Video Instance Segmentation
- Computer ScienceECCV
- 2022
This work proposes to automatically learn to find appropriate support videos given a query to bypass the need for labelled examples in few-shot video understanding at run time, and outlines a simple self-supervised learning method to generate an embedding space well-suited for unsupervised retrieval of relevant samples.
Temporal Alignment Prediction for Few-Shot Video Classification
- Computer ScienceArXiv
- 2021
Temporal Alignment Prediction (TAP) based on sequence similarity learning for few-shot video classification is proposed and its superiority over state-of-the-art methods is verified.
Few-Shot Learning for Video Object Detection in a Transfer-Learning Scheme
- Computer ScienceArXiv
- 2021
This paper defines the few-shot setting and creates a new benchmark dataset for few- shot video object detection derived from the widely used ImageNet VID dataset, and employs a transfer-learning framework to effectively train the video object detector on a large number of base- class objects and a few video clips of novel-class objects.
Few-Shot Video Object Detection
- Computer ScienceECCV
- 2022
Extensive experiments demonstrate that theFSVOD method produces significantly better detection results on two few-shot video object detection datasets compared to image-based methods and other naive video-based extensions.
References
SHOWING 1-10 OF 62 REFERENCES
Metric-Based Few-Shot Learning for Video Action Recognition
- Computer ScienceArXiv
- 2019
This work addresses the task of few-shot video action recognition with a set of two-stream models, and finds prototypical networks and pooled long short-term memory network embeddings to give the best performance as few- shot method and video encoder, respectively.
A Closer Look at Few-shot Classification
- Computer ScienceICLR
- 2019
The results reveal that reducing intra-class variation is an important factor when the feature backbone is shallow, but not as critical when using deeper backbones, and a baseline method with a standard fine-tuning practice compares favorably against other state-of-the-art few-shot learning algorithms.
TARN: Temporal Attentive Relation Network for Few-Shot and Zero-Shot Action Recognition
- Computer ScienceBMVC
- 2019
The proposed TARN uses attention mechanisms so as to perform temporal alignment, and learns a deep-distance measure on the aligned representations at video segment level to achieve competitive results in zero-shot action recognition.
Learning to Compare: Relation Network for Few-Shot Learning
- Computer Science2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
- 2018
A conceptually simple, flexible, and general framework for few-shot learning, where a classifier must learn to recognise new classes given only few examples from each, which is easily extended to zero- shot learning.
Compound Memory Networks for Few-Shot Video Classification
- Computer ScienceECCV
- 2018
A multi-saliency embedding algorithm which encodes a variable-length video sequence into a fixed-size matrix representation by discovering multiple saliencies of interest is introduced.
ECO: Efficient Convolutional Network for Online Video Understanding
- Computer ScienceECCV
- 2018
A network architecture that takes long-term content into account and enables fast per-video processing at the same time and achieves competitive performance across all datasets while being 10 to 80 times faster than state-of-the-art methods.
Temporal Segment Networks: Towards Good Practices for Deep Action Recognition
- Computer ScienceECCV
- 2016
Deep convolutional networks have achieved great success for visual recognition in still images. However, for action recognition in videos, the advantage over traditional methods is not so evident.…
Large-Scale Video Classification with Convolutional Neural Networks
- Computer Science2014 IEEE Conference on Computer Vision and Pattern Recognition
- 2014
This work studies multiple approaches for extending the connectivity of a CNN in time domain to take advantage of local spatio-temporal information and suggests a multiresolution, foveated architecture as a promising way of speeding up the training.
Learning Temporal Action Proposals With Fewer Labels
- Computer Science2019 IEEE/CVF International Conference on Computer Vision (ICCV)
- 2019
This work proposes a semi-supervised learning algorithm specifically designed for training temporal action proposal networks and shows that this approach consistently matches or outperforms the fully supervised state-of-the-art approaches.