MotionSqueeze: Neural Motion Feature Learning for Video Understanding
- Heeseung Kwon, Manjin Kim, Suha Kwak, Minsu Cho
- Computer ScienceEuropean Conference on Computer Vision
- 20 July 2020
This work proposes a trainable neural module, dubbed MotionSqueeze, for effective motion feature extraction, and demonstrates that the proposed method provides a significant gain on four standard benchmarks for action recognition with only a small amount of additional cost, outperforming the state of the art on Something-Something-V1&V2 datasets.
Future Transformer for Long-term Action Anticipation
- Dayoung Gong, Joonseok Lee, Manjin Kim, S. Ha, Minsu Cho
- Computer ScienceComputer Vision and Pattern Recognition
- 27 May 2022
An end-to-end attention model for action anticipation, dubbed Future Transformer (FUTR), that leverages global attention over all input frames and output tokens to predict a minutes-long sequence of future actions.
Learning Self-Similarity in Space and Time as Generalized Motion for Action Recognition
- Heeseung Kwon, Manjin Kim, Suha Kwak, Minsu Cho
- Computer SciencearXiv.org
- 2021
A rich and robust motion representation based on spatio-temporal self-similarity (STSS), which effectively captures long-term interaction and fast motion in the video, leading to robust action recognition.
Future Transformer for Long-term Action Anticipation –Supplementary Materials–
- Dayoung Gong, Joonseok Lee, Manjin Kim, S. Ha, Minsu Cho
- Computer Science
- 2022
Figure S1. FUTR variants with different decoding strategies. (a) FUTR-A autoregressively anticipates future actions using the output action labels from the previous predictions as input and utilizes…