Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis
- Quoc V. Le, Will Y. Zou, Serena Yeung, A. Ng
- Computer ScienceComputer Vision and Pattern Recognition
- 20 June 2011
This paper presents an extension of the Independent Subspace Analysis algorithm to learn invariant spatio-temporal features from unlabeled video data and discovered that this method performs surprisingly well when combined with deep learning techniques such as stacking and convolution to learn hierarchical representations.
End-to-End Learning of Action Detection from Frame Glimpses in Videos
- Serena Yeung, Olga Russakovsky, Greg Mori, Li Fei-Fei
- Computer ScienceComputer Vision and Pattern Recognition
- 22 November 2015
A fully end-to-end approach for action detection in videos that learns to directly predict the temporal bounds of actions and uses REINFORCE to learn the agent's decision policy.
Every Moment Counts: Dense Detailed Labeling of Actions in Complex Videos
- Serena Yeung, Olga Russakovsky, Ning Jin, M. Andriluka, Greg Mori, Li Fei-Fei
- Computer ScienceInternational Journal of Computer Vision
- 21 July 2015
A novel variant of long short-term memory deep networks is defined for modeling these temporal relations via multiple input and output connections and it is shown that this model improves action labeling accuracy and further enables deeper understanding tasks ranging from structured retrieval to action prediction.
Towards Viewpoint Invariant 3D Human Pose Estimation
- Albert Haque, Boya Peng, Zelun Luo, Alexandre Alahi, Serena Yeung, Li Fei-Fei
- Computer ScienceEuropean Conference on Computer Vision
- 23 March 2016
A viewpoint invariant model for 3D human pose estimation from a single depth image that leverages a convolutional and recurrent network architecture with a top-down error feedback mechanism to self-correct previous pose estimates in an end-to-end manner.
Faster CryptoNets: Leveraging Sparsity for Real-World Encrypted Inference
- Edward Chou, Josh Beal, Daniel Levy, Serena Yeung, Albert Haque, Li Fei-Fei
- Computer ScienceArXiv
- 25 November 2018
This work develops a pruning and quantization approach that leverages sparse representations in the underlying cryptosystem to accelerate inference and derives an optimal approximation for popular activation functions that achieves maximally-sparse encodings and minimizes approximation error.
Personalized Federated Learning with First Order Model Optimization
- Michael Zhang, Karan Sapra, S. Fidler, Serena Yeung, J. Álvarez
- Computer ScienceInternational Conference on Learning…
- 15 December 2020
This work efficiently calculate optimal weighted model combinations for each client, based on figuring out how much a client can benefit from another's model, to achieve personalization in federated FL.
Tool Detection and Operative Skill Assessment in Surgical Videos Using Region-Based Convolutional Neural Networks
- Amy Jin, Serena Yeung, Li Fei-Fei
- Computer Science, MedicineIEEE Workshop/Winter Conference on Applications…
- 24 February 2018
This work introduces an approach to automatically assess surgeon performance by tracking and analyzing tool movements in surgical videos, leveraging region-based convolutional neural networks, and is the first to not only detect presence but also spatially localize surgical tools in real-world laparoscopic surgical videos.
Scaling Human-Object Interaction Recognition Through Zero-Shot Learning
- Liyue Shen, Serena Yeung, Judy Hoffman, Greg Mori, Li Fei-Fei
- Computer ScienceIEEE Workshop/Winter Conference on Applications…
- 12 March 2018
This work introduces a factorized model for HOI detection that disentangles reasoning on verbs and objects, and at test-time can therefore produce detections for novel verb-object pairs through a zero-shot learning approach.
VideoSET: Video Summary Evaluation through Text
- Serena Yeung, A. Fathi, Li Fei-Fei
- Computer ScienceArXiv
- 23 June 2014
This paper presents VideoSET, a method for Video Summary Evaluation through Text that can evaluate how well a video summary is able to retain the semantic information contained in its original video, and develops a text-based approach for the evaluation.
Dynamic Task Prioritization for Multitask Learning
- Michelle Guo, Albert Haque, De-An Huang, Serena Yeung, Li Fei-Fei
- Computer ScienceEuropean Conference on Computer Vision
- 8 September 2018
This work proposes a notion of dynamic task prioritization to automatically prioritize more difficult tasks by adaptively adjusting the mixing weight of each task’s loss objective and outperforms existing multitask methods and demonstrates competitive results with modern single-task models on the COCO and MPII datasets.
...
...