Skeleton-Based Mutually Assisted Interacted Object Localization and Human Action Recognition

  title={Skeleton-Based Mutually Assisted Interacted Object Localization and Human Action Recognition},
  author={Liang Xu and Cuiling Lan and Wenjun Zeng and Cewu Lu},
—Skeleton data carries valuable motion information and is widely explored in human action recognition. However, not only the motion information but also the interaction with the environment provides discriminative cues to recognize the action of persons. In this paper, we propose a joint learning framework for mutually assisted “interacted object localization” and “human action recognition” based on skeleton data. The two tasks are serialized together and collaborate to promote each other… 

Figures and Tables from this paper



Learning Actionlet Ensemble for 3D Human Action Recognition

This paper proposes to characterize the human actions with a novel actionlet ensemble model, which represents the interaction of a subset of human joints, which is robust to noise, invariant to translational and temporal misalignment, and capable of characterizing both the human motion and the human-object interactions.

Hierarchical Soft Quantization for Skeleton-Based Human Action Recognition

A spatio-temporal hierarchical soft quantization method to extract the congenerous motion features, which reflect the cooperation relations among joints and body parts and can provide competitive results compared with state-of-the-arts methods.

Enhanced skeleton visualization for view invariant human action recognition

Skeleton-Based Action Recognition With Directed Graph Neural Networks

A novel directed graph neural network is designed specially to extract the information of joints, bones and their relations and make prediction based on the extracted features and is tested on two large-scale datasets, NTU-RGBD and Skeleton-Kinetics, and exceeds state-of-the-art performance on both of them.

A Cuboid CNN Model With an Attention Mechanism for Skeleton-Based Action Recognition

A cuboid arranging strategy is developed to organize the pairwise displacements between all body joints to obtain a cuboid action representation that allows deep CNN models to focus analyses on actions.

Human Action Recognition Using a Temporal Hierarchy of Covariance Descriptors on 3D Joint Locations

A novel approach to human action recognition from 3D skeleton sequences extracted from depth data that uses the covariance matrix for skeleton joint locations over time as a discriminative descriptor for a sequence to encode the relationship between joint movement and time.

Skeleton-Based Action Recognition with Spatial Reasoning and Temporal Stack Learning

A novel model with spatial reasoning and temporal stack learning (SR-TSL) for skeleton-based action recognition, which consists of a spatial reasoning network (SRN) and a temporal stacklearning network (TSLN).

Where to Focus on for Human Action Recognition?

This paper proposes an attention mechanism based on 3D articulated pose that outperforms the State-of-the-art methods on the largest human activity recognition dataset available to-date and on a human action recognition dataset with object interaction.

Human Action Recognition: Pose-Based Attention Draws Focus to Hands

An extensive ablation study is performed to show the strengths of this approach and the conditioning aspect of the attention mechanism and to evaluate the method on the largest currently available human action recognition dataset, NTU-RGB+D, and report state-of-the-art results.

An Attention Enhanced Graph Convolutional LSTM Network for Skeleton-Based Action Recognition

A novel Attention Enhanced Graph Convolutional LSTM Network (AGC-LSTM) for human action recognition from skeleton data can not only capture discriminative features in spatial configuration and temporal dynamics but also explore the co-occurrence relationship between spatial and temporal domains.