Anticipating Daily Intention Using On-wrist Motion Triggered Sensing

@article{Wu2017AnticipatingDI,
  title={Anticipating Daily Intention Using On-wrist Motion Triggered Sensing},
  author={Tz-Ying Wu and Ting-An Chien and C. S. Chan and Chan-Wei Hu and Min Sun},
  journal={2017 IEEE International Conference on Computer Vision (ICCV)},
  year={2017},
  pages={48-56}
}
Anticipating human intention by observing one’s actions has many applications. For instance, picking up a cellphone, then a charger (actions) implies that one wants to charge the cellphone (intention) (Fig. 1). By anticipating the intention, an intelligent system can guide the user to the closest power outlet. We propose an on-wrist motion triggered sensing system for anticipating daily intentions, where the on-wrist sensors help us to persistently observe one’s actions. The core of the system… 

Figures and Tables from this paper

Action Anticipation from Multimodal Data
TLDR
Experimental results pointed out that prediction from multimodal data seems a feasible task, suggesting that multimodality improves both classification and prediction.
Leveraging the Present to Anticipate the Future in Videos
TLDR
This work focuses on anticipating actions seconds before they start, and proposes a fusion of a purely anticipatory model with a complementary model constrained to reason about the present that predicts present action and scene attributes, and reasons about how they evolve over time.
Predicting the Future from First Person (Egocentric) Vision: A Survey
Self-Regulated Learning for Egocentric Video Activity Anticipation
TLDR
A simple and effective Self-Regulated Learning framework, which aims to regulate the intermediate representation consecutively to produce representation that emphasizes the novel information in the frame of the current time-stamp in contrast to previously observed content and reflects its correlation with previously observed frames.
Anticipating Activity from Multimodal Signals
TLDR
The proposed embedding space where inputs of different nature, but semantically correlated, are projected in a new representation space and properly exploited to anticipate the future user activity is built.
Egocentric Prediction of Action Target in 3D
TLDR
A large multimodality dataset is proposed of more than 1 million frames of RGB-D and IMU streams and evaluation metrics based on high-quality 2D and 3D labels from semi-automatic annotation are provided, demonstrating that this new task is worthy of further study by researchers in robotics, vision, and learning communities.
Human Activity Recognition Algorithm Based on One-Dimensional Convolutional Neural Network
TLDR
A method based on the one-dimensional convolutional neural network (1D-CNN) to classify the sensor signals of some different activities and shows that the proposed method has good performance in terms of generality across users, and at the same time provides a higher accuracy.
When will you do what? - Anticipating Temporal Occurrences of Activities
TLDR
This paper proposes two methods to predict a considerably large amount of future actions and their durations using a CNN and an RNN that are trained to learn future video labels based on previously seen content.
Mental State Analysis on Eyewear
TLDR
A short state of the art of the existing technologies for sensing the mental states and the possible applications on eyewear is introduced.
Multi-sensor Data Fusion for Wearable Devices
TLDR
The goal of this research is to build a shared representation between data coming from different domains, such as images, signal audio, heart rate, acceleration, etc., in order to predict daily activities.
...
...

References

SHOWING 1-10 OF 40 REFERENCES
Recurrent Neural Networks for driver activity anticipation via sensory-fusion architecture
TLDR
A sensory-fusion architecture which jointly learns to anticipate and fuse information from multiple sensory streams and shows significant improvement over the state-of-the-art in maneuver anticipation by increasing the precision and recall.
Anticipating Human Activities Using Object Affordances for Reactive Robotic Response
TLDR
This work represents each possible future using an anticipatory temporal conditional random field (ATCRF) that models the rich spatial-temporal relations through object affordances and represents each ATCRF as a particle and represents the distribution over the potential futures using a set of particles.
Long-Term Activity Forecasting Using First-Person Vision
TLDR
This work develops the first forecasting framework that uses ego-centric video input, which is an optimal vantage point for understanding human activities over large spaces and makes it possible to combine a high dimensional continuous state with a discrete state portion into a single state for behavior forecasting.
Intention-Sensing Recipe Guidance via User Accessing to Objects
TLDR
A strategy for recipe guidance systems that can predict the forthcoming intended subtask in a cooking task is investigated and the use of “access to objects” to realize effective intention-sensing systems is supported.
Probabilistic Modeling of Human Movements for Intention Inference
TLDR
The Intention-Driven Dynamics Model (IDDM), a latent variable model for inferring unknown human intentions, is proposed and an efficient approximate inference algorithm to infer the human’s intention from an ongoing movement is introduced.
Anticipating Accidents in Dashcam Videos
TLDR
The DSA-RNN learns to distribute soft-attention to candidate objects dynamically to gather subtle cues and model the temporal dependencies of all cues to robustly anticipate an accident, and achieves the highest mean average precision (74.35%) outperforming other baselines without attention or RNN.
First-Person Activity Forecasting with Online Inverse Reinforcement Learning
TLDR
DARKO learns and forecasts from first-person visual observations of the user’s daily behaviors via an Online Inverse Reinforcement Learning (IRL) approach, and shows DARKO forecasts goals better than competing methods in both noisy and ideal settings.
Activity Forecasting
TLDR
The unified model uses state-of-the-art semantic scene understanding combined with ideas from optimal control theory to achieve accurate activity forecasting and shows how the same techniques can improve the results of tracking algorithms by leveraging information about likely goals and trajectories.
Anticipatory Planning for Human-Robot Teams
TLDR
A graphical model is used to represent the state-space, where the model model the humans through their low-level kinematics as well as their high-level intent, and model their interactions with the objects through physically-grounded object affordances, and shows that this method performs better than various baselines for collaborative planning.
Predicting Motivations of Actions by Leveraging Text
TLDR
The problem of predicting why a person has performed an action in images is introduced and results suggest that transferring knowledge from language into vision can help machines understand why people in images might be performing an action.
...
...