Anticipating Daily Intention Using On-wrist Motion Triggered Sensing
@article{Wu2017AnticipatingDI, title={Anticipating Daily Intention Using On-wrist Motion Triggered Sensing}, author={Tz-Ying Wu and Ting-An Chien and C. S. Chan and Chan-Wei Hu and Min Sun}, journal={2017 IEEE International Conference on Computer Vision (ICCV)}, year={2017}, pages={48-56} }
Anticipating human intention by observing one’s actions has many applications. For instance, picking up a cellphone, then a charger (actions) implies that one wants to charge the cellphone (intention) (Fig. 1). By anticipating the intention, an intelligent system can guide the user to the closest power outlet. We propose an on-wrist motion triggered sensing system for anticipating daily intentions, where the on-wrist sensors help us to persistently observe one’s actions. The core of the system…
Figures and Tables from this paper
11 Citations
Action Anticipation from Multimodal Data
- Computer ScienceVISIGRAPP
- 2019
Experimental results pointed out that prediction from multimodal data seems a feasible task, suggesting that multimodality improves both classification and prediction.
Leveraging the Present to Anticipate the Future in Videos
- Computer Science2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
- 2019
This work focuses on anticipating actions seconds before they start, and proposes a fusion of a purely anticipatory model with a complementary model constrained to reason about the present that predicts present action and scene attributes, and reasons about how they evolve over time.
Predicting the Future from First Person (Egocentric) Vision: A Survey
- Computer ScienceComput. Vis. Image Underst.
- 2021
Self-Regulated Learning for Egocentric Video Activity Anticipation
- Computer ScienceIEEE transactions on pattern analysis and machine intelligence
- 2021
A simple and effective Self-Regulated Learning framework, which aims to regulate the intermediate representation consecutively to produce representation that emphasizes the novel information in the frame of the current time-stamp in contrast to previously observed content and reflects its correlation with previously observed frames.
Anticipating Activity from Multimodal Signals
- Computer Science2020 25th International Conference on Pattern Recognition (ICPR)
- 2021
The proposed embedding space where inputs of different nature, but semantically correlated, are projected in a new representation space and properly exploited to anticipate the future user activity is built.
Egocentric Prediction of Action Target in 3D
- Computer ScienceArXiv
- 2022
A large multimodality dataset is proposed of more than 1 million frames of RGB-D and IMU streams and evaluation metrics based on high-quality 2D and 3D labels from semi-automatic annotation are provided, demonstrating that this new task is worthy of further study by researchers in robotics, vision, and learning communities.
Human Activity Recognition Algorithm Based on One-Dimensional Convolutional Neural Network
- Computer ScienceRev. d'Intelligence Artif.
- 2020
A method based on the one-dimensional convolutional neural network (1D-CNN) to classify the sensor signals of some different activities and shows that the proposed method has good performance in terms of generality across users, and at the same time provides a higher accuracy.
When will you do what? - Anticipating Temporal Occurrences of Activities
- Computer Science2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
- 2018
This paper proposes two methods to predict a considerably large amount of future actions and their durations using a CNN and an RNN that are trained to learn future video labels based on previously seen content.
Mental State Analysis on Eyewear
- Computer ScienceUbiComp/ISWC Adjunct
- 2018
A short state of the art of the existing technologies for sensing the mental states and the possible applications on eyewear is introduced.
Multi-sensor Data Fusion for Wearable Devices
- Computer Science
- 2018
The goal of this research is to build a shared representation between data coming from different domains, such as images, signal audio, heart rate, acceleration, etc., in order to predict daily activities.
References
SHOWING 1-10 OF 40 REFERENCES
Recurrent Neural Networks for driver activity anticipation via sensory-fusion architecture
- Computer Science2016 IEEE International Conference on Robotics and Automation (ICRA)
- 2016
A sensory-fusion architecture which jointly learns to anticipate and fuse information from multiple sensory streams and shows significant improvement over the state-of-the-art in maneuver anticipation by increasing the precision and recall.
Anticipating Human Activities Using Object Affordances for Reactive Robotic Response
- Computer ScienceIEEE Transactions on Pattern Analysis and Machine Intelligence
- 2016
This work represents each possible future using an anticipatory temporal conditional random field (ATCRF) that models the rich spatial-temporal relations through object affordances and represents each ATCRF as a particle and represents the distribution over the potential futures using a set of particles.
Long-Term Activity Forecasting Using First-Person Vision
- Computer ScienceACCV
- 2016
This work develops the first forecasting framework that uses ego-centric video input, which is an optimal vantage point for understanding human activities over large spaces and makes it possible to combine a high dimensional continuous state with a discrete state portion into a single state for behavior forecasting.
Intention-Sensing Recipe Guidance via User Accessing to Objects
- Computer ScienceInt. J. Hum. Comput. Interact.
- 2016
A strategy for recipe guidance systems that can predict the forthcoming intended subtask in a cooking task is investigated and the use of “access to objects” to realize effective intention-sensing systems is supported.
Probabilistic Modeling of Human Movements for Intention Inference
- Computer ScienceRobotics: Science and Systems
- 2012
The Intention-Driven Dynamics Model (IDDM), a latent variable model for inferring unknown human intentions, is proposed and an efficient approximate inference algorithm to infer the human’s intention from an ongoing movement is introduced.
Anticipating Accidents in Dashcam Videos
- Computer ScienceACCV
- 2016
The DSA-RNN learns to distribute soft-attention to candidate objects dynamically to gather subtle cues and model the temporal dependencies of all cues to robustly anticipate an accident, and achieves the highest mean average precision (74.35%) outperforming other baselines without attention or RNN.
First-Person Activity Forecasting with Online Inverse Reinforcement Learning
- Computer Science2017 IEEE International Conference on Computer Vision (ICCV)
- 2017
DARKO learns and forecasts from first-person visual observations of the user’s daily behaviors via an Online Inverse Reinforcement Learning (IRL) approach, and shows DARKO forecasts goals better than competing methods in both noisy and ideal settings.
Activity Forecasting
- Computer ScienceECCV
- 2012
The unified model uses state-of-the-art semantic scene understanding combined with ideas from optimal control theory to achieve accurate activity forecasting and shows how the same techniques can improve the results of tracking algorithms by leveraging information about likely goals and trajectories.
Anticipatory Planning for Human-Robot Teams
- Computer ScienceISER
- 2014
A graphical model is used to represent the state-space, where the model model the humans through their low-level kinematics as well as their high-level intent, and model their interactions with the objects through physically-grounded object affordances, and shows that this method performs better than various baselines for collaborative planning.
Predicting Motivations of Actions by Leveraging Text
- Computer Science2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2016
The problem of predicting why a person has performed an action in images is introduced and results suggest that transferring knowledge from language into vision can help machines understand why people in images might be performing an action.