MemX: An Attention-Aware Smart Eyewear System for Personalized Moment Auto-capture

  title={MemX: An Attention-Aware Smart Eyewear System for Personalized Moment Auto-capture},
  author={Yuhu Chang and Yingying Zhao and Mingzhi Dong and Yujiang Wang and Yutian Lu and Qin Lv and Robert P. Dick and Tun Lu and Ning Gu and Li Shang},
  journal={Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.},
YUHU CHANG, School of Computer Science, Fudan University, China and Shanghai Key Laboratory of Data Science, Fudan University, China YINGYING ZHAO∗, School of Computer Science, Fudan University, China and Shanghai Key Laboratory of Data Science, Fudan University, China MINGZHI DONG, School of Computer Science, Fudan University, China and Shanghai Key Laboratory of Data Science, Fudan University, China YUJIANG WANG, Department of Computing, Imperial College London, United Kingdom YUTIAN LU… 


Gaze Estimation Using Residual Neural Network
This paper explored the use of the deep learning model, Residual Neural Network (ResNet-18), to predict the eye gaze on mobile device using the large-scale eye tracking public dataset called GazeCapture, and found that head pose information was useful contribution to the proposed deep-learning network, while face grid information does not help to reduce test error.
Classifying Attention Types with Thermal Imaging and Eye Tracking
The findings not only demonstrate the potential of thermal imaging and eye tracking for unobtrusive classification of different attention types but also pave the way for novel applications for attentive user interfaces and attention-aware computing.
Dynamic Face Video Segmentation via Reinforcement Learning
This work uses reinforcement learning for online key-frame decision in dynamic video segmentation as a deep reinforcement learning problem and learns an efficient and effective scheduling policy from expert information about decision history and from the process of maximising global return.
Video Instance Segmentation
The first time that the image instance segmentation problem is extended to the video domain, and a novel algorithm called MaskTrack R-CNN is proposed for this task, which is simultaneous detection, segmentation and tracking of instances in videos.
Generalizing Eye Tracking With Bayesian Adversarial Learning
This work adds an adversarial component into traditional CNN-based gaze estimator so that it can learn features that are gaze-responsive but can generalize to appearance and pose variations and extends the point-estimation based deterministic model to a Bayesian framework so that gaze estimation can be performed using all parameters.
1D CNN with BLSTM for automated classification of fixations, saccades, and smooth pursuits
A novel pipeline and metric for event detection in eye-tracking recordings, which enforce stricter criteria on the algorithmically produced events in order to consider them as potentially correct detections, and shows that the deep approach outperforms all others, including the state-of-the-art multi-observer smooth pursuit detector.
Learning Unsupervised Video Object Segmentation Through Visual Attention
This paper quantitatively verified the high consistency of visual attention behavior among human observers, and found strong correlation between human attention and explicit primary object judgements during dynamic, task-driven viewing.
MPIIGaze: Real-World Dataset and Deep Appearance-Based Gaze Estimation
It is shown that image resolution and the use of both eyes affect gaze estimation performance, while head pose and pupil centre information are less informative, and GazeNet is proposed, the first deep appearance-based gaze estimation method.
Leveraging eye-gaze and time-series features to predict user interests and build a recommendation model for visual analysis
This work presents results on pre-attentive features, and discusses the precision/recall of the model in comparison to final selections made by users, which helps users to efficiently identify interesting time-series patterns.
Detecting Attended Visual Targets in Video
A novel architecture models the dynamic interaction between the scene and head features and infers time-varying attention targets and introduces a new annotated dataset, VideoAttentionTarget, containing complex and dynamic patterns of real-world gaze behavior.