Capturing Interactions in Meetings with Omnidirectional Cameras

@article{Stiefelhagen2005CapturingII,
  title={Capturing Interactions in Meetings with Omnidirectional Cameras},
  author={Rainer Stiefelhagen and Xilin Chen and Jie Yang},
  journal={Int. J. Distance Educ. Technol.},
  year={2005},
  volume={3},
  pages={34-47}
}
Human interaction is one of the most important characteristics of meetings. To explore complex human interactions in meetings, we must understand them and their components in detail. In this paper, we present our efforts in capturing human interactions in meetings using omnidirectional cameras. We present algorithms for person tracking, head pose estimation, and face recognition from omnidirectional images. We also discuss an approach for the estimation of who was talking to whom, based on… 

COLLABORATIVE CAPTURING AND DETECTION OF HUMAN INTERACTIONS IN MEETINGS

TLDR
This paper proposes a collaborative approach to capture and detect human interactions in meetings by employing multiple sensors, such as video cameras, microphones, and motion sensors, and adopts the support vector machines (SVM) classifier to recognize human interactions.

Capture, recognition, and visualization of human semantic interactions in meetings

TLDR
A multimodal method is proposed for interaction recognition based on a variety of contexts, including head gestures, attention from others, speech tone, speaking time, interaction occasion (spontaneous or reactive), and information about the previous interaction.

Multimodal sensing, recognizing and browsing group social dynamics

TLDR
Multimodal methods are proposed for human interaction recognition and group interest recognition based on a variety of features and a graphical user interface, the MMBrowser, is presented for browsing group social dynamics.

Body Movement Synchrony Captured by an Omnidirectional Camera predicts the Degree of Information Transfer during Dialogue: Toward Automatic Evaluation of Verbal Communication

TLDR
An information transfer estimation method based on head movement synchrony during a conversation evaluated by an omnidirectional camera that can easily obtain body movements is proposed and found to be high for interacting pairs and a positive correlation was observed between synchrony and degree of information transfer.

Weakly-Supervised Multi-Person Action Recognition in 360° Videos

TLDR
This work proposes a weakly-supervised method based on multiinstance multi-label learning, which trains the model to recognize and localize multiple actions in a video using only video-level action labels as supervision.

Inferring Human Interactions in Meetings: A Multimodal Approach

TLDR
The experimental results show that SVM outperforms other inference models, it can successfully infer human interactions with a recognition rate around 80%, and the multimodal approach achieves robust and reliable results by leveraging on the characteristics of each single modality.

Automatic Head-size Equalization in Panorama Images for Video Conferencing

TLDR
A robust algorithm to automatically segment the table boundaries is proposed and a symmetry voting scheme is applied to filter out noisy points on the edge map to ensure the robustness.

An Augmented Reality Setup with an Omnidirectional Camera Based on Multiple Object Detection

TLDR
A novel augmented reality (AR) setup with an omni directional camera on a table top display that allows the system to superimpose virtual visual effects to the omniirection camera image acting as a mirror.

Weakly-Supervised Multi-Person Action Recognition in 360$^{\circ}$ Videos

TLDR
This work introduces 360Action, the first omnidirectional video dataset for multi-person action recognition, and proposes a weakly-supervised method based on multi-instance multi-label learning, which trains the model to recognize and localize multiple actions in a video using only video-level action labels as supervision.

Group Behavior Recognition

TLDR
This chapter presents some of the recent progress on group behavior sensing and recognition and discusses how to recognize the mobility level and structure of groups in the physical world by leveraging mobile devices.

References

SHOWING 1-10 OF 34 REFERENCES

Towards monitoring human activities using an omnidirectional camera

  • Xilin ChenJie Yang
  • Computer Science
    Proceedings. Fourth IEEE International Conference on Multimodal Interfaces
  • 2002
TLDR
The method provides an efficient way to monitoring high-level human activities without exploring identities and employs a deformable model to adapt the foreground models to optimally match objects in different position within a pattern of view of the omnidirectional camera.

Simultaneous tracking of head poses in a panoramic view

TLDR
With this approach, it is possible to simultaneously track the locations of multiple people around a meeting table and estimate their gaze directions using only a panoramic camera.

Modeling focus of attention for meeting indexing

TLDR
An approach to detect who is looking at whom during a meeting by using Hidden Markov Models to characterize participants’ focus of attention by using gaze information as well as knowledge about the number and positions of people present in a meeting is presented.

Tracking focus of attention in meetings

  • R. Stiefelhagen
  • Psychology
    Proceedings. Fourth IEEE International Conference on Multimodal Interfaces
  • 2002
TLDR
A system capable of estimating participants' focus of attention from multiple cues, which employs an omni-directional camera to simultaneously track the faces of participants sitting around a meeting table and uses neural networks to estimate their head poses, is developed.

Viewing meeting captured by an omni-directional camera

TLDR
A rototype system built using an omni-directional camera is reported on and results from user studies of interface preferences expressed by viewers are reported, indicating how much data needs to be stored on the disk, what computation can be done on the server vs the client, and how much bandwidth is needed.

Face recognition in a meeting room

  • R. GrossJie YangA. Waibel
  • Computer Science
    Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)
  • 2000
TLDR
The experimental results indicate that the DSW approach outperforms the eigenface approach in both cases, and the basic idea of the algorithm is to combine local features under certain spatial constraints.

Head orientation and gaze direction in meetings

TLDR
It is concluded that head orientation is a good indicator of focus of attention in human computer interaction applications.

Modeling focus of attention for meeting indexing based on multiple cues

TLDR
A system to estimate participants' focus of attention from gaze directions and sound sources is developed and can be used as an index for a multimedia meeting record and for analyzing a meeting.

Meeting Capture in a Media Enriched Conference Room

TLDR
This work describes a media enriched conference room designed for capturing meetings that is flexible, seamless, and unobtrusive in a public conference room that is used for everyday work.

Multimodal people ID for a multimedia meeting browser

TLDR
An approach that identifies meeting participants by fusing multimodal inputs using face ID, speaker ID, color appearance ID, and sound source directional ID to identify and track meeting is presented.