Towards automatic estimation of conversation floors within F-formations

  title={Towards automatic estimation of conversation floors within F-formations},
  author={Chirag Raman and H. Hung},
  journal={2019 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW)},
  • Chirag Raman, H. Hung
  • Published 19 July 2019
  • Sociology
  • 2019 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW)
The detection of free-standing conversing groups has received significant attention in recent years. In the absence of a formal definition, most studies operationalize the notion of a conversation group either through a spatial or a temporal lens. Spatially, the most commonly used representation is the F-formation, defined by social scientists as the configuration in which people arrange themselves to sustain an interaction. However, the use of this representation is often accompanied with the… 

Figures and Tables from this paper

Comparing F-Formations Between Humans and On-Screen Agents
Differences in group formation based on the moderator embodiment are observed, but it is found that people's positions can be predicted as a function of the number of people in the group and the moderator's position.
Defining and Quantifying Conversation Quality in Spontaneous Interactions
A novel perceived measure, the perceived Conversation Quality, is designed, which intends to quantify spontaneous interactions by accounting for several socio-dimensional aspects of individual experiences, and to further quantitatively study spontaneous interactions, a questionnaire is devised.
Social Processes: Self-Supervised Meta-Learning over Conversational Groups for Forecasting Nonverbal Social Cues
This work forms the task of Social Cue Forecasting to leverage the larger amount of unlabeled low-level behavior cues, and proposes the Social Process (SP) models—socially aware sequence-to-sequence (Seq2Seq) models within the Neural Process (NP) family.
Conversation Group Detection With Spatio-Temporal Context
This work proposes an approach for detecting conversation groups in social scenarios like cocktail parties and networking events, from overhead camera recordings, using a dynamic LSTM-based deep learning model that predicts continuous pairwise affinity values indicating how likely two people are in the same conversation group.
Detecting socially interacting groups using f-formation: A survey of taxonomy, methods, datasets, applications, challenges, and future research directions
A comprehensive survey of the existing work on social interaction and group detection using f-formation for robotics and other applications is provided and a novel holistic survey framework is put forward combining all the possible concerns and modules relevant to this problem.
Robocentric Conversational Group Discovery
An unsupervised conversational group detection method based on agglomerative hierarchical clustering is introduced based on a novel dataset called Robocentric Indoor Crowd Analysis (RICA).
Multimodal Joint Head Orientation Estimation in Interacting Groups via Proxemics and Interaction Dynamics
This work proposes an LSTM-based head orientation estimation method that combines the hidden representations of the group members and outperforms baseline methods that do not explicitly consider the group context, and generalizes to an unseen dataset from a different social event.
A Ulysses Pact with Artificial Systems. How to Deliberately Change the Objective Spirit with Cultured AI
The article introduces a concept of cultured technology, i.e. intelligent systems capable of interacting with humans and showing (or simulating) manners, of following customs and of socio-sensitive
ConfLab: A Rich Multimodal Multisensor Dataset of Free-Standing Social Interactions In-the-Wild
The ConfLab dataset is described, an instantiation of a new concept for multimodal multisensor data collection of real life in-the-wild free standing social interactions in the form of a Conference Living Lab that aims to bridge the gap between traditional computer vision tasks and in- the-wild ecologically valid socially-motivated tasks.
A Modular Approach for Synchronized Wireless Multimodal Multisensor Data Acquisition in Highly Dynamic Social Settings
It is argued and shown that the latency introduced by using NTP as a source reference is adequate for human behavior research, and the subsequent cost and modularity benefits are a desirable trade-off for applications in this domain.


Beyond F-Formations: Determining Social Involvement in Free Standing Conversing Groups from Static Images
  • Lu Zhang, H. Hung
  • Computer Science
    2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2016
This paper presents the first attempt to analyse differing levels of social involvement in free standing conversing groups (or the so-called F-formations) from static images and generates a richer model of the social interactions in a scene but also significantly improve F-formation detection.
On Social Involvement in Mingling Scenarios: Detecting Associates of F-Formations in Still Images
  • Lu Zhang, H. Hung
  • Computer Science
    IEEE Transactions on Affective Computing
  • 2021
By embracing the subjectivity of social involvement, this paper not only generates a richer model of the social interactions in a scene but can use the detected associates to improve initial estimates of the full members of an F-formation.
Analyzing Free-standing Conversational Groups: A Multimodal Approach
This paper introduces a framework able to fuse multimodal data emanating from a combination of distributed and wearable sensors, taking into account the temporal consistency, the head/body coupling and the noise inherent to the scenario.
Deciphering the Silent Participant: On the Use of Audio-Visual Cues for the Classification of Listener Categories in Group Discussions
This paper devised a thin-sliced perception test where subjects were asked to assess listener roles and engagement levels in 15-second video-clips taken from a corpus of group interviews, and showed that humans are usually able to assess silent participant roles.
Parallel detection of conversational groups of free-standing people and tracking of their lower-body orientation
An alternating optimization procedure that estimates lower body orientations and detects groups of interacting people and a new group detection algorithm based on F-formation detection that can improve state-of-the-art detection of non-interacting people without sacrificing group detection accuracy.
Timing in turn-taking and its implications for processing models of language
This paper reviews the extensive literature about this system, adding new statistical analyses of behavioral data where they have been missing, demonstrating that turn-taking has the systematic properties originally noted by Sacks et al. (1974); and sketches some first model of the mental processes involved for the participant preparing to speak next.
Who's got the floor?
This study into the nature of “the floor” actually began as an open-ended inquiry into sex differences that might occur beyond the sentence level in the multi-party interaction of five informal
Social interaction discovery by statistical analysis of F-formations
A novel approach for detecting social interactions in a crowded scene by employing solely visual cues and employing the sociological notion of F-formation, which is a set of possible configurations in space that people may assume while participating in a social interaction.