Nikolaos Sarafianos

Learn More
Speaker diarization aims to automatically answer the question “who spoke when” given a speech signal. In this work, we have focused on applying the FLsD approach, a semi-supervised version of Fisher Linear Discriminant analysis, both in the audio and the video signals to form a complete multimodal speaker diarization system. Extensive experiments have(More)
In this work, we investigate the problem of predicting gender from still images using human metrology. Since the values of the anthropometric measurements are difficult to be estimated accurately from state-of-the-art computer vision algorithms, ratios of anthropometric measurements were used as features. Additionally, since several measurements will not be(More)
—In this paper, we propose a novel regression-based method for employing privileged information to estimate the height using human metrology. The actual values of the an-thropometric measurements are difficult to estimate accurately using state-of-the-art computer vision algorithms. Hence, we use ratios of anthropometric measurements as features. Since many(More)
Integrating robotic platforms in smart home environments can improve the monitoring quality of daily activities. In this study, we explore a scenario where a robot provides a service to the users, which in our case is delivering a cup of coffee. The users place their order via an application, which at the same time captures a short video from their(More)
In this paper, we predict a human's depression level in the BDI-II scale, using facial and voice features. Active orientation models (AOM) and several voice features were extracted from the video and audio modalities. Long-term and mid-term features were computed and a fusion is performed in the feature space. Videos from the Depression Recognition(More)
  • 1