Nikolaos Sarafianos

Learn More
Speaker diarization aims to automatically answer the question “who spoke when” given a speech signal. In this work, we have focused on applying the FLsD approach, a semi-supervised version of Fisher Linear Discriminant analysis, both in the audio and the video signals to form a complete multimodal speaker diarization system. Extensive experiments have(More)
In this paper, we predict a human's depression level in the BDI-II scale, using facial and voice features. Active orientation models (AOM) and several voice features were extracted from the video and audio modalities. Long-term and mid-term features were computed and a fusion is performed in the feature space. Videos from the Depression Recognition(More)
Integrating robotic platforms in smart home environments can improve the monitoring quality of daily activities. In this study, we explore a scenario where a robot provides a service to the users, which in our case is delivering a cup of coffee. The users place their order via an application, which at the same time captures a short video from their(More)
  • 1