Carlos Segura

Learn More
Simultaneous speech poses a challenging problem for conventional speaker diarization systems. In meeting data, a substantial amount of missed speech error is due to speaker overlaps, since usually only one speaker label per segment is assigned. Furthermore, simultaneous speech included in training data can lead to corrupt speaker models and thus worse(More)
Acoustic event detection (AED) aims at determining the identity of sounds and their temporal position in audio signals. When applied to spontaneously generated acoustic events, AED based only on audio information shows a large amount of errors, which are mostly due to temporal overlaps. Actually, temporal overlaps accounted for more than 70% of errors in(More)
Reliable measures of person positions are needed for computational perception of human activities taking place in a smart-room environment. In this work, we present the Person Tracking systems developed at UPC for audio, video and audio-video modalities in the context of the EU funded CHIL project research activities. The aim of the designed systems, and(More)
Acoustic events produced in meeting environments may contain useful information for perceptually aware interfaces and multimodal behavior analysis. In this paper, a system to detect and recognize these events from a multimodal perspective is presented combining information from multiple cameras and microphones. First, spectral and temporal features are(More)
The Frequency Assignment Problem (FAP) is one of the key issues in the design of GSM networks (Global System for Mobile communications), and will remain important in the foreseeable future. There are many versions of FAP, most of them benchmarking-like problems. We use a formulation of FAP, developed in published work, that focuses on aspects which are(More)
Detecting the location and identity of users is a first step in creating context-aware applications for technologically-endowed environments. We propose a system that makes use of motion detection, person tracking, face identification, feature-based identification, audio-based localization, and audio-based identification modules, fusing information with(More)
This article presents a multimodal approach to head pose estimation of individuals in environments equipped with multiple cameras and microphones, such as SmartRooms or automatic video conferencing. Determining the individuals head orientation is the basis for many forms of more sophisticated interactions between humans and technical devices and can also be(More)
Nowadays, mobile communications are experiencing a strong growth, being more and more indispensable. One of the key issues in the design of mobile networks is the frequency assignment problem (FAP). This problem is crucial at present and will remain important in the foreseeable future. Real-world instances of FAP typically involve very large networks, which(More)
Most research on Strip Packing Problems is focused on the single-objective formulation of the problem. However, in this work we deal with a more general and practical variant of the problem, which not only seeks to optimize the usage of the raw material, but also the production process. For the problem solution, we have applied some of the most-known(More)