Javier Ruiz Hidalgo

Learn More
This paper deals with the extraction and characterization of foreground objects in video sequences. The algorithm first computes the mosaic image representing the background information and then extracts foreground objects. In this last step, the foreground objects are progressively extracted taking into account the reliability of the contour information.(More)
—For the last few years, video indexing and video compression have been considered as two separate functionali-ties. However, multimedia content is growing at such a rate that multimedia services will need to consider both the compression and the indexing aspects of the content in order to efficiently manage this audio–video content. Therefore, it is(More)
The media industry is currently being pulled in the often-opposing directions of increased realism (high resolution, stereoscopic, large screen) and personalization (selection and control of content, availability on many devices). We investigate the feasibility of an end-to-end format-agnostic approach to support both these trends. In this paper, different(More)
At the Technical University of Catalonia (UPC), a smart room has been equipped with 85 microphones and 8 cameras. This paper describes the setup of the sensors, gives an overview of the underlying hardware and software infrastructure and indicates possibilities for high-and low-level multi-modal interaction. An example of usage of the information collected(More)
The main challenge in Super Resolution (SR) is to discover the mapping between the low-and high-resolution manifolds of image patches, a complex ill-posed problem which has recently been addressed through piecewise linear regression with promising results. In this paper we present a novel regression-based SR algorithm that benefits from an extended(More)
In this paper, we propose a gesture-based interface designed to interact with panoramic scenes. The system combines novel static gestures with a fast hand tracking method. Our proposal is to use static gestures as shortcuts to activate functionalities of the system (i.e. volume up/down, mute, pause, etc.), and hand tracking to freely explore the panoramic(More)
The performance of learning-based Super-Resolution (SR) methods depends strongly on the content of the training dataset. In [3] the dictionary is built by randomly sampling raw patches from a large set of images regardless of the image to be recovered, hence relying on gathering sufficiently diverse patches so that they can generalize for any patch to be(More)