Maurizio Rigamonti

Learn More
Accessing the structured content of PDF document is a difficult task, requiring pre-processing and reverse engineering techniques. In this paper, we first present different methods to accomplish this task, which are based either on document image analysis, or on electronic content extraction. Then, XCDF, a canonical format with well-defined properties is(More)
PDF became a very common format for exchanging printable documents. Further, it can be easily generated from the major documents formats, which make a huge number of PDF documents available over the net. However its use is limited to displaying and printing, which considerably reduces the search and retrieval capabilities. For this reason, additional tools(More)
This article presents Xed, a reverse engineering tool for PDF documents, which extracts the original document layout structure. Xed mixes electronic extraction methods with state-of-the-art document analysis techniques and outputs the layout structure in a hierarchical canonical form, i.e. which is universal and independent of the document type. This(More)
The aim of this report is to describe the browsers that have been developed by various groups within the IM2 1 project, highlighting goals, design methodologies, key functionalities and evaluation methods used by each. The paper concludes with a tabular overview of the media, input and output modalities and special functionalities handled by each browser,(More)
The paper presents an extension to the Excentric Labeling, a labeling technique to dynamically show labels around a movable lens. Each labels refers to one object within the lens and is connected to it through a line. The original implementation has several known limitations and potential improvements that we address in this work, like: high density areas,(More)
This paper describes a novel browsing paradigm, taking benefit of the various types of links (e.g. thematic, temporal, references, etc.) that can be automatically built between multimedia documents. This browsing paradigm can help eliciting multimedia archives' hidden structures or expanding search results to related media. The paper intend to present a(More)
This paper describes the DocMining platform, that is aimed at providing a general framework for document interpretation. The platform integrates processings that come from different sources and that communicate through the document. A task to be performed is represented by a scenario that describes the processings to be run, and each processing is(More)
This article presents an ego-centric approach for indexing and browsing meetings. The method considers two concepts: meetings' data alignment with personal information to enable ego-centric browsing and live intentional annotation of meetings through tangible actions to enable ego-centric indexing. The article first motivates and introduces these concepts(More)