Guilherme Tavares de Assis

Learn More
Author name ambiguity is a hard problem that occurs when several authors publish articles with the same name or when a same author publishes their articles under different names. Traditionally, automatic disambiguation methods process the author names of all citation records in a repository. Aiming efficiency, incremental methods disambiguate author names(More)
This paper describes the development of a multimedia information system to support the discourse analysis of video recordings of television programs. Although the TV system is one of the most fascinating media phenomena ever created by men, there is still a lack of information systems that allow an effective retrieval of TV information relevant to the(More)
Focused crawlers have as their main goal to crawl Web pages that are relevant to a specific topic or user interest, playing an important role for a great variety of applications. In general, they work by trying to find and crawl all kinds of pages deemed as related to an implicitly declared topic. However, users are often not simply interested in any(More)
The genre-aware approach to focused crawling aims at crawling pages related to specific topics that can be expressed in terms of both genre and content information. Such an approach requires an expert to specify a set of terms that describe the genre and the content of the pages of interest. In this paper, we analyze the impact of term selection on this(More)
This work addresses the development of a unified approach to content-based indexing and retrieval of digital videos from television archives. The proposed approach has been designed to deal with arbitrary television genres, making it suitable for various applications. To achieve this goal, the main steps of a content-based video retrieval system are(More)
This paper presents a novel multimedia information system, called SAPTE, for supporting the discourse analysis and information retrieval of television programs from their corresponding video recordings. Unlike most common systems, SAPTE uses both content independent and dependent metadata, which are determined by the application of discourse analysis(More)
Focused crawlers attempt to crawl web pages that are relevant to a specific topic or user interest. Although these kinds of crawlers have been proven to be effective, they need to improve their efficiency. Focused crawlers usually use a Frontier of non-visited URLs to visit the web pages and gather relavant ones. In this work, we define and evaluate a(More)