• Corpus ID: 18318880

HNews: An Enhanced Multilingual Hyperlinking News Platform

  title={HNews: An Enhanced Multilingual Hyperlinking News Platform},
  author={Diego De Cao and Daniele Previtali and Roberto Basili},
In this paper, we describe the HNews platform, a Web-based system addressing the general problem of aggregating and enriching news from different sources and languages. In the indexing stage, the news items gathered from RSS feeds or video streams are analyzed through Information Extraction tools. Their topical category information and the Named Entities mentions are recognized and used to create semantic metadata so to enrich the information available for each news item. Moreover, a robust… 

Figures from this paper


RitroveRAI: A Web Application for Semantic Indexing and Hyperlinking of Multimedia News
Performance evaluation of the current system prototype confirms the viability of the RitroveRAI approach for realistic (i.e. 24 hours) applications and continuous monitoring and metadata extraction from multimedia news data.
Enriched Page Rank for Multilingual Word Sense Disambiguation
An adaptation of the PageRank algorithm proposed for WSD using distributional information is presented to preserve the achievable accuracy for the english language over a foreign language and a variant called Personalized PageRank (PPR) is proposed in [7].
Robust and Efficient Page Rank for Word Sense Disambiguation
An adaptation of the PageRank algorithm recently proposed for Word Sense Disambiguation is presented that preserves the reachable accuracy while significantly reducing the requested processing time.
Personalizing PageRank for Word Sense Disambiguation
This paper proposes a new graph-based method that uses the knowledge in a LKB (based on WordNet) in order to perform unsupervised Word Sense Disambiguation, performing better than previous approaches in English all-words datasets.
An Algorithm that Learns What's in a Name
IdentiFinderTM, a hidden Markov model that learns to recognize and classify names, dates, times, and numerical quantities, is evaluated and is competitive with approaches based on handcrafted rules on mixed case text and superior on text where case information is not available.
Information Extraction A Multidisciplinary Approach to an Emerging Information Technology
This paper presents a meta-modelling architecture for multilingual information extraction that combines modeling and querying semi-structured data with formal ontological distinctions for information organization, extraction, and integration.
The Semantic Web - ISWC 2005, 4th International Semantic Web Conference, ISWC 2005, Galway, Ireland, November 6-10, 2005, Proceedings
Semantic Acceleration Helping Realize the Semantic Web Vision or "The Practical Web", research/Academic track.
NLP-driven IR: Evaluating Performances over a Text Classification task
A novel model for TC is defined, extending a well know statistical model and applied to linguistic features and represents an effective feature selection methodology that reaches the performance of the best known models.
Introduction to WordNet: An On-line Lexical Database
Standard alphabetical procedures for organizing lexical information put together words that are spelled alike and scatter words with similar or related meanings haphazardly through the list.
Creating Rich Metadata in the TV Broadcast Archives Environment: The PrestoSpace Project
  • A. Messina, L. Boch, Roberto Basili
  • Computer Science
    2006 Second International Conference on Automated Production of Cross Media Content for Multi-Channel Distribution (AXMEDIS'06)
  • 2006
The mission of the MAD system, inside the wider perspective of the PrestoSpace factory, is to generate, validate and deliver to the archive users metadata created through the employment of both automatic and manual information extraction tools.