Stefano Ortona

Learn More
Named entity extractors can be used to enrich both text and Web documents with semantic annotations. While originally focused on a few standard entity types, the ecosystem of annotators is becoming increasingly diverse, with recognition capabilities ranging from generic to specialised entity types. Both the overlap and the diversity in annotator(More)
Web scraping (or wrapping) is a popular means for acquiring data from the web. Recent advancements have made scal-able wrapper-generation possible and enabled data acquisition processes involving thousands of sources. This makes wrapper analysis and maintenance both needed and challenging as no scalable tools exists that support these tasks. We demonstrate(More)
Today the web has become the largest available source of information. The automatic extraction of structured data from web is a challenging problem that has been widely investigated. However, after the extraction process, the problem of identifying duplicates among the extracted web records must be solved in order to present clean data to the final user.(More)
  • 1