• Corpus ID: 250244139

Multi-Document Keyphrase Extraction: Dataset, Baselines and Review

  title={Multi-Document Keyphrase Extraction: Dataset, Baselines and Review},
  author={Ori Shapira and Ramakanth Pasunuru and Ido Dagan and Yael Amsterdamer},
Keyphrase extraction has been extensively re-searched within the single-document setting, with an abundance of methods, datasets and applications. In contrast, multi-document keyphrase extraction has been infrequently studied, despite its utility for describing sets of documents, and its use in summarization. Moreover, no prior dataset exists for multi-document keyphrase extraction, hindering the progress of the task. Recent advances in multi-text processing make the task an even more ap… 

Tables from this paper



Automatic keyphrase extraction: a survey and trends

A comprehensive review of recent research efforts on the AKPE task and its related techniques is provided, including a comparison study of the best performing techniques, why some perform better than others and proposed recommendations to improve each stage of theAKPE process.

CorePhrase: Keyphrase Extraction for Document Clustering

Subjective as well as quantitative evaluation show that the algorithm outperforms keyword-based cluster-labeling algorithms, and is capable of accurately discovering the topic, and often ranking it in the top one or two extracted keyphrases.

Large Dataset for Keyphrases Extraction

A large dataset for machine learning-based automatic keyphrase extraction based on 2,000 of scientific papers from computer science domain published by ACM shows keyphrases recognition accuracy improvement for refined texts.

CollabRank: Towards a Collaborative Approach to Single-Document Keyphrase Extraction

This paper proposes a novel approach named CollabRank to collaborative single-document keyphrase extraction by making use of mutual influences of multiple documents within a cluster context, and finds that the system performance relies positively on the quality of document clusters.

A review of keyphrase extraction

This article introduces keyphrase extraction, provides a well‐structured review of the existing work, offers interesting insights on the different evaluation approaches, highlights open issues and presents a comparative experimental study of popular unsupervised techniques on five datasets.

Capturing Global Informativeness in Open Domain Keyphrase Extraction

JointKPE is presented, an open-domain KPE architecture built on pre-trained language models, which can capture both local phraseness and global informativeness when extracting keyphrases and reveals the significant advantages of JointKPE in predicting long and non-entity keyphRases, which are challenging for previous neural KPE methods.

Automatic query-based keyword and keyphrase extraction

The proposed approach is specifically practical when a user is interested in additional data such as keywords/keyphrases related to a topic or query, and based user's satisfaction.

PositionRank: An Unsupervised Approach to Keyphrase Extraction from Scholarly Documents

An unsupervised model for keyphrase extraction from scholarly documents that incorporates information from all positions of a word’s occurrences into a biased PageRank, which achieves remarkable improvements over PageRank models that do not take into account word positions.

Interactive document summarisation using automatically extracted keyphrases

An evaluation of IDS summaries was reported, in which representative end-users of on-line documents identified relevant summary sentences in source documents and the efficacy of the summaries based on standard precision and recall measures was reported.