Generalizing Cross-Document Event Coreference Resolution Across Multiple Corpora

  title={Generalizing Cross-Document Event Coreference Resolution Across Multiple Corpora},
  author={Michael Bugert and Nils Reimers and Iryna Gurevych},
  journal={Computational Linguistics},
Cross-document event coreference resolution (CDCR) is an NLP task in which mentions of events need to be identified and clustered throughout a collection of documents. CDCR aims to benefit downstream multidocument applications, but despite recent progress on corpora and system development, downstream improvements from applying CDCR have not been shown yet. We make the observation that every CDCR system to date was developed, trained, and tested only on a single respective corpus. This raises… 
Qualitative and Quantitative Analysis of Diversity in Cross-document Coreference Resolution Datasets
A phrasing diversity metric (PD) is proposed that compares lexical diversity within coreference chains on a more detailed level than previously proposed metric, e.g., a number of unique lemmas.
XCoref: Cross-document Coreference Resolution in the Wild
Outperforming an established CDCR model shows that the new CDCR models need to be evaluated on semantically complex mentions with more loose coreference relations to indicate their applicability of models to resolve mentions in the “wild” of political news articles.
Event Coreference Data (Almost) for Free: Mining Hyperlinks from Online News
This work automatically extracts event coreference data from hyperlinks in online news and demonstrates that collecting hyperlinks which point to the same article(s) produces extensive and high-quality CDCR data and creates a corpus of 2M documents and 2.7M silver-standard event mentions called HyperCoref.


NewsReader at SemEval-2018 Task 5: Counting events by reasoning over event-centric-knowledge-graphs
The participation of the NewsReader system in the SemEval-2018 Task 5 on Counting Events and Participants in the Long Tail is described and the quality and potential of ECKGs are tested to establish event identity and reason over the result to answer the task queries.
A model-theoretic coreference scoring scheme
This note describes a scoring scheme for the coreference task in MUC6. It improves on the original approach by: (1) grounding the scoring scheme in terms of a model; (2) producing more intuitive
Breaking the Subtopic Barrier in Cross-Document Event Coreference Resolution
This work presents the first scalable approach for annotating cross-subtopic event coreference links, a highly valuable but rarely occurring type of cross-document link, and proposes crowdsourcing annotation on sentence level to achieve scalability.
New Insights into Cross-Document Event Coreference: Systematic Comparison and a Simplified Approach
This study provides the first cross-validated evaluation on the ECB+ dataset; the first explicit evaluation of the pairwise event coreference classification step; and the first quantification of the effect of document clustering on system performance.
Paraphrasing vs Coreferring: Two Sides of the Same Coin
This work used annotations from an event coreference dataset as distant supervision to re-score heuristically-extracted predicate paraphrases, and used the same re-ranking features as additional inputs to a state-of-the-art eventcoreference resolution model, which yielded modest but consistent improvements to the model’s performance.
Event Coreference Resolution: A Survey of Two Decades of Research
This paper provides an overview of the major milestones made in event coreference research since its inception two decades ago.
KOI at SemEval-2018 Task 5: Building Knowledge Graph of Incidents
We present KOI (Knowledge of Incidents), a system that given news articles as input, builds a knowledge graph (KOI-KG) of incidental events. KOI-KG can then be used to efficiently answer questions
Revisiting the Evaluation for Cross Document Event Coreference
A new evaluation methodology is suggested which overcomes limitations of past works, and allows for an accurate assessment of CDEC systems, and better reflects the corpus-wide information aggregation ability ofCDEC systems.
A Hierarchical Distance-dependent Bayesian Model for Event Coreference Resolution
A novel hierarchical distance-dependent Bayesian model for event coreference resolution that allows for the incorporation of pairwise distances between event mentions to guide the generative clustering processing for better event clustering both within and across documents.
Joint Entity and Event Coreference Resolution across Documents
A novel coreference resolution system that models entities and events jointly that handles nominal and verbal events as well as entities, and the joint formulation allows information from event coreference to help entity coreference, and vice versa.