It is found that the best current methods used on MCTACO are still far behind human performance, by about 20%, and several directions for improvement are discussed.
A more rigorous annotation paradigm for NLP that helps to close systematic gaps in the test data, and recommends that the dataset authors manually perturb the test instances in small but meaningful ways that (typically) change the gold label, creating contrast sets.
A zero-shot entity typing approach that requires no annotated data and can flexibly identify newly defined types is proposed that is shown to be competitive with state-of-the-art supervised NER systems, and to outperform them on out- of-training datasets.
This paper introduces CogCompTime, a system that has these two important functionalities and incorporates the most recent progress, achieves state-of-the-art performance, and is publicly available at http://cogcomp.org/page/publication_view/844.
This work proposes a novel sequence modeling approach that exploits explicit and implicit mentions of temporal common sense, extracted from a large corpus, to build TacoLM, a temporal commonsense language model.
A neuro-symbolic temporal reasoning model, SymTime, is proposed, which exploits distant supervision signals from large-scale text and uses temporal rules to combine start times and durations to infer end times and generalizes to other temporal reasoning tasks.
A new annotation paradigm for NLP is proposed that helps to close systematic gaps in the test data, and it is recommended that after a dataset is constructed, the dataset authors manually perturb the test instances in small but meaningful ways that change the gold label, creating contrast sets.
A new model, JEANS, is proposed, which jointly represents multilingual KGs and text corpora in a shared embedding scheme, and seeks to improve entity alignment with incidental supervision signals from text.
This work presents the library COGCOMPNLP, which simplifies the process of design and development of NLP applications by providing modules to address different challenges, and provides a corpus-reader module that supports popular corpora in the NLP community, a module for various low-level data-structures and operations, and an extensive suite of annotation modules for a wide range of semantic and syntactic tasks.
A new model, JEANS, is proposed, which jointly represents multilingual KGs and text corpora in a shared embedding scheme, and seeks to improve entity alignment with incidental supervision signals from text.