Share This Author
CORD-19: The COVID-19 Open Research Dataset
The mechanics of dataset construction are described, highlighting challenges and key design decisions, an overview of how CORD-19 has been used, and several shared tasks built around the dataset are described.
S2ORC: The Semantic Scholar Open Research Corpus
In S2ORC, a large corpus of 81.1M English-language academic papers spanning many academic disciplines is introduced, which is expected to facilitate research and development of tools and tasks for text mining over academic text.
Construction of the Literature Graph in Semantic Scholar
This paper reduces literature graph construction into familiar NLP tasks, point out research challenges due to differences from standard formulations of these tasks, and report empirical results for each task.
TREC-COVID: Constructing a Pandemic Information Retrieval Test Collection
TREC-COVID is a community evaluation designed to build a test collection that captures the information needs of biomedical researchers using the scientific literature during a pandemic. One of the…
Fact or Fiction: Verifying Scientific Claims
We introduce scientific claim verification, a new task to select abstracts from the research literature containing evidence that supports or refutes a given scientific claim, and to identify…
TREC-COVID: rationale and structure of an information retrieval shared task for COVID-19
TREC-COVID differs from traditional IR shared task evaluations with special considerations for the expected users, IR modality considerations, topic development, participant requirements, assessment process, relevance criteria, evaluation metrics, iteration process, projected timeline, and the implications of data use as a post-task test collection.
Ontology alignment in the biomedical domain using entity definitions and context
- Lucy Lu Wang, Chandra Bhagavatula, Mark Neumann, Kyle Lo, Christopher Wilhelm, Waleed Ammar
- Computer Science, PhilosophyBioNLP
- 20 June 2018
This work proposes a method for enriching entities in an ontology with external definition and context information, and uses this additional information for ontology alignment, and develops a neural architecture capable of encoding the additional information when available.
MSˆ2: Multi-Document Summarization of Medical Studies
- Jay DeYoung, Iz Beltagy, Madeleine van Zuylen, Bailey Kuehl, Lucy Lu Wang
- Computer ScienceEMNLP
- 13 April 2021
This work releases MSˆ2 (Multi-Document Summarization of Medical Studies), a dataset of over 470k documents and 20K summaries derived from the scientific literature that facilitates the development of systems that can assess and aggregate contradictory evidence across multiple studies, and is the first large-scale, publicly available multi-document summarization dataset in the biomedical domain.
What Do We Mean by “Accessibility Research”?: A Literature Survey of Accessibility Papers in CHI and ASSETS from 1994 to 2019
- K. Mack, Emma J. McDonnell, D. Jain, Lucy Lu Wang, Jon E. Froehlich, Leah Findlater
- 12 January 2021
A dataset of accessibility papers appearing at CHI and ASSETS since ASSETS’ founding in 1994 is created and analyzed to understand current and historical trends and highlight areas that have received disproportionate attention and those that are underserved.