• Publications
  • Influence
QASC: A Dataset for Question Answering via Sentence Composition
This work presents a multi-hop reasoning dataset, Question Answering via Sentence Composition (QASC), that requires retrieving facts from a large corpus and composing them to answer a multiple-choice question, and provides annotation for supporting facts as well as their composition.
From 'F' to 'A' on the N.Y. Regents Science Exams: An Overview of the Aristo Project
Success is reported on the Grade 8 New York Regents Science Exam, where for the first time a system scores more than 90 percent on the exam’s nondiagram, multiple choice (NDMC) questions, demonstrating that modern natural language processing methods can result in mastery on this task.
Protinfo PPC: A web server for atomic level prediction of protein complexes
The fully automated all atom comparative modeling service for protein complexes provided by Protinfo PPC server offers wide capabilities ranging from prediction of protein complex interactions to identification of possible interaction sites, which will be useful for researchers studying these topics.
BIOVERSE: enhancements to the framework for structural, functional and contextual modeling of proteins and proteomes
An overview of the new features available that include expansion of the number of organisms represented in the Bioverse and addition of new data sources and novel prediction techniques not available elsewhere, including network-based annotation.
A Dataset for Tracking Entities in Open Domain Procedural Text
This work presents the first dataset for tracking state changes in procedural text from arbitrary domains by using an unrestricted (open) vocabulary, and creates OPENPI, a high-quality, large-scale dataset comprising 29,928 state changes over 4,050 sentences from 810 procedural real-world paragraphs from WikiHow.com.
Improving the accuracy of template-based predictions by mixing and matching between initial models
This novel approach can be applied without any manual intervention to improve the quality of comparative predictions where multiple template/alignment combinations are available for modeling, producing conformational models of higher quality than the starting initial predictions.
GPU-Q-J, a fast method for calculating root mean square deviation (RMSD) after optimal superposition
GPU-Q-J relieves a major bottleneck in the clustering of large numbers of structures for NRW and has applications in structure comparison methods that involve multiple superposition and RMSD determination steps, particularly when such methods are applied on a proteome and genome wide scale.
INTEGRATOR: interactive graphical search of large protein interactomes over the Web
Integrator provides single and multiple protein searches of the Bioverse database containing experimentally-derived and predicted protein-protein interactions and the interface provides animated local network views, rapid subgraph manipulation, and cross-referencing of functional annotations.
The Bioverse API and web application.
The Bioverse is a framework for creating, warehousing and presenting biological information based on hierarchical levels of organisation. The framework is guided by a deeper philosophy of desiring to
Computational representation of biological systems.
A systematic approach for the management and analysis of large biological data sets based on data warehouses and implemented in the Bioverse, a framework combining diverse protein information from a variety of knowledge areas such as molecular interactions, pathway localization, protein structure, and protein function.