IdeaReader: A Machine Reading System for Understanding the Idea Flow of Scientific Publications

  title={IdeaReader: A Machine Reading System for Understanding the Idea Flow of Scientific Publications},
  author={Qi Li and Yuyang Ren and Xingli Wang and Luoyi Fu and Jiaxin Ding and Xinde Cao and Xinbing Wang and Cheng Zhou},
Understanding the origin and influence of the publication’s idea is critical to conducting scientific research. However, the proliferation of scientific publications makes it difficult for researchers to sort out the evolution of all relevant literature. To this end, we present IdeaReader, a machine reading system that finds out which papers are most likely to inspire or be influenced by the target publication and summarizes the ideas of these papers in natural language. Specifically, IdeaReader first… 

Figures from this paper



MRT: Tracing the Evolution of Scientific Publications

This work proposed a practical framework called Master Reading Tree (MRT), which can build annotated evolution roadmaps for publications and identify important previous works or evolution tracks by generating expressive embeddings and clustering them into various groups.

Detecting topic evolution in scientific literature: how can citations help?

An iterative topic evolution learning framework is proposed by adapting the Latent Dirichlet Allocation model to the citation network and develop a novel inheritance topic model, which clearly shows that citations can help to understand topic evolution better.

Mining Algorithm Roadmap in Scientific Publications

This work defines a new problem called mining algorithm roadmap in scientific publications, and proposes a new weakly supervised method to build the roadmap, and presents a proposed algorithm that shows its superiority over the baseline methods on the proposed task.

AceMap: A Novel Approach towards Displaying Relationship among Academic Literatures

A novel academic system, AceMap, to analyze the big scholarly data and present the results through a ``map'' approach, which integrates several algorithms in the field of network analysis and data mining and then displays the information in a clear and intuitive way, aiming to help the researchers facilitate their work.

Automatic Generation of Related Work Sections in Scientific Papers: An Optimization Approach

The proposed Automatic Related Work Generation system called ARWG first exploits a PLSA model to split the sentence set of the given papers into different topic-biased parts, and then applies regression models to learn the importance of the sentences.

Slowed canonical progress in large fields of science

Significance The size of scientific fields may impede the rise of new ideas. Examining 1.8 billion citations among 90 million papers across 241 subjects, we find a deluge of papers does not lead to

Automatic labeling of multinomial topic models

Probabilistic approaches to automatically labeling multinomial topic models in an objective way are proposed and can be applied to labeling topics learned through all kinds of topic models such as PLSA, LDA, and their variations.

SciBERT: A Pretrained Language Model for Scientific Text

SciBERT leverages unsupervised pretraining on a large multi-domain corpus of scientific publications to improve performance on downstream scientific NLP tasks and demonstrates statistically significant improvements over BERT.

A deep learning classifier for sentence classification in biomedical and computer science abstracts

A novel deep learning approach based on a convolutional layer and a bidirectional gated recurrent unit to classify sentences of abstracts to support scientific database querying, to summarize relevant literature works and to assist in the writing of new abstracts is proposed.

Text Summarization with Pretrained Encoders

This paper introduces a novel document-level encoder based on BERT which is able to express the semantics of a document and obtain representations for its sentences and proposes a new fine-tuning schedule which adopts different optimizers for the encoder and the decoder as a means of alleviating the mismatch between the two.