Entity Set Search of Scientific Literature: An Unsupervised Ranking Approach

@article{Shen2018EntitySS,
  title={Entity Set Search of Scientific Literature: An Unsupervised Ranking Approach},
  author={Jiaming Shen and Jinfeng Xiao and Xinwei He and Jingbo Shang and Saurabh Sinha and Jiawei Han},
  journal={The 41st International ACM SIGIR Conference on Research \& Development in Information Retrieval},
  year={2018}
}
  • Jiaming Shen, Jinfeng Xiao, Jiawei Han
  • Published 29 April 2018
  • Computer Science
  • The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval
Literature search is critical for any scientific research. Different from Web or general domain search, a large portion of queries in scientific literature search are entity-set queries, that is, multiple entities of possibly different types. Entity-set queries reflect user's need for finding documents that contain multiple entities and reveal inter-entity relationships and thus pose non-trivial challenges to existing search algorithms that model each entity separately. However, entity-set… 

Figures and Tables from this paper

SetSearch + : Entity-Set-Aware Search and Mining for Scientific Literature
TLDR
SetSearch+ first leverages a data-driven text mining pipeline to extract typed entities for building entity-enhanced indices, and adopts a novel entity-setaware ranking model for online document retrieval, which captures entity type information and relations among entity sets.
Mining Entity Synonyms with Efficient Neural Set Generation
TLDR
A new framework is proposed, named SynSetMine, that efficiently generates entity synonym sets from a given vocabulary, using example sets from external knowledge bases as distant supervision, and demonstrates both effectiveness and efficiency of Syn setMine for mining entity synonyms sets.
Query-Specific Knowledge Summarization with Entity Evolutionary Networks
TLDR
To facilitate such a novel insightful search system, the proposed SetEvolve is proposed, which is a unified framework based on nonparanomal graphical models for evolutionary network construction from large text corpora.
A supervised and distributed framework for cold-start author disambiguation in large-scale publications
TLDR
This paper focuses on the cold-start disambiguation task with homonymous author names, i.e., distinguishing publications written by authors with identical names and presents a supervised framework named DND (abbreviation for Distributed Framework for Name Disambigsuation) to solve the author disambIGuation problem efficiently.
SynSetExpan: An Iterative Framework for Joint Entity Set Expansion and Synonym Discovery
TLDR
This work hypothesizes that these two tasks are tightly coupled because two synonymous entities tend to have similar likelihoods of belonging to various semantic classes, and designs SynSetExpan, a novel framework that enables two tasks to mutually enhance each other.
EVIDENCEMINER: Textual Evidence Discovery for Life Sciences
TLDR
EVIDENCEMINER is a web-based system that lets users query a natural language statement and automatically retrieves textual evidence from a background corpora for life sciences, supported by novel data-driven methods for distantly supervised named entity recognition and open information extraction.
Knowledge Graphs: An Information Retrieval Perspective
TLDR
An overview of the literature on knowledge graphs (KGs) in the context of information retrieval (IR) is provided and how KGs can be employed to support IR tasks, including document and entity retrieval is discussed.
TaxoEnrich: Self-Supervised Taxonomy Completion via Structure-Semantic Representations
TLDR
Extensive experiments on four large real-world datasets from different domains show that TaxoEnrich achieves the best performance among all evaluation metrics and outperforms previous state-of-the-art methods by a large margin.
HierCon: Hierarchical Organization of Technical Documents Based on Concepts
TLDR
This work study the hierarchical organization of technical documents, where given a set of documents and a hierarchy of categories, the goal is to assign documents to their corresponding categories by leveraging semantic information from concepts.
Discovering Hypernymy in Text-Rich Heterogeneous Information Network by Exploiting Context Granularity
TLDR
This work develops a new framework, named HyperMine, that exploits multi-granular contexts and combines signals from both text and network without human labeled data, and extends the definition of "context" to the scenario of text-rich HIN.
...
1
2
...

References

SHOWING 1-10 OF 48 REFERENCES
Latent entity space: a novel retrieval approach for entity-bearing queries
TLDR
Experimental results over TREC collections show that the proposed LES approach is effective in capturing latent semantic content and can significantly improve the search accuracy of several state-of-the-art retrieval models for entity-bearing queries.
Entity query feature expansion using knowledge base links
TLDR
A new technique, called entity query feature expansion (EQFE), which enriches the query with features from entities and their links to knowledge bases, including structured attributes and text, finds that entity-based feature expansion results in significant improvements in retrieval effectiveness over state-of-the-art text expansion approaches.
Query Expansion with Freebase
TLDR
A supervised model combines information derived from Freebase descriptions and categories to select terms that are effective for query expansion, and finds some methods have better win/loss ratios than baseline algorithms, with 50% fewer queries damaged.
On Type-Aware Entity Retrieval
TLDR
This paper performs a thorough analysis of three main aspects: the choice of type taxonomy, the representation of hierarchical type information, and the combination of type-based and term-based similarity in the retrieval model.
Named entity recognition in query
TLDR
Experimental results show that the proposed method based on WS-LDA can accurately perform NERQ, and outperform the baseline methods.
Word-Entity Duet Representations for Document Ranking
TLDR
Evaluation results on TREC Web Track ad-hoc task demonstrate that all of the four-way interactions in the duet are useful, the attention mechanism successfully steers the model away from noisy entities, and together they significantly outperform both word-based and entity-based learning to rank systems.
Bag-of-Entities Representation for Ranking
TLDR
This paper presents a new bag-of-entities representation for document ranking, with the help of modern knowledge bases and automatic entity linking, and demonstrates that current entity linking systems can provide sufficient coverage of the general domain search task.
Query dependent pseudo-relevance feedback based on wikipedia
TLDR
This work proposes and proposes and studies the effectiveness of three methods for expansion term selection, each modeling the Wikipedia based pseudo-relevance information from a different perspective, and incorporates the expansion terms into the original query and uses language modeling IR to evaluate these methods.
Document Retrieval Using Entity-Based Language Models
We address the ad hoc document retrieval task by devising novel types of entity-based language models. The models utilize information about single terms in the query and documents as well as term
Explicit Semantic Ranking for Academic Search via Knowledge Graph Embedding
TLDR
Explicit Semantic Ranking is introduced, a new ranking technique that leverages knowledge graph embedding that represents queries and documents in the entity space and ranks them based on their semantic connections from their knowledgegraph embedding.
...
1
2
3
4
5
...