An Overview of Microsoft Academic Service (MAS) and Applications

@article{Sinha2015AnOO,
  title={An Overview of Microsoft Academic Service (MAS) and Applications},
  author={Arnab Sinha and Zhihong Shen and Yang Song and Hao Ma and Darrin Eide and Bo-June Paul Hsu and Kuansan Wang},
  journal={Proceedings of the 24th International Conference on World Wide Web},
  year={2015}
}
In this paper we describe a new release of a Web scale entity graph that serves as the backbone of Microsoft Academic Service (MAS), a major production effort with a broadened scope to the namesake vertical search engine that has been publicly available since 2008 as a research prototype. At the core of MAS is a heterogeneous entity graph comprised of six types of entities that model the scholarly activities: field of study, author, institution, paper, venue, and event. In addition to obtaining… 

Figures and Tables from this paper

Microsoft Academic Graph: When experts are not enough
TLDR
The design, schema, and technical and business motivations behind MAG are described and how MAG can be used in analytics, search, and recommendation scenarios are elaborated.
Integration of Scholarly Communication Metadata Using Knowledge Graphs
TLDR
This work conducts an experimental study on an SCM-KG that merges scientific research metadata from the DBLP bibliographic source and the Microsoft Academic Graph, and demonstrates the benefits of exploiting semantic web technology to reconcile data about authors, papers, and conferences.
GRASP: Graph-based Mining of Scientific Papers
TLDR
This paper introduces GRASP, a search engine that retrieves scientific papers starting from a sub-graph query provided by the user, offering a list of time papers based on the query and a graph with papers and authors as vertices and edges being cited and published-by.
On the Use of Web Search to Improve Scientific Collections
TLDR
This paper proposes a novel search-driven framework for acquiring documents for scientific portals using publicly-available research paper titles and author names used as queries to a Web search engine.
The Microsoft Academic Knowledge Graph enhanced: Author name disambiguation, publication classification, and embeddings
TLDR
Methods for enhancing the Microsoft Academic Knowledge Graph (MAKG), a recently published large-scale knowledge graph containing metadata about scientific publications and associated authors, venues, and affiliations, are presented.
Improving Access to Scientific Literature with Knowledge Graphs
TLDR
A scholarly knowledge graph can be used to give a condensed overview on the state-of-the-art addressing a particular research quest, for example as a tabular comparison of contributions according to various characteristics of the approaches.
A Recommendation System Based on Hierarchical Clustering of an Article-Level Citation Network
TLDR
This paper introduces EigenfactorRecommends - a citation-based method for improving scholarly navigation that uses the hierarchical structure of scientific knowledge, making possible multiple scales of relevance for different users.
The Microsoft Academic Knowledge Graph: A Linked Data Source with 8 Billion Triples of Scholarly Data
TLDR
The Microsoft Academic Knowledge Graph (MAKG), a large RDF data set with over eight billion triples with information about scientific publications and related entities, such as authors, institutions, journals, and fields of study, is presented.
Building an Accessible, Usable, Scalable, and Sustainable Service for Scholarly Big Data
TLDR
This paper reviews the design, implementation, and operation experiences, and lessons of CiteSeerX, a real-world digital library search engine, and proposed a new design with a revised architecture, enhanced hardware, and software infrastructure.
Developing a Temporal Bibliographic Data Set for Entity Resolution
TLDR
This paper describes the preparation of a temporal data set based on author profiles extracted from the Digital Bibliography and Library Project (DBLP) using the Microsoft Academic Graph to link temporal affiliation information for DBLP authors.
...
...

References

SHOWING 1-10 OF 16 REFERENCES
Efficient Name Disambiguation for Large-Scale Databases
TLDR
It is proved that by recasting transitivity as density reachability in DBSCAN, transitivity is guaranteed for core points.
Knowledge vault: a web-scale approach to probabilistic knowledge fusion
TLDR
The Knowledge Vault is a Web-scale probabilistic knowledge base that combines extractions from Web content (obtained via analysis of text, tabular data, page structure, and human annotations) with prior knowledge derived from existing knowledge repositories that computes calibrated probabilities of fact correctness.
Query Recommendation Using Query Logs in Search Engines
TLDR
A method is proposed that, given a query submitted to a search engine, suggests a list of related queries that are based in previously issued queries and can be issued by the user to the search engine to tune or redirect the search process.
ClusCite: effective citation recommendation by information network-based clustering
TLDR
A novel cluster-based citation recommendation framework, called ClusCite, which explores the principle that citations tend to be softly clustered into interest groups based on multiple types of relationships in the network, and learns group memberships for objects and the significance of relevance features for each interest group by solving a joint optimization problem.
Bing dialog model: intent, knowledge and user interaction
TLDR
Bing Dialog Model, Microsoft's decision engine, is designed to not just navigate users to a landing page through a blue link but to continue engaging with users to clarify intent and facilitate task completion.
Real life information retrieval: a study of user queries on the Web
We analyzed transaction logs of a set of 51,473 queries posed by 18,113 users of Excite, a major Internet search service. We provide data on: (i) queries --- the number of search terms, and the use
Efficient topic-based unsupervised name disambiguation
TLDR
This paper presents an efficient and effective two-stage approach to disambiguate person names within web pages and scientific documents and empirically addressed the issue of scalability bydisambiguating authors in over 750,000 papers from the entire CiteSeer dataset.
Recommending citations for academic papers
TLDR
This work uses the text of previous literature as well as the citation graph that connects it to find relevant related material and finds an order of magnitude improvement in mean average precision as compared to a text similarity baseline.
Rise of the Rest: The Growing Impact of Non-Elite Journals
TLDR
The evolution of the impact of non-elite journals is examined to answer two questions: first, what fraction of the top-cited articles are published in non-Elite journals and how has this changed over time and second, now that finding and reading relevant articles inNon-elites is about as easy as finding andReading articles in elite journals, researchers are increasingly building on and citing work published everywhere.
Are elite journals declining?
TLDR
This article examines citation patterns during the past 40 years of seven long‐standing traditionally elite journals and six journals that have been increasing in importanceDuring the past 20 years, whether this diversification has also affected the handful of elite journals that are traditionally considered to be the best.
...
...