• Publications
  • Influence
An Overview of Microsoft Academic Service (MAS) and Applications
A knowledge driven, highly interactive dialog that seamlessly combines reactive search and proactive suggestion experience, and a proactive heterogeneous entity recommendation are demonstrated.
Network Embedding as Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec
The NetMF method offers significant improvements over DeepWalk and LINE for conventional network mining tasks and provides the theoretical connections between skip-gram based network embedding algorithms and the theory of graph Laplacian.
CORD-19: The COVID-19 Open Research Dataset
The mechanics of dataset construction are described, highlighting challenges and key design decisions, an overview of how CORD-19 has been used, and several shared tasks built around the dataset are described.
Heterogeneous Graph Transformer
The proposed HGT model consistently outperforms all the state-of-the-art GNN baselines by 9–21 on various downstream tasks, and the heterogeneous mini-batch graph sampling algorithm—HGSampling—for efficient and scalable training.
GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training
Graph Contrastive Coding (GCC) is designed --- a self-supervised graph neural network pre-training framework --- to capture the universal network topological properties across multiple networks and leverage contrastive learning to empower graph neural networks to learn the intrinsic and transferable structural representations.
ERD'14: entity recognition and disambiguation challenge
It is shown how the pooling technique is adapted to address the difficulties of gathering annotations for the entity linking task, and how the task definition, issues encountered during annotation, and detailed analysis of all the participating systems are provided.
Auditory representations of acoustic signals
An analytically tractable framework is presented to describe mechanical and neural processing in the early stages of the auditory system. Algorithms are developed to assess the integrity of the
Clickage: towards bridging semantic and intent gaps via mining click logs of search engines
It is argued that the massive amount of click data from commercial search engines provides a data set that is unique in the bridging of the semantic and intent gap, and preliminary studies on the power of large-scale click data are presented.
GPT-GNN: Generative Pre-Training of Graph Neural Networks
The GPT-GNN framework to initialize GNNs by generative pre-training introduces a self-supervised attributed graph generation task to pre-train a GNN so that it can capture the structural and semantic properties of the graph.
Spectral shape analysis in the central auditory system
A model of spectral shape analysis in the central auditory system is developed based on neurophysiological mappings in the primary auditory cortex and on results from psychoacoustical experiments in human subjects, showing that this representation is equivalent to performing an affine wavelet transform of the spectral pattern.