• Publications
  • Influence
Graph Regularized Transductive Classification on Heterogeneous Information Networks
This paper considers the transductive classification problem on heterogeneous networked data which share a common topic and proposes a novel graph-based regularization framework, GNetMine, to model the link structure in information networks with arbitrary network schema and arbitrary number of object/link types. Expand
Generating High Quality Proposition Banks for Multilingual Semantic Role Labeling
This paper presents a two-stage method to enable the construction of SRL models for resourcepoor languages by exploiting monolingual SRL and multilingual parallel data and shows that this method outperforms existing methods. Expand
A phrase mining framework for recursive construction of a topical hierarchy
This paper proposes an algorithm for recursively constructing a hierarchy of topics from a collection of content-representative documents, characterized each topic in the hierarchy by an integrated ranked list of mixed-length phrases. Expand
Automatic Construction and Ranking of Topical Keyphrases on Collections of Short Documents
A framework for topical keyphrase generation and ranking, based on the output of a topic model run on a collection of short documents, is introduced, able to directly compare and rank phrases of different lengths. Expand
Active Learning for BERT: An Empirical Study
The results demonstrate that AL can boost BERT performance, especially in the most realistic scenario in which the initial set of labeled examples is created using keyword-based queries, resulting in a biased sample of the minority class. Expand
Ranking-based classification of heterogeneous information networks
A novel ranking-based iterative classification framework that generates more accurate classes than the state-of-art classification methods on networked data, but also provides meaningful ranking of objects within each class, serving as a more informative view of the data than traditional classification. Expand
The Joint Inference of Topic Diffusion and Evolution in Social Communities
A novel and principled probabilistic model is proposed which casts this task as an joint inference problem, which considers textual documents, social influences, and topic evolution in a unified way and performs significantly better than existing methods. Expand
Large-Scale Spectral Clustering on Graphs
The key idea is to repeatedly generate a small number of "supernodes" connected to the regular nodes, in order to compress the original graph into a sparse bipartite graph. Expand
Creation and Interaction with Large-scale Domain-Specific Knowledge Bases
The Content Services system that provides cloud services for creating and querying high-quality domain-specific knowledge bases by analyzing and integrating multiple (un/semi)structured content sources is demonstrated. Expand
Clustering in the Creative Industries: Insights from the Origins of Computer Software
We use several different sources (a 1970 Roster of Organizations in Data Processing and the 1960 and 1970 Censuses of Population) to study patterns of geographic clustering at the very origins of theExpand