• Publications
  • Influence
Analogical Reasoning on Chinese Morphological and Semantic Relations
TLDR
We investigate the linguistic regularities beneath Chinese, and propose an analogical reasoning task based on 68 morphological relations and 28 semantic relations. Expand
  • 134
  • 10
  • PDF
Fast Single-Pair SimRank Computation
TLDR
SimRank is an intuitive and effective measure for link-based similarity that scores similarity between two nodes as the first-meeting probability of two random surfers based on the random surfer model. Expand
  • 59
  • 8
  • PDF
Big data challenge: a data management perspective
TLDR
There is a trend that, virtually everyone, ranging from big Web companies to traditional enterprisers to physical science researchers, is either already experiencing or anticipating unprecedented growth in the amount of data available in their world, as well as new opportunities and great untapped value. Expand
  • 209
  • 7
Combining user preferences and user opinions for accurate recommendation
TLDR
We propose a novel recommendation algorithm based on the characteristics of online reviews to extract effectively the opinion of the user from a customer review written in Chinese. Expand
  • 100
  • 6
Detecting Event Rumors on Sina Weibo Automatically
TLDR
Sina Weibo has become one of the most popular social networks in China. Expand
  • 72
  • 6
Structure Based User Identification across Social Networks
TLDR
We proposed an unsupervised scheme, termed Friend Relationship-based User Identification algorithm without Prior knowledge (FRUI-P) for anonymous identical users of cross-platforms. Expand
  • 55
  • 5
Fast and Scalable Distributed Set Similarity Joins for Big Data Analytics
TLDR
We propose FS-Join, a highly scalable MapReduce-based string similarity join algorithm based on a novel partitioning technique (Vertical Par- titioning). Expand
  • 35
  • 5
  • PDF
Zero-shot Image Tagging by Hierarchical Semantic Embedding
TLDR
This paper proposes Hierarchical Semantic Embedding (HierSE), a simple model that exploits the WordNet hierarchy to improve label embedding and consequently image embedding. Expand
  • 50
  • 5
  • PDF
Partially Supervised Text Classification with Multi-Level Examples
TLDR
A novel multi-level example based learning method for partially supervised text classification is proposed, which can make full use of all unlabeled examples. Expand
  • 11
  • 4
A novel Bayesian classification for uncertain data
TLDR
We apply probabilistic and statistical theory on uncertain data and develop a novel method to calculate conditional probabilities of Bayes theorem for classification of uncertain data. Expand
  • 38
  • 3