Tengjiao Wang

Learn More
Labeling text data is quite time-consuming but essential for automatic text classification. Especially, manually creating multiple labels for each document may become impractical when a very large amount of data is needed for training multi-label text classifiers. To minimize the human-labeling efforts, we propose a novel multi-label active learning(More)
With the rapid growth of large graphs, we cannot assume that graphs can still be fully loaded into memory, thus the disk-based graph operation is inevitable. In this paper, we take the shortest path discovery as an example to investigate the technique issues when leveraging existing infrastructure of relational database (RDB) in the graph data management.(More)
With the advent of cloud computing, it becomes desirable to utilize cloud computing to efficiently process complex operations on large graphs without compromising their sensitive information. This paper studies shortest distance computing in the cloud, which aims at the following goals: i) preventing outsourced graphs from neighborhood attack, ii)(More)
Previous studies have shown mining closed patterns provides more benefits than mining the complete set of frequent patterns, since closed pattern mining leads to more compact results and more efficient algorithms. It is quite useful in a data stream environment where memory and computation power are major concerns. This paper studies the problem of mining(More)
Density-based clustering is a sort of clustering analysis methods, which can discover clusters with arbitrary shape and is insensitive to noise data. The efficiency of data mining algorithms is strongly needed with data becoming larger and larger. In this paper, we present a new fast clustering algorithm called CURD, which means Clustering Using References(More)
In this paper, we develop a convolutional neural network for stance detection in tweets. According to the official results, our system ranks 1 on subtask B (among 9 teams) and ranks 2 on subtask A (among 19 teams) on the twitter test set of SemEval2016 Task 6. The main contribution of our work is as follows. We design a ”vote scheme” for prediction instead(More)
With the wide applications of large scale graph data such as social networks, the problem of finding the top-<i>k</i> shortest paths attracts increasing attention. This paper focuses on the discovery of the top-<i>k</i> simple shortest paths (paths without loops). The well known algorithm for this problem is due to Yen, and the provided worstcase bound(More)
Microblogs contain the most up-to-date and abundant opinion information on current events. Aspect-based opinion mining is a good way to get a comprehensive summarization of events. The most popular aspect based opinion mining models are used in the field of product and service. However, existing models are not suitable for event mining. In this paper we(More)
This paper takes the shortest path discovery to study efficient relational approaches to graph search queries. We first abstract three enhanced relational operators, based on which we introduce an FEM framework to bridge the gap between relational operations and graph operations. We show new features introduced by recent SQL standards, such as window(More)