Wenxin Liang

Learn More
Recently, more and more data are published and exchanged by XML on the Internet. However, different XML data sources might contain the same data but have different structures. Therefore, it requires an efficient method to integrate such XML data sources so that more complete and useful information can be conveniently accessed and acquired by users. The tree(More)
Seed sets are of significant importance for trust propagation based anti-spamming algorithms, e.g., TrustRank. Conventional approaches require manual evaluation to construct a seed set, which restricts the seed set to be small in size, since it would cost too much and may even be impossible to construct a very large seed set manually. The small-sized seed(More)
The Twitter social network has become a target platform for both promoters and stammers to disseminate their target messages. There are a large number of campaigns containing coordinated spam or promoting accounts in Twitter, which are more harmful than the traditional methods, such as email spamming. Since traditional solutions mainly check individual(More)
Propagating trust/distrust from a set of seed (good/bad) pages to the entire Web has been widely used to combat Web spam. It has been mentioned that a combined use of good and bad seeds can lead to better results. However, little work has been known to realize this insight successfully. A serious issue of existing algorithms is that trust/distrust is(More)
In this paper, we propose an effective method for segmenting large XML documents into independent meaningful subtrees based on two syntactic segmentation rates: vertical segmentation rate and horizontal segmentation rate. In the proposed method, we use DO-VLEI code to calculate the required parameters for the subtree segmentation. We conduct experiments to(More)
In this paper, we propose two methods exploiting path information, direct-parent based method and full-path based method for syntax-based XML subtree matching in RDBs. In each proposed method, we discuss two ways of using the path information. The one is utilizing the path information after matching the leaf nodes. The other is using the path information(More)
A number of XML labeling methods have been proposed to store XML documents in relational databases. However, they have a vulnerable point, in insertion operations. We propose the variable length endless insertable (VLEI) code and apply it to XML labeling to reduce the cost of insertion operations. Results of our experiments indicate that a combination of(More)