Jiakai Xiao

  • Citations Per Year
Learn More
Feature selection methods are designed to obtain the optimal feature subset from the original features to give the most accurate prediction. So far, supervised and unsupervised feature selection methods have been discussed and developed separately. However, these two methods can be combined together as a hybrid feature selection method for some data sets.(More)
This paper proposes a method to optimize the Nonorthogonal Space Distance (NoSD) based on the Particle Swarm Optimization (PSO) algorithm so as to increase estimation accuracy in analogy-based software cost estimation. NoSD is a measure of projects similarity that uses a matrix defined based on mutual information to take both feature redundancies and(More)
In different types of feature selection algorithms, feature clustering is an emerging subset generation paradigm. In this paper, a Minimum spanning tree based Feature Clustering (MFC) algorithm is proposed. In the algorithm, an information-theoretic based measure, i.e., Variation of information, is utilized as the feature redundancy and relevance metric. At(More)
This research came from a school-enterprise cooperation program, which aims to improve data reconciliation efficiency between two large-scale data sources. This paper mainly presents three typical algorithms: standard Bloom filter (BF), counting Bloom filter (CBF) and Invertible Bloom filter (IBF). With the purpose of evaluating their performance, mainly on(More)
Traditional graph clustering methods perform poorly on real world power-law graphs out of core. To tackle this challenge, in this paper, we propose an algorithm to cluster such large power law graphs in case of small memory size. In the proposed method, clusters (connected components) are formed by removing top degree nodes (hubs) from the graph. In order(More)
  • 1