Xuefeng Xian

Learn More
Big data from the Internet of Things may create big challenge for data classification. Most active learning approaches select either uncertain or representative unlabeled instances to query their labels. Although several active learning algorithms have been proposed to combine the two criteria for query selection, they are usually ad hoc in finding(More)
For many applications, finding rare instances or outliers can be more interesting than finding common patterns. Existing work in outlier detection never considers the context of deep web. In this paper, we argue that, for many scenarios, it is more meaningful to detect outliers over deep web. In the context of deep web, users must submit queries through a(More)
—An ever increasing amount of valuable information is stored in web databases, "hidden" behind search interfaces. A new application area emerge for information retrieval and integration. There may be hundreds or thousands of web databases providing data of relevance to a particular domain on the web. So a primary challenge to internet-scale hidden web(More)
—In the internet-scale hidden web data integration, The problem of sources(web databases) selection has been a primary challenge. This paper proposes a novel approach for web databases selection of internet-scale hidden web data integration. This approach is based on a benefit function that evaluates how much benefit the web database brings to a given(More)
— With the rapid development of Web, there are more and more structured web database available for users to access. At the same time, domain searchers often have difficulties in finding the right web database. In this paper, we study how we can build an effective web database integration system with the aim of making the system contain as much important(More)
  • 1