Houping Xiao

Learn More
The recent proliferation of human-carried mobile devices has given rise to the crowd sensing systems. However, the sensory data provided by individual participants are usually not reliable. To identify truthful values from the crowd sensing data, the topic of truth discovery, whose goal is to estimate user quality and infer truths through quality-aware data(More)
Driven by the proliferation of sensor-rich mobile devices, crowd sensing has emerged as a new paradigm of gathering information about the physical world. In crowd sensing applications, user observations are usually unevenly distributed across the monitored entities, and this gives rise to two major challenges -- redundancy and sparsity. On one hand,(More)
The recent proliferation of human-carried mobile devices has given rise to mobile crowd sensing (MCS) systems that outsource the collection of sensory data to the public crowd equipped with various mobile devices. A fundamental issue in such systems is to effectively <i>incentivize worker participation</i>. However, instead of being an isolated module, the(More)
The demand for automatic extraction of true information (i.e., truths) from conflicting multi-source data has soared recently. A variety of <i>truth discovery</i> methods have witnessed great successes via jointly estimating source reliability and truths. All existing truth discovery methods focus on providing a point estimator for each object's truth, but(More)
In the information age, people can easily collect information about the same set of entities from multiple sources, among which conflicts are inevitable. This leads to an important task, <i>truth discovery</i>, i.e., to identify true facts (truths) via iteratively updating truths and source reliability. However, the convergence to the truths is never(More)
A vast ocean of data is collected every day, and numerous applications call for the extraction of actionable insights from data. One important task is to detect untrustworthy information because such information usually indicates critical, unusual, or suspicious activities. In this paper, we study the important problem of detecting untrustworthy information(More)
Drug side-effects become a worldwide public health concern, which are the fourth leading cause of death in the United States. Pharmaceutical industry has paid tremendous effort to identify drug side-effects during the drug development. However, it is impossible and impractical to identify all of them. Fortunately, drug side-effects can also be reported on(More)
In this paper, we investigate the problem of identifying inconsistent hosts in large-scale enterprise networks by mining multiple views of temporal data collected from the networks. The time-varying behavior of hosts is typically consistent across multiple views, and thus hosts that exhibit inconsistent behavior are possible anomalous points to be further(More)
Sequential data modeling has received growing interests due to its impact on real world problems. Sequential data is ubiquitous - financial transactions, advertise conversions and disease evolution are examples of sequential data. A long-standing challenge in sequential data modeling is how to capture the strong hidden correlations among complex features in(More)
Diabetes is a serious disease affecting a large number of people. Although there is no cure for diabetes, it can be managed. Especially, with advances in sensor technology, lots of data may lead to the improvement of diabetes management, if properly mined. However, there usually exists noise or errors in the observed behavioral data which poses challenges(More)