Learn More
In crowdsourced data aggregation task, there exist conflicts in the answers provided by large numbers of sources on the same set of questions. The most important challenge for this task is to estimate source reliability and select answers that are provided by high-quality sources. Existing work solves this problem by simultaneously estimating sources'(More)
Topical Influential User Analysis (TIUA) is an important technique in Twitter. Existing techniques neglected relationship strength between users, which is a crucial aspect for TIUA. For modeling relationship strength, interaction frequency between users has not been considered in previous works. In this paper, we firstly introduce a poisson regression-based(More)
The demand for automatic extraction of true information (i.e., truths) from conflicting multi-source data has soared recently. A variety of <i>truth discovery</i> methods have witnessed great successes via jointly estimating source reliability and truths. All existing truth discovery methods focus on providing a point estimator for each object's truth, but(More)
In the past decade, commercial crowdsourcing platforms have revolutionized the ways of classifying and annotating data, especially for large datasets. Obtaining labels for a single instance can be inexpensive, but for large datasets, it is important to allocate budgets wisely. With limited budgets, requesters must trade-off between the quantity of labeled(More)
Discovering topics in short texts, such as news titles and tweets, has become an important task for many content analysis applications. However, due to the lack of rich context information in short texts, the performance of conventional topic models on short texts is usually unsatisfying. In this paper, we propose a novel topic model for short text corpus(More)
In the age of big data, information for the same entity can be obtained from different sources, which is inevitably conflicting. Therefore, aggregation methods are needed to identify the trustworthy information from such conflicting data. Truth discovery, which improves the aggregation results by estimating source trustworthiness and discovering truths(More)
In crowdsourced data aggregation task, there exist conflicts in the answers provided by large numbers of sources on the same set of questions. The most important challenge for this task is to estimate source reliability and select answers that are provided by high-quality sources. Existing work solves this problem by simultaneously estimating sources'(More)
An algorithm named SMHP is proposed, which aims at improving the efficiency of Topic Detection. In SMHP, a T-MI-TFIDF model is designed by introducing mutual information (MI) and enhancing the weight of terms in the title. Then VSM is constructed according to terms' weight, and the dimension is reduced by combining H-TOPN and PCA. Then topics are grouped(More)
  • 1