Learn More
Data uncertainty is an inherent property in various applications due to reasons such as outdated sources or imprecise measurement. When data mining techniques are applied to these data, their uncertainty has to be considered to obtain high quality results. We present UK-means clustering, an algorithm that enhances the K-means algorithm to handle data(More)
We study the problem of clustering data objects whose locations are uncertain. A data object is represented by an uncertainty region over which a probability density function (pdf) is defined. One method to cluster uncertain objects of this sort is to apply the UK-means algorithm, which is based on the traditional K-means algorithm. In UK-means, an object(More)
A major challenge facing all law-enforcement and intelligence-gathering organizations is accurately and efficiently analyzing the growing volumes of crime data. Detecting cybercrime can likewise be difficult because busy network traffic and frequent online transactions generate large amounts of data, only a small portion of which relates to illegal(More)
Blogs, often treated as the equivalence of online personal diaries, have become one of the fastest growing types of Web-based media. Everyone is free to express their opinions and emotions very easily through blogs. In the blogosphere, many communities have emerged, which include hate groups and racists that are trying to share their ideology, express their(More)
© 2005 Wiley Periodicals, Inc. • Published online 31 August 2005 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/asi.20210 that automatically collect Web pages and create an index that can be searched by users (Chau & Chen, 2003b). As these general-purpose search engines do not restrict themselves to particular domains or specialties, they(More)
The authors investigated censorship practices and the use of microblogs—or weibos, in Chinese—using 111 million microblogs collected between 1 January and 30 June 2012. To better control for alternative explanations for censorship decisions attributable to an individual's characteristics and choices, they used a matched case-control(More)
T he Web has plenty of useful resources, but its dynamic, unstructured nature makes them difficult to locate. Search engines help, but the number of Web pages now exceeds two billion, making it difficult for generalpurpose engines to maintain comprehensive, up-todate search indexes. Moreover, as the Web grows ever larger, so does information overload in(More)
While the Web provides a lot of useful information to managers and decision makers in organizations for decision support, it requires a lot of time and cognitive effort for users to sift through a search result list returned by search engines to find useful information. Previous research in information visualization has shown that visualization techniques(More)