Y. Dora Cai

Learn More
Real-time surveillance systems, telecommunication systems, and other dynamic environments often generate tremendous (potentially infinite) volume of stream data: the volume is too huge to be scanned multiple times. Much of such data resides at rather low level of abstraction, whereas most analysts are interested in relatively high-level dynamic changes(More)
Association rule mining often generates a huge number of rules, but a majority of them either are redundant or do not reflect the true correlation relationship among data objects. In this paper, we reexamine this problem and show that two interesting measures, all confidence (denoted as α) and coherence (denoted as γ), both disclose genuine correlation(More)
Real-time surveillance systems, network and telecommuni-cation systems, and other dynamic processes often generate tremendous (potentially infinite) volume of stream data. Effective analysis of such stream data poses great challenges to database and data mining researchers, due to its unique features, such as single-scan algorithm, multi-dimensional online(More)
Massively Multiplayer Online Games (MMOGs) provide unique opportunities to investigate large social networks, such as player (working-group), trading, and communication (chat) networks. This paper presents a visualization tool -- <i>SocialMapExplorer</i> - that allows users to explore these networks over temporal-geographic space. Implemented on the(More)
Due to the huge volume and extreme complexity in online game data collections, selecting essential features for the analysis of massive game logs is not only necessary, but also challenging. This study develops and implements a new XSEDE-enabled tool, FeatureSelector, which uses the parallel processing techniques on high performance computers to perform(More)
The Dark Energy Survey (DES) collaboration will study cosmic acceleration with a 5000 deg 2 griZY survey in the southern sky over 525 nights from 2011-2016. The DES data management (DESDM) system will be used to process and archive these data and the resulting science ready data products. The DESDM system consists of an integrated archive, a processing(More)
Grouping game players based on their online behaviors has attracted a lot of attention recently. However, due to the huge volume and extreme complexity in online game data collections, grouping players is a challenging task. This study has applied parallelized K-Means on Gordon, a supercomputer hosted at San Diego Supercomputer Center, to meet the(More)
This study highlights the importance of considering gender and offline cultural context when working in virtual teams. To this end, we examine gender differences in performance and participation within virtual teams in a popular online game, drawing from behavioral game data from game servers in nine countries, each representing a distinct region of the(More)