Learn More
The new interdisciplinary field of Data Mining emerged in the early 1990s as a response to the profusion of digital data generated in numerous fields such as biology, chemistry, astronomy, advertising, banking and finance, retail market, stock market, and the WWW. In this paper, I describe an undergraduate course in Data Mining offered at the College of(More)
Data arising from genomic and proteomic experiments is amassing at high speeds resulting in huge amounts of raw data; consequently, the need for analyzing such biological data --- the understanding of which is still lagging way behind --- has been prominently solicited in the post-genomic era we are currently witnessing. In this paper we attempt to analyze(More)
"One person's noise is another person's signal". Outlier detection is used to clean up datasets and also to discover useful anomalies, such as criminal activities in electronic commerce, computer intrusion attacks, terrorist threats, agricultural pest infestations, etc. Thus, outlier detection is critically important in the information-based society. This(More)
Outlier detection can lead to discovering unexpected and interesting knowledge, which is critically important to some areas such as monitoring of criminal activities in electronic commerce, credit card fraud, and the like. In This work, we propose an efficient outlier detection method with clusters as by-product, which works efficiently for large datasets.(More)
Association rule mining (ARM) finds all the association rules in data, that match some measures of interest such as support and confidence. In certain situations where high support is not necessarily of interest, fixed-consequent association-rule mining for confident rules might be favored over traditional ARM. The need for fixed consequent ARM is becoming(More)
Vast amounts of information available online make plagiarism increasingly easy to commit, and this is particularly true of source code. The traditional approach of detecting copied work in a course setting is manual inspection. This is not only tedious but also typically misses code plagiarized from outside sources or even from an earlier offering of the(More)