Learn More
We have proposed replicator neural networks (RNNs) for outlier detection [8]. Here we compare RNN for out-lier detection with three other methods using both publicly available statistical datasets (generally small) and data mining datasets (generally much larger and generally real data). The smaller datasets provide insights into the relative strengths and(More)
We consider the problem of finding outliers in large multi-variate databases. Outlier detection can be applied during the data cleansing process of data mining to identify problems with the data itself, and to fraud detection where groups of outliers are often of particular interest. We use replicator neural networks (RNNs) to provide a measure of the(More)
Adverse reactions to drugs are a leading cause of hospitalisa-tion and death worldwide. Most post-marketing Adverse Drug Reaction (ADR) detection techniques analyse spontaneous ADR reports which underestimate ADRs significantly. This paper aims to signal ADRs from administrative health databases in which data are collected routinely and are readily(More)
The work is motivated by real-world applications of detecting Adverse Drug Reactions (ADRs) from administrative health databases. ADRs are a leading cause of hospitalization and death worldwide. Almost all current postmarket ADR signaling techniques are based on spontaneous ADR case reports, which suffer from serious underreporting and latency. However,(More)
In various real-world applications, it is very useful mining unanticipated episodes where certain event patterns unexpectedly lead to outcomes, e.g., taking two medicines together sometimes causing an adverse reaction. These unanticipated episodes are usually unexpected and infrequent, which makes existing data mining techniques, mainly designed to find(More)
Australia has extensive administrative health data collected by Commonwealth and state agencies. The value of the health data is how quickly and effectively the data can be converted into useful knowledge independent of its quantity and quality. Using a unique cleaned and linked administrative health dataset we address the problem of empirically defining(More)