Rob M. Konijn

Learn More
An important subproblem in supervised tasks such as decision tree induction and subgroup discovery is finding an interesting binary feature (such as a node split or a subgroup refinement) based on a numeric or nominal attribute, with respect to some discrete or continuous target variable. Often one is faced with a trade-off between the expressiveness of(More)
Conventional techniques for detecting outliers address the problem of finding isolated observations that significantly differ from other observations that are stored in a database. For example, in the context of health insurance, one might be interested in finding unusual claims concerning prescribed medicines. Each claim record may contain information on(More)
We consider data where examples are not only labeled in the classical sense (positive or negative), but also have costs associated with them. In this sense, each example has two target attributes, and we aim to find clearly defined subsets of the data where the values of these two targets have an unusual distribution. In other words, we are focusing on a(More)
In Subgroup Discovery, one is interested in finding subgroups that behave differently from the ‘average’ behavior of the entire population. In many cases, such an approach works well because the general population is rather homogeneous, and the subgroup encompasses clear outliers. In more complex situations however, the investigated population is a mixture(More)
BACKGROUND Osteoporosis often does not involve symptoms, and so the actual number of patients with osteoporosis is higher than the number of diagnosed individuals. This underdiagnosis results in a treatment gap. OBJECTIVES To estimate the total health care resource use and costs related to osteoporosis in the Netherlands, explicitly including fractures,(More)
In this paper we describe an interactive approach for finding outliers in big sets of records, such as collected by banks, insurance companies, web shops. The key idea behind our approach is the usage of an easy-to-compute and easy-to-interpret outlier score function. This function is used to identify a set of potential outliers. The outliers, organized in(More)
  • 1