Although popular and extremely well established in mainstream statistical data analysis, logistic regression is strangely absent in the field of data mining. There are two possible explanations of this phenomenon. First, there might be an assumption that any tool which can only produce linear classification boundaries is likely to be trumped by more modern… (More)
To my father, for curiosity. To my mother, for discipline.
Binary classification is a core data mining task. For large datasets or real-time applications, desirable classi-fiers are accurate, fast, and need no parameter tuning. We present a simple implementation of logistic regression that meets these requirements. A combination of regulariza-tion, truncated Newton methods, and iteratively re-weighted least squares… (More)
This paper has no novel learning or statistics: it is concerned with making a wide class of pre-existing statistics and learning algorithms com-putationally tractable when faced with data sets with massive numbers of records or attributes. It briefly reviews the static AD-tree structure of Moore and Lee (1998), and offers a new structure with more… (More)