Learn More
Time series classification is an important task with many challenging applications. A nearest neighbor (NN) classifier with dynamic time warping (DTW) distance is a strong solution in this context. On the other hand, feature-based approaches have been proposed as both classifiers and to provide insight into the series, but these approaches have problems(More)
Predictive models benefit from a compact, non-redundant subset of features that improves inter-pretability and generalization. Modern data sets are wide, dirty, mixed with both numerical and categorical predictors, and may contain interactive effects that require complex models. This is a challenge for filters, wrappers, and embedded feature selection(More)
A tree-ensemble method, referred to as time series forest (TSF), is proposed for time series classification. TSF employs a combination of entropy gain and a distance measure, referred to as the Entrance (entropy and distance) gain, for evaluating the splits. Experimental studies show that the Entrance gain improves the accuracy of TSF. TSF randomly samples(More)
— In contrast to typical variable selection methods such as CFS, tree-based ensemble methods can produce numerical importances of input variables of mixed type considering all variable interactions, not just one or two variables at a time. However, they do not indicate a cutoff point: how to set a threshold to the importance. This paper presents an(More)
Attribute importance measures for supervised learning are important for improving both learning accuracy and interpretability. However, it is well-known there could be bias when the predictor attributes have different numbers of values. We propose two methods to solve the bias problem. One uses an out-of-bag sampling method called OOBForest and one, based(More)
Associative classifiers have been proposed to achieve an accurate model with each individual rule being interpretable. However, existing associative clas-sifiers often consist of a large number of rules and, thus, can be difficult to interpret. We show that associative classifiers consisting of an ordered rule set can be represented as a tree model. From(More)
We address the problem of identifying a non-redundant subset of important variables. All modern feature selection approaches including filters, wrappers, and embedded methods experience problems in very general settings with massive mixed-type data, and with complex relationships between the inputs and the target. We propose an efficient ensemble-based(More)