Classification of Ordinal Data

Abstract

Predictive learning has traditionally been a standard inductive learning, where different subproblem formulations have been identified. One of the most representative is classification, consisting on the estimation of a mapping from the feature space into a finite class space. Depending on the cardinality of the finite class space we are left with binary or multiclass classification problems. Finally, the presence or absence or a “natural” order among classes will separate nominal from ordinal problems. Although two-class and nominal classification problems have been dissected in the literature, the ordinal sibling has not yet received a lot of attention, even with many learning problems involving classifying examples into classes which have a natural order. Scenarios in which it is natural to rank instances occur in many fields, such as information retrieval, collaborative filtering, econometric modeling and natural sciences. Conventional methods for nominal classes or for regression problems could be employed to solve ordinal data problems; however, the use of techniques designed specifically for ordered classes yields simpler classifiers, making it easier to interpret the factors that are being used to discriminate among classes, and generalises better. Although the ordinal formulation seems conceptually simpler than nominal, some technical difficulties to incorporate in the algorithms this piece of additional information – the order – may explain the widespread use of conventional methods to tackle the ordinal data problem. This dissertation addresses this void by proposing a nonparametric procedure for the classification of ordinal data based on the extension of the original dataset with additional variables, reducing the classification task to the well-known two-class problem. This framework unifies two well-known approaches for the classification of ordinal categorical data, the minimum margin principle and the generic approach by Frank and Hall. It also presents a probabilistic interpretation for the neural network model. A second novel model, the unimodal model, is also introduced and a parametric version is mapped into neural networks. Several case studies are presented to assert the validity of the proposed models.

Extracted Key Phrases

36 Figures and Tables

Cite this paper

@article{Cardoso2006ClassificationOO, title={Classification of Ordinal Data}, author={Jaime S. Cardoso}, journal={CoRR}, year={2006}, volume={abs/cs/0605123} }