Learn More
We extend the well-known BFGS quasi-Newton method and its memory-limited variant LBFGS to the optimization of nonsmooth convex objectives. This is done in a rigorous fashion by generalizing three components of BFGS to subdifferentials: the local quadratic model, the identification of a descent direction, and the Wolfe line search conditions. We prove that(More)
We develop stochastic variants of the well-known BFGS quasi-Newton optimization method, in both full and memory-limited (LBFGS) forms, for online optimization of convex functions. The resulting algorithm performs comparably to a well-tuned natural gradient descent but is scalable to very high-dimensional problems. On standard benchmarks in natural language(More)
We develop gain adaptation methods that improve convergence of the Kernel Hebbian Algorithm (KHA) for iterative kernel PCA (Kim et al., 2005). KHA has a scalar gain parameter which is either held constant or decreased according to a predetermined annealing schedule, leading to slow convergence. We accelerate it by incorporating the reciprocal of the current(More)
We present an investigation of recently proposed character and word sequence kernels for the task of authorship attribu-tion based on relatively short texts. Performance is compared with two corresponding probabilistic approaches based on Markov chains. Several configurations of the sequence kernels are studied on a relatively large dataset (50 authors),(More)
Non-metric dissimilarity measures may arise in practice e.g. when objects represented by sensory measurements or by structural descriptions are compared. It is an open issue whether such non-metric measures should be corrected in some way to be metric or even Euclidean. The reason for such corrections is the fact that pairwise metric distances are(More)
In off-line handwriting recognition, classifiers based on hidden Markov models (HMMs) have become very popular. However, while there exist well-established training algorithms , such as the Baum-Welsh procedure, which optimize the transition and output probabilities of a given HMM architecture , the architecture itself, and in particular the number of(More)
BACKGROUND When analysing microarray and other small sample size biological datasets, care is needed to avoid various biases. We analyse a form of bias, stratification bias, that can substantially affect analyses using sample-reuse validation techniques and lead to inaccurate results. This bias is due to imperfect stratification of samples in the training(More)
The study of multiple classifier systems has become an area of intensive research in pattern recognition recently. Also in handwriting recognition, systems combining several classifiers have been investigated. In this paper new methods for the creation of classifier ensembles based on feature selection algorithms are introduced. Those new methods are(More)