Learn More
Probabilistic Latent Semantic Indexing is a novel approach to automated document indexing which is based on a statistical latent class model for factor analysis of count data. Fitted from a training corpus of text documents by a generalization of the Expectation Maximization algorithm, the utilized model is able to deal with domain{specific synonymy as well(More)
(1985). A learning algorithm for boltzmann machines. (2010). Learning the structure of deep sparse graphical models. In AI/Statistics. On tight approximate inference of the logistic-normal topic admixture model. In AI/Statistics.ference using message propoga-tion and topology transformation in vector Gaussian continuous networks. In UAI. Bayesian analysis(More)
This paper presents two new formulations of multiple-instance learning as a maximum margin problem. The proposed extensions of the Support Vector Machine (SVM) learning approach lead to mixed integer quadratic programs that can be solved heuristically. Our generalization of SVMs makes a state-of-the-art classification technique, including non-linear(More)
Probabilistic Latent Semantic Analysis is a novel statistical technique for the analysis of two{mode and co-occurrence data, which has applications in information retrieval and ltering, natural language processing, machine learning from text, and in related areas. Compared to standard Latent Semantic Analysis which stems from linear algebra and performs a(More)
This paper presents a novel statistical method for factor analysis of binary and count data which is closely related to a technique known as Latent Semantic Analysis. In contrast to the latter method which stems from linear algebra and performs a Singular Value Decomposition of co-occurrence tables, the proposed technique uses a generative latent class(More)
Learning general functional dependencies is one of the main goals in machine learning. Recent progress in kernel-based methods has focused on designing flexible and powerful input representations. This paper addresses the complementary issue of problems involving complex outputs such as multiple dependent output variables and structured output spaces. We(More)
Most successful object recognition systems rely on binary classification, deciding only if an object is present or not, but not providing information on the actual object location. To perform localization, one can take a sliding window approach, but this strongly increases the computational cost, because the classifier function has to be evaluated over a(More)
This paper presents a novel discriminative learning technique for label sequences based on a combination of the two most successful learning algorithms, Support Vector Machines and Hidden Markov Models which we call Hidden Markov Support Vector Machine. The proposed architecture handles dependencies between neighboring labels using Viterbi decoding. In(More)
In this paper, we propose a novel statistical language model to capture topic-related long-range dependencies. Topics are mod-eled in a latent variable framework in which we also derive an EM algorithm to perform a topic factor decomposition based on a segmented training corpus. The topic model is combined with a standard language model to be used for(More)
Collaborative filtering aims at learning predictive models of user preferences, interests or behavior from community data, that is, a database of available user preferences. In this article, we describe a new family of model-based algorithms designed for this task. These algorithms rely on a statistical modelling technique that introduces latent class(More)