• Corpus ID: 15151742

A Topic Modeling Approach to Ranking

  title={A Topic Modeling Approach to Ranking},
  author={Weicong Ding and Prakash Ishwar and Venkatesh Saligrama},
We propose a topic modeling approach to the prediction of preferences in pairwise comparisons. We develop a new generative model for pairwise comparisons that accounts for multiple shared latent rankings that are prevalent in a population of users. This new model also captures inconsistent user behavior in a natural way. We show how the estimation of latent rankings in the new generative model can be formally reduced to the estimation of topics in a statistically equivalent topic modeling… 

Figures and Tables from this paper

CompareLDA: A Topic Model for Document Comparison

This work develops a topic model supervised by pairwise comparisons of documents that learns predictive topic distributions that comply with the pairwise comparison observations and derives a maximum likelihood estimation method via augmented variational approximation algorithm.

Rank-Integrated Topic Modeling: A General Framework

A new method to integrate topical ranking with topic modeling and a general framework for topic modeling of documents with link structures is introduced by interpreting the normalized topical ranking score vectors as topic distributions for documents.

Learning Mixed Membership Mallows Models from Pairwise Comparisons

A novel parameterized family of Mixed Membership Mallows Models (M4) is proposed to account for variability in pairwise comparisons generated by a heterogeneous population of noisy and inconsistent users and is empirically competitive with the current state-of-the-art approaches in predicting real-world preferences.

Most large topic models are approximately separable

It is proved that when the columns of the topic matrix are independently sampled from a Dirichlet distribution, the resulting topic matrix will be approximately separable with probability tending to one as the number of rows (vocabulary size) scales to infinity sufficiently faster than thenumber of columns (topics).

Inductive Pairwise Ranking: Going Beyond the n log(n) Barrier

The technique of matrix completion in the presence of side information is used to develop the Inductive Pairwise Ranking (IPR) algorithm that provably learns a good ranking under the Feature Low Rank (FLR) model, in a sample-efficient manner.

A Provably Efficient Algorithm for Separable Topic Discovery

We develop necessary and sufficient conditions and a novel provably consistent and efficient algorithm for discovering topics (latent factors) from observations (documents) that are realized from a

A Ranking Model Motivated by Nonnegative Matrix Factorization with Applications to Tennis Tournaments

An efficient, provably convergent, and numerically stable majorization-minimization-based algorithm is derived to maximize the likelihood of datasets under the proposed statistical model.

A Multiresolution Analysis Framework for the Statistical Analysis of Incomplete Rankings

It is shown that the MRA representation naturally allows to overcome both the statistical and computational challenges without any structural assumption on the data, which provides a general and flexible framework to solve a wide variety of statistical problems, where data are of the form of incomplete rankings.

Multiresolution analysis of ranking data

A new representation for the data is introduced, which by construction overcomes the two aforementioned challenges of statistical and computational challenge, offering a natural and efficient framework for the analysis of incomplete rankings.

Stochastically Transitive Models for Pairwise Comparisons: Statistical and Computational Issues

This paper studies a flexible model for pairwise comparisons, under which the probabilities of outcomes are required only to satisfy a natural form of stochastic transitivity, and proposes and studies algorithms that achieve the minimax rate over interesting sub-classes of the full stochastically transitive class.



Learning Mallows Models with Pairwise Preferences

A new algorithm is developed, the generalized repeated insertion model (GRIM), for sampling from arbitrary ranking distributions, that develops approximate samplers that are exact for many important special cases—and have provable bounds with pair-wise evidence.

New learning methods for supervised and unsupervised preference aggregation

A general treatment of the preference aggregation problem, in which multiple preferences over objects must be combined into a single consensus ranking, and introduces the Multinomial Preference model (MPM), which uses a multinomial generative process to model the observed preferences.

Probabilistic Topic Models

  • D. Blei
  • Computer Science
    IEEE Signal Processing Magazine
  • 2010
In this article, we review probabilistic topic models: graphical models that can be used to summarize a large collection of documents with a smaller number of distributions over words. Those

Learning Topic Models -- Going beyond SVD

This paper formally justifies Nonnegative Matrix Factorization (NMF) as a main tool in this context, which is an analog of SVD where all vectors are nonnegative, and gives the first polynomial-time algorithm for learning topic models without the above two limitations.

Learning Mixtures of Ranking Models

This work presents the first polynomial time algorithm which provably learns the parameters of a mixture of two Mallows models, and makes a novel use of tensor decomposition techniques to learn the top-k prefix in both the rankings.

Evaluation methods for topic models

It is demonstrated experimentally that commonly-used methods are unlikely to accurately estimate the probability of held-out documents, and two alternative methods that are both accurate and efficient are proposed.

Probabilistic Matrix Factorization

The Probabilistic Matrix Factorization (PMF) model is presented, which scales linearly with the number of observations and performs well on the large, sparse, and very imbalanced Netflix dataset and is extended to include an adaptive prior on the model parameters.

A Practical Algorithm for Topic Modeling with Provable Guarantees

This paper presents an algorithm for topic model inference that is both provable and practical and produces results comparable to the best MCMC implementations while running orders of magnitude faster.

Collaborative topic modeling for recommending scientific articles

An algorithm to recommend scientific articles to users of an online community that combines the merits of traditional collaborative filtering and probabilistic topic modeling and can form recommendations about both existing and newly published articles is developed.

Efficient Distributed Topic Modeling with Provable Guarantees

This work considers topic modeling under the separability assumption and develops novel computationally efficient methods that provably achieve the statistical performance of the state-of-the-art centralized approaches while requiring insignificant communication between the distributed document collections.