• Publications
  • Influence
Regression-based latent factor models
A novel latent factor model to accurately predict response for large scale dyadic data in the presence of features is proposed and induces a stochastic process on the dyadic space with kernel given by a polynomial function of features.
fLDA: matrix factorization through latent dirichlet allocation
We propose fLDA, a novel matrix factorization method to predict ratings in recommender system applications where a "bag-of-words" representation for item meta-data is natural. Such scenarios are
Response prediction using collaborative filtering with hierarchies and side-information
This paper shows how response prediction can be viewed as a problem of matrix completion, and proposes to solve it using matrix factorization techniques from collaborative filtering (CF), and shows how this factorization can be seamlessly combined with explicit features or side-information for pages and ads, which let us combine the benefits of both approaches.
Amazon Redshift and the Case for Simpler Data Warehouses
An oft-overlooked differentiating characteristic of Amazon Redshift is discussed -- simplicity, designed to bring data warehousing to a mass market by making it easy to buy, easy to tune and easy to manage while also being fast and cost-effective.
Multi-armed bandit problems with dependent arms
We provide a framework to exploit dependencies among arms in multi-armed bandit problems, when the dependencies are in the form of a generative model on clusters of arms. We find an optimal MDP-based
Predictive discrete latent factor models for large scale dyadic data
We propose a novel statistical method to predict large scale dyadic response variables in the presence of covariate information. Our approach simultaneously incorporates the effect of covariates and
Localized factor models for multi-context recommendation
This work proposes a new model that significantly improves predictive accuracy, especially in cold-start scenarios, and shows that the E-step can be fitted through a fast multi-resolution Kalman filter algorithm that ensures scalability.
Explore/Exploit Schemes for Web Content Optimization
A Bayesian solution to find the optimal trade-off between explore and exploit for web content publishing applications where dynamic set of items with short lifetimes, delayed feedback and non-stationary reward distributions are typical is developed.
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
The process of whittling down the initial 734 submissions to the final set of 133 accepted papers required the coordination and time of a large number of willing volunteers, and special care was taken to minimize the error of rejecting a potentially good paper at this stage.
Online Models for Content Optimization
A new content publishing system that selects articles to serve to a user, choosing from an editorially programmed pool that is frequently refreshed, is described and deployed on a major Yahoo! portal, and significantly increases the number of user clicks over the original manual approach.