Deepak Agarwal

Learn More
We propose a novel latent factor model to accurately predict response for large scale dyadic data in the presence of features. Our approach is based on a model that predicts response as a multiplicative function of row and column latent factors that are estimated through separate regressions on known row and column features. In fact, our model provides a(More)
We propose fLDA, a novel matrix factorization method to predict ratings in recommender system applications where a "bag-of-words" representation for item meta-data is natural. Such scenarios are commonplace in web applications like content recommendation, ad targeting and web search where items are articles, ads and web pages respectively. Because of data(More)
We propose a novel statistical method to predict large scale dyadic response variables in the presence of covariate information. Our approach simultaneously incorporates the effect of covariates and estimates local structure that is induced by interactions among the dyads through a discrete latent factor model. The discovered latent factors provide a(More)
We provide a framework to exploit dependencies among arms in multi-armed bandit problems, when the dependencies are in the form of a generative model on clusters of arms. We find an optimal MDP-based policy for the discounted reward case, and also give an approximation of it with formal error guarantee. We discuss lower bounds on regret in the undiscounted(More)
In online advertising, response prediction is the problem of estimating the probability that an advertisement is clicked when displayed on a content publisher's webpage. In this paper, we show how response prediction can be viewed as a problem of matrix completion, and propose to solve it using matrix factorization techniques from collaborative filtering(More)
We consider the problem of estimating rates of rare events for high dimensional, multivariate categorical data where several dimensions are hierarchical. Such problems are routine in several data mining applications including computational advertising, our main focus in this paper. We propose LMMH, a novel log-linear modeling method that scales to massive(More)
Spatial scan statistics are used to determine hotspots in spatial data, and are widely used in epidemiology and biosurveillance. In recent years, there has been much effort invested in designing efficient algorithms for finding such "high discrepancy" regions, with methods ranging from fast heuristics for special cases, to general grid-based methods, and to(More)
Combining correlated information from multiple contexts can significantly improve predictive accuracy in recommender problems. Such information from multiple contexts is often available in the form of several incomplete matrices spanning a set of entities like users, items, features, and so on. Existing methods simultaneously factorize these matrices by(More)
We describe a new content publishing system that selects articles to serve to a user, choosing from an editorially programmed pool that is frequently refreshed. It is now deployed on a major Yahoo! portal, and selects articles to serve to hundreds of millions of user visits per day, significantly increasing the number of user clicks over the original manual(More)
We propose novel multi-armed bandit (explore/exploit) schemes to maximize total clicks on a content module published regularly on Yahoo! Intuitively, one can ``explore'' each candidate item by displaying it to a small fraction of user visits to estimate the item's click-through rate (CTR), and then ``exploit'' high CTR items in order to maximize clicks.(More)