Learn More
Large-scale graph-structured computation is central to tasks ranging from targeted advertising to natural language processing and has led to the development of several graph-parallel abstractions including Pregel and GraphLab. However, the natural graphs commonly found in the real-world have highly skewed power-law degree distributions, which challenge the(More)
We study graph estimation and density estimation in high dimensions, using a family of density estimators based on forest structured undirected graphical models. For density estimation, we do not assume the true distribution corresponds to a forest; rather, we form kernel density estimates of the bivariate and univariate marginals, and apply Kruskal's(More)
Popular apps on the Apple iOS App Store can generate millions of dollars in profit and collect valuable personal user information. Fraudulent reviews could deceive users into downloading potentially harmful spam apps or unfairly ignoring apps that are victims of review spam. Thus, automatically identifying spam in the App Store is an important problem. This(More)
We present algorithms for nonparametric regression in settings where the data are obtained sequentially. While traditional esti-mators select bandwidths that depend upon the sample size, for sequential data the effective sample size is dynamically changing. We propose a linear time algorithm that adjusts the bandwidth for each new data point, and show that(More)
We analyze the web access log of Zillow.com – one of the largest real estate website and present a hierarchical mixture model which learns clusters of users and sessions from the combination of web usage and content data. The model is able to exploit the hierarchical structure of the usage data, and learns stereotypical session types and user segments such(More)
  • 1