Learn More
We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, in turn, modeled as an infinite mixture over an underlying(More)
All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from the publisher. MIT Press books may be purchased at special quantity discounts for business or sales promotional use. p. cm. —(Adaptive(More)
a grant from Darpa in support of the CALO program. The authors wish to acknowledge helpful discussions with Lancelot James and Jim Pitman and the referees for useful comments. Abstract We consider problems involving groups of data, where each observation within a group is a draw from a mixture model, and where it is desirable to share mixture components(More)
Yair Weiss School of CS & Engr. The Hebrew Univ. Despite many empirical successes of spectral clustering methods-algorithms that cluster points using eigenvectors of matrices derived from the data-there are several unresolved issues. First, there are a wide variety of algorithms that use the eigenvectors in slightly different ways. Second, many of these(More)
Many algorithms rely critically on being given a good metric over their inputs. For instance, data can often be clustered in many " plausible " ways, and if a clustering algorithm such as K-means initially fails to find one that is meaningful to a user, the only recourse may be for the user to manually tweak the metric until sufficiently good clusters are(More)
Kernel-based learning algorithms work by embedding the data into a Euclidean space, and then searching for linear relations among the embedded data points. The embedding is performed implicitly, by specifying the inner products between each pair of points in the embedding space. This information is contained in the so-called kernel matrix, a symmetric and(More)
We present a tree-structured architecture for supervised learning. The statistical model underlying the architecture is a hierarchical mixture model in which both the mixture coeecients and the mixture components are generalized linear models GLIM's. Learning is treated as a maximum likelihood problem; in particular, we present an Expectation-Maximization(More)