#### Filter Results:

#### Publication Year

2002

2008

#### Co-author

#### Key Phrase

#### Publication Venue

#### Data Set Used

Learn More

Clustering data by identifying a subset of representative examples is important for processing sensory signals and detecting patterns in data. Such "exemplars" can be found by randomly choosing an initial subset of data points and then iteratively refining it, but this works well only if that initial choice is close to a good solution. We devised a method… (More)

Unsupervised categorization of images or image parts is often needed for image and video summarization or as a preprocessing step in supervised methods for classification, tracking and segmentation. While many metric-based techniques have been applied to this problem in the vision community , often, the most natural measures of similarity (e.g., number of… (More)

- Delbert Dueck, Sam Roweis, Geoff Hinton, Wei Yu I
- 2008

2009 Clustering data by identifying a subset of representative examples is important for detecting patterns in data and in processing sensory signals. Such " exemplars " can be found by randomly choosing an initial subset of data points as exemplars and then iteratively refining it, but this works well only if that initial choice is close to a good… (More)

Clustering is a fundamental problem in machine learning and has been approached in many ways. Two general and quite different approaches include iteratively fitting a mixture model (e.g., using EM) and linking together pairs of training cases that have high affinity (e.g., using spectral methods). Pair-wise clustering algorithms need not compute sufficient… (More)

MOTIVATION
We address the problem of multi-way clustering of microarray data using a generative model. Our algorithm, probabilistic sparse matrix factorization (PSMF), is a probabilistic extension of a previous hard-decision algorithm for this problem. PSMF allows for varying levels of sensor noise in the data, uncertainty in the hidden prototypes used to… (More)

- Vincent Cheung, Inmar Givoni, Delbert Dueck, Brendan J Frey
- 2006

Effective visualization of biological data is often critical for subsequent analysis. The popular clustergram/dendrogram visualization rearranges rows and columns of a data matrix so as to highlight clusters of similar responses, but assumes each row or column belongs to only one cluster and cannot associate each row or column with multiple clusters. Such… (More)

Many kinds of data can be viewed as consisting of a set of vectors, each of which is a noisy combination of a small number of noisy prototype vectors. Physically, these prototype vectors may correspond to different hidden variables that play a role in determining the measured data. For example, a gene's expression is influenced by the presence of… (More)

- Delbert Dueck, Brendan J. Frey, Nebojsa Jojic, Vladimir Jojic, Guri Giaever, Andrew Emili +2 others
- RECOMB
- 2008

- Delbert Dueck, Jim Huang, Quaid D Morris, Brendan J Frey
- 2004

In the past decade, technologies have been developed that enable researchers to quantify the levels of specific DNA transcripts present in cultures and tissues. While the signal-to-noise ratios of these technologies has consistently improved over the years, the signals provide only an observable projection of hidden, complex underlying processes. A variety… (More)

- Delbert Dueck, Ken Ferens, Fern Berard
- 2002

Common sense dictates that as Internet connection speeds increase, so too does the bandwidth of commonly transmitted data. Presumably, this means that today's Internet of text and images will evolve into an Internet with high-quality video conforming to the MPEG standard. This report begins by introducing MPEG, the MPEG-2 standard, and providing a… (More)