• Publications
  • Influence
Deep Canonical Correlation Analysis
TLDR
We introduce Deep Canonical Correlation Analysis (DCCA), a method to learn complex nonlinear transformations of two views of data such that the resulting representations are highly linearly correlated. Expand
  • 956
  • 238
  • PDF
A gentle tutorial of the em algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models
TLDR
We describe the maximum-likelihood parameter estimation problem and how the ExpectationMaximization (EM) algorithm can be used for its solution. Expand
  • 2,826
  • 168
  • PDF
On Deep Multi-View Representation Learning
TLDR
We consider learning representations (features) in the setting in which we have access to multiple unlabeled views of the data for representation learning while only one view is available at test time. Expand
  • 397
  • 81
  • PDF
A Class of Submodular Functions for Document Summarization
TLDR
We design a class of submodular functions meant for document summarization tasks. Expand
  • 605
  • 66
  • PDF
Unsupervised pattern discovery in human chromatin structure through genomic segmentation
We trained Segway, a dynamic Bayesian network method, simultaneously on chromatin data from multiple experiments, including positions of histone modifications, transcription-factor binding and openExpand
  • 433
  • 54
  • PDF
MVA Processing of Speech Features
TLDR
In this paper, we investigate a technique consisting of mean subtraction, variance normalization and time sequence filtering. Expand
  • 229
  • 39
  • PDF
Multi-document Summarization via Budgeted Maximization of Submodular Functions
TLDR
We show, both theoretically and empirically, a modified greedy algorithm can efficiently solve the budgeted submodular maximization problem near-optimally, and derive new approximation bounds in doing so. Expand
  • 345
  • 37
  • PDF
An integrated encyclopedia of DNA elements in the human genome
The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project hasExpand
  • 2,018
  • 36
  • PDF
Factored Language Models and Generalized Parallel Backoff
TLDR
We introduce factored language models (FLMs) and generalized parallel backoff (GPB). Expand
  • 332
  • 28
  • PDF
Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology
TLDR
We report on a BLAS GEMM compatible multi-level cache-blocked matrix multiply generator which produces code that achieves around 90% of peak on the Sparcstation-20/61, IBM RS/6000-590, HP 712/8Oi, SGI Power Challenge RBk, and SGI Octane RlOk. Expand
  • 467
  • 22
  • PDF