Towards link characterization from content

  title={Towards link characterization from content},
  author={John Grothendieck and A. Gorin},
  journal={2008 IEEE International Conference on Acoustics, Speech and Signal Processing},
  • John Grothendieck, A. Gorin
  • Published 2008
  • Computer Science
  • 2008 IEEE International Conference on Acoustics, Speech and Signal Processing
In processing large volumes of speech and language data, we are often interested in the distribution of languages, speakers, topics, etc. For large data sets, these distributions are typically estimated at a given point in time using pattern classification technology. Such estimates can be highly biased, especially for rare classes. While these biases have been addressed in some applications, they have thus far been ignored in the speech and language literature. This neglect causes significant… Expand
Error correction of proportions in spoken opinion surveys


Tracking changes in language
  • John Grothendieck
  • Computer Science
  • IEEE Transactions on Speech and Audio Processing
  • 2005
Application-independent evaluation of speaker detection
Hardware accelerated algorithms for semantic processing of document streams
Estimation of test error rates, disease prevalence and relative risk from misclassified data: a review.
Understanding the Metropolis-Hastings Algorithm
Tutorial on Practical Prediction Theory for Classification
An adaptive Metropolis algorithm
Automated Worm Fingerprinting
Bayesian estimation of disease prevalence and the parameters of diagnostic tests in the absence of a gold standard.