• Publications
  • Influence
The English all-words task
Multiple Aspect Ranking Using the Good Grief Algorithm
TLDR
An algorithm is presented that jointly learns ranking models for individual aspects by modeling the dependencies between assigned ranks, and it is proved that the agreementbased joint model is more expressive than individual ranking models. Expand
Unsupervised Multilingual Learning for Morphological Segmentation
TLDR
A nonparametric Bayesian model is presented that jointly induces morpheme segmentations of each language under consideration and at the same time identifies cross-lingual morphem patterns, or abstract morphemes, of multiple languages. Expand
Unsupervised Multilingual Learning for POS Tagging
TLDR
A hierarchical Bayesian model is formulated for jointly predicting bilingual streams of part-of-speech tags that learns language-specific features while capturing cross-lingual patterns in tag distribution for aligned words. Expand
A Parallel Proposition Bank II for Chinese and English
TLDR
This paper presents the results of the parallel PropBank II project, which adds these richer layers of semantic annotation to the first 100K of the Chinese Treebank and its English translation. Expand
Unsupervised Multilingual Grammar Induction
TLDR
A generative Bayesian model is formulated which seeks to explain the observed parallel data through a combination of bilingual and monolingual parameters, and loosely binds parallel trees while allowing language-specific syntactic structure. Expand
A Statistical Model for Lost Language Decipherment
TLDR
A method for the automatic decipherment of lost languages by employing a non-parametric Bayesian framework to simultaneously capture both low-level character mappings and high-level morphemic correspondences. Expand
Adding More Languages Improves Unsupervised Multilingual Part-of-Speech Tagging: a Bayesian Non-Parametric Approach
TLDR
A non-parametric Bayesian model is proposed that connects related tagging decisions across languages through the use of multilingual latent variables and shows that performance improves steadily as the number of languages increases. Expand
PropBank as a Bootstrap for Richer Annotation Schemes
The successof interlingual annotationdependscrucially on agreementas to the entities to be annotated,both in termsof the categoriesof entitiesand in termsof the namesof the entities. TwoExpand
Multilingual Part-of-Speech Tagging: Two Unsupervised Approaches
TLDR
This work considers two ways of applying this intuition to the problem of unsupervised part-of-speech tagging: a model that directly merges tag structures for a pair of languages into a single sequence and a second model which instead incorporates multilingual context using latent variables. Expand
...
1
2
3
4
...