The lack of annotated data is an obstacle to the development of many natural language processing applications ; the problem is especially severe when the data is non-English. Previous studies suggested the possibility of acquiring resources for non-English languages by bootstrapping from high quality English NLP tools and parallel corpora; however, the(More)
*Proposed a novel approach of combining a large amount of bilingual resources with a small amount of manually annotated data, and conducted the experiments with the language pair of English-Chinese to train a Chinese Part-of-Speech tagger. Experimental results showed that the proposed approach achieves a significant improvement over EM and self-training.(More)
