Learn More
This paper presents a novel supposition, One Sense Per N-gram (N>1), which we believe is appropriate for more linguistic phenomena and can serve as a general version instead of the celebrated One Sense Per Collocation supposition, at least in Chinese language. This new supposition is based on our observation of the error detection process of annoted sense(More)
This paper describes our infrequent sense identification system participating in the SemEval-2010 task 15 on Infrequent Sense Identification for Mandarin Text to Speech Systems. The core system is a supervised system based on the ensembles of Naïve Bayesian classifiers. In order to solve the problem of unbalanced sense distribution, we intentionally extract(More)
  • Shiwen Yu
  • 2008
In Peking University Computer Research Institute (PUCRI) a method of inputting Chinese sentences based on words has been developed. To reduce the troubles in choosing one word out of the others characterized by the same feature, grammatical parsing technique is applied to the method and good results have been achieved. This article describes the outline of(More)
Figure relation extraction is an important and hard field in information extraction. In this paper, aiming to improve the performance for relation extraction of historical figures, we propose a novel method based on bibliographic description. In the proposed method, by analyzing the species and co-occurrence relation of responsibility in a bibliographic(More)
  • 1