Research on Automatic Pattern Acquisition Based on Construction Extension


Although entities are named under some specific rules, the amount of various names makes it impossible for computers to detect these entities in a context because of the complex variety of the rules. If we can create a rule that can be easily identified by computers to detect these names automatically, it will substantially reduce our cost, save our time as well as improve extraction efficiency. Therefore, this paper is intended to discuss these specific naming rules for the entities and to assign the methods into computers in order for them to automatically obtain new patterns of term. These methods are represented by pos tags and indicated term. One method is based on soft match. The other method is based on constituent extension. The constituent extension method recognizes new patterns according to the rules and logic among each entity’s constituents. This means that each pattern can be extended and assembled logically. The patterns produced in this way would be the accurate patterns. The result of the experiment based on this method proves that the automatic new patterns recognition increases the efficiency of entity extraction.

@article{Chen2010ResearchOA, title={Research on Automatic Pattern Acquisition Based on Construction Extension}, author={Yu Chen and Dequan Zheng and Bowen Zheng and Tiejun Zhao}, journal={JCIT}, year={2010}, volume={5}, pages={122-127} }