Learn More
For classification problem, the training data will significantly influence the classification accuracy. However, the data in real-world applications often are imbalanced class distribution, that is, most of the data are in majority class and little data are in minority class. In this case, if all the data are used to be the training data, the classifier(More)
In this paper, we present a three-step multilingual dependency parser based on a deterministic shift-reduce parsing algorithm. Different from last year, we separate the root-parsing strategy as sequential labeling task and try to link the neighbor word dependences via a near neighbor parsing. The outputs of the root and neighbor parsers were encoded as(More)
Phrase pattern recognition (phrase chunking) refers to automatic approaches for identifying predefined phrase structures in a stream of text. Support vector machines (SVMs)-based methods had shown excellent performance in many sequential text pattern recognition tasks such as protein name finding, and noun phrase (NP)-chunking. Even though they yield very(More)
The most important factor of classification for improving classification accuracy is the training data. However, the data in real-world applications often are imbalanced class distribution, that is, most of the data are in majority class and little data are in minority class. In this case, if all the data are used to be the training data, the classifier(More)
Department of Computer Science and Information Engineering, Ming Chuan University, No. 5, De-Ming Rd., Gweishan District, Taoyuan 333, Taiwan, ROC Department of Communication and Management, Ming Chuan University, No. 250, Zhong Shan N. Rd., Taipei 111, Taiwan, ROC Graduate Institute of Network Learning Technology, National Central University, No. 300,(More)
Mining frequent patterns refers to the discovery of the sets of items that frequently appear in a transaction database. Many approaches have been proposed for mining frequent itemsets from a large database, but a large number of frequent itemsets may be discovered. In order to present users fewer but more important patterns, researchers are interested in(More)
Several phrase chunkers have been proposed over the past few years. Some state-of-the-art chunkers achieved better performance via integrating external resources, e.g., parsers and additional training data, or combining multiple learners. However, in many languages and domains, such external materials are not easily available and the combination of multiple(More)