Dang Duc Pham

Learn More
This paper describes our robust, easy-to-use and language independent toolkit namely RDRPOSTagger which employs an error-driven approach to automatically construct a Single Classification Ripple Down Rules tree of transformation rules for POS tagging task. During the demonstration session, we will run the tagger on data sets in 15 different languages.
This paper presents a new approach to learn a rule based system for the task of part of speech tagging. Our approach is based on an incremental knowledge acquisition methodology where rules are stored in an exception-structure and new rules are only added to correct errors of existing rules; thus allowing systematic control of interaction between rules.(More)
Wikipedia is a free multilingual online encyclopedia covering a wide range of general and specific knowledge. Its content is continuously maintained up-to-date and extended by a supporting community. In many cases, real-world events influence the collaborative editing of Wikipedia articles of the involved or affected entities. In this paper, we present(More)
Crowdsourcing has become ubiquitous in machine learning as a cost effective method to gather training labels. In this paper we examine the challenges that appear when employing crowdsourcing for active learning, in an integrated environment where an automatic method and human labelers work together towards improving their performance at a certain task. By(More)
Detecting duplicate entities, usually by examining metadata, has been the focus of much recent work. Several methods try to identify duplicate entities, while focusing either on accuracy or on efficiency and speed - with still no perfect solution. We propose a combined layered approach for duplicate detection with the main advantage of using Crowdsourcing(More)
In this paper, we present our method of using Information Extraction techniques to tackle the task of automatically translating English weather bulletins to Vietnamese. It is simple yet effective in satisfying the constraints of low processing power and storage space for the deployment on an embedded system. Experimental results are very promising with the(More)
In this paper, we propose a new approach to construct a system of transformation rules for the Part-of-Speech (POS) tagging task. Our approach is based on an incremen-tal knowledge acquisition method where rules are stored in an exception structure and new rules are only added to correct the errors of existing rules; thus allowing systematic control of the(More)
In this demo we present WikipEvent, an exploratory system that captures and visualises continuously evolving complex event structures , along with the involved entities. The framework facilitates entity-centric and event-centric search, presented via a user-friendly interface and supported by temporal snippets from corresponding Wikipedia page versions. The(More)
  • 1