- Full text PDF available (30)
- This year (8)
- Last 5 years (28)
- Last 10 years (37)
Journals and Conferences
Data Set Used
Probabilistic topic models are widely used to discover latent topics in document collections, while latent feature vector representations of words have been used to obtain high performance in many NLP tasks. In this paper, we extend two different Dirichlet multinomial topic models by incorporating latent feature vector representations of words trained on… (More)
Knowledge bases of real-world facts about entities and their relationships are useful resources for a variety of natural language processing tasks. However, because knowledge bases are typically incomplete, it is useful to be able to perform link prediction, i.e., predict whether a relationship not in the knowledge base is likely to be true. This paper… (More)
This paper describes our robust, easyto-use and language independent toolkit namely RDRPOSTagger which employs an error-driven approach to automatically construct a Single Classification Ripple Down Rules tree of transformation rules for POS tagging task. During the demonstration session, we will run the tagger on data sets in 15 different languages.
BACKGROUND This large population-based study was conducted to estimate the prevalence of Dupuytren's disease in US adults and describe associated treatment patterns. METHODS A total of 23,103 individuals from an Internet-based research panel representative of the US population completed a brief online survey designed to identify individuals with symptoms,… (More)
This paper presents a framework for automatically constructing timeline summaries from collections of web news articles. We also evaluate our solution against manually created timelines and in comparison with related work.
Purpose. To estimate the US prevalence of Peyronie's disease (PD) from patient-reported data and to identify diagnosis and treatment patterns. Methods. 11,420 US males ≥18 years old completed a brief web-based survey regarding the presence of PD, past treatments, and penile symptoms (Phase 1). Phase 1 respondents with PD diagnosis, history of treatment, or… (More)
This paper presents a new conversion method to automatically transform a constituent-based Vietnamese Treebank into dependency trees. On a dependency Treebank created according to our new approach, we examine two stateof-the-art dependency parsers: the MSTParser and the MaltParser. Experiments show that the MSTParser outperforms the MaltParser. To the best… (More)
We present a new feature type named rating-based feature and evaluate the contribution of this feature to the task of document-level sentiment analysis. We achieve state-of-the-art results on two publicly available standard polarity movie datasets: on the dataset consisting of 2000 reviews produced by Pang and Lee (2004) we obtain an accuracy of 91.6% while… (More)
Knowledge bases are useful resources for many natural language processing tasks, however, they are far from complete. In this paper, we define a novel entity representation as a mixture of its neighborhood in the knowledge base and apply this technique on TransE—a well-known embedding model for knowledge base completion. Experimental results show that the… (More)
This paper presents an empirical comparison of different dependency parsers for Vietnamese, which has some unusual characteristics such as copula drop and verb serialization. Experimental results show that the neural network-based parsers perform significantly better than the traditional parsers. We report the highest parsing scores published to date for… (More)