Learn More
Named Entity Recognition or Extraction (NER) is an important task for automated text processing for industries and academia engaged in the field of language processing, intelligence gathering and Bioinformatics. In this paper we discuss the general problem of Named Entity Recognition, more specifically the challenges in NER in languages that do not have(More)
We are interested in contributing a small, publicly available Urdu corpus of written text to the natural language processing community. The Urdu text is stored in the Unicode character set, in its native Arabic script, and marked up according to the Corpus Encoding Standard (CES) XML Document Type Definition (DTD). All the tags and metadata are in English.(More)
Hepatitis B and C is common in Pakistan and various risk factors are attributable to its spread. One thousand and fifty consecutive male cases suffering from chronic liver disease (327 HBV and 723 HCV) were selected from the OPD of public sector hospital and a private clinic dealing exclusively with the liver patients. To compare the results 723 age and(More)
Several algorithms based on link analysis have been developed to measure the importance of nodes on a graph such as pages on the World Wide Web. PageRank and HITS are the most popular ranking algorithms to rank the nodes of any directed graph. But, both these algorithms assign equal importance to all the edges and nodes, ignoring the semantically rich(More)
This paper explains the challenges pertaining to Urdu stemming and presents a rule-based prototype with a few rules implemented for Urdu to motivate the intricacies. It shows that Urdu stemming is quite challenging because of Urdu’s diverse nature and because Arabic and Farsi stemmers cannot be used for Urdu. Dictionary-based errorcorrecting schemes used by(More)
This paper describes a thesis proposal to do concept search in non English and non European languages. Urdu is chosen as an example language because of its unique nature, morphology and a large number of speakers. Besides its importance, Urdu does not have adequate language resources to do intellectual research in Information Retrieval (IR). It is shown(More)
Named Entity Recognition (NER) seeks to locate and classify atomic elements in text into predefined categories such as names of person, organization, location, Quantities, Percentage etc. Named entities tell us the roles of each meaning bearing word in a sentence and hence identification of these entities certainly helps us to extract the essence of the(More)
Goal of conferences like TREC, TIPSTER, NTCIR, CLEF is to judge the performance of different algorithms. Most of these conferences have tracks that deal with new and innovative information retrieval problems, but none has tackled to work with Urdu data, primarily because of the lack of resources. In this paper we present a baseline for Urdu IR evaluation(More)
Of the 149 women with 0 pregnancy losses, 7 (5%) had factor XI level ≥150% versus 5 of 31 (16%) women with recurrent pregnancy loss. Three of the 5 women with high factor XI and recurrent pregnancy loss, with 19 previous pregnancy losses and 0 live births, were given enoxaparin during 5 subsequent pregnancies, and had 6 term live births and 1 miscarriage.
  • 1