Saeedeh Momtazi

Learn More
For the slot filling task of TAC KBP 2010 we developed as a system a simple pipeline architecture whose main components are a two-stage retrieval module and a relation extraction module. We use word-cluster features in the system as a method of achieving generalization by exploiting raw text. In the relation extraction module we use distant supervision in(More)
Persian is one of the Indo-European languages which has borrowed its script from Arabic, a member of Semitic language family. Since Persian and Arabic scripts are so similar, problems arise when we want to process an electronic text. In this paper, some of the common problems faced experimentally in developing a corpus for Persian are discussed. The sources(More)
In this paper a new method for automatic word clustering is presented. We used this method for building n-gram language models for Persian continuous speech recognition (CSR) systems. In this method, each word is specified by a feature vector that represents the statistics of parts of speech (POS) of that word. The feature vectors are clustered by k-means(More)
Expressing opinions and emotions on social media becomes a frequent activity in daily life. People express their opinions about various targets via social media and they are also interested to know about other opinions on the same target. Automatically identifying the sentiment of these texts and also the strength of the opinions is an enormous help for(More)
The prediction task in national language processing means to guess the missing letter, word, phrase, or sentence that likely follow in a given segment of a text. Since 1980s many systems with different methods were developed for different languages. In this paper an overview of the existing prediction methods that have been used for more than two decades(More)
We present the Alyssa QA system which participated in the TAC 2008 Question Answering track. The system consists of two parallel streams: the blogger stream which is used in order to deal with questions which ask for lists of blog authors, and the main stream which processes other opinion questions. We also use a named entity detection component specialized(More)
In this paper we propose a term clustering approach to improve the performance of sentence retrieval in Question Answering (QA) systems. As the search in question answering is conducted over smaller segments of data than in a document retrieval task, the problems of data sparsity and exact matching become more critical. In this paper we propose Language(More)
In this paper, we propose two different language modeling approaches, namely skip trigram and across sentence boundary, to capture the long range dependencies. The skip trigram model is able to cover more predecessor words of the present word compared to the normal trigram while the same memory space is required. The across sentence boundary model uses the(More)