Evaluation of Different Query Expansion Techniques by using Different Similarity Measures in Arabic Documents

  author={Hayel Khafajeh and Nidal Yousef},
Millions of users search daily for their needs using internet and other information stores, they search by writing their queries. Unfortunately, these queries may fail to reach to their needs, this fail known as word mismatch. One way of handling this Word mismatch is by using a thesaurus, that shows (usually semantic) the relationships between terms. The main goal of this study is to design and build an automatic Arabic thesaurus using Local Context Analysis technique that can be used in any… 

This study introduces a compa rison between two query expansion techniques and shows their effectiveness, although, local context analysis has some advantages over the similarity thesaurus, AssociationThesaurus which is global is generally the most effective one.

Novel Automatic Query Building Algorithm Using Similarity Thesaurus

A novel algorithm for query extraction from any collection of documents was suggested, the algorithm elaborate the similarity thesaurus forquery extraction, which leads to the ability of using the algorithm on any language, to evaluate the suggested algorithm.

Query Length and its Impact on Arabic Information Retrieval Performance

The main finding of this research is that using shorter queries improves both precision and recall in Arabic retrieval.

Enhanced Arabic Document Retrieval Using Optimized Query Paraphrasing

This article proposes an enhancement for Arabic information retrieval using a query paraphrasing technique and two query paraphRasing optimization techniques are proposed to overcome the time complexity and exhaustive calculation of existing query paraph rasing techniques.

Semantic Based Query Expansion for Arabic Question Answering Systems

A method to add semantically equivalent keywords in the questions by using semantic resources is presented and it is suggested that the proposed research can deliver highly accurate answers for Arabic questions.

Information Retrieval from Unstructured Arabic Legal Data

An approach for enhancing the process of AIR based on transforming these texts into structured documents in XML format through a document ontology as well as a set of linguistic grammars is proposed for the enhancement of the search results.

Arabic Studies’ Progress in Information Retrieval

Light is shed on the current progress in the field of Arabic information retrieval, the challenges that hinder the progress of this science are identified, and suggestions for further research are proposed.

Contextual text categorization: an improved stemming algorithm to increase the quality of categorization in arabic text

An improved stemming algorithm based on the extraction of the root and the technique of n-grams which permit to return Arabic words’ stems without using any morphological rules or grammatical patterns is proposed.

Development of Arabic evaluations in information retrieval

This paper practices the imaginative analytical technique to scrutinize the genuineness of Arabic educations in the field of information retrieval and to learn the difficulties that are being confronted in this area.



Query expansion using lexical-semantic relations

Examination of the utility of lexical query expansion in the large, diverse TREC collection shows this query expansion technique makes little difference in retrieval effectiveness if the original queries are relatively complete descriptions of the information being sought even when the concepts to be expanded are selected by hand.

Design and Implementation of Automatic Indexing for Information Retrieval with Arabic Documents

A long series of experiments has demonstrated that automatic indexing is at least as effective as manual indexing and more effective in some cases, and suggests that it can achieve a wider coverage of the literature with less money and produce as good results as with manualindexing.

Query Expansion by Mining User Logs

This study proposes a new method for query expansion based on user interactions recorded in user logs that extracts correlations between query terms and document terms by analyzing user logs and can produce much better results than both the classical search method and the other query expansion methods.

Stop-word removal algorithm for Arabic language

The new Arabic removal stop-word technique has been tested using a set of 242 Arabic abstracts chosen from the Proceedings of the Saudi Arabian National Computer conferences, and another set of data choosing from the holy Q'uran, and it gives impressive results that reached approximately to 98%.

Effectiveness of query expansion in ranked-output document retrieval systems

An evaluation of three methods for the expansion of natural language queries in ranked-out put retrieval systems based on term co-oc currence data, on Soundex codes, and on a string similarity measure suggests there is no significant differ ence in retrieval effectiveness between any of these methods and unexpanded searches.

An approach to the automatic construction of global thesauri

Query Expansion using an Automatically Constructed Thesaurus

This work evaluated the effectiveness of a thesaurus constructed from patents for invalidity search, and found that the method can improve upon traditional document retrieval systems.

Evaluating Relevance Ranking Strategies for MEDLINE Retrieval

Experimental results show that retrievals based on the two strategies had improved performance over the baseline performance, and that TF-IDF weighting is more effective in retrieving relevant documents based on a comparison between theTwo strategies.

A Thesaurus Construction Method from Large ScaleWeb Dictionaries

This paper proposes an efficient method to analyze the link structure of Web-based dictionaries to construct an association thesaurus and develops a search engine for evaluation, then conducted a number of experiments to compare the method with other traditional methods such as cooccurrence analysis.

Generating, integrating, and activating thesauri for concept-based document retrieval

A blackboard-based document management system that uses a neural network spreading-activation algorithm which lets users traverse multiple thesauri is discussed, and the system's query formation; the retrieving, ranking and selection of documents; and thesaurus activation are described.