Searching web documents using a summarization approach

@article{Qumsiyeh2016SearchingWD,
  title={Searching web documents using a summarization approach},
  author={Rani Qumsiyeh and Yiu-Kai Ng},
  journal={Int. J. Web Inf. Syst.},
  year={2016},
  volume={12},
  pages={83-101}
}
Purpose The purpose of this paper is to introduce a summarization method to enhance the current web-search approaches by offering a summary of each clustered set of web-search results with contents addressing the same topic, which should allow the user to quickly identify the information covered in the clustered search results. Web search engines, such as Google, Bing and Yahoo!, rank the set of documents S retrieved in response to a user query and represent each document D in S using… 

Figures and Tables from this paper

Evaluating Search Results in Exploratory Search
TLDR
Exploratory search as a mechanism for conducting result-oriented search is reviewed and the ways of evaluating the search results obtained from an exploratory search are reviewed.
Search and Aggregation in XML Documents
TLDR
This paper proposes and evaluates an aggregated search method to obtain the most accurate and richest answers in XML fragment search based on the Top-k Approximate Subtree Matching (TASM) algorithm and a new similarity function is proposed to improve the returned fragments.
Multi-Documents Summarization Based on TextRank and its Application in Online Argumentation Platform
TLDR
A method of speech text analysis is proposed based on TextRank, which takes into account the position of sentences in paragraphs, the weight of the key sentence, and the length of the sentence to extract a multi-document summary.
Searching Web Documents Using a Fuzzy-Based Method
TLDR
A solution through a fuzzy linguistic description of the document is proposed, a linguistic variant of standard metadata types, which is used to qualitatively represent both meta-information and user needs.
Extractive Multi-Document Arabic Text Summarization Using Evolutionary Multi-Objective Optimization With K-Medoid Clustering
TLDR
This paper proposes an automatic, generic, and extractive Arabic multi-document summarization system that employs the clustering-based and evolutionary multi-objective optimization methods and outperformed other peer systems for all ROUGE metrics using TAC 2011.
A Visualization Technique to Support Searching Filtering
TLDR
The main contribution of this paper is the review of previous exploratory-search-based works, the utilised features as well as its existing applications, visualizations as the mechanism for developing filters to narrow down the results of searching.
A Semi-Automatic Annotation Method of Effect Clue Words for Chinese Patents Based on Co-Training
TLDR
This article summarizes the classification and characteristics of effect clue words, and proposes a co-training-based method of extracting effect clue Words thesaurus from Chinese patents suitable for various fields through a strategy called self-filtering.
Improving Big Data Technologies with Visual Faceted Search
TLDR
This work proposes a new FS framework for visualizing browsing and refinements of search results to allow users to visually build complex search queries and can also solve the problem of lexical uncertainty in current search engines and give users more interest.

References

SHOWING 1-10 OF 30 REFERENCES
Enhancing Web Search Using Query-Based Clusters and Labels
  • Rani Qumsiyeh, Yiu-Kai Ng
  • Computer Science
    2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT)
  • 2013
TLDR
Experimental results show that QCL is effective and efficient in generating high-quality clusters of documents on specific topics with informative labels, which saves the user's time and effort in searching for specific information of interest without having to browse through the documents one by one.
AUTOMATIC MULTI-DOCUMENT SUMMARIZATION FOR DIGITAL LIBRARIES
TLDR
This paper reports three types of multi-document summaries generated for a set of research abstracts, using different summarization approaches: a sentence-based summary generated by a MEAD summarization system that extracts important sentences using various features, another sentence- based summarygenerated by extracting research objective sentences, and a variable-based Summary focusing on research concepts and relationships.
Query enrichment for web-query classification
TLDR
It is shown that, despite the difficulty of an abundance of ambiguous queries and lack of training data, the query-enrichment technique can solve the problem satisfactorily through a two-phase classification framework.
QCS: A system for querying, clustering and summarizing documents
Generic Text Summarization Using Probabilistic Latent Semantic Indexing
TLDR
This paper presents a method for creating extractive summary of the document by using PLSI to analyze the features of document such as term frequency and graph structure and shows the results.
Search Engines - Information Retrieval in Practice
TLDR
This text provides the background and tools needed to evaluate, compare and modify search engines and numerous programming exercises make extensive use of Galago, a Java-based open source search engine.
Experiments in multidocument summarization
TLDR
A multidocument summarizer built upon research into the detection of new information uses several new strategies to select interesting and informative sentences, including an innovative measure of importance derived from the analysis of a large corpus.
Learning Sub-structures of Document Semantic Graphs for Document Summarization
paper we present a method for summarizing document by creating a semantic graph of the original document and identifying the substructure of such a graph that can be used to extract sentences for a
Incorporating prior knowledge into a transductive ranking algorithm for multi-document summarization
TLDR
This paper presents a transductive approach to learn ranking functions for extractive multi-document summarization by identifying topic themes within a document collection and iteratively trains a ranking function over these two sets of sentences.
Latent dirichlet allocation based multi-document summarization
TLDR
This article uses Latent Dirichlet Allocation to capture the events being covered by the documents and form the summary with sentences representing these different events and shows that the algorithms gave significantly better ROUGE-1 recall measures compared to DUC 2002 winners.
...
...