An Architectural Framework of a Crawler for Retrieving Highly Relevant Web Documents by Filtering Replicated Web Collections

  title={An Architectural Framework of a Crawler for Retrieving Highly Relevant Web Documents by Filtering Replicated Web Collections},
  author={S. Shekhar and Rohit Agrawal and K. V. Arya},
  journal={2010 International Conference on Advances in Computer Engineering},
As the Web continues to grow, it has become a difficult task to search for the relevant information using traditional search engines. There are many index based web search engines to search information in various domains on the Web. By using such search engines the retrieved documents (URLs) related to the searched topic are of poor quality also as the amount of Web pages is growing at a rapid speed, the issue of devising a personalized Web search is of great importance. This paper proposes a… Expand
A WEBIR Crawling Framework for Retrieving Highly Relevant Web Documents: Evaluation Based on Rank Aggregation and Result Merging Algorithms
Clustering Retrieved Web Documents to Speed Up Web Searches
Semantic Based Image Retrieval using multi-agent model by searching and filtering replicated web images
Enhancing Web Search Using Query-Based Clusters and Labels
  • Rani Qumsiyeh, Yiu-Kai Ng
  • Computer Science
  • 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT)
  • 2013
Enhancing web search by using query-based clusters and multi-document summaries
Study of Web Crawler and its Different Types
Genetically optimizing query expansion for retrieving activities from the web
Searching made easy: A multithreading based desktop search engine
Linguistic structural framework for encoding transliteration variants for word origin detection using bilingual lexicon
  • S. Shekhar, D. Sharma, M. Beg
  • Computer Science
  • 2017 International Conference on Multimedia, Signal Processing and Communication Technologies (IMPACT)
  • 2017


Mining the Web's Link Structure
Intelligent crawling on the World Wide Web with arbitrary predicates
Searching for Hidden-Web Databases
An adaptive crawler for locating hidden-Web entry points
Crawling the Hidden Web
Semantic Web Content Analysis: A Study in Proximity-Based Collaborative Clustering
Cooperative crawling
  • M. Buzzi
  • Computer Science
  • Proceedings of the IEEE/LEOS 3rd International Conference on Numerical Simulation of Semiconductor Optoelectronic Devices (IEEE Cat. No.03EX726)
  • 2003
Design and implementation of a high-performance distributed Web crawler
Web mining: information and pattern discovery on the World Wide Web
A scalable comparison-shopping agent for the World-Wide Web