Finding buying guides with a Web carnivore

  title={Finding buying guides with a Web carnivore},
  author={Reiner Kraft and Raymie Stata},
  journal={Proceedings of the IEEE/LEOS 3rd International Conference on Numerical Simulation of Semiconductor Optoelectronic Devices (IEEE Cat. No.03EX726)},
  • Reiner KraftR. Stata
  • Published 10 November 2003
  • Computer Science, Business
  • Proceedings of the IEEE/LEOS 3rd International Conference on Numerical Simulation of Semiconductor Optoelectronic Devices (IEEE Cat. No.03EX726)
Research on buying behavior indicates that buying guides perform an important role in the overall buying process. However, while the Web contains many buying guides, finding those guides is difficult to impossible for the average consumer. Web search engines typically index many buying guides on many topics, but simple queries do not often return these results. Given this, we built a Web carnivore that finds buying guides on behalf of consumers. Web carnivores leverage the crawling, scrubbing… 

Figures and Tables from this paper

Enrichment in Performance of Focused Web Crawlers

In this thesis, Crawler basics, the commonly used Web crawling techniques, the pseudo code of various basic crawling algorithms and their implementations in C language along with simplified flowcharts are discussed.

Web Page Content Block Partitioning for Focussed Crawling

An algorithm which partitions the web pages on the basis of headings into blocks and then calculates the relevancy of each partitioned block in web page is designed to address the problem of how to retrieve the relevant and quality web pages.

Correlation of Expert and Search Engine Rankings

To answer the question: do expert rankings of real-world entities correlate with search engine rankings of corresponding web resources, 9 experiments are conducted using 8 expert rankings on a range of academic, athletic, financial and popular culture topics.

Agreeing to disagree: search engines and their public interfaces

This work provides the first in depth quantitative analysis of the results produced by the Google, MSN and Yahoo API and WUI interfaces and found MSN to produce the most consistent results between their two interfaces.

Searching with context

QR, RB, and IFM represent a cost-effective design spectrum for contextual search and it is shown that while QR works surprisingly well, the relevance and recall can be improved using RB and substantially more using IFM.


The authors of this paper conclude that this paper will provide an efficient mechanism for the Focused crawler to index a web page which is more relevant to the topic.

Improving the performance of focused web crawlers

Using genre to improve web search

This dissertation explores the use of genre as a document descriptor in order to improve the effectiveness of web searching by developing a genre palette and showing that users would agree on the genres of webpages, when choosing from the genre palette.

Context-aware web search in ubiquitous sensor environments

This article proposes a new concept for a context-aware Web search method that automatically retrieves a webpage related to the daily activity that a user currently is engaged in and displays the

A Focused Crawler in order to Get Semantic Web Resources (CSR)

This research work proposes a focused crawler which allow to download these resources automatically and store them on disk in order to have a collection that will be used for data processing.



Finding Relevant Website Queries

The system is able to suggest queries based on the number of URLs that the query has in common with the set of URLs related to the starting URL, and rank this set of queries by the number.

Mining topic-specific concepts and definitions on the web

The goal is to help people learn in-depth knowledge of a topic systematically on the Web, and the proposed techniques first identify those sub-topics or salient concepts of the topic, and then find and organize those informative pages, containing definitions and descriptions of thetopic and sub- topics, just like those in a book.

Information retrieval on the web

Overall trends cited by the sources are consistent and point to exponential growth in the past and in the coming decade, and the development of new techniques targeted to resolve some of the problems associated with Web-based information retrieval are discussed.

Iterative Information Retrieval Using Fast Clustering and Usage-Specific Genres

This paper describes how collection specific empirically defined stylistics based genre prediction can be brought together together with rapid topical clustering to build an interactive information

Improving automatic query expansion

Experimental results show that refining the set of documents used in query expansion often prevents the query drift caused by blind expansion and yields substantial improvements in retrieval effectiveness, both in terms of average precision and precision in the top twenty documents.

Query-Free News Search

A variety of algorithms were evaluated for finding news articles on the web that are relevant to news currently being broadcast, looking at the impact of inverse document frequency, stemming, compounds, history, and query length on the relevance and coverage of news articles returned in real time during a broadcast.

Implications of buyer decision theory for design of e-commerce websites

It is contention is that significant investment and effort is required at any given website in order to create the decision support and search agents needed to properly support buyer decision-making.

Google hacks - 100 industrial-strength tips and tools

Google Hacks explores this unique interface of Google, demonstrating clever ways to perform a wide variety of tasks using Google, a powerful and highly customizable user interface for tapping the resources of the Internet.

Agent-Mediated Integrative Negotiation for Retail Electronic Commerce

This paper analyzes approaches from economic, behavioral, and software agent perspectives then proposes integrative negotiation as a more suitable approach to retail electronic commerce and identifies promising techniques for implementing agent-mediated Integrative negotiation.

Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL

This paper presents a simple unsupervised learning algorithm for recognizing synonyms, based on statistical data acquired by querying a Web search engine. The algorithm, called PMI-IR, uses Pointwise