Analysis of a very large web search engine query log

  title={Analysis of a very large web search engine query log},
  author={Craig Silverstein and Monika Henzinger and Hannes Marais and Michael Moricz},
  journal={SIGIR Forum},
In this paper we present an analysis of an AltaVista Search Engine query log consisting of approximately 1 billion entries for search requests over a period of six weeks. This represents almost 285 million user sessions, each an attempt to fill a single information need. We present an analysis of individual queries, query duplication, and query sessions. We also present results of a correlation analysis of the log entries, studying the interaction of terms within queries. Our data supports the… Expand
Analysis of the query logs of a Web site search engine
The research on analyzing the search logs of the search engine of the Utah state government Web site shows that some statistics of Web users are the same for general-purpose search engines and Web site search engines, but others are considerably different. Expand
An Investigation Into the Use of Simple Queries On Web IR Systems
In general, increasing the complexity of the queries had little effect on the results with a greater than a 70% overlap in results, on average. Expand
Coverage, relevance, and ranking: The impact of query operators on Web search engine results
Examining the effects of query operators on the performance of three major Web search engines concluded that the use of most query operators had no significant effect on coverage, relative precision, or ranking, although the effect varied depending on the search engine. Expand
Analysis of long queries in a large scale search log
This paper analyzes the long queries in the search log with the aim of identifying the characteristics of the most commonly occurring types of queries, and the issues involved with using them effectively in a search engine. Expand
Subject categorization of query terms for exploring Web users' search interests
This article presents a query categorization approach to automatically classifying Web query terms into broad subject categories and demonstrates that the approach is efficient in dealing with large numbers of queries and adaptable to the dynamic Web environment. Expand
A General Classification of (Search) Queries and Terms
Time-dependent clusters of (search) terms around particular subjects are found and strategies for the design of search engines and Web pages which focus on the (information) consumer are developed. Expand
Temporal analysis of a very large topically categorized Web query log
This study builds on the authors' previous work, which showed changes in popularity and uniqueness of topically categorized queries across the hours in a day, and outlines a method for studying both the static and topical properties of a very large query log over varying periods. Expand
A Query Log Analysis of Dataset Search
The first query log analysis for dataset search, based on logs of four national open data portals, is presented to gain a better understanding of the typical users of these portals and the types of queries they issue, and frame the findings in the broader context of dataset search. Expand
Characterizing Search Behavior in Web Archives
This work presents the first search behavior characterization of web archive users, finding a strong evidence that users prefer the oldest documents over the newest, but mostly search without any temporal restriction. Expand
Real time search user behavior
This research analyzes user interactions with a real time search engine and investigates aggregate usage of the search engine, such as number of users, queries, and terms, and the structure of queries and terms submitted by these users. Expand


Real life information retrieval: a study of user queries on the Web
This work analyzed transaction logs of a set of 51,473 queries posed by 18,113 users of Excite, a major Internet search service, to provide data on the number of search terms, and the use of the service. Expand
Categorical Data Analysis.