Information retrieval on the web

@article{Kobayashi2000InformationRO,
  title={Information retrieval on the web},
  author={Mei Kobayashi and Koichi Takeda},
  journal={ACM Comput. Surv.},
  year={2000},
  volume={32},
  pages={144-173}
}
In this paper we review studies of the growth of the Internet and technologies that are useful for information search and retrieval on the Web. We present data on the Internet from several different sources, e.g., current as well as projected number of users, hosts, and Web sites. Although numerical figures vary, overall trends cited by the sources are consistent and point to exponential growth in the past and in the coming decade. Hence it is not surprising that about 85% of Internet users… Expand
IntelligentWeb Agent for Search Engines
TLDR
This paper illustrates the different types of agents, crawlers, robots,etc for mining the contents of web in a methodical, automated manner and discusses the use of crawler to gather specific types of information from Web pages, such as harvesting e-mail addresses. Expand
A PRIMER ON THE WEB INFORMATION RETRIEVAL PARADIGM
The unabated growth of the Web and the increasing expectation placed by the user on the search engine to anticipate and infer his/her information needs and provide relevant results has fostered theExpand
Enhancing the Power of the Internet
TLDR
Design of any new intelligent search engine should be at least based on two main motivations: design of the web environment is, for the most part, unstructured and imprecise, and a logic that supports modes of reasoning which are approximate rather than exact is needed. Expand
A SURVEY: FROM IR TO WEB IR
TLDR
A Normalized Google Distance (NGD) algorithm, which uses Google as a semantic corpus, is introduced, which can provide a new aspect for IR research and extract the most important keywords or keyword sequences for advanced knowledge discovery. Expand
Information discovery and retrieval tools
  • M. Frame
  • Computer Science
  • Inf. Serv. Use
  • 2004
TLDR
This session will focus on the various Internet search engines, directories, and how to improve the user experience through the use of such techniques as metadata, meta-search engines, subject specific search tools, and other developing technologies. Expand
Enhanced Web document retrieval using automatic query expansion
TLDR
This work describes a scheme that attempts to remedy the situation by automatically expanding the user query through the analysis of initially retrieved documents, and experimental results to demonstrate the effectiveness of the query expansion scheme are presented. Expand
Web Information Retrieval
TLDR
This paper takes a deeper dive into the Web IR process, a variant of classical Information Retrieval, by clearly explaining its core concepts, the components, model categories, tools, tasks and the performance measures that quantify the quality of retrieval results. Expand
COMPARATIVE STUDY OF SOME POPULAR WEB SEARCH ENGINES
Web search engines are veritable tools for information mining. These search engines differ in effectiveness at retrieving relevant documents from the web. A big question is “which search engine(s)Expand
Effective Retrieval of Information in Tables on the Internet
TLDR
Based on the similarity to a Web's html document, the main purpose here is to do table parsing and construct a dictionary of table indexes for applying to the information retrieval system and thus enhance the accuracy. Expand
Users search trends on WWW and their analysis
TLDR
A survey was being conducted to carry out some quantitative studies that can supplement in better understanding of the user's behavior/requirement while using search engine and consequently helping to improve its working. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 298 REFERENCES
Searching the Web: general and scientific information access
  • S. Lawrence, C. Lee Giles
  • Computer Science
  • First IEEE/POPOV Workshop on Internet Technologies and Services. Proceedings (Cat. No.99EX391)
  • 1999
The World Wide Web has revolutionized the way that people access information, and has opened up new possibilities in areas such as digital libraries, general and scientific information disseminationExpand
Learning Information Retrieval Agents: Experiments with Automated Web Browsing
TLDR
A system which helps users keep abreast of new and interesting information Every day it presents a selection of interesting web pages, and the user evaluates each page, and given feedback the system adapts and attempts to produce better pages the following day. Expand
WebQuery: Searching and Visualizing the Web Through Connectivity
TLDR
This work examines links among the nodes returned in a keyword-based query, finding “interesting” sites that are highly connected to those sites returned by the original query by finding ‘hot spots’ on the Web that contain information germane to a user's query. Expand
Text and Image Metasearch on the Web
TLDR
Both the text and image metasearch functions of Inquirus are surprisingly fast, and the parallel architecture of the engine that provides this efficiency is described. Expand
A Technique for Measuring the Relative Size and Overlap of Public Web Search Engines
TLDR
A standardized, statistical way of measuring search engine coverage and overlap through random queries is described that can be implemented by third-party evaluators using only public query interfaces and suggests the size of the static, public Web as of November was over 200 million pages. Expand
Querying multiple document collections across the Internet
TLDR
GlOSS, a scalable system that chooses the best document sources for a query, is designed and dSCAM an "illegal copy" metasearcher is developed that finds potential copies of a document over distributed text sources. Expand
Human Performance on Clustering Web Pages: A Preliminary Study
TLDR
An initial study of human clustering of web pages, in the hope that it would provide some insight into the difficulty of automating this task, shows that subjects did not cluster identically; in fact, any two subjects had little similarity in their web-page clusters. Expand
An Adaptive Agent for Automated Web Browsing
TLDR
A system which learns to browse the Internet on behalf of a user, which every day presents a selection of interesting Web pages and the user evaluates each page, and given this feedback the system adapts and attempts to produce better pages the following day. Expand
The Anatomy of a Large-Scale Hypertextual Web Search Engine
TLDR
This paper provides an in-depth description of Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and looks at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want. Expand
Multi-Service Search and Comparison Using the MetaCrawler
Standard Web search services, though useful, are far from ideal. There are over a dozen di erent search services currently in existence, each with a unique interface and a database covering a diExpand
...
1
2
3
4
5
...