This paper discusses the design and performance of a hierarchical proxy-cache designed to make Internet information systems scale better. The design was motivated by our earlier trace-driven simulation study of Internet traffic. We believe that the conventional wisdom, that the benefits of hierarchical file caching do not merit the costs, warrants… (More)
It is increasingly difficult to make effective use of Internet information, given the rapid growth in data volume, user base, and data diversity. In this paper we introduce Harvest, a system that provides a scalable, customizable architecture for gathering, indexing, caching, replicat-ing, and accessing Internet information.
Over the past several years, a number of information discovery and access tools have been introduced in the Internet, including Archie, Gopher, Nettnd, and WAIS. These tools have become quite popular, and are helping to redeene how people think about wide-area network applications. Yet, they are not well suited to supporting the future information… (More)
Rapid growth in data volume, user base, and data diversity render Internet-accessible information increasingly diicult to use eeectively. In this paper we i n troduce Harvest, a system that provides an integrated set of customizable tools for gathering information from diverse repositories , building topic-speciic content indexes, exibly searching the… (More)
ngoing increases in wide-area network connectivity promise vastly increased opportunities for collaboration and resource sharing. A fundamental problem confronting users of such networks i,; how to discover the existence of resources of interest, such as files, retail products, network services, or people. In tZhis article we focus on the problem of… (More)
This paper presents evidence that several, judiciously placed file caches could reduce the volume of FTP traffic by 42%, and hence the volume of all NSFNET backbone traffic by 21%. In addition, if FTP client and server software automatically compressed data, this savings could increase to 27%. We believe that a hierarchical architecture of whole file… (More)
In the past several years, the number and variety of resources available on the Internet have increased dramatically. With this increase, many new systems have been developed that allow users to search for and access these resources. As these systems begin to interconnect with one another through "information gate-ways", the conceptual relationships between… (More)
The author discusses aspects of Internet's resource discovery problem: how users specify searches, the difference between discovering classes of resources and locating appropriate instances, system-management problems that can be cast as global state discovery searches, issues involved with characterizing resources and with the efficient distribution of… (More)
In this paper we consider the problem of choosing among a collection of replicated servers, focusing on the question of how to make choices that segregate client/server traffic according to network topology. We explore the cost and effectiveness of a variety of approaches, ranging from those requiring routing layer support (e.g., anycast) to those that… (More)
Heterogeneity in hardware and software is an inevitable consequence of experimental computer research. At the University of Washington, the Heterogeneous Computer Systems (HCS) project is a major research and development effort whose goal is to simplify the interconnection of heterogeneous computer systems.