Learn More
This paper discusses the design and performance of a hierarchical proxy-cache designed to make Internet information systems scale better. The design was motivated by our earlier trace-driven simulation study of Internet traffic. We believe that the conventional wisdom, that the benefits of hierarchical file caching do not merit the costs, warrants(More)
It is increasingly difficult to make effective use of Internet information, given the rapid growth in data volume, user base, and data diversity. In this paper we introduce Harvest, a system that provides a scalable, customizable architecture for gathering, indexing, caching, replicat-ing, and accessing Internet information.
ngoing increases in wide-area network connectivity promise vastly increased opportunities for collaboration and resource sharing. A fundamental problem confronting users of such networks i,; how to discover the existence of resources of interest, such as files, retail products, network services, or people. In tZhis article we focus on the problem of(More)
This paper presents evidence that several, judiciously placed file caches could reduce the volume of FTP traffic by 42%, and hence the volume of all NSFNET backbone traffic by 21%. In addition, if FTP client and server software automatically compressed data, this savings could increase to 27%. We believe that a hierarchical architecture of whole file(More)
In the past several years, the number and variety of resources available on the Internet have increased dramatically. With this increase, many new systems have been developed that allow users to search for and access these resources. As these systems begin to interconnect with one another through "information gate-ways", the conceptual relationships between(More)
Discovering different types of file resources (such as documentation, programs, and images) in the vast amount of data contained within network file systems is useful for both users and system administrators. In this paper we discuss the Essence resource discovery system, which exploits file semantics to index both textual and binary files. By exploiting(More)
In this paper we present an architecture and prototype implementation for discovering key network characteristics, such as hosts, gateways, and topology. The Fremont system uses an extensible set of modules to discover information, based on a variety of different protocols and information sources, rather than a single network management protocol. This(More)
Indexing file contents is a powerful means of helping users locate documents, software, and other types of data among large repositories. In environments that contain many different types of data, content indexing requires type-specific processing to extract information effectively. We present a model for type-specific, user-customizable information(More)