Content-Addressable Network

  • Published 2010

Abstract

1 Introduction A hash table is a data structure that efficiently maps " keys " onto " values " and serves as a core building block in the implementation of software systems. We conjecture that many large-scale distributed systems could likewise benefit from hash table functionality.We use the term Content-Addressable Network (CAN) to describe such a distributed, Internet-scale, hash table. Perhaps the best example of current Internet systems that could potentially be improved by a CAN are the recently introduced peer-to-peer file sharing systems such as Napster and Gnutella .In these systems, files are stored at the end user machines (peers)rather than at a central server and, as opposed to the traditional client-server model, files are transferred directly between peers.These peer-to-peer systems have become quite popular. Napster was introduced in mid-1999 and, as of December 2000, the software has been downloaded by 50 million users, making it the fastest growing application on the Web. New file sharing systems such as Scour, FreeNet, Ohaha, Jungle Monkey, and MojoNation have all been introduced within the last year. While there remains some (quite justified) skepticism about the business potential of these file sharing systems, we believe their rapid and widespread deployment suggests that there are important advantages to peer-to-peer systems. Peer-to-peer designs harness huge amounts of resources-the content advertised through Napster has been observed to exceed 7 TB of storage on a single day, without requiring centralized planning or huge investments in hardware, bandwidth, or rack space. As such, peer-to-peer file sharing may lead to new content distribution models for applications such as software distribution, file sharing, and static web content delivery. Unfortunately, most of the current peer-to-peer designs are not scalable. For example, in Napster a central server stores the index of all the files available within the Napster user community. To retrieve a file, a user queries this central server using the desired file's well known name and obtains the IP address of a user machine storing the requested file. The file is then down-loaded directly from this user machine. Thus, although Napster uses a peerto-peer communication model for the actual file transfer, the process of locating a file is still very much centralized. This makes it both expensive (to scale the central directory) and vulnerable (since there is a single point of failure). Gnutella goes a step further and decentralizes the file location process as well. Users in a Gnutella …

Cite this paper

@inproceedings{2010ContentAddressableN, title={Content-Addressable Network}, author={}, year={2010} }