Learn More
In wide area computing systems, it is often desirable to create remote read-only copies (replicas) of files. Replication can be used to reduce access latency, improve data locality, and/or increase robustness, scalability and performance for distributed applications. We define a replica location service (RLS) as a system that maintains and provides access(More)
The ever-increasing scale of scientific data has become a significant challenge for researchers that rely on networks to interact with remote computing systems and transfer results to collaborators worldwide. Despite the availability of high-capacity connections, scientists struggle with inadequate cyberinfrastructure that cripples data transfer(More)
Data replication is a key issue in a Data Grid and can be managed in different ways and at different levels of granularity: for example, at the file level or object level. In the High Energy Physics community, Data Grids are being developed to support the distributed analysis of experimental data. We have produced a prototype data replication tool, the Grid(More)
While wide-area Internet traffic has been heavily studied for many years, the characteristics of traffic inside Internet enterprises remain almost wholly unexplored. Nearly all of the studies of enterprise traffic available in the literature are well over a decade old and focus on individual LANs rather than whole sites. In this paper we present a broad(More)
In this work we present a NIDS cluster as a scalable solution for realizing high-performance, stateful network intrusion detection on commodity hardware. The design addresses three challenges: (i) distributing traffic evenly across an extensible set of analysis nodes in a fashion that minimizes the communication required for coordination, (ii) adapting the(More)
We describe a methodology that enables the real-time diagnosis of performance problems in complex high-performance distributed systems. The methodology includes tools for generating precision event logs that can be used to provide detailed end-to-end application and system level monitoring; a Java agent-based system for managing the large amount of logging(More)
Diagnosis and debugging of performance problems on complex distributed systems requires end-to-end performance information at both the application and system level. We describe a methodology, called NetLogger, that enables real-time diagnosis of performance problems in such systems. The methodology includes tools for generating precision event logs, an(More)
Developers and users of high-performance distributed systems often observe performance problems such as unexpectedly low throughput or high latency. Determining the source of the performance problems requires detailed end-to-end instrumentation of all components, including the applications, operating systems, hosts, and networks. In this paper we describe a(More)
A large number of tools that attempt to estimate network capacity and available bandwidth use algorithms that are based on measuring packet inter-arrival time. However in recent years network bandwidth has become faster than system input/output (I/O) bandwidth. This means that it is getting harder and harder to estimate capacity and available bandwidth(More)