Learn More
Modern latency-critical online services often rely on composing results from a large number of server components. Hence the tail latency (e.g. The 99th percentile of response time), rather than the average, of these components determines the overall service performance. When hosted on a cloud environment, the components of a service typically co-locate with(More)
Modern latency-critical online services such as search engines often process requests by consulting large input data spanning massive parallel components. Hence the tail latency of these components determines the service latency. To trade off result accuracy for tail latency reduction, existing techniques use the components responding before a specified(More)
Large-scale interactive services usually divide requests into multiple sub-requests and distribute them to a large number of server components for parallel execution. Hence the tail latency (i.e. The slowest component's latency) of these components determines the overall service latency. On a cloud platform, each component shares and competes node resources(More)
  • 1