Corpus ID: 18596682

Failure as a Service (FaaS): A Cloud Service for Large- Scale, Online Failure Drills

@inproceedings{Gunawi2011FailureAA,
  title={Failure as a Service (FaaS): A Cloud Service for Large- Scale, Online Failure Drills},
  author={Haryadi S. Gunawi and Thanh Do and J. M. Hellerstein and I. Stoica and D. Borthakur and J. Robbins},
  year={2011}
}
  • Haryadi S. Gunawi, Thanh Do, +3 authors J. Robbins
  • Published 2011
  • Engineering
  • Cloud computing is pervasive, but cloud service outages still take place. One might say that the computing forecast for tomorrow is “cloudy with a chance of failure.” One main reason why major outages still occur is that there are many unknown large-scale failure scenarios in which recovery might fail. We propose a new type of cloud service, Failure as a Service (FaaS), which allows cloud services to routinely perform large-scale failure drills in real deployments. 
    53 Citations

    Figures and Tables from this paper.

    The Case for Drill-Ready Cloud Computing
    • 12
    • PDF
    Failure scenario as a service (FSaaS) for Hadoop clusters
    • 36
    • PDF
    Evolution of as-a-Service Era in Cloud
    • 41
    • PDF
    Phantom of the cloud: Towards improved cloud availability and dependability
    • 4
    Efficient Inter-cloud Replication for High-Availability Services*
    • 11
    • PDF
    Approaches for Resilience against Cascading Failures in Cloud Datacenters
    • 6
    A Survey on Resiliency Techniques in Cloud Computing Infrastructures and Applications
    • 80
    • PDF
    Self-managing SLA compliance in cloud architectures: a market-based approach
    • 14
    The Hydra: A Layered, Redundant Configuration Management Approach for Cloud-Agnostic Disaster Recovery
    • K. Huang, Kyrre M. Begnum
    • Computer Science
    • 2013 IEEE 5th International Conference on Cloud Computing Technology and Science
    • 2013
    • 2
    • PDF

    References

    SHOWING 1-10 OF 26 REFERENCES
    FATE and DESTINI: A Framework for Cloud Recovery Testing
    • 126
    • PDF
    Autopilot: automatic data center management
    • 208
    • PDF
    Automated software testing as a service
    • 127
    • PDF
    Efficient Replica Maintenance for Distributed Storage Systems
    • 304
    • PDF