Thanumalayan Sankaranarayana Pillai

Learn More
We present the first comprehensive study of application-level crash-consistency protocols built atop modern file systems. We find that applications use complex update protocols to persist state, and that the correctness of these protocols is highly dependent on subtle behaviors of the underlying file system, which we term persistence properties. We develop(More)
Applications employ complex protocols to ensure consistency after system crashes. Such protocols are affected by the exact behavior of file systems. However, modern file systems vary widely in such behavior, reducing the correctness and performance of applications. In this paper, we study application-level crash consistency. Through the detailed study of(More)
We introduce <i>optimistic crash consistency</i>, a new approach to crash consistency in journaling file systems. Using an array of novel techniques, we demonstrate how to build an optimistic commit protocol that correctly recovers from crashes and delivers high performance. We implement this optimistic approach within a Linux ext4 variant which we call(More)
We present WiscKey, a persistent LSM-tree-based key-value store with a performance-oriented data layout that separates keys from values to minimize I/O amplification. The design of WiscKey is highly SSD optimized, leveraging both the sequential and random performance characteristics of the device. We demonstrate the advantages of WiscKey with both(More)
Applications are deployed upon deep, diverse storage stacks that are constructed on-demand. Although many storage stacks share a common API to allow portabil-ity, application behavior differs in subtle ways depending upon unspecified properties of the underlying storage stack. Currently, there is no way to test whether an application will behave correctly(More)
We present Quarantine, a system that enables data-driven selective isolation within concurrent server applications. Instead of constructing arbitrary isolation boundaries between components, Quarantine collects data to learn where such boundaries should be placed, and then instantiates said barriers to improve reliability. We present the case for(More)
Modern distributed storage systems employ complex protocols to update replicated data. In this paper, we study whether such update protocols work correctly in the presence of correlated crashes. We find that the correctness of such protocols hinges on how local file-system state is updated by each replica in the system. We build PACE, a framework that(More)
We introduce Fracture, a novel framework that transforms and modernizes the basic process abstraction. By " fracturing " an application into logical modules, Fracture enables powerful and novel run-time configurations that improve run-time testing, application availability, and general robustness, all in a generic and incremental manner. We demonstrate the(More)