P. P. S. Narayan

Learn More
A fundamental concern of data integration in an XML context is the ability to <i>embed</i> one or more source documents in a target document so that (a) the target document conforms to a target schema and (b) the information in the source documents is <i>preserved</i>. In this paper, information preservation for XML is formally studied, and the results of(More)
Data warehouses and recording systems typically have a large continuous stream of incoming data, that must be stored in a manner suitable for future access. Access to stored records is usually based on a key. Organizing the data on disk as the data arrives using standard techniques would result in either (a) one or more I/OS to store each incoming record(More)
An increasing number of applications use XML data published from relational databases. For speed and convenience, such applications routinely cache this XML data locally and access it through standard navigational interfaces such as DOM, sacrificing the consistency and integrity guarantees provided by a DBMS for speed. The ROLEX system is being built to(More)
Yahoo! is building a set of scalable, highly-available data storage and processing services, and deploying them in a cloud model to make application development and ongoing maintenance significantly easier. In this paper we discuss the vision and requirements, as well as the components that will go into the cloud. We highlight the challenges and research(More)
While the <sc>XML</sc> Stylesheet Language for Transformations (<sc>XSLT</sc>) was not designed as a query language, it is well-suited for many query-like operations on <sc>XML</sc> documents including selecting and restructuring data. Further, it actively fulfills the role of an <sc>XML</sc> query language in modern applications and is widely supported by(More)
Data management for stateful Web applications is extremely challenging. Applications must scale as they grow in popularity, serve their content with low latency on a global scale, and be highly available, even in the face of hardware failures. This need has generated a new class of Internet-scale data management systems. Yahoo has more than 100 user-facing(More)
General-purpose commercial disk-based database systems, though widely employed in practice, have failed to meet the performance requirements of applications requiring short, predictable response times, and extremely high throughput rates. Main memory is the only technology capable of these characteristics. DataBlitz is a main-memory storage manager product(More)
Sherpa is a large-scale distributed and globally replicated multi-tenant cloud data storage system. Sherpa scales by horizontally partitioning data into tablets and distributing these tablets across multiple servers. While Sherpa scales for increasing workload sizes by adding servers, it is vulnerable to load imbalance among tablets that cause hotspots to(More)
Service providers and enterprises all over the world are rapidly deploying Voice over IP (VoIP) networks because of reduced capital and operational expenditure, and easy creation of new services. Voice traffic has stringement requirements on the quality of service, like strict delay and loss requirements, and 99.999% network availability. However, IP(More)