Vijay Gadepally

Learn More
This paper presents BigDAWG, a reference implementation of a new architecture for " Big Data " applications. Such applications not only call for large-scale analytics, but also for real-time streaming support, smaller analytics at interactive speeds, data visualiza-tion, and cross-storage-system queries. Guided by the principle that " one size does not fit(More)
—The Apache Accumulo database is an open source relaxed consistency database that is widely used for government applications. Accumulo is designed to deliver high performance on unstructured data such as graphs of network data. This paper tests the performance of Accumulo using data from the Graph500 benchmark. The Dynamic Distributed Dimensional Data Model(More)
— The growing gap between data and users calls for innovative tools that address the challenges faced by big data volume, velocity and variety. Along with these standard three V's of big data, an emerging fourth " V " is veracity, which addresses the confidentiality, integrity, and availability of the data. Traditional cryptographic techniques that ensure(More)
—Big data and the Internet of Things era continue to challenge computational systems. Several technology solutions such as NoSQL databases have been developed to deal with this challenge. In order to generate meaningful results from large datasets, analysts often use a graph representation which provides an intuitive way to work with the data. Graph(More)
—The Apache Accumulo database excels at distributed storage and indexing and is ideally suited for storing graph data. Many big data analytics compute on graph data and persist their results back to the database. These graph calculations are often best performed inside the database server. The GraphBLAS standard provides a compact and efficient basis for a(More)
—Data processing systems impose multiple views on data as it is processed by the system. These views include spreadsheets, databases, matrices, and graphs. The common theme amongst these views is the need to store and operate on data as whole sets instead of as individual data elements. This work describes a common mathematical representation of these data(More)
—The ability to collect and analyze large amounts of data is a growing problem within the scientific community. The growing gap between data and users calls for innovative tools that address the challenges faced by big data volume, velocity and variety. Numerous tools exist that allow users to store, query and index these massive quantities of data. Each(More)
—The growing demand for cloud computing motivates the need to study the security of data received, stored, processed , and transmitted by a cloud. In this paper, we present a framework for such a study. We introduce a cloud computing model that captures a rich class of big-data use-cases and allows reasoning about relevant threats and security goals. We(More)