Learn More
MATLAB is a popular choice for algorithm development in signal and image processing. While traditionally done using sequential MATLAB running on desktop systems, in recent years there has been a surge of interest in running MATLAB in parallel to take advantage of multiprocessor and multicore systems. In this article, we discuss three variations of(More)
Organizations are often faced with the challenge of providing data management solutions for large, heterogenous datasets that may have different underlying data and programming models. For example, a medical dataset may have unstructured text, relational data, time series waveforms and imagery. Trying to fit such datasets in a single data management system(More)
This paper presents BigDAWG, a reference implementation of a new architecture for " Big Data " applications. Such applications not only call for large-scale analytics, but also for real-time streaming support, smaller analytics at interactive speeds, data visualiza-tion, and cross-storage-system queries. Guided by the principle that " one size does not fit(More)
The Apache Accumulo database is an open source relaxed consistency database that is widely used for government applications. Accumulo is designed to deliver high performance on unstructured data such as graphs of network data. This paper tests the performance of Accumulo using data from the Graph500 benchmark. The Dynamic Distributed Dimensional Data Model(More)
We present a framework for the estimation of driver behavior at intersections, with applications to autonomous driving and vehicle safety. The framework is based on modeling the driver behavior and vehicle dynamics as a hybrid-state system (HSS), with driver decisions being modeled as a discrete-state system and the vehicle dynamics modeled as a(More)
Big data and the Internet of Things era continue to challenge computational systems. Several technology solutions such as NoSQL databases have been developed to deal with this challenge. In order to generate meaningful results from large datasets, analysts often use a graph representation which provides an intuitive way to work with the data. Graph vertices(More)
The growing gap between data and users calls for innovative tools that address the challenges faced by big data volume, velocity and variety. Along with these standard three V's of big data, an emerging fourth “V” is veracity, which addresses the confidentiality, integrity, and availability of the data. Traditional cryptographic techniques(More)
The Apache Accumulo database excels at distributed storage and indexing and is ideally suited for storing graph data. Many big data analytics compute on graph data and persist their results back to the database. These graph calculations are often best performed inside the database server. The GraphBLAS standard provides a compact and efficient basis for a(More)
The ability to collect and analyze large amounts of data is a growing problem within the scientific community. The growing gap between data and users calls for innovative tools that address the challenges faced by big data volume, velocity and variety. Numerous tools exist that allow users to store, query and index these massive quantities of data. Each(More)
Software engineering studies have shown that programmer productivity is improved through the use of computational science integrated development environments (or CSIDE, pronounced "sea side”) such as MATLAB. ParaM is a CSIDE distribution which provides parallel execution of MATLAB scripts for HPC systems. ParaM runs on a range of processor(More)