Learn More
The production environment for analytical data management applications is rapidly changing. Many enterprises are shifting away from deploying their analytical databases on high-end proprietary machines, and moving towards cheaper, lower-end, commodity hardware, typically arranged in a shared-nothing MPP architecture, often in a virtualized environment(More)
Commercial analytical database systems suffer from a high "time-to-first-analysis": before data can be processed, it must be modeled and schematized (a human effort), transferred into the database's storage layer, and optionally clustered and indexed (a computational effort). For many types of structured data, this upfront effort is unjustifiable, so the(More)
Writing complex queries in SQL is a challenge for users. Prior work has developed several techniques to ease query specification but none of these techniques are applicable to a particularly difficult class of queries: <i>quantified queries</i>. Our hypothesis is that users prefer to specify quantified queries interactively by <i>trial-and-error</i>. We(More)
To help a user specify and verify quantified queries --- a class of database queries known to be very challenging for all but the most expert users --- one can question the user on whether certain data objects are <i>answers</i> or <i>non-answers</i> to her intended query. In this paper, we analyze the number of questions needed to learn or verify(More)
Paper as a medium persists as the de facto standard for information collection, storage, and transfer in many low-resource developing contexts. Of these contexts, the microfinance industry continues to be fascinating in the ongoing ICTD conversation due, in part, to its elimination of paper by digitizing money transfers using mobile banking. This success(More)
HadoopDB is a hybrid of MapReduce and DBMS technologies, designed to meet the growing demand of analyzing massive datasets on very large clusters of machines. Our previous work has shown that HadoopDB approaches parallel databases in performance and still yields the scalability and fault tolerance of MapReduce-based systems. In this demonstration, we focus(More)
Digital games have the ability to engage both children and adults alike. We are exploring the use of games for children with long term treatment regimes, where motivation for compliance is a key factor in the success of the treatment. In this paper, we describe the game framework we are building for this purpose. This framework is meant to support the long(More)
Automation of business processes, proliferation of digital devices. eBay has a 6.5 petabyte warehouse. Automation of business processes, proliferation of digital devices. eBay has a 6.5 petabyte warehouse. 2 Deep analysis over raw data: Inefficient to push data from database into specialized analysis engines → process data in the database. Best(More)
Urbanization has created transient, ethnically-varied, and densely-populated communities where meaningful human contact is difficult. Urban social norms such as "civil inattention" --- a deliberate display of unwillingness to become more familiar with strangers --- discourage social interactions among strangers. While these norms help reduce anxiety or fear(More)
DataPlay is a query tool that encourages a trial-and-error approach to query specification. DataPlay uses a graphi-cal query language to make a particularly challenging query specification task-quantification-easier. It constrains the relational data model to enable the presentation of non-answers, in addition to answers, to aid query interpretation. Two(More)