Learn More
Many "big data" applications must act on data in real time. Running these applications at ever-larger scales requires parallel platforms that automatically handle faults and stragglers. Unfortunately, current distributed stream processing models provide fault recovery in an expensive manner, requiring hot replication or long recovery times, and do not(More)
—We consider the problem of reconstructing vehicle trajectories from sparse sequences of GPS points, for which the sampling interval is between 10 seconds and 2 minutes. We introduce a new class of algorithms, called altogether path inference filter (PIF), that maps GPS data in real time, for a variety of trade-offs and scenarios, and with a high(More)
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to(More)
—Controlling and analyzing cyberphysical and robotics systems is increasingly becoming a Big Data challenge. We study the case of predicting drivers' travel times in a large urban area from sparse GPS traces. We present a framework that can accommodate a wide variety of traffic distributions and spread all the computations on a cluster to achieve small(More)
Most optimal routing problems focus on minimizing travel time or distance traveled. Oftentimes, a more useful objective is to maximize the probability of on-time arrival, which requires statistical distributions of travel times, rather than just mean values. We propose a method to estimate travel time distributions on large-scale road networks, using probe(More)
We report on our experience scaling up the Mobile Millennium traffic information system using cloud computing and the Spark cluster computing framework. Mobile Millennium uses machine learning to infer traffic conditions for large metropolitan areas from crowdsourced data, and Spark was specifically designed to support such applications. Many studies of(More)
We consider the problem of estimating real-time traffic conditions from sparse, noisy GPS probe vehicle data. We specifically address arterial roads, which are also known as the secondary road network (highways are considered the primary road network). We consider several estimation problems: historical traffic patterns, real-time traffic conditions, and(More)
In this paper, we combine the most complete record of daily mobility, based on large-scale mobile phone data, with detailed Geographic Information System (GIS) data, uncovering previously hidden patterns in urban road usage. We find that the major usage of each road segment can be traced to its own--surprisingly few--driver sources. Based on this finding we(More)
Human subjects can quickly adapt and maintain performance of arm reaching when experiencing novel physical environments such as robot-induced velocity-dependent force fields. Using anodal transcranial direct current stimulation (tDCS) this study showed that the primary motor cortex may play a role in motor adaptation of this sort. Subjects performed arm(More)