Data Driven Hourly Taxi Drop-offs Prediction using TLC Trip Record Data

  title={Data Driven Hourly Taxi Drop-offs Prediction using TLC Trip Record Data},
  author={Chathurika S. Wickramasinghe and Daniel L. Marino and F. Yucel and Eyuphan Bulut and Milos Manic},
  journal={2019 12th International Conference on Human System Interaction (HSI)},
Crowdsourcing applications are proven to be a promising tool to gather valuable information, which can be used for a wide range of tasks, such as ensuring public safety. Traffic data collected using these applications have been used for efficient evacuation planning in large cities. In this paper, we propose to use regression-based machine learning methods to predict hourly taxi rides for a given location in a target day of week and month. The presented method can be used for the following… 

Figures and Tables from this paper

Neural Network-Based Ridesharing Policy for Reducing Rider Transportation Cost
A neural network-based ridesharing policy that will utilize the available seats in an occupied taxi to accommodate other queueing passengers and reduces the passenger taxi fare is proposed.
Hybrid genetic algorithm for ridesharing with timing constraints: efficiency analysis with real-world data
A Hybrid Genetic Algorithm using elements from Simulated Annealing and Local search which is very suitable for Ridesharing related applications is proposed which efficiently handles advanced constraints like timing windows for pick-up, detour time(or distance), and waiting-time minimization.
Data-driven Stochastic Anomaly Detection on Smart-Grid communications using Mixture Poisson Distributions
This paper uses Mixture Poisson distributions to model the packet communication between the network devices, and modeled using a directed graph, where each edge represents a Poisson distribution of the packets being transmitted.


Effective Traffic Flow Forecasting Using Taxi and Weather Data
A new methodology called WTFPredict is proposed to solve the problem of floating traffic flow prediction with weather-affected New York City, and a novel model using taxi trajectory data and weather information is presented.
Traffic Flow Forecasting for Urban Work Zones
Four models were developed for forecasting traffic flow for planned work zone events and it was shown that the random forest model yielded the most accurate long-term and short-term work zone traffic flow forecasts.
Predicting Bike Usage for New York City's Bike Sharing System
It is shown that aggregating stations in neighborhoods can substantially improve predictions, and the presented model can assist planners by predicting bike demand at a macroscopic level, between pairs of neighborhoods.
CROWDSAFE: crowd sourcing of crime incidents and safe routing on mobile devices
CROWDSAFE is presented, a novel convergence of Internet crowd sourcing and portable smart devices to enable real time, location based crime incident searching and reporting and leverages crowd sourced data to provide novel features such as a Safety Router and value added crime analytics.
Distributed Data Analytics Framework for Smart Transportation
  • Alexander Howard, Tim Lee, Sara Mahar, Paul Intrevado, D. Woodbridge
  • Computer Science
    2018 IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)
  • 2018
This study evaluates Logistic Regression, Random Forrest Regressors and Classifiers, Principal Component Analysis, and Gradient Boosted Regression and Classification Tree machine learning techniques on a commodity computer as well as on a distributed system.
An evaluative study on mobile crowdsourcing applications for crime watch
6 mobile applications which use crowdsourcing to report crime related incidents are evaluated based on 7 criteria that are found to be important based on literatures.
Comparing Random Forest with Logistic Regression for Predicting Class-Imbalanced Civil War Onset Data
This article compares the performance of Random Forests with three versions of logistic regression, and finds that the algorithmic approach provides significantly more accurate predictions of civil war onset in out-of-sample data than any of theLogistic regression models.