Towards distributed convoy pattern mining

  title={Towards distributed convoy pattern mining},
  author={Faisal Moeen Orakzai and Thomas Devogele and Toon Calders},
  journal={Proceedings of the 23rd SIGSPATIAL International Conference on Advances in Geographic Information Systems},
  • F. OrakzaiT. DevogeleT. Calders
  • Published 3 November 2015
  • Computer Science
  • Proceedings of the 23rd SIGSPATIAL International Conference on Advances in Geographic Information Systems
Mining movement data to reveal interesting behavioral patterns has gained attention in recent years. One such pattern is the convoy pattern which consists of at least m objects moving together for at least k consecutive time instants where m and k are user-defined parameters. Existing algorithms for detecting convoy patterns, however do not scale to real-life dataset sizes. Therefore a distributed algorithm for convoy mining is inevitable. In this paper, we discuss the problem of convoy mining… 

Figures and Tables from this paper

Distributed Convoy Pattern Mining

A generic distributed convoy pattern mining algorithm is proposed and it is shown how such an algorithm can be implemented using the MapReduce framework and the experimental results show that the distributed algorithm is scalable and more efficient than the existing sequential convoy patternmining algorithms.

k/2-hop: Fast Mining of Convoy Patterns With Effective Pruning

The experimental results show that k/2-hop outperforms existing sequential as well as parallel convoy pattern mining algorithms by orders of magnitude, and scales to larger datasets which existing algorithms fail on.

Distributed mining of convoys in large scale datasets

A generic distributed convoy pattern mining algorithm called DCM is proposed and how such an algorithm can be implemented using the MapReduce framework is shown, showing speed-ups of up to 16 times over SPARE, the state of the art distributed co-movement pattern mining framework.

Querying Recurrent Convoys over Trajectory Data

This study proposes the problem of finding recurrent co-moving patterns from streaming trajectories, enabling to discover recent co- Moving patterns that are repeated within a given time period and results on real-life trajectory data verify the efficiency and effectiveness of the method.

Effective Following Patterns Mining Scheme for the Movements of Objects

The progress of patterns mining mainly in frequent pattern, periodic pattern and following pattern is reviewed by reviewing and comparing the methods and algorithms in detail, providing a quick understanding of research to the worker and giving effective following patterns mining scheme for the movements of objects.

Spatio-Temporal Data Mining: From Big Data to Patterns

Technological advances in terms of data acquisition enable to better monitor dynamic phenomena in various domains (areas, fields) including environment to require new data analysis and knowledge discovery methods, including approaches aimed at discovering spatio-Temporal patterns.

Distributed Human Trajectory Sensing and Partial Similarity Queries

  • Haotian WangJie Gao
  • Computer Science
    2020 19th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN)
  • 2020
New partial similarity measures, categorized as time-sensitive, order-sensitive and order-insensitive ones, are proposed and shown with real data that they are more robust than classical measures and more suitable for generating meaningful query results in near-neighbor type of data mining applications.

MinHash Hierarchy for Privacy Preserving Trajectory Sensing and Query

This work builds on the checkpoints a distributed data structure named the MinHash hierarchy, with which one can efficiently answer queries regarding popular paths and other traffic patterns and provides privacy protection using a model inspired by the differential privacy model.



On-line discovery of flock patterns in spatio-temporal data

The on-line flock discovery problem is polynomial and a framework and several strategies to discover such patterns in streaming spatio-temporal data are proposed and experiments show that the proposed algorithms are efficient and scalable.

Convoy Queries in Spatio-Temporal Databases

The main novelty of the methods is to approximate original trajectories by using line simplification methods and perform the discovery process over the simplified trajectories with bounded errors.

Discovery of convoys in trajectory databases

This paper formalizes the concept of a convoy query using density-based notions, in order to capture groups of arbitrary extents and shapes and develops three efficient algorithms for convoy discovery that adopt the well-known filter-refinement framework.

Computing longest duration flocks in trajectory data

This work considers the computational efficiency of computing two of the most basic spatio-temporal patterns in trajectories, namely flocks and meetings, and gives several exact and approximation algorithms.

On Discovering Moving Clusters in Spatio-temporal Data

This work provides a formal definition for moving clusters and describes three algorithms for their automatic discovery, a straight-forward method based on the definition, a more efficient method which avoids redundant checks and an approximate algorithm which trades accuracy for speed by borrowing ideas from the MPEG-2 video encoding.

Discovery of Evolving Convoys

This work proposes new concepts of dynamic convoys and evolving convoy, which reflect real-life scenarios, and develops algorithms to discover evolving convoys in an incremental manner.

Accurate Discovery of Valid Convoys from Moving Object Trajectories

This work proposes a new valid convoy discovery algorithm, called VCoDA, for the accurate discovery of valid convoys from moving object trajectories that improves the precision by a factor of 3 on average and the recall by up to 2 orders of magnitude as compared to an existing method.

A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise

DBSCAN, a new clustering algorithm relying on a density-based notion of clusters which is designed to discover clusters of arbitrary shape, is presented which requires only one input parameter and supports the user in determining an appropriate value for it.

MR-DBSCAN: a scalable MapReduce-based DBSCAN algorithm for heavily skewed data

MR-DBSCAN is presented, a scalable DBSCAN algorithm using MapReduce that achieves desirable load balancing even in the context of heavily skewed data and proposes a novel data partitioning method based on computation cost estimation.

Managing Skew in Hadoop

An overview of some of the recent work that tackles the problem of load imbalance (a.k.a. skew) in parallel UDO evaluation, and discusses the prevalence of skew in today’s applications and clusters.