Corpus ID: 215238505

GeoFlink: A Framework for the Real-time Processing of Spatial Streams

@article{Shaikh2020GeoFlinkAF,
  title={GeoFlink: A Framework for the Real-time Processing of Spatial Streams},
  author={Salman Ahmed Shaikh and Komal Mariam and Hiroyuki Kitagawa and Kyoung-Sook Kim},
  journal={ArXiv},
  year={2020},
  volume={abs/2004.03352}
}
Apache Flink is an open-source system for the scalable processing of batch and streaming data. Flink does not natively support efficient processing of spatial data streams, which is the requirement of many applications dealing with spatial data. Besides Flink, other scalable spatial data processing platforms including GeoSpark, Spatial Hadoop, GeoMesa and Parallel Secondo do not support streaming workloads and can only handle static/batch workloads. Hence this work presents GeoFlink, which… Expand
Using Deep Learning for Big Spatial Data Partitioning
TLDR
It is shown that the proposed model outperforms the baseline method in terms of accuracy for choosing the best partitioning technique by only analyzing the summary of the datasets, and the applicability of the proposed technique is experimentally shown. Expand
USING SYSTEMS OF PARALLEL AND DISTRIBUTED DATA PROCESSING TO BUILD HYDROLOGICAL MODELS BASED ON REMOTE SENSING DATA
Abstract. The article describes the possibilities and advantages of using distributed systems in the processing and analysis of remote sensing data. The preparation and processing of various types ofExpand
A Scalable and Dependable Data Analytics Platform for Water Infrastructure Monitoring
TLDR
A scalable stream processing platform designed to monitor large and critical infrastructures of cities, and several non-functional requirements such as scalability, responsiveness and dependability are factored into the system architecture. Expand

References

SHOWING 1-10 OF 29 REFERENCES
Real-Time Spatial Queries for Moving Objects Using Storm Topology
TLDR
This paper presents a distributed spatial index based on Apache Storm, an open-source distributed real-time computation system, and builds a secondary distributed index for spatial join queries based on the grid-partition index. Expand
Spatial data management in apache spark: the GeoSpark perspective and beyond
TLDR
GeoSpark is presented, which extends the core engine of Apache Spark and SparkSQL to support spatial data types, indexes, and geometrical operations at scale and achieves up to two orders of magnitude faster run time performance than existing Hadoop-based systems. Expand
SparkGIS: Resource Aware Efficient In-Memory Spatial Query Processing
TLDR
The comparative evaluation has shown that the performance of SparkGIS is on par with contemporary Spark based platforms for relatively smaller queries and outperforms them for larger data and memory intensive workflows by dynamic query rewriting and efficient spatial data management. Expand
Hadoop-GIS: A High Performance Spatial Data Warehousing System over MapReduce
TLDR
Hadoop-GIS - a scalable and high performance spatial data warehousing system for running large scale spatial queries on Hadoop and integrated into Hive to support declarative spatial queries with an integrated architecture is presented. Expand
SpatialHadoop: A MapReduce framework for spatial data
  • A. Eldawy, M. Mokbel
  • Computer Science
  • 2015 IEEE 31st International Conference on Data Engineering
  • 2015
TLDR
SpatialHadoop is a comprehensive extension to Hadoop that injects spatial data awareness in each Hadoan layer, namely, the language, storage, MapReduce, and operations layers, with orders of magnitude better performance than Hadoops for spatial data processing. Expand
LocationSpark: In-memory Distributed Spatial Query Processing and Optimization
TLDR
This paper introduces new techniques for handling query skew that commonly happens in practice, and minimizes communication costs accordingly, and proposes a distributed query scheduler that uses a new cost model to minimize the cost of spatial query processing. Expand
GeoMesa: a distributed architecture for spatio-temporal fusion
TLDR
GeoMesa is a distributed spatio-temporal database built on top of Hadoop and column-family databases such as Accumulo and HBase that includes a suite of tools for indexing, managing and analyzing both vector and raster data. Expand
Large-scale spatial join query processing in Cloud
TLDR
The designs and implementations of two prototype systems that are ready for Cloud deployments are reported: SpatialSpark based on Apache Spark and ISP-MC based on Cloudera Impala, which support indexed spatial joins based on point-in-polygon test and point-to-polyline distance computation. Expand
Trees or grids?: indexing moving objects in main memory
TLDR
The study shows that the choice of the index boils down to the issues such as the ease of implementation or the support for spatially extended objects, and proposes the update- and query-efficient variants of the R-tree and the grid. Expand
STQL — A Spatio-Temporal Query Language
TLDR
This work presents the main aspects of an SQL-like, spatio-temporal query language, called STQL, and provides a framework in STQL that allows a user to build more and more complex predicates starting with a small set of elementary ones. Expand
...
1
2
3
...