Towards Parallel Spatial Query Processing for Big Spatial Data
@article{Zhong2012TowardsPS, title={Towards Parallel Spatial Query Processing for Big Spatial Data}, author={Yunqin Zhong and Jizhong Han and Tieying Zhang and Zhenhua Li and Jinyun Fang and Guihai Chen}, journal={2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops \& PhD Forum}, year={2012}, pages={2085-2094} }
In recent years, spatial applications have become more and more important in both scientific research and industry. Spatial query processing is the fundamental functioning component to support spatial applications. However, the state-of-the-art techniques of spatial query processing are facing significant challenges as the data expand and user accesses increase. In this paper we propose and implement a novel scheme (named VegaGiStore) to provide efficient spatial query processing over big…
Figures from this paper
73 Citations
Efficient spark-based framework for big geospatial data query processing and analysis
- Computer Science2017 IEEE Symposium on Computers and Communications (ISCC)
- 2017
This paper introduces a generic framework for optimizing the performance of big spatial data queries on top of Apache Spark and supports advanced management functions including a unique self-adaptable load-balancing service to self-tune framework execution.
Hadoop-GIS: A High Performance Spatial Data Warehousing System over MapReduce
- Computer ScienceProc. VLDB Endow.
- 2013
Hadoop-GIS - a scalable and high performance spatial data warehousing system for running large scale spatial queries on Hadoop and integrated into Hive to support declarative spatial queries with an integrated architecture is presented.
An extra spatial hierarchical schema in key-value store
- Computer ScienceCluster Computing
- 2018
This paper advocates an extra spatial hierarchical schema inspired by geohash, and design spatial query method based on primary keys index, and tests the query accuracy and efficiency based on this schema even without the help of a spatial index.
Haggis: turbocharge a MapReduce based spatial data warehousing system with GPU engine
- Computer ScienceBigSpatial@SIGSPATIAL
- 2014
This paper extends Hadoop-GIS, a MapReduce based spatial query system, and provides GPU accelerated spatial query processing capability at the engine level, and demonstrates that GPU accelerated system can gain considerable performance improvements.
Performance evaluation of SpatialHadoop for big web mapping data
- Computer Science2016 Second International Conference on Web Research (ICWR)
- 2016
This study investigates the performance of SpatialHadoop and compares it against a variety of datasets and with the use of different operations including index creation, K-Nearest Neighbor (KNN), spatial join, and so on and demonstrates that as the volume of data increases, Spatial Hadoop scales well and performs better than the relational engine.
GeoSpark SQL: An Effective Framework Enabling Spatial Queries on Spark
- Computer ScienceISPRS Int. J. Geo Inf.
- 2017
This paper aims to address the increasingly large-scale spatial query-processing requirement in the era of big data, and proposes an effective framework GeoSpark SQL, which enables spatial queries on Spark, and notes that Spark is not a panacea.
Big Data Storage Techniques for Spatial Databases: Implications of Big Data Architecture on Spatial Query Processing
- Computer Science
- 2015
This paper reviews the various approaches with Hadoop to handle spatial data efficiently, categorizes the spatial queries reported in the testing, summarizes results, and identifies strengths and weaknesses with each approach.
Spatio-Temporal Join on Apache Spark
- Computer ScienceSIGSPATIAL/GIS
- 2017
This paper details several variants of a spatial join operation that addresses both spatial, temporal, and attribute-based joins that runs in commercial off-the-shelf (COTS) application.
An improved integrated Grid and MapReduce‐Hadoop architecture for spatial data: Hilbert TGS R‐Tree–based IGSIM
- Computer ScienceConcurr. Comput. Pract. Exp.
- 2019
A thorough literature survey has been done on the available traditional spatial indexes from the serial programming environment and Hilbert TGS R‐Tree has been selected on the basis of several parameters for its parallel implementation and extending spatial query efficiency work of the IGSIM.
Scalable and Fast Top-k Most Similar Trajectories Search Using MapReduce In-Memory
- Computer ScienceADC
- 2016
This work proposes a distributed parallel approach for k-NN trajectories search in a multi-user environment using MapReduce in-memory, and proposes a space/time data partitioning based on Voronoi diagrams and time pages in order to provide both spatial-temporal data organization and process decentralization.
References
SHOWING 1-10 OF 23 REFERENCES
SJMR: Parallelizing spatial join with MapReduce on clusters
- Computer Science2009 IEEE International Conference on Cluster Computing and Workshops
- 2009
SJMR (Spatial Join with MapReduce), a novel parallel algorithm to relieve the problem of heterogeneous related data sets processing, which is common in operations like spatial joins is presented.
Revisiting R-Tree Construction Principles
- Computer ScienceADBIS
- 2002
It is argued that dynamic R-tree construction is a typical clustering problem which can be addressed by incorporating existing clustering algorithms, and adopted the well-known k-means algorithm as a working example.
Supporting Complex Multi-Dimensional Queries in P2P Systems
- Computer Science25th IEEE International Conference on Distributed Computing Systems (ICDCS'05)
- 2005
Network-R-tree (NR-tree), a P2P adaptation of the dominant spatial index - R*-tree was proposed, which is capable of processing complex queries such as range queries and k-nearest neighbor queries.
Quadtree and R-tree indexes in oracle spatial: a comparison using GIS data
- Computer ScienceSIGMOD '02
- 2002
This paper first describes briefly the implementation of Quadtree and R-tree index structures and related optimizations in Oracle Spatial, then examines the relative merits of two structures as implemented inOracle Spatial and compares their performance for different types of queries and other operations.
An introduction to spatial database systems
- Computer ScienceThe VLDB Journal
- 2005
This work surveys data modeling, querying, data structures and algorithms, and system architecture for spatial database systems, with the emphasis on describing known technology in a coherent manner, rather than listing open problems.
Bigtable: A Distributed Storage System for Structured Data
- Computer ScienceTOCS
- 2008
The simple data model provided by Bigtable is described, which gives clients dynamic control over data layout and format, and the design and implementation of Bigtable are described.
Using a distributed quadtree index in peer-to-peer networks
- Computer ScienceThe VLDB Journal
- 2005
A distributed quadtree index that adapts the MX-CIF quadtree is described that enables more powerful accesses to data in P2P networks and is easy to use, scalable, and exhibits good load-balancing properties.
Hadoop++
- Computer Science
- 2010
This paper proposes a new type of system named Hadoop++: it boosts task performance without changing the Hadooper framework at all (Hadoop does not even 'notice it'), and shows the superiority of Hadoo++ over both Hadoops and HadoOPDB for tasks related to indexing and join processing.
Spatial databases - a tour
- Computer Science
- 2003
An introduction to Spatial Databases and Trends in Spatial Data Mining.
Implementing WebGIS on Hadoop: A case study of improving small file I/O performance on HDFS
- Computer Science2009 IEEE International Conference on Cluster Computing and Workshops
- 2009
This paper proposes an approach to optimize I/O performance of small files on HDFS by combining small files into large ones to reduce the file number and build index for each file.