1. Global multi-scale grid integer coding and spatial indexing: A novel approach for big earth observation data
- Author
-
Li He, Guo Congzhou, Guangling Lai, Lei Yi, Wu Xiangyu, Chunping Qiu, Xiaochong Tong, Zhang Yong, and Yongsheng Zhang
- Subjects
Earth observation ,010504 meteorology & atmospheric sciences ,business.industry ,Computer science ,Data management ,Search engine indexing ,Big data ,0211 other engineering and technologies ,Geohash ,02 engineering and technology ,Grid ,computer.software_genre ,01 natural sciences ,Atomic and Molecular Physics, and Optics ,Oracle ,Computer Science Applications ,Data mining ,Computers in Earth Sciences ,business ,Cluster analysis ,Engineering (miscellaneous) ,computer ,021101 geological & geomatics engineering ,0105 earth and related environmental sciences - Abstract
With the exponentially growing earth observation data of specific sensor-determined resolutions and update frequencies, earth observation has irreversibly arrived in the Big Data era, enabling new insights in science and engineering. With great opportunity comes great challenges regarding efficient and effective data management because earth observation data is of different scales and characterized by complexity in spatial relationships related to the real world. To overcome the challenges is crucial for, for instance, data mining, land surveying, and especially emergency mapping for disaster response. To improve the querying efficiency of big earth observation data, we proposed a novel data management approach: Global Multi-scale Grid Integer Coding and Spatial Indexing. Among our contributions are: (1) proposing Global Multi-scale Grid Integer Coding Model (GMGICM), which presents clustering property in both the scale dimension and spatial dimension, and theoretically facilitates an efficient querying; (2) deliberately applying GMGICM on multi-scale earth observation data spatial indexing, which results in one-dimensional data index, which can be queried using simple B-tree, inversion, and other one-dimensional indexes; (3) designing a strategy to assure the completeness of spatial querying, which is not well solved by existing grid-based coding models. The advantages of our proposed approach have been demonstrated with both simulated and real remote sensing data, with spatial operation 20 times as fast as Geohash and spatial querying 10 times as fast as Oracle Spatial on average. The proposed approach can be easily adapted for three or higher-dimensional earth observation data and bring potential benefit to all big earth observation data analytic projects.
- Published
- 2020
- Full Text
- View/download PDF