47 results on '"Huang, Bormin"'
Search Results
2. GPU Acceleration of Adaptive Local Kriging Applied to Retrieving Slant-Range Surface Motion Maps.
- Author
-
Chang, Wen-Yen, Wu, Meng-Che, Chang, Yang-Lang, Shih, Sheng-Yung, and Huang, Bormin
- Abstract
Differential interferometric synthetic aperture radar (DInSAR) is an effective technique to measure the surface displacement caused by strong earthquakes. However, in the area with respect to the most significant deformation along the fault zone, the coherence between pre- and post-earthquake SAR images is completely lost because of the earthquake-induced violent and chaotic destruction on the land surface and consequently, no surface displacement data can be measured. In our previous work, anadaptive local kriging(ALK) was proposed to solve this problem. However, according to the spatial distribution of the reference points for the interpolation procedure of ALK, using moving window with different sizes through all the pixels in the image is time consuming. Thus, ahigh-performance computing(HPC) is needed. As a result, a parallel image interpolation approach, referred as thegraphics processing unit(GPU) based ALK method, to retrieving slant-range surface motion maps derived by DInSAR is proposed in this study. The proposed GPU-ALK makes use of the parallelism of the original ALK architectures to retrieve the loss of data through the fault zone. In this paper, an HPC GPU-ALK is proposed to speed up the interpolation. It makes use of the performance profiling to analyze the serial version of ALK and perform a parallel GPU computation for reducing the computation time efficiently. By employing one NVIDIA TITAN GPU, the proposed GPU-ALK achieves a speedup of 104.32× compared to its CPU counterpart. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
3. GPU-Accelerated Massively Parallel Computation of Electromagnetic Scattering of a Time-Evolving Oceanic Surface Model I: Time-Evolving Oceanic Surface Generation.
- Author
-
Linghu, Longxiang, Wu, Jiaji, Huang, Bormin, Wu, Zhensen, and Shi, Min
- Abstract
The development of a time-evolving oceanic surface model (TOSM) is important for the accuracy of determining electromagnetic scattering properties from the sea surface in oceanic remote sensing, target detection, and synthetic aperture radar imagery at arbitrary incident angles, especially small grazing angles. The double superimposition model (DSM) is a well-known approach for TOSM, which is composed of large-scale gravity waves and small-scale capillary ripples superimposed on them. However, due to the real-time dynamic complexity and dimensionality of the TOSM, the traditional DSM algorithm may be very time consuming and hardly meet the real-time requirements. In this paper, the feasibility of using graphics processing units (GPUs) with diverse compute unified device architecture optimization techniques to increase the speed of the generation of the TOSM is investigated. The entirely GPU-based TOSM implemented in this study includes the most effective use of temporary arrays, fast-math compiler options, shared memory, more L1 cache use than shared memory, and page-locked host memory. With respect to single-threaded C programs running on Intel(R) Core(TM) i7-2600K CPU, optimization on the GPU-based generated TOSM can achieve a speedup of $791 \times $ by using a single NVIDIA Tesla K80 GPU model. This new GPU-based TOSM implemented shows great potential for studying the scattering characteristics of electromagnetic backscattered echoes from dynamic sea surfacess. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
4. Particle swarm optimization/impurity function class overlapping scheme based on multiple attribute decision making model for hyperspectral band selection
- Author
-
Chang, Yang-Lang, primary, Chang, Lena, additional, Fang, Jyh-Perng, additional, Huang, Min-Yu, additional, Lin, Kuo-Kai, additional, Wu, Jen-Shian, additional, and Huang, Bormin, additional
- Published
- 2015
- Full Text
- View/download PDF
5. Acceleration of the WRF Monin?Obukhov?Janjic Surface Layer Parameterization Scheme on an MIC-Based Platform for Weather Forecast.
- Author
-
Huang, Melin, Huang, Bormin, and Huang, Hung-Lung Allen
- Abstract
A state-of-the-art numerical weather prediction (NWP) model, comprising weather research and forecast (WRF) model and analysis techniques, has been extensively exercised for weather prophecy all over the world. The WRF model, the soul role in NWP, constitutes dynamic solvers and elaborate physical components for conducting fluid behavior, all of which are sketched for both atmospheric research analyses and operational weather foretell. One salient physical ingredient in WRF is the surface layer simulation, which provides surface heat and moisture fluxes through calculation of surface friction velocities and exchange coefficients. The Monin–Obukhov–Janjic (MOJ) scheme is one popular surface layer option in WRF. This is one of the schemes in WRF we choose to expedite toward an end-to-end accelerated weather model. One advantageous aspect in WRF is the independence among grid points that facilitates programming implementations in parallel computation. We here present a parallel construction on the MOJ module with application of vectorization elements and efficient parallelization essentials furnished by Intel many integrated core (MIC) architecture. To achieve high computing performance, apart from the fundamental usage of Intel MIC architecture, this paper offers some new approaches related to code structure and art of optimization skills. At the end, in comparison with the original code separately executing on one CPU core and on one CPU socket (eight cores) with Intel Xeon E5-2670, the optimized MIC-based MOJ module running on Xeon Phi coprocessor 7120P ameliorates the computing performance by 9.6× and 1.5×, respectively. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
6. Parallel Construction of the WRF Pleim-Xiu Land Surface Scheme With Intel Many Integrated Core (MIC) Architecture.
- Author
-
Huang, Melin, Huang, Bormin, and Huang, Hung-Lung Allen
- Abstract
The weather research and forecast model (WRF), a simulation model, is built for the needs of both research and operational weather forecast and research in atmospheric science. The land-surface model (LSM), which describes one physical process in atmosphere, supplies the heat and moisture fluxes over land points and sea-ice points. Among several schemes of LSM that have been developed and incorporated into WRF, Pleim-Xiu (PX) scheme is one popular scheme in LSM. Processing the WRF simulation codes for weather prediction has acquired the benefits of dramatically increasing computing power with the advent of large-scale parallelism. Several merits such as vectorization essence, efficient parallelization, and multiprocessor computer structure of Intel Many Integrated Core (MIC) architecture allow us to accelerate the computation performance of the modeling code of the PX scheme. This paper demonstrates that our results of the optimized MIC-based PX scheme executing on Xeon Phi coprocessor 7120P enhances the computing performance by 2.3× and 11.7×, respectively, in comparison to the initial CPU-based code running on one CPU socket (eight cores) and on one single CPU core with Intel Xeon E5-2670. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
7. The Reconnection of Contour Lines from Scanned Color Images of Topographical Maps Based on GPU Implementation.
- Author
-
Song, Jianfeng, Wang, Panfeng, Miao, Qiguang, Liu, Ruyi, and Huang, Bormin
- Abstract
This paper presents a method for the reconnection of contour lines from scanned color images of topographical maps based on graphics processing unit (GPU) implementation. The extraction of contour lines, which are shown with brown color on USGS maps, is a difficult process due to aliasing and false colors induced by the scanning process and due to closely spaced and intersecting/overlapping features inherent to the map. First, an effective method is presented for contour line reconnection from scanned topographical maps based on CPU. This method considers both the distance and direction between the two broken points of the contour lines. It gets better performance and has high connection rate, but the time complexity of the algorithm is nonlinear with the increasing size of topographical map. Second, the advantage of the massively parallel computing capability of GPU with the compute unified device architecture is taken to improve the algorithm. Finally, a better performance has been achieved based on the open source computer vision library. The experimental results show that the GPU implementation with loop-based patterns achieves a speedup of 1360 $\times$ and the identical result compared with the implementation on CPU. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
8. Particle Swarm Optimization-Based Impurity Function Band Prioritization Using Weighted Majority Voting for Feature Extraction of High Dimensional Data Sets
- Author
-
Chang, Yang-Lang, primary, Huang, Min-Yu, additional, Wang, Ping-Hao, additional, Hsieh, Tung-Ju, additional, Fang, Jyh-Perng, additional, and Huang, Bormin, additional
- Published
- 2013
- Full Text
- View/download PDF
9. Further Improvement on GPU-Based Parallel Implementation of WRF 5-Layer Thermal Diffusion Scheme
- Author
-
Huang, Melin, primary, Huang, Bormin, additional, Mielikainen, Jarno, additional, Huang, H.L. Allen, additional, Goldberg, Mitchell D., additional, and Mehta, Ajay, additional
- Published
- 2013
- Full Text
- View/download PDF
10. Multisource data fusion for image classification using fisher criterion based nearest feature space approach
- Author
-
Chang, Yang-Lang, primary, Wang, Yi Chun, additional, Huang, Min-Yu, additional, Liu, Jin Nan, additional, Fu, Yi-Shiang, additional, Huang, Bormin, additional, and Han, Chin-Chuan, additional
- Published
- 2013
- Full Text
- View/download PDF
11. Simulation of tsunami impact on Taiwan coastal area
- Author
-
Chang, Yang-Lang, primary, Huang, Min-Yu, additional, Wang, Yi Chun, additional, Lin, Wen-Da, additional, Fang, Jyh Perng, additional, Huang, Bormin, additional, and Hsieh, Tung-Ju, additional
- Published
- 2013
- Full Text
- View/download PDF
12. GPU-Based Parallel Design of the Hyperspectral Signal Subspace Identification by Minimum Error (HySime).
- Author
-
Wu, Xin, Huang, Bormin, Wang, Lizhe, and Zhang, Jianqi
- Abstract
Signal subspace identification provides a performance improvement in hyperspectral applications, such as target detection, spectral unmixing, and classification. The HySime method is a well-known unsupervised approach for hyperspectral signal subspace identification. It computes the estimated noise and signal correlation matrices from which a subset of eigenvectors is selected to best represent the signal subspace in the least square sense. Depending on the complexity and dimensionality of the hyperspectral scene, the HySime algorithm may be computationally expensive. In this paper, we propose a massively parallel design of the HySime method for acceleration on NVIDIA's graphics processing units (GPUs). Our pure GPU-based implementation includes the optimal use of the page-locked host memory, block size, and the number of registers per thread. The proposed implementation was validated in terms of accuracy and performance using the NASA AVIRIS hyperspectral data. The benchmark with the NVIDIA GeForce GTX 580 and Tesla K20 GPUs shows significant speedups with regards to the optimized CPU-based serial counterpart. This new fast implementation of the HySime method demonstrates good potential for real-time hyperspectral applications. [ABSTRACT FROM PUBLISHER]
- Published
- 2016
- Full Text
- View/download PDF
13. Hopfield Neural Network Approach for Supervised Nonlinear Spectral Unmixing.
- Author
-
Li, Jing, Li, Xiaorun, Huang, Bormin, and Zhao, Liaoying
- Abstract
Nonlinear unmixing, which has attracted considerable interest from researchers and developers, has been successfully applied in many real-world hyperspectral imaging scenarios. Hopfield neural network (HNN) machine learning has already proven successful in solving the linear mixture model; this study utilized an HNN machine learning approach to solve the generalized bilinear model (GBM) optimization problem. Two HNNs were constructed in a successive manner to solve respective seminonnegative matrix factorization problems intended for abundance and nonlinear coefficient estimation. In the proposed HNN-based GBM unmixing method, both HNNs evolve to stable states after a number of iterations to obtain unmixing results related to the states of neurons. In experiments on synthetic data, the proposed method showed more efficient performance in regard to abundance estimation accuracy than other GBM optimization algorithms, especially when given reliable endmember spectra. The proposed method was also applied to real hyperspectral data and still demonstrated notable advantages despite the obvious increase in unmixing difficulty. [ABSTRACT FROM PUBLISHER]
- Published
- 2016
- Full Text
- View/download PDF
14. Parallel Computation of Aerial Target Reflection of Background Infrared Radiation: Performance Comparison of OpenMP, OpenACC, and CUDA Implementations.
- Author
-
Guo, Xing, Wu, Jiaji, Wu, Zhensen, and Huang, Bormin
- Abstract
The infrared (IR) signature of an aerial target due to the reflection of radiation from the Sun, the Earth’s surface and atmosphere plays an important role in aerial target detection and tracking. As the background radiation from the Earth’s surface, and atmosphere is distributed in the entire space and in a wide spectrum, it is time-consuming to obtain an aerial target’s reflected radiation. This problem is suitable for parallel implementation to run on multicore CPU or many-core GPU because the reflection of background radiation incident from different directions in each spectral wavelength can be calculated in parallel. We consider three different parallel approaches: 1) CPU implementation using OpenMP (open multiprocessing); 2) GPU implementation using OpenACC (open accelerators); and 3) GPU implementation using CUDA (compute unified device architecture). An NVIDIA K20c GPU (with 2496 cores) and two Intel Xeon E5-2690 CPU (with 8 cores each) are used in our experiment. Compared to their single-threaded CPU counterpart, speedups obtained by OpenMP, OpenACC, and CUDA implementations are 15x, 140x, 426x, respectively. The result shows that GPU implementations are promising in our problem. [ABSTRACT FROM PUBLISHER]
- Published
- 2016
- Full Text
- View/download PDF
15. GPU Compute Unified Device Architecture (CUDA)-based Parallelization of the RRTMG Shortwave Rapid Radiative Transfer Model.
- Author
-
Mielikainen, Jarno, Price, Erik, Huang, Bormin, Huang, Hung-Lung Allen, and Lee, Tsengdar
- Abstract
Radiative transfer of electromagnetic radiation through a planetary atmosphere is computed using an atmospheric radiative transfer model (RTM). One RTM is the rapid RTM (RRTM), which calculates both longwave and shortwave atmospheric radiative fluxes and heating rates. Broadband radiative transfer code for general circulation model (GCM) applications, rapid RTM for global (RRTMG), is based on the single-column reference code, RRTM. The focus of this paper is on the RRTMG shortwave (RRTMG_SW) model. Due to its accuracy, RRTMG_SW has been implemented operationally in many weather forecast and climate models. In this paper, we examine the feasibility of using graphics processing units (GPUs) to accelerate the RRTMG_SW for a massive amount of atmospheric profiles. In recent years, GPUs have emerged as a low-cost, low-power, and a very high-performance alternative to conventional central processing units (CPUs). GPUs can provide a substantial improvement in RRTMG speed by supporting the parallel computation of large numbers of independent radiative calculations in separate atmospheric profiles. A GPU-compatible version of RRTMG was implemented and thorough testing was performed to ensure that the original level of accuracy is retained. Our results show that GPUs can provide significant speedup over conventional CPUs. In particular, Nvidia’s Tesla K40 GPU card can provide a speedup of ${{202}} \times $ compared to its single-threaded Fortran counterpart running on Intel Xeon E5-2603 CPU, whereas the speedup for four CPU cores, on one CPU socket, with respect to 1 CPU core is ${{5}}.{{6}} \times $. [ABSTRACT FROM PUBLISHER]
- Published
- 2016
- Full Text
- View/download PDF
16. Microwave Unmixing With Video Segmentation for Inferring Broadleaf and Needleleaf Brightness Temperatures and Abundances From Mixed Forest Observations.
- Author
-
Gu, Lingjia, Zhao, Kai, and Huang, Bormin
- Subjects
OPTICAL properties ,MIXED forests ,MICROWAVES ,GRISELINIA littoralis ,BRIGHTNESS temperature - Abstract
Passive microwave sensors have better capability of penetrating forest layers to obtain more information from forest canopy and ground surface. For forest management, it is useful to study passive microwave signals from forests. Passive microwave sensors can detect signals from needleleaf, broadleaf, and mixed forests. The observed brightness temperature of a mixed forest can be approximated by a linear combination of the needleleaf and broadleaf brightness temperatures weighted by their respective abundances. For a mixed forest observed by an N-band microwave radiometer with horizontal and vertical polarizations, there are 2 N observed brightness temperatures. It is desirable to infer 4 N $+$ 2 unknowns: 2 N broadleaf brightness temperatures, 2 N needleleaf brightness temperatures, 1 broadleaf abundance, and 1 needleleaf abundance. This is a challenging underdetermined problem. In this paper, we devise a novel method that combines microwave unmixing with video segmentation for inferring broadleaf and needleleaf brightness temperatures and abundances from mixed forests. We propose an improved Otsu method for video segmentation to infer broadleaf and needleleaf abundances. The brightness temperatures of needleleaf and broadleaf trees can then be solved by the nonnegative least squares solution. For our mixed forest unmixing problem, it turns out that the ordinary least squares solution yields the desired positive brightness temperatures. The experimental results demonstrate that the proposed method is able to unmix broadleaf and needleleaf brightness temperatures and abundances well. The absolute differences between the reconstructed and observed brightness temperatures of the mixed forest are well within 1 K. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
17. Optimizing Purdue-Lin Microphysics Scheme for Intel Xeon Phi Coprocessor.
- Author
-
Mielikainen, Jarno, Huang, Bormin, and Huang, Hung-Lung Allen
- Abstract
Due to severe weather events, there is a growing need for more accurate weather predictions. Climate change has increased both frequency and severity of such events. Optimizing weather model source code would result in reduced run times or more accurate weather predictions. One such weather model is the weather research and forecasting (WRF) model, which is designed for both numerical weather prediction (NWP) and atmospheric research. The WRF software infrastructure consists of several components such as dynamic solvers and physics schemes. Purdue-Lin scheme is a relatively sophisticated microphysics scheme in the WRF model. The scheme includes six classes of hydro meteors: 1) water vapor; 2) cloud water; 3) raid; 4) cloud ice; 5) snow; and 6) graupel. The scheme is very suitable for massively parallel computation as there are no interactions among horizontal grid points. Thus, we present our optimization results for the Purdue-Lin microphysics scheme. Those optimizations included improved vectorization of the code to utilize multiple vector units inside each processor code better. Performed optimizations improved the performance of the original unmodified Purdue-Lin microphysics code running natively on Xeon Phi 7120P by a factor of $4.7\times$. Similarly, the same optimizations improved the performance of the Purdue-Lin microphysics scheme on a dual socket configuration of eight core Intel Xeon E5-2670 CPUs by a factor of $1.3 \times$ compared to the original code. [ABSTRACT FROM PUBLISHER]
- Published
- 2016
- Full Text
- View/download PDF
18. Lossless Compression of Hyperspectral Imagery via Clustered Differential Pulse Code Modulation with Removal of Local Spectral Outliers.
- Author
-
Wu, Jiaji, Kong, Wanqiu, Mielikainen, Jarno, and Huang, Bormin
- Subjects
DIFFERENTIAL pulse code modulation ,HYPERSPECTRAL imaging systems ,LOSSLESS data compression ,OUTLIERS (Statistics) ,SPECTROMETERS - Abstract
A high-order clustered differential pulse code modulation method with removal of local spectral outliers (C-DPCM-RLSO) is proposed for the lossless compression of hyperspectral images. By adaptively removing the local spectral outliers, the C-DPCM-RLSO method improves the prediction accuracy of the high-order regression predictor and reduces the residuals between the predicted and the original images. The experiment on a set of the NASA Airborne Visible Infrared Imaging Spectrometer (AVIRIS) test images show that the C-DPCM-RLSO method has a comparable average compression gain but a much reduced execution time as compared with the previous lossless methods. [ABSTRACT FROM PUBLISHER]
- Published
- 2015
- Full Text
- View/download PDF
19. A GPU-based Implementation of WRF PBL/MYNN Surface Layer Scheme
- Author
-
Wu, Xianyun, primary, Huang, Bormin, additional, Huang, H.-L. Allen, additional, and Goldberg, Mitchell D., additional
- Published
- 2012
- Full Text
- View/download PDF
20. Parallel Computation of the Weather Research and Forecast (WRF) WDM5 Cloud Microphysics on a Many-Core GPU
- Author
-
Wang, Jun, primary, Huang, Bormin, additional, Huang, Allen, additional, and Goldberg, Mitchell D., additional
- Published
- 2011
- Full Text
- View/download PDF
21. Parallel Implementation of Edge-Directed Image Interpolation on a Graphics Processing Unit
- Author
-
Wu, Jiaji, primary, Li, Tao, additional, and Huang, Bormin, additional
- Published
- 2011
- Full Text
- View/download PDF
22. Accelerating the Kalman Filter on a GPU
- Author
-
Huang, Min-Yu, primary, Wei, Shih-Chieh, additional, Huang, Bormin, additional, and Chang, Yang-Lang, additional
- Published
- 2011
- Full Text
- View/download PDF
23. GPU Implementation of Orthogonal Matching Pursuit for Compressive Sensing
- Author
-
Fang, Yong, primary, Chen, Liang, additional, Wu, Jiaji, additional, and Huang, Bormin, additional
- Published
- 2011
- Full Text
- View/download PDF
24. Development of GPU-based RTTOV-7 IASI and AMSU-A forward models
- Author
-
Mielikainen, Jarno, primary, Huang, Bormin, additional, Huang, Hung-Lung Allen, additional, and Saunders, Roger, additional
- Published
- 2011
- Full Text
- View/download PDF
25. GPU-based spatially divided predictive partitioned vector quantization for gifts ultraspectral data compression
- Author
-
Wei, Shih-Chieh, primary and Huang, Bormin, additional
- Published
- 2011
- Full Text
- View/download PDF
26. Real-Time Big Data Analytical Architecture for Remote Sensing Application.
- Author
-
Rathore, Muhammad Mazhar Ullah, Paul, Anand, Ahmad, Awais, Chen, Bo-Wei, Huang, Bormin, and Ji, Wen
- Abstract
The assets of remote senses digital world daily generate massive volume of real-time data (mainly referred to the term “Big Data”), where insight information has a potential significance if collected and aggregated effectively. In today’s era, there is a great deal added to real-time remote sensing Big Data than it seems at first, and extracting the useful information in an efficient manner leads a system toward a major computational challenges, such as to analyze, aggregate, and store, where data are remotely collected. Keeping in view the above mentioned factors, there is a need for designing a system architecture that welcomes both real-time, as well as offline data processing. Therefore, in this paper, we propose real-time Big Data analytical architecture for remote sensing satellite application. The proposed architecture comprises three main units, such as 1) remote sensing Big Data acquisition unit (RSDU); 2) data processing unit (DPU); and 3) data analysis decision unit (DADU). First, RSDU acquires data from the satellite and sends this data to the Base Station, where initial processing takes place. Second, DPU plays a vital role in architecture for efficient processing of real-time Big Data by providing filtration, load balancing, and parallel processing. Third, DADU is the upper layer unit of the proposed architecture, which is responsible for compilation, storage of the results, and generation of decision based on the results received from DPU. The proposed architecture has the capability of dividing, load balancing, and parallel processing of only useful data. Thus, it results in efficiently analyzing real-time remote sensing Big Data using earth observatory system. Furthermore, the proposed architecture has the capability of storing incoming raw data to perform offline analysis on largely stored dumps, when required. Finally, a detailed analysis of remotely sensed earth observatory Big Data for land and sea area are provided using Hadoop. In addition, various algorithms are proposed for each level of RSDU, DPU, and DADU to detect land as well as sea area to elaborate the working of an architecture. [ABSTRACT FROM PUBLISHER]
- Published
- 2015
- Full Text
- View/download PDF
27. Optimizing Total Energy–Mass Flux (TEMF) Planetary Boundary Layer Scheme for Intel’s Many Integrated Core (MIC) Architecture.
- Author
-
Mielikainen, Jarno, Huang, Bormin, and Huang, Hung-Lung Allen
- Abstract
In order to make use of the ever-improving microprocessor performance, the applications must be modified to take advantage of the parallelism of today’s microprocessors. One such application that needs to be modernized is the weather research and forecasting (WRF) model, which is designed for numerical weather prediction and atmospheric research. The WRF software infrastructure consists of several components such as dynamic solvers and physics schemes. Numerical models are used to resolve the large-scale flow. However, subgrid-scale parameterizations are for an estimation of small-scale properties (e.g., boundary layer turbulence and convection, clouds, radiation). Those have a significant influence on the resolved scale due to the complex nonlinear nature of the atmosphere. For the cloudy planetary boundary layer (PBL), it is fundamental to parameterize vertical turbulent fluxes and subgrid-scale condensation in a realistic manner. A parameterization based on the total energy–mass flux (TEMF) that unifies turbulence and moist convection components produces a better result than other PBL schemes. Thus, we present our optimization results for the TEMF PBL scheme. Those optimizations included vectorization of the code to utilize multiple vector units inside each processor code. The optimizations improved the performance of the original TEMF code on Xeon Phi 7120P by a factor of \bf 25.\bf 9 \times . Furthermore, the same optimizations improved the performance of the TEMF on a dual socket configuration of eight-core Intel Xeon E5-2670 CPUs by a factor of \bf 8.\bf 3 \times compared to the original TEMF code. [ABSTRACT FROM PUBLISHER]
- Published
- 2015
- Full Text
- View/download PDF
28. Massively Parallel GPU Design of Automatic Target Generation Process in Hyperspectral Imagery.
- Author
-
Li, Xiaojie, Huang, Bormin, and Zhao, Kai
- Abstract
A popular algorithm for hyperspectral image interpretation is the automatic target generation process (ATGP). ATGP creates a set of targets from image data in an unsupervised fashion without prior knowledge. It can be used to search a specific target in unknown scenes and when a target’s size is smaller than a single pixel. Its application has been demonstrated in many fields including geology, agriculture, and intelligence. However, the algorithm requires long time to process due to the massive amount of data. To expedite the process, the graphics processing units (GPUs) are an attractive alternative in comparison with traditional CPU architectures. In this paper, we propose a GPU-based massively parallel version of ATGP, which provides real-time performance for the first time in the literature. The HYDICE image data ( 307\ast 307 pixels and 210 spectral bands) are used for benchmark. Our optimization efforts on the GPU-based ATGP algorithm using one NVIDIA Tesla K20 GPU with I/O transfer can achieve a speedup of 362\times with respect to its single-threaded CPU counterpart. We also tested the algorithm on Airborne Visible/InfraRed Imaging Spectrometer (AVIRIS) WTC dataset ( 512\ast 614\ast 224 of 224 bands) and Cuprite dataset (${{35\ast 350\ast 188}} of 188 bands), the speedup was 416\times and 320\times, respectively, when the target number was 15. [ABSTRACT FROM PUBLISHER]
- Published
- 2015
- Full Text
- View/download PDF
29. Massive Parallelization of the WRF GCE Model Toward a GPU-Based End-to-End Satellite Data Simulator Unit.
- Author
-
Huang, Melin, Huang, Bormin, Li, Xiaojie, Huang, Allen Hung-Lung, Goldberg, Mitchell D., and Mehta, Ajay
- Abstract
Modern weather satellites provide more detailed observations of cloud and precipitation processes. To harness these observations for better satellite data assimilations, a cloud-resolving model, known as the Goddard Cumulus Ensemble (GCE) model, was developed and used by the Goddard Satellite Data Simulator Unit (G-SDSU). The GCE model has also been incorporated as part of the widely used weather research and forecasting (WRF) model. The computation of the cloud-resolving GCE model is time-consuming. This paper details our massively parallel design of GPU-based WRF GCE scheme. With one NVIDIA Tesla K40 GPU, the GPU-based GCE scheme achieves a speedup of \bf 361 \times as compared to its original Fortran counterpart running on one CPU core, whereas the speedup for one CPU socket (four cores) with respect to one CPU core is only \bf 3.\bf 9 \times . [ABSTRACT FROM PUBLISHER]
- Published
- 2015
- Full Text
- View/download PDF
30. Efficient Parallel GPU Design on WRF Five-Layer Thermal Diffusion Scheme.
- Author
-
Huang, Melin, Huang, Bormin, Chang, Yang-Lang, Mielikainen, Jarno, Huang, Hung-Lung Allen, and Goldberg, Mitchell D.
- Abstract
Satellite remote-sensing observations and ground-based radar can detect the weather conditions from a distance and are widely used to monitor the weather all around the globe. The assimilated satellite/radar data are passed through the weather models for weather forecasting. The five-layer thermal diffusion scheme is one of the weather models, handling with an energy budget made up of sensible, latent, and radiative heat fluxes. The model feature of no interactions among horizontal grid points makes this scheme very favorable for parallel processing. This study demonstrates implementation of this scheme using graphics processing unit (GPU) massively parallel architecture. By employing one NVIDIA Tesla K40 GPU, our GPU optimization effort on this scheme achieves a speedup of $311 \times$ with respect to its CPU counterpart Fortran code running on one CPU core of Intel Xeon E5-2603, whereas the speedup for one CPU socket (four cores) with respect to one CPU core is only $3.1 \times$. We can even boost the speedup of this scheme to $398 \times$ with respect to one CPU core when two NVIDIA Tesla K40 GPUs are applied. [ABSTRACT FROM PUBLISHER]
- Published
- 2015
- Full Text
- View/download PDF
31. Performance and Scalability of the JCSDA Community Radiative Transfer Model (CRTM) on NVIDIA GPUs.
- Author
-
Mielikainen, Jarno, Huang, Bormin, Huang, Hung-Lung Allen, and Lee, Tsengdar
- Abstract
An atmospheric radiative transfer model calculates radiative transfer of electromagnetic radiation through earth’s atmosphere. The community radiative transfer model (CRTM) is a fast radiative transfer model for calculating the satellite infrared (IR) and microwave (MW) radiances of a given state of the Earth’s atmosphere and its surface. The CRTM takes into account the radiance emission and absorption of various atmospheric gasses as well as the emission and the reflection of various surface types. Two different transmittance algorithms are currently available in the CRTM OPTRAN: optical depth in absorber space (ODAS) and optical depth in pressure space (ODPS). ODAS in the current CRTM allows two variable absorbers (water vapor and ozone). In this paper, we examine the feasibility of using graphics processing units (GPUs) to accelerate the CRTM with the ODAS transmittance model. Using commodity GPUs for accelerating CRTM means that the hardware costs of adding high-performance accelerators to computation hardware configuration are significantly reduced. Our results show that GPUs can provide significant speedup over conventional processors for the 8461-channel IASI sounder. In particular, a GPU on the dual-GPU NVIDIA GTX 590 card can provide a speedup 375x for the single-precision version of the CRTM ODAS compared to its single-threaded Fortran counterpart running on Intel i7 920 CPU, whereas the speedup for 1 CPU socket with respect to 1 CPU core is only 6.3x. Furthermore, two NVIDIA GTX 590s provided speedups of 201x and 1367x for double precision and single precision versions of ODAS compared to single threaded Fortran code. [ABSTRACT FROM PUBLISHER]
- Published
- 2015
- Full Text
- View/download PDF
32. Multisource Data Fusion and Fisher Criterion-Based Nearest Feature Space Approach to Landslide Classification.
- Author
-
Chang, Yang-Lang, Wang, Yi Chun, Fu, Yi-Shiang, Han, Chin-Chuan, Chanussot, Jocelyn, and Huang, Bormin
- Abstract
In this paper, a novel technique known as the Fisher criterion-based nearest feature space (FCNFS) approach is proposed for supervised classification of multisource images for the purpose of landslide hazard assessment. The method is developed for land cover classification based upon the fusion of remotely sensed images of the same scene collected from multiple sources. This paper presents a framework for data fusion of multisource remotely sensed images, consisting of two approaches: 1) the band generation process(BGP); and 2) the FCNFS classifier. We propose the BGP to create a new set of additional bands that are specifically accommodated to the landslide class and are extracted from the original multisource images. In comparison to the original nearest feature space (NFS) method, the proposed improved FCNFS classifier uses the Fisher criterion of between-class and within-class discrimination to enhance the classifier. In the training phase, the labeled samples are discriminated by the Fisher criterion, which can be treated as a preprocessing step of the NFS method. After completion of the training, the classification results can be obtained from the NFS algorithm. In order for the proposed FCNFS to be effective for multispectral images, a multiple adaptive BGP is introduced to create an additional set of bands specially accommodated to landslide classes. Experimental results show that the proposed BGP/FCNFS framework is suitable for land cover classification in Earth remote sensing and improves the classification accuracy compared to conventional classifiers. [ABSTRACT FROM PUBLISHER]
- Published
- 2015
- Full Text
- View/download PDF
33. Burst Error Studies with DVB-S2 and 3D Wavelet Reversible Variable-Length Coding for Ultraspectral Sounder Data Compression
- Author
-
Huang, Bormin, primary, Ahuja, Alok, additional, and Sriraja, Y., additional
- Published
- 2006
- Full Text
- View/download PDF
34. Optimal Code Rate Allocation with Turbo Product Codes for JPEG2000 Compression of Ultraspectral Sounder Data
- Author
-
Huang, Bormin, primary, Sriraja, Y., additional, and Ahuja, Alok, additional
- Published
- 2006
- Full Text
- View/download PDF
35. Real-Time Implementation of the Pixel Purity Index Algorithm for Endmember Identification on GPUs.
- Author
-
Wu, Xianyun, Huang, Bormin, Plaza, Antonio, Li, Yunsong, and Wu, Chengke
- Abstract
Spectral unmixing amounts to automatically finding the signatures of pure spectral components (called endmembers in the hyperspectral imaging literature) and their associated abundance fractions in each pixel of the hyperspectral image. Many algorithms have been proposed to automatically find spectral endmembers in hyperspectral data sets. Perhaps one of the most popular ones is the pixel purity index (PPI), which is available in the ENVI software from Exelis Visual Information Solutions. This algorithm identifies the endmembers as the pixels with maxima projection values after projections onto a large randomly generated set of random vectors (called skewers). Although the algorithm has been widely used in the spectral unmixing community, it is highly time consuming as its precision asymptotically increases. Due to its high computational complexity, the PPI algorithm has been recently implemented in several high-performance computing architectures, including commodity clusters, heterogeneous and distributed systems, field programmable gate arrays, and graphics processing units (GPUs). In this letter, we present an improved GPU implementation of the PPI algorithm, which provides real-time performance for the first time in the literature. [ABSTRACT FROM PUBLISHER]
- Published
- 2014
- Full Text
- View/download PDF
36. GPU-Accelerated Computation for Electromagnetic Scattering of a Double-Layer Vegetation Model.
- Author
-
Su, Xiang, Wu, Jiaji, Huang, Bormin, and Wu, Zhensen
- Abstract
In this paper we develop a graphics processing unit (GPU)-based massively parallel approach for efficient computation of electromagnetic scattering via a proposed double-layer vegetation model composed of vegetation and ground layers. The proposed vector radiative transfer (VRT) model for vegetation scattering considers different sizes and orientations of the leaves. It uses the Monte Carlo method to calculate the backward scattering coefficients of rough ground and vegetation where the leaves are approximated as a large number of randomly oriented flat ellipsoids and the ground is treated as a Gaussian random rough surface. In the original CPU-based sequential code, the Monte Carlo simulation to calculate the electromagnetic scattering of vegetation takes up 97.2% of the total execution time. In this paper we take advantage of the massively parallel compute capability of NVIDIA Fermi GTX480 with the Compute Unified Device Architecture (CUDA) to compute the multiple scattering of all the leaf groups simultaneously. Our parallel design includes the registers for faster memory access, the shared memory for parallel reduction, the pipelined multiple-stream asynchronous transfer, the parallel random number generator and the CPU-GPU heterogeneous computation. By using these techniques, we achieved speedup of 213-fold on the NVIDIA GTX 480 GPU and 291-fold on the NVIDIA GTX 590 GPU as compared with its single-core CPU counterpart. [ABSTRACT FROM PUBLISHER]
- Published
- 2013
- Full Text
- View/download PDF
37. Improved GPU/CUDA Based Parallel Weather and Research Forecast (WRF) Single Moment 5-Class (WSM5) Cloud Microphysics.
- Author
-
Mielikainen, Jarno, Huang, Bormin, Huang, Hung-Lung Allen, and Goldberg, Mitchell D.
- Abstract
The Weather Research and Forecasting (WRF) model is an atmospheric simulation system which is designed for both operational and research use. WRF is currently in operational use at the National Oceanic and Atmospheric Administration (NOAA)'s national weather service as well as at the air force weather agency and meteorological services worldwide. Getting weather predictions in time using latest advances in atmospheric sciences is a challenge even on the fastest super computers. Timely weather predictions are particularly useful for severe weather events when lives and property are at risk. Microphysics is a crucial but computationally intensive part of WRF. WRF Single Moment 5-class (WSM5) microphysics scheme represents fallout of various types of precipitation, condensation and thermodynamics effects of latent heat release. Therefore, to expedite the computation process, Graphics Processing Units (GPUs) appear an attractive alternative to traditional CPU architectures. In this paper, we accelerate the WSM5 microphysics scheme on GPUs and obtain a considerable speedup thereby significantly reducing the processing time. Such high performance and computationally efficient GPUs allow us to use higher resolution WRF forecasts. The use of high resolution WRF enables us to compute microphysical processes for increasingly small clouds and water droplets. To implement WSM5 scheme on GPUs, the WRF code was rewritten into CUDA C, a high level data-parallel programming language used on NVIDIA GPU. We observed a reduction in processing time from 16928 ms on CPU to 43.5 ms on a Graphics Processing Unit (GPU). We obtained a speedup of 389x without I/O using a single GPU. Taking I/O transfer times into account, the speedup obtained is 206\times. The speedup was further increased by using four GPUs, speedup being 1556x and 357x for without I/O and with I/O, respectively. [ABSTRACT FROM PUBLISHER]
- Published
- 2012
- Full Text
- View/download PDF
38. GPU Implementation of Stony Brook University 5-Class Cloud Microphysics Scheme in the WRF.
- Author
-
Mielikainen, Jarno, Huang, Bormin, Huang, Hung-Lung Allen, and Goldberg, Mitchell D.
- Abstract
The Weather Research and Forecasting (WRF) model is a next-generation mesoscale numerical weather prediction system. It is designed to serve the needs of both operational forecasting and atmospheric research for a broad spectrum of applications across scales ranging from meters to thousands of kilometers. Microphysics plays an important role in weather and climate prediction. Microphysics includes explicitly resolved water vapor, cloud, and precipitation processes. Several bulk water microphysics schemes are available within the WRF, with different numbers of simulated hydrometeor classes and methods for estimating their size, fall speeds, distributions and densities. Stony Brook University scheme is a 5-class scheme with riming intensity predicted to account for the mixed-phase processes. In this paper, we develop an efficient Graphics Processing Unit (GPU) based Stony Brook University scheme. The GPU-based Stony Brook University scheme was compared to a CPU-based single-threaded counterpart on a computational domain of 422 \times 297 horizontal grid points with 34 vertical levels. The original Fortran code was first rewritten into a standard C code. After that, C code was verified against Fortran code and CUDA C extensions were added for data parallel execution on GPUs. On a single GPU, we achieved a speed-up of 213\times with data I/O and 896\times without I/O on NVIDIA GTX 590. Using multiple GPUs, a speed-up of 352\times is achieved with I/O for 4 GPUs. We will also discuss how data I/O will be less cumbersome if we ran the complete WRF model on GPUs. [ABSTRACT FROM PUBLISHER]
- Published
- 2012
- Full Text
- View/download PDF
39. GPU Acceleration of the Updated Goddard Shortwave Radiation Scheme in the Weather Research and Forecasting (WRF) Model.
- Author
-
Mielikainen, Jarno, Huang, Bormin, Huang, Hung-Lung Allen, and Goldberg, Mitchell D.
- Abstract
Next-generation mesoscale numerical weather prediction system, the Weather Research and Forecasting (WRF) model, is a designed for dual use for forecasting and research. WRF offers multiple physics options that can be combined in any way. One of the physics options is radiance computation. The major source for energy for the earth's climate is solar radiation. Thus, it is imperative to accurately model horizontal and vertical distribution of the heating. Goddard solar radiative transfer model includes the absorption duo to water vapor, \O3, \O2, \CO2, clouds and aerosols. The model computes the interactions among the absorption and scattering by clouds, aerosols, molecules and surface. Finally, fluxes are integrated over the entire shortwave spectrum from 0.175 \mu\m to 10 \mu\m. In this paper, we develop an efficient graphics processing unit (GPU) based Goddard shortwave radiative scheme. The GPU-based Goddard shortwave scheme was compared to a CPU-based single-threaded counterpart on a computational domain of 422\times297 horizontal grid points with 34 vertical levels. Both the original FORTRAN code on CPU and CUDA C code on GPU use double precision floating point values for computation. Processing time for Goddard shortwave radiance on CPU is 22106 ms. GPU accelerated Goddard shortwave radiance on 4 GPUs can be computed in 208.8 ms and 157.1 ms with and without I/O, respectively. Thus, the speedups are 116\times with data I/O and 141\times without I/O on two NVIDIA GTX 590 s . Using single precision arithmetic and less accurate arithmetic modes the speedups are increased to 536\times and 259\times, with and without I/O, respectively. [ABSTRACT FROM PUBLISHER]
- Published
- 2012
- Full Text
- View/download PDF
40. Recent Developments in High Performance Computing for Remote Sensing: A Review.
- Author
-
Lee, Craig A., Gasster, Samuel D., Plaza, Antonio, Chang, Chein-I, and Huang, Bormin
- Abstract
Remote sensing data have become very widespread in recent years, and the exploitation of this technology has gone from developments mainly conducted by government intelligence agencies to those carried out by general users and companies. There is a great deal more to remote sensing data than meets the eye, and extracting that information turns out to be a major computational challenge. For this purpose, high performance computing (HPC) infrastructure such as clusters, distributed networks or specialized hardware devices provide important architectural developments to accelerate the computations related with information extraction in remote sensing. In this paper, we review recent advances in HPC applied to remote sensing problems; in particular, the HPC-based paradigms included in this review comprise multiprocessor systems, large-scale and heterogeneous networks of computers, grid and cloud computing environments, and hardware systems such as field programmable gate arrays (FPGAs) and graphics processing units (GPUs). Combined, these parts deliver a snapshot of the state-of-the-art and most recent developments in those areas, and offer a thoughtful perspective of the potential and emerging challenges of applying HPC paradigms to remote sensing problems. [ABSTRACT FROM PUBLISHER]
- Published
- 2011
- Full Text
- View/download PDF
41. Group and Region Based Parallel Compression Method Using Signal Subspace Projection and Band Clustering for Hyperspectral Imagery.
- Author
-
Chang, Lena, Chang, Yang-Lang, Tang, Z. S., and Huang, Bormin
- Abstract
In this study, a novel group and region based parallel compression approach is proposed for hyperspectral imagery. The proposed approach contains two algorithms, which are clustering signal subspace projection (CSSP) and the maximum correlation band clustering (MCBC). The CSSP first divides the image into proper regions by transforming the high dimensional image data into one dimensional projection length. The MCBC partitions the spectral bands into several groups according to their associated band correlation for each image region. The image data with high degree correlations in spatial/spectral domains are then gathered in groups. Then, the grouped image data is further compressed by Principal Components Analysis (PCA)-based spectral/spatial hyper-spectral image compression techniques. Furthermore, to accelerate the computing efficiency, we present a parallel architecture of the proposed compression approach by using parallel cluster computing techniques. Simulation results performed on AVIRIS images have shown that the proposed group and region based approach performs better than standard 3D hyperspectral image compression. Moreover, the proposed approach achieves better computation efficiency than the direct combination of PCA and JPEG2000 under the same compression ratio. [ABSTRACT FROM PUBLISHER]
- Published
- 2011
- Full Text
- View/download PDF
42. A GPU-Accelerated Wavelet Decompression System With SPIHT and Reed-Solomon Decoding for Satellite Images.
- Author
-
Song, Changhe, Li, Yunsong, and Huang, Bormin
- Abstract
The discrete wavelet transform (DWT)-based Set Partitioning in Hierarchical Trees (SPIHT) algorithm is widely used in many image compression systems. The time-consuming computation of the 9/7 discrete wavelet decomposition is usually the bottleneck of these systems. In order to perform real-time Reed-Solomon channel decoding and SPIHT+DWT source decoding on a massive bit stream of compressed images continuously down-linked from the satellite, we propose a novel graphic processing unit (GPU)-accelerated decoding system. In this system the GPU is used to compute the time-consuming inverse DWT, while multiple CPU threads are run in parallel for the remaining part of the system. Both CPU and GPU parts were carefully designed to have approximately the same processing speed to obtain the maximum throughput via a novel pipeline structure for processing continuous satellite images. As part of the SPIHT decoding system, the GPU-based inverse DWT is about 158 times faster than its CPU counterpart. Through the pipelined CPU and GPU heterogeneous computing, the entire decoding system approaches a speedup of 83x as compared to its single-threaded CPU counterpart. The proposed channel and source decoding system is able to decompress 1024x1024 satellite images at a speed of 90 frames per second. [ABSTRACT FROM PUBLISHER]
- Published
- 2011
- Full Text
- View/download PDF
43. A Parallel Simulated Annealing Approach to Band Selection for High-Dimensional Remote Sensing Images.
- Author
-
Chang, Yang-Lang, Chen, Kun-Shan, Huang, Bormin, Chang, Wen-Yen, Benediktsson, Jon Atli, and Chang, Lena
- Abstract
In this paper a parallel band selection approach, referred to as parallel simulated annealing band selection (PSABS), is presented for high-dimensional remote sensing images. The approach is based on the simulated annealing band selection (SABS) scheme which is originally designed to group highly correlated hyperspectral bands into a smaller subset of modules regardless of the original order in terms of wavelengths. SABS selects sets of correlated hyperspectral bands based on simulated annealing (SA) algorithm and utilizes the inherent separability of different classes to reduce dimensionality. In order to be effective, the proposed PSABS is introduced to improve the computational performance by using parallel computing technique. It allows multiple Markov chains (MMC) to be traced simultaneously and fully utilizes the parallelism of SABS to create a set of SABS modules on each parallel node. Two parallel implementations, namely the message passing interface (MPI) cluster-based library and the open multi-processing (OpenMP) multicore-based application programming interface, are applied to three different MMC techniques: non-interacting MMC, periodic exchange MMC and asynchronous MMC for evaluation. The effectiveness of the proposed PSABS is evaluated by NASA MODIS/ASTER (MASTER) airborne simulator data sets and airborne synthetic aperture radar (SAR) images for land cover classification during the Pacrim II campaign in the experiments. The results demonstrated that the MMC techniques of PSABS can significantly improve the computational performance and provide a more reliable quality of solution compared to the original SABS method. [ABSTRACT FROM PUBLISHER]
- Published
- 2011
- Full Text
- View/download PDF
44. GPU-Accelerated Multi-Profile Radiative Transfer Model for the Infrared Atmospheric Sounding Interferometer.
- Author
-
Mielikainen, Jarno, Huang, Bormin, and Huang, Hung-Lung Allen
- Abstract
In this paper, we develop a novel Graphics Processing Unit (GPU)-based high-performance Radiative Transfer Model (RTM) for the Infrared Atmospheric Sounding Interferometer (IASI) launched in 2006 onboard the first European meteorological polar-orbiting satellites, METOP-A. The proposed GPU RTM processes more than one profile at a time in order to gain a significant speedup compared to the case of processing just one profile at a time. The radiative transfer model performance in operational numerical weather prediction systems nowadays still limits the number of channels they can use in hyperspectral sounders to only a few hundreds. To take the full advantage of such high resolution infrared observations, a computationally efficient radiative transfer model is needed. Our GPU-based IASI radiative transfer model is developed to run on a low-cost personal supercomputer with 4 NVIDIA Tesla C1060 GPUs with total 960 cores, delivering near 4 TFlops theoretical peak performance. The model exhibited linear scaling with the number of graphics processing units. Computing 10 IASI radiance spectra simultaneously on a GPU, we reached 763x speedup for 1 GPU and 3024x speedup for all 4 GPUs, both with respect to the original single-threaded Fortran CPU code. The significant 3024x speedup means that the proposed GPU-based high-performance forward model is able to compute one day's amount of 1,296,000 IASI spectra within 6 minutes, whereas the original CPU-based version will impractically take more than 10 days. The GPU-based high-performance IASI radiative transfer model is suitable for the assimilation of the IASI radiance observations into the operational numerical weather forecast model. [ABSTRACT FROM PUBLISHER]
- Published
- 2011
- Full Text
- View/download PDF
45. Accelerating Regular LDPC Code Decoders on GPUs.
- Author
-
Chang, Cheng-Chun, Chang, Yang-Lang, Huang, Min-Yu, and Huang, Bormin
- Abstract
Modern active and passive satellite and airborne sensors with higher temporal, spectral and spatial resolutions for Earth remote sensing result in a significant increase in data volume. This poses a challenge for data transmission over error-prone wireless links to a ground receiving station. Low-density parity-check (LDPC) codes have been adopted in modern communication systems for robust error correction. Demands for LDPC decoders at a ground receiving station for efficient and flexible data communication links have inspired the usage of a cost-effective high-performance computing device. In this paper we propose a graphic-processing-unit (GPU)-based regular LDPC decoders with the log sum-product iterative decoding algorithm (log-SPA). The GPU code was written to run NVIDIA GPUs using the compute unified device architecture (CUDA) language with a novel implementation of asynchronous data transfer for LDPC decoding. Experimental results showed that the proposed GPU-based high-throughput regular LDPC decoder achieved a significant 271x speedup compared to its CPU-based single-threaded counterpart written in the C language. [ABSTRACT FROM PUBLISHER]
- Published
- 2011
- Full Text
- View/download PDF
46. GPU Acceleration of Predictive Partitioned Vector Quantization for Ultraspectral Sounder Data Compression.
- Author
-
Wei, Shih-Chieh and Huang, Bormin
- Abstract
For the large-volume ultraspectral sounder data, compression is desirable to save storage space and transmission time. To retrieve the geophysical paramters without losing precision the ultraspectral sounder data compression has to be lossless. Recently there is a boom on the use of graphic processor units (GPU) for speedup of scientific computations. By identifying the time dominant portions of the code that can be executed in parallel, significant speedup can be achieved by using GPU. Predictive partitioned vector quantization (PPVQ) has been proven to be an effective lossless compression scheme for ultraspectral sounder data. It consists of linear prediction, bit depth partitioning, vector quantization, and entropy coding. Two most time consuming stages of linear prediction and vector quantization are chosen for GPU-based implementation. By exploiting the data parallel characteristics of these two stages, a spatial division design shows a speedup of 72x in our four-GPU-based implementation of the PPVQ compression scheme. [ABSTRACT FROM PUBLISHER]
- Published
- 2011
- Full Text
- View/download PDF
47. Foreword to the Special Issue on Big Data in Remote Sensing.
- Author
-
Chi, Mingmin, Plaza, Antonio J., Benediktsson, Jon Atli, Zhang, Bing, and Huang, Bormin
- Abstract
The papers in this special issue focus on the deployment of Big Data applications for use in remote sensing. This issue is intended to introduce the latest techniques to manage, exploit, process, and analyze big data in remote sensing applications. It contains 11 papers that exhibit the latest advances in Big Data in Remote Sensing. To understand big data, usually three facets should be taken into account from owning data, data methods, and data applications, which contribute together to a single big data life cycle, including identification of applications, data collections, data processing, data analysis, data visualization, data evaluation, and so on. [ABSTRACT FROM PUBLISHER]
- Published
- 2015
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.