6,040 results on '"Transforms"'
Search Results
2. Frequency-Guided Spatial Adaptation for Camouflaged Object Detection.
- Author
-
Zhang, Shizhou, Kong, Dexuan, Xing, Yinghui, Lu, Yue, Ran, Lingyan, Liang, Guoqiang, Wang, Hexu, and Zhang, Yanning
- Published
- 2025
- Full Text
- View/download PDF
3. DNP-AUT: Image Compression Using Double-Layer Non-Uniform Partition and Adaptive U Transform.
- Author
-
Zhang, Yumo and Cai, Zhanchuan
- Published
- 2025
- Full Text
- View/download PDF
4. JPEG Image Encryption With DC Rotation and Undivided RSV-Based AC Group Permutation.
- Author
-
Yuan, Yuan, He, Hongjie, Yang, Yaolin, Amirpour, Hadi, Timmerer, Christian, and Chen, Fan
- Published
- 2025
- Full Text
- View/download PDF
5. An Area and Energy Efficient Serial-Multiplier.
- Author
-
Khan, Mohd. Tasleem and Hazarika, Jinti
- Abstract
In this work, we present an area and energy-efficient serial multiplier. Specifically, we exploit symmetries in odd and even partial products (PPs) in its radix- $\gamma $ implementation. Subsequently, we express them as $\mp (2^{k}\pm 1)$ with $1 \leq k \leq \text {{log}}_{2}\gamma -1$ , which enable to reduce the hardware resources. For $\gamma \geq 16$ , the above representation becomes invalid, requiring additional power-of-two terms and raising hardware costs. To address this, we utilize recursive symmetries in PPs, which enable time-sharing and reduce the logic resources for efficient realization. ASIC synthesis results show the proposed design has substantial savings in area and energy than the state-of-the-art design. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Exploiting Symmetry in Dynamics for Model-Based Reinforcement Learning With Asymmetric Rewards
- Author
-
Sonmez, Yasin, Junnarkar, Neelay, and Arcak, Murat
- Subjects
Control Engineering ,Mechatronics and Robotics ,Engineering ,Behavioral and Social Science ,Basic Behavioral and Social Science ,Training ,Reinforcement learning ,Task analysis ,Computational modeling ,Transforms ,Dynamic programming ,Data models ,Symmetry ,neural networks ,reinforcement learning ,Control engineering ,mechatronics and robotics - Abstract
Recent work in reinforcement learning has leveraged symmetries in the model to improve sample efficiency in training a policy. A commonly used simplifying assumption is that the dynamics and reward both exhibit the same symmetry; however, in many real-world environments, the dynamical model exhibits symmetry independent of the reward model. In this letter, we assume only the dynamics exhibit symmetry, extending the scope of problems in reinforcement learning and learning in control theory to which symmetry techniques can be applied. We use Cartan's moving frame method to introduce a technique for learning dynamics that, by construction, exhibit specified symmetries. Numerical experiments demonstrate that the proposed method learns a more accurate dynamical model.
- Published
- 2024
7. Cervical‐YOSA: Utilizing prompt engineering and pre‐trained large‐scale models for automated segmentation of multi‐sequence MRI images in cervical cancer
- Author
-
Yanwei Xia, Zhengjie Ou, Lihua Tan, Qiang Liu, Yanfen Cui, Da Teng, and Dan Zhao
- Subjects
biomedical MRI ,image segmentation ,transforms ,Photography ,TR1-1050 ,Computer software ,QA76.75-76.765 - Abstract
Abstract Cervical cancer is a major health concern, particularly in developing countries with limited medical resources. This study introduces two models aimed at improving cervical tumor segmentation: a semi‐automatic model that fine‐tunes the Segment Anything Model (SAM) and a fully automated model designed for efficiency. Evaluations were conducted using a dataset of 8586 magnetic resonance imaging (MRI) slices, where the semi‐automatic model achieved a Dice Similarity Coefficient (DSC) of 0.9097, demonstrating high accuracy. The fully automated model also performed robustly with a DSC of 0.8526, outperforming existing methods. These models offer significant potential to enhance cervical cancer diagnosis and treatment, especially in resource‐limited settings.
- Published
- 2024
- Full Text
- View/download PDF
8. Cervical‐YOSA: Utilizing prompt engineering and pre‐trained large‐scale models for automated segmentation of multi‐sequence MRI images in cervical cancer.
- Author
-
Xia, Yanwei, Ou, Zhengjie, Tan, Lihua, Liu, Qiang, Cui, Yanfen, Teng, Da, and Zhao, Dan
- Subjects
CERVICAL cancer diagnosis ,MAGNETIC resonance imaging ,IMAGE segmentation ,CERVICAL cancer ,DEVELOPING countries - Abstract
Cervical cancer is a major health concern, particularly in developing countries with limited medical resources. This study introduces two models aimed at improving cervical tumor segmentation: a semi‐automatic model that fine‐tunes the Segment Anything Model (SAM) and a fully automated model designed for efficiency. Evaluations were conducted using a dataset of 8586 magnetic resonance imaging (MRI) slices, where the semi‐automatic model achieved a Dice Similarity Coefficient (DSC) of 0.9097, demonstrating high accuracy. The fully automated model also performed robustly with a DSC of 0.8526, outperforming existing methods. These models offer significant potential to enhance cervical cancer diagnosis and treatment, especially in resource‐limited settings. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. A Methodical Framework Utilizing Transforms and Biomimetic Intelligence-Based Optimization with Machine Learning for Speech Emotion Recognition.
- Author
-
Prabhakar, Sunil Kumar and Won, Dong-Ok
- Subjects
- *
FEATURE extraction , *MACHINE learning , *ARTIFICIAL intelligence , *FEATURE selection , *EMOTION recognition - Abstract
Speech emotion recognition (SER) tasks are conducted to extract emotional features from speech signals. The characteristic parameters are analyzed, and the speech emotional states are judged. At present, SER is an important aspect of artificial psychology and artificial intelligence, as it is widely implemented in many applications in the human–computer interface, medical, and entertainment fields. In this work, six transforms, namely, the synchrosqueezing transform, fractional Stockwell transform (FST), K-sine transform-dependent integrated system (KSTDIS), flexible analytic wavelet transform (FAWT), chirplet transform, and superlet transform, are initially applied to speech emotion signals. Once the transforms are applied and the features are extracted, the essential features are selected using three techniques: the Overlapping Information Feature Selection (OIFS) technique followed by two biomimetic intelligence-based optimization techniques, namely, Harris Hawks Optimization (HHO) and the Chameleon Swarm Algorithm (CSA). The selected features are then classified with the help of ten basic machine learning classifiers, with special emphasis given to the extreme learning machine (ELM) and twin extreme learning machine (TELM) classifiers. An experiment is conducted on four publicly available datasets, namely, EMOVO, RAVDESS, SAVEE, and Berlin Emo-DB. The best results are obtained as follows: the Chirplet + CSA + TELM combination obtains a classification accuracy of 80.63% on the EMOVO dataset, the FAWT + HHO + TELM combination obtains a classification accuracy of 85.76% on the RAVDESS dataset, the Chirplet + OIFS + TELM combination obtains a classification accuracy of 83.94% on the SAVEE dataset, and, finally, the KSTDIS + CSA + TELM combination obtains a classification accuracy of 89.77% on the Berlin Emo-DB dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. NUMERICAL INVESTIGATION OF THE ALPHA PARAMETRIZED DTM WITH THE CLASSICAL DTM.
- Author
-
FARID, GHULAM, REHMAN, FAIZA, ARSHAD, MUHAMMAD, SHAHZADI, KIRAN, and ZAFAR, TAYYBA
- Subjects
- *
DIFFERENTIAL equations - Abstract
In this paper, solutions of some differential equations are found using alpha parametrized differential transform method (α-PDTM). The results are compared with the exact solution and the results found from the classical differential transform method (DTM). The optimal values of a are found for which the error at some particular domain point is minimal. It is found that the alpha parametrized differential transform method is more accurate and precise method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Time-inhomogeneous Hawkes processes and its financial applications.
- Author
-
Suhyun Lee, Mikyoung Ha, Young-Ju Lee, and Youngsoo Seol
- Subjects
INTEREST rates ,BOND prices ,TIME-based pricing ,MARKET pricing ,PRICES - Abstract
We consider time-inhomogeneous Hawkes processes with an exponential kernel, and we analyze some properties of the model. Time-inhomogeneity for the Hawkes process is indispensable for short rate models or for other calibration purposes, while financial applications for the time-homogeneous case already well known. Distributional properties for such a model generate computational tractability for a financial application. In this paper, moments and the Laplace transform of time-inhomogeneous Hawkes processes are obtained from the distributional properties of the underlying processes. As an applications to finance, we investigate the pricing formula for zero-coupon bonds when short-term interest rates are governed by the time-inhomogeneous Hawkes process. Numerical illustrations are also provided. As an illustrative example, we apply the derived moments and Laplace transform of time-inhomogeneous Hawkes processes to the pricing of zero-coupon bonds within a financial context. By considering the short-term interest rate as driven by inhomogeneous Hawkes processes, we develop explicit formulae for valuing zero-coupon bonds. This application is particularly relevant for modeling interest rate dynamics in real-world scenarios, allowing for a more nuanced understanding of pricing dynamics. Through numerical illustrations, we demonstrate the computational tractability of our approach, showcasing its practical utility for financial practitioners and providing insights into the intricate interplay between time-inhomogeneous Hawkes processes and bond pricing in dynamic markets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Efficient PAPR Reduction Techniques and Performance of DWT-OFDM
- Author
-
Thilagaraj, M., Arul Murugan, C., Kottaimalai, R., Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Kumar, Sandeep, editor, K., Balachandran, editor, Kim, Joong Hoon, editor, and Bansal, Jagdish Chand, editor
- Published
- 2024
- Full Text
- View/download PDF
13. Improved Stockwell Transform for Image Compression and Reconstruction
- Author
-
Babu, Padigala Prasanth, Prasad, T. Jayachandra, Soundararajan, K., Kacprzyk, Janusz, Series Editor, Gunjan, Vinit Kumar, editor, Zurada, Jacek M., editor, and Singh, Ninni, editor
- Published
- 2024
- Full Text
- View/download PDF
14. Correlation of protein binding pocket properties with hits’ chemistries used in generation of ultra-large virtual libraries
- Author
-
Song, Robert X., Nicklaus, Marc C., and Tarasova, Nadya I.
- Published
- 2024
- Full Text
- View/download PDF
15. FINITELY ADDITIVE FUNCTIONS IN MEASURE THEORY AND APPLICATIONS.
- Author
-
Alpay, Daniel and Jorgensen, Palle
- Subjects
- *
ADDITIVE functions , *MEASURE theory , *FRACTIONAL calculus , *COMPOSITION operators , *PROBABILITY theory , *ADJOINT differential equations - Abstract
In this paper, we consider, and make precise, a certain extension of the Radon-Nikodym derivative operator, to functions which are additive, but not necessarily sigma-additive, on a subset of a given sigma-algebra. We give applications to probability theory; in particular, to the study of μ-Brownian motion, to stochastic calculus via generalized Itô-integrals, and their adjoints (in the form of generalized stochastic derivatives), to systems of transition probability operators indexed by families of measures μ, and to adjoints of composition operators. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Algorithms for time–frequency imaging and analysis: introduction to mixed-model spectral decomposition.
- Author
-
Pant, Animesh, Ghosal, Dibakar, Puryear, Charles, and Verma, Shashank Narayan
- Subjects
- *
TIME-frequency analysis , *WAVELET transforms , *IMAGE analysis , *CLASSIFICATION algorithms , *ALGORITHMS , *RECOMMENDER systems - Abstract
Time–frequency algorithms help discern and filter hidden information from signals but their growing abundance induces non-uniqueness thus, complicating selection. Classification of these algorithms into approaches can bring simplification and structure to improve our selection and estimates. This study focuses on algorithms we classify here as fixed window-based projection approach, wavelet-based projection approach, greedy-based approach and combinational-based approach while omitting heuristic-based approach and numerical-autoregressive-based approach classes. It describes the basic theory of transforms under the classes and compares them for effective stability, effective localization and resolution capabilities of time–frequency spectra for wavelet estimation and interfering beds with results demonstrating subtle advantages for each depending on nature of signal and model behind the algorithm. The combinational-based mixed-model approach wavelet-assisted constrained least squares spectral analysis concatenates a wavelet-based approach with a fixed window-based approach and effectively functions to reassign complex amplitude coefficients from their apparent positions to their true positions. A comparison of the results suggests that it demonstrates good scope as an effective alternative general tool for hydrocarbon detection and resolution of thin beds. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. The Interdependence of the Sustainable Development Goals: Network Analysis as a Methodology for Policy Impact Evaluation.
- Author
-
Sasse, Beatriz C. D.
- Subjects
- *
SUSTAINABLE development , *POLICY analysis , *STATISTICAL power analysis , *DATABASES , *ACCOUNTING policies - Abstract
Network analysis using mixed graphical modeling is a computational social science approach with the potential to transform decades of sustainable development goals (SDGs) open data from the official SDG Indicators Database into digestible insights using open-source software R. Flexible and free, R provides the ability to perform network analysis. As we demonstrate, network analysis is a promising tool for exploratory analysis in policy evaluation accounting for the systemic nature of the SDG relationships, with a new opportunity to benefit from its statistical power based on large sample sizes without limiting its interpretability and approachability. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. AC-DC Bidirectional Converter-based Flexible Interconnection for Low Voltage Side in Power Systems
- Author
-
KONG, Y., WANG, Y., Li, Y., ZHAO, Z., GUO, Y., and ZHONG, J.
- Subjects
ac-dc power converter ,load management ,power system interconnection ,power system reliability ,transforms ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 ,Computer engineering. Computer hardware ,TK7885-7895 - Abstract
Against the background of 'Emission Peak, Carbon Neutrality', there is a significant integration of new energy into the power grid. The characteristics of ‘double high’ and ‘double peak’ in power systems become more pronounced, posing a substantial challenge to the stability and energy coordination of the power grid. To address this, we propose a flexible interconnection scheme based on an AC-DC bidirectional converter. It can solve the problem of high-load caused by new energy generation. Firstly, high-load and adjacent substations are flexibly interconnected through the DC bus on the low-voltage side, enabling load management within a small range. Secondly, AC-DC power transfers the excess new energy generation to other substations. It can promote the absorption of new energy. This method can improve the utilization efficiency of distribution transformers, realize capacity sharing between different transformers, and improve the reliability of power systems in low-voltage distribution networks. Finally, based on a DC interconnection project in Shandong Province (China), the effectiveness of the proposed flexible interconnection scheme is verified. The new energy consumption increases from 5% to 24.4%, and the load rate of high-load transforms drops below 70%.
- Published
- 2024
- Full Text
- View/download PDF
19. Dual-Stream Complex-Valued Convolutional Network for Authentic Dehazed Image Quality Assessment.
- Author
-
Guan, Tuxin, Li, Chaofeng, Zheng, Yuhui, Wu, Xiaojun, and Bovik, Alan C.
- Subjects
- *
CONVOLUTIONAL neural networks , *PERCEPTUAL illusions , *PERCEPTUAL learning - Abstract
Effectively evaluating the perceptual quality of dehazed images remains an under-explored research issue. In this paper, we propose a no-reference complex-valued convolutional neural network (CV-CNN) model to conduct automatic dehazed image quality evaluation. Specifically, a novel CV-CNN is employed that exploits the advantages of complex-valued representations, achieving better generalization capability on perceptual feature learning than real-valued ones. To learn more discriminative features to analyze the perceptual quality of dehazed images, we design a dual-stream CV-CNN architecture. The dual-stream model comprises a distortion-sensitive stream that operates on the dehazed RGB image, and a haze-aware stream on a novel dark channel difference image. The distortion-sensitive stream accounts for perceptual distortion artifacts, while the haze-aware stream addresses the possible presence of residual haze. Experimental results on three publicly available dehazed image quality assessment (DQA) databases demonstrate the effectiveness and generalization of our proposed CV-CNN DQA model as compared to state-of-the-art no-reference image quality assessment algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. AC-DC Bidirectional Converter-based Flexible Interconnection for Low Voltage Side in Power Systems.
- Author
-
Yuhui KONG, Yaqian WANG, Yating Li, Zhan ZHAO, Yujing GUO, and Jianying ZHONG
- Subjects
AC DC transformers ,LOW voltage systems ,CARBON offsetting ,ENERGY consumption ,RELIABILITY in engineering ,ELECTRIC power distribution ,ELECTRIC power distribution grids - Abstract
Against the background of 'Emission Peak, Carbon Neutrality', there is a significant integration of new energy into the power grid. The characteristics of 'double high' and 'double peak' in power systems become more pronounced, posing a substantial challenge to the stability and energy coordination of the power grid. To address this, we propose a flexible interconnection scheme based on an AC-DC bidirectional converter. It can solve the problem of high-load caused by new energy generation. Firstly, high-load and adjacent substations are flexibly interconnected through the DC bus on the low-voltage side, enabling load management within a small range. Secondly, AC-DC power transfers the excess new energy generation to other substations. It can promote the absorption of new energy. This method can improve the utilization efficiency of distribution transformers, realize capacity sharing between different transformers, and improve the reliability of power systems in low-voltage distribution networks. Finally, based on a DC interconnection project in Shandong Province (China), the effectiveness of the proposed flexible interconnection scheme is verified. The new energy consumption increases from 5% to 24.4%, and the load rate of high-load transforms drops below 70%. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. Robust Region Feature Extraction With Salient MSER and Segment Distance-Weighted GLOH for Remote Sensing Image Registration.
- Author
-
Zhao, Zilu, Wang, Feng, and You, Hongjian
- Abstract
Remote sensing image registration is one of the crucial steps in remote sensing image processing, where ground control information is essential. Maintenance of control point databases is complex and expensive. Consequently, lightweight feature databases are emerging. Lightweight feature databases need to store stable and reproducible features. In this context, region features exhibit a distinct advantage. In feature registration methods, the reproducibility of regional features is typically stronger than with individual points. A popular feature region matching method is currently the combination of maximally stable extremal regions (MSER) and scale-invariant feature transform (SIFT). However, the direct combining of MSER and SIFT has difficulties primarily due to redundancy and overlap in regions extracted by MSER, as well as the conflict in applying texture descriptors on homogeneous regions. In this research, we first suggest a salient MSER detection approach that combines frequency-tuned salient region detection and effective nonmaximum suppression filtering to get rid of redundant information and enhance the stability and dependability of the feature region; afterward, we describe the feature region using the unique, enhanced segment distance-weighted gradient location-orientation histogram, which aims to comprehensively describe the feature regions by incorporating more information about the gradient at the edges of the regions. In the experimental phase, we validate the proposed method using multiple remote sensing images. The experimental results confirm the superiority of the proposed method and demonstrate the significant potential and advantages of feature region matching in the context of lightweight feature databases and remote sensing image registration. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. Power Line Detection for Aerial Images Using Object-Based Markov Random Field With Discrete Multineighborhood System.
- Author
-
Zhao, Le, Yao, Hongtai, Fan, Yajun, Ma, Haihua, Li, Zhihui, and Tian, Meng
- Abstract
Robust power line detection from aerial images is an essential step for intelligent unmanned aerial vehicle (UAV) inspection. However, compared to regular object detection tasks, the generalization ability and accuracy of current methods are seriously insufficient due to the variability of power line scenes and the low proportion of power lines in the image. In this letter, we propose an object-based Markov random field with a discrete multineighborhood system (OMRF-DMNS) model, which can provide a solution for tackling power line detection challenges. First, the proposed method constructs a discrete multineighborhood system (DMNS) for the noncontact nodes. This not only greatly improves the antinoise ability of the model, but also lays the foundation for the transfer of context information among nodes under different receptive fields. Second, a multilevel logical model with multineighborhood joint reasoning (MLL-MNJR) is defined on the DMNS, which makes full use of the line features and location information to achieve semantic inference of nodes. Therefore, the OMRF-DMNS model can fully mine the multidimensional feature associations and context information among nodes, enabling it to effectively distinguish between power line segments and interference noises. Experimental results demonstrate that the proposed method significantly outperforms previous methods in different power line scenes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. Unified algorithms for generalized new Mersenne number transforms.
- Author
-
Abdulla, Lujain S., Hussein, Abdulmutalib A-Wahab, and Hamood, Mounir Taha
- Subjects
- *
ALGORITHMS , *DATA mapping - Abstract
The generalized new Mersenne number transforms (GNMNTs) have proved to be significant number theoretic transforms (NTTs) used to calculate convolutions and correlations accurately. In this paper, by applying the principles of the decimation-in-frequency (DIF) approach with appropriate relations in finite field modulo Mersenne primes, two new fast algorithms for computing odd NMNT (ONMNT) and odd-squared NMNT (O²NMNT) are introduced. Moreover, by formulating a unified index mapping scheme for data sequence, a close relationship between the structures of the developed algorithms has been established. As a result, it has been shown that only a single universal butterfly structure is adequate to execute both algorithms. Consequently, a unified implementation platform can be used to compute the ONMNT as well as the O²NMNT. The validity of the development has been checked via an example for fast calculations of different types of convolutions, using both the GNMNTs and the proposed algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
24. Stochastics and Dynamics of Fractals
- Author
-
Jorgensen, Palle E. T., Tian, James, Gohberg, Israel, Founding Editor, Ball, Joseph A., Series Editor, Böttcher, Albrecht, Series Editor, Dym, Harry, Series Editor, Langer, Heinz, Series Editor, Tretter, Christiane, Series Editor, Alpay, Daniel, editor, Behrndt, Jussi, editor, Colombo, Fabrizio, editor, Sabadini, Irene, editor, and Struppa, Daniele C., editor
- Published
- 2023
- Full Text
- View/download PDF
25. A multiple gated boosting network for multi‐organ medical image segmentation
- Author
-
Feiniu Yuan, Zhaoda Tang, Chunmei Wang, Qinghua Huang, and Jinting Shi
- Subjects
medical image processing ,transforms ,Photography ,TR1-1050 ,Computer software ,QA76.75-76.765 - Abstract
Abstract Segmentations provide important clues for diagnosing diseases. U‐shaped neural networks with skip connections have become one of popular frameworks for medical image segmentation. Skip connections really reduce loss of spatial details caused by down‐sampling, but they cannot handle well semantic gaps between low‐ and high‐level features. It is quite challenging to accurately separate out long, narrow, and small organs from human bodies. To solve these problems, the authors propose a Multiple Gated Boosting Network (MGB‐Net). To boost spatial accuracy, the authors first adopt Gated Recurrent Units (GRU) to design multiple Gated Skip Connections (GSC) at different levels, which efficiently reduce the semantic gap between the shallow and deep features. The Update and Reset gates of GRUs enhance features beneficial to segmentation and suppress information adverse to final results in a recurrent way. To obtain more scale invariances, the authors propose a module of Multi‐scale Weighted Channel Attention (MWCA). The module first uses convolutions with different kernel sizes and group numbers to generate multi‐scale features, and then adopts learnable weights to emphasize the importance of each scale for capturing attention features. Blocks of Transformer Self‐Attention (TSA) are sequentially stacked to extract long‐range dependency features. To effectively fuse and boost the features of MWCA and TSA, the authors use GRUs again to propose a Gated Dual Attention module (GDA), which enhances beneficial features and suppresses adverse information in a gated learning way. Experiments show that the authors’ method achieves an average Dice coefficient of 80.66% on the Synapse multi‐organ segmentation dataset. The authors’ method outperforms the state‐of‐the‐art methods on medical images. In addition, the authors’ method achieves a Dice segmentation accuracy of 62.77% on difficult objects such as pancreas, significantly exceeding the current average accuracy, so multiple gated boosting (MGB) methods are reliably effective for improving the ability of feature representations. The authors’ code is publicly available at https://github.com/DAgalaxy/MGB‐Net.
- Published
- 2023
- Full Text
- View/download PDF
26. Saak Transform-Based Machine Learning for Light-Sheet Imaging of Cardiac Trabeculation
- Author
-
Ding, Yichen, Gudapati, Varun, Lin, Ruiyuan, Fei, Yanan, Packard, Ren R Sevag, Song, Sibo, Chang, Chih-Chiang, Baek, Kyung In, Wang, Zhaoqiang, Roustaei, Mehrdad, Kuang, Dengfeng, Kuo, C-C Jay, and Hsiai, Tzung K
- Subjects
Computer Vision and Multimedia Computation ,Information and Computing Sciences ,Engineering ,Machine Learning ,Biomedical Imaging ,Cardiovascular ,Bioengineering ,Networking and Information Technology R&D (NITRD) ,Heart Disease ,Algorithms ,Heart ,Image Processing ,Computer-Assisted ,Microscopy ,Fluorescence ,Neural Networks ,Computer ,Transforms ,Image segmentation ,Kernel ,Feature extraction ,Random forests ,Neural networks ,Biomedical optical imaging ,machine learning ,cardiology ,principal component analysis ,Artificial Intelligence and Image Processing ,Biomedical Engineering ,Electrical and Electronic Engineering ,Biomedical engineering ,Electronics ,sensors and digital hardware ,Computer vision and multimedia computation - Abstract
ObjectiveRecent advances in light-sheet fluorescence microscopy (LSFM) enable 3-dimensional (3-D) imaging of cardiac architecture and mechanics in toto. However, segmentation of the cardiac trabecular network to quantify cardiac injury remains a challenge.MethodsWe hereby employed "subspace approximation with augmented kernels (Saak) transform" for accurate and efficient quantification of the light-sheet image stacks following chemotherapy-treatment. We established a machine learning framework with augmented kernels based on the Karhunen-Loeve Transform (KLT) to preserve linearity and reversibility of rectification.ResultsThe Saak transform-based machine learning enhances computational efficiency and obviates iterative optimization of cost function needed for neural networks, minimizing the number of training datasets for segmentation in our scenario. The integration of forward and inverse Saak transforms can also serve as a light-weight module to filter adversarial perturbations and reconstruct estimated images, salvaging robustness of existing classification methods. The accuracy and robustness of the Saak transform are evident following the tests of dice similarity coefficients and various adversary perturbation algorithms, respectively. The addition of edge detection further allows for quantifying the surface area to volume ratio (SVR) of the myocardium in response to chemotherapy-induced cardiac remodeling.ConclusionThe combination of Saak transform, random forest, and edge detection augments segmentation efficiency by 20-fold as compared to manual processing.SignificanceThis new methodology establishes a robust framework for post light-sheet imaging processing, and creating a data-driven machine learning for automated quantification of cardiac ultra-structure.
- Published
- 2021
27. A multiple gated boosting network for multi‐organ medical image segmentation.
- Author
-
Yuan, Feiniu, Tang, Zhaoda, Wang, Chunmei, Huang, Qinghua, and Shi, Jinting
- Subjects
IMAGE segmentation ,DIAGNOSTIC imaging ,ORGANS (Anatomy) ,CRANES (Birds) ,HUMAN body ,PROBLEM solving - Abstract
Segmentations provide important clues for diagnosing diseases. U‐shaped neural networks with skip connections have become one of popular frameworks for medical image segmentation. Skip connections really reduce loss of spatial details caused by down‐sampling, but they cannot handle well semantic gaps between low‐ and high‐level features. It is quite challenging to accurately separate out long, narrow, and small organs from human bodies. To solve these problems, the authors propose a Multiple Gated Boosting Network (MGB‐Net). To boost spatial accuracy, the authors first adopt Gated Recurrent Units (GRU) to design multiple Gated Skip Connections (GSC) at different levels, which efficiently reduce the semantic gap between the shallow and deep features. The Update and Reset gates of GRUs enhance features beneficial to segmentation and suppress information adverse to final results in a recurrent way. To obtain more scale invariances, the authors propose a module of Multi‐scale Weighted Channel Attention (MWCA). The module first uses convolutions with different kernel sizes and group numbers to generate multi‐scale features, and then adopts learnable weights to emphasize the importance of each scale for capturing attention features. Blocks of Transformer Self‐Attention (TSA) are sequentially stacked to extract long‐range dependency features. To effectively fuse and boost the features of MWCA and TSA, the authors use GRUs again to propose a Gated Dual Attention module (GDA), which enhances beneficial features and suppresses adverse information in a gated learning way. Experiments show that the authors' method achieves an average Dice coefficient of 80.66% on the Synapse multi‐organ segmentation dataset. The authors' method outperforms the state‐of‐the‐art methods on medical images. In addition, the authors' method achieves a Dice segmentation accuracy of 62.77% on difficult objects such as pancreas, significantly exceeding the current average accuracy, so multiple gated boosting (MGB) methods are reliably effective for improving the ability of feature representations. The authors' code is publicly available at https://github.com/DAgalaxy/MGB‐Net. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
28. Analysis of Image Hashing in Wafer Map Failure Pattern Recognition.
- Author
-
Piao, Minghao and Jin, Cheng Hao
- Subjects
- *
PATTERN recognition systems , *DEEP learning , *IMAGE analysis , *FEATURE extraction , *CLASSIFICATION algorithms - Abstract
There are mainly two types of wafer map failure pattern recognition, i.e., traditional classification based and deep learning based approaches. Traditional classification usually needs feature engineering, and deep learning requires lower human intervention and feature engineering. The joint requirement is noise filtering and identical input dimension size. Feature engineering and dimension resizing are artificial work and vary from study to study. In our study, we proposed an image hashing based wafer map resizing and pattern information retrieval framework. Traditional classification methods and CNN are employed for the wafer map failure pattern recognition. The experiments show that the image hashing based framework is the more appropriate method when compared to feature extraction and image resizing methods. We also found that proper input data manipulation can result in acceptable performance whether the used method is traditional classification algorithms or CNN. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
29. Toward the Achievable Rate-Distortion Bound of VVC Intra Coding: A Beam Search-Based Joint Optimization Scheme.
- Author
-
Zhang, Yingwen, Wang, Meng, Li, Junru Li, Wang, Shiqi, Ma, Siwei, and Lin, Weisi
- Subjects
- *
SCALABILITY , *SYNTAX (Grammar) , *COPPER , *ENCODING , *CUSTOMIZATION , *VIDEO coding - Abstract
In this paper, we present the first attempt at determining where the achievable rate-distortion (R-D) performance bound in versatile video coding (VVC) intra coding is when considering the mutual dependency in the rate-distortion optimization (RDO) process. In particular, the abundant search space of encoding parameters in VVC intra coding is practically explored with a beam search-based joint rate-distortion optimization (BSJRDO) scheme. As such, the partitioning, prediction and transform decisions are jointly optimized across different coding units (CUs) with a customized search subset instead of the full space. To make the beam search process implementation-friendly for VVC, the dependencies among the CUs are truncated at different depths. To facilitate finer computational scalability, the beam size is flexibly adjusted based on the characteristics of the CUs, such that the operational points that satisfy different complexity demands for diverse applications can be practically obtained. The proposed BSJRDO approach, which fully conforms to the VVC decoding syntax, can serve as both the way toward the optimal RDO bound and a practical performance-boosting solution. BSJRDO is further implemented on a VVC coding platform (VVC Test model (VTM) 12.0), and extensive experiments show that BSJRDO can achieve 1.30% and 3.22% bit rate savings compared to the VTM anchor under the common test condition and low-bit-rate coding scenarios, respectively. Moreover, the performance gain can also be flexibly customized with different computational overheads. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
30. PLGAN: Generative Adversarial Networks for Power-Line Segmentation in Aerial Images.
- Author
-
Abdelfattah, Rabab, Wang, Xiaofeng, and Wang, Song
- Subjects
- *
GENERATIVE adversarial networks , *ELECTRIC lines , *COMPUTER vision , *IMAGE segmentation , *FEATURE extraction , *HOUGH transforms - Abstract
Accurate segmentation of power lines in various aerial images is very important for UAV flight safety. The complex background and very thin structures of power lines, however, make it an inherently difficult task in computer vision. This paper presents PLGAN, a simple yet effective method based on generative adversarial networks, to segment power lines from aerial images with different backgrounds. Instead of directly using the adversarial networks to generate the segmentation, we take their certain decoding features and embed them into another semantic segmentation network by considering more context, geometry, and appearance information of power lines. We further exploit the appropriate form of the generated images for high-quality feature embedding and define a new loss function in the Hough-transform parameter space to enhance the segmentation of very thin power lines. Extensive experiments and comprehensive analysis demonstrate that our proposed PLGAN outperforms the prior state-of-the-art methods for semantic segmentation and line detection. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
31. Learned Image Compression Using Cross-Component Attention Mechanism.
- Author
-
Duan, Wenhong, Chang, Zheng, Jia, Chuanmin, Wang, Shanshe, Ma, Siwei, Song, Li, and Gao, Wen
- Subjects
- *
VIDEO coding , *IMAGE reconstruction , *DATA distribution - Abstract
Learned image compression methods have achieved satisfactory results in recent years. However, existing methods are typically designed for RGB format, which are not suitable for YUV420 format due to the variance of different formats. In this paper, we propose an information-guided compression framework using cross-component attention mechanism, which can achieve efficient image compression in YUV420 format. Specifically, we design a dual-branch advanced information-preserving module (AIPM) based on the information-guided unit (IGU) and attention mechanism. On the one hand, the dual-branch architecture can prevent changes in original data distribution and avoid information disturbance between different components. The feature attention block (FAB) can preserve the important information. On the other hand, IGU can efficiently utilize the correlations between Y and UV components, which can further preserve the information of UV by the guidance of Y. Furthermore, we design an adaptive cross-channel enhancement module (ACEM) to reconstruct the details by utilizing the relations from different components, which makes use of the reconstructed Y as the textural and structural guidance for UV components. Extensive experiments show that the proposed framework can achieve the state-of-the-art performance in image compression for YUV420 format. More importantly, the proposed framework outperforms Versatile Video Coding (VVC) with 8.37% BD-rate reduction on common test conditions (CTC) sequences on average. In addition, we propose a quantization scheme for context model without model retraining, which can overcome the cross-platform decoding error caused by the floating-point operations in context model and provide a reference approach for the application of neural codec on different platforms. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
32. Secure Outsourced SIFT: Accurate and Efficient Privacy-Preserving Image SIFT Feature Extraction.
- Author
-
Liu, Xiang, Zhao, Xueli, Xia, Zhihua, Feng, Qian, Yu, Peipeng, and Weng, Jian
- Subjects
- *
INFORMATION technology , *ABSOLUTE value , *ADDITION (Mathematics) , *TASK analysis , *BIG data - Abstract
Cloud computing has become an important IT infrastructure in the big data era; more and more users are motivated to outsource the storage and computation tasks to the cloud server for convenient services. However, privacy has become the biggest concern, and tasks are expected to be processed in a privacy-preserving manner. This paper proposes a secure SIFT feature extraction scheme with better integrity, accuracy and efficiency than the existing methods. SIFT includes lots of complex steps, including the construction of DoG scale space, extremum detection, extremum location adjustment, rejecting of extremum point with low contrast, eliminating of the edge response, orientation assignment, and descriptor generation. These complex steps need to be disassembled into elementary operations such as addition, multiplication, comparison for secure implementation. We adopt a serial of secret-sharing protocols for better accuracy and efficiency. In addition, we design a secure absolute value comparison protocol to support absolute value comparison operations in the secure SIFT feature extraction. The SIFT feature extraction steps are completely implemented in the ciphertext domain. And the communications between the clouds are appropriately packed to reduce the communication rounds. We carefully analyzed the accuracy and efficiency of our scheme. The experimental results show that our scheme outperforms the existing state-of-the-art. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
33. An Algorithm for Learning Orthonormal Matrix Codebooks for Adaptive Transform Coding.
- Author
-
Boragolla, Rashmi and Yahampath, Pradeepa
- Subjects
- *
MACHINE learning , *APPROXIMATION algorithms , *COVARIANCE matrices , *CONSTRAINED optimization , *TWO-dimensional bar codes - Abstract
This paper proposes a novel data-driven approach to designing orthonormal transform matrix codebooks for adaptive transform coding of any non-stationary vector processes which can be considered locally stationary. Our algorithm, which belongs to the class of block-coordinate descent algorithms, relies on simple probability models such as Gaussian or Laplacian for transform coefficients to directly minimize with respect to the orthonormal transform matrix the mean square error (MSE) of scalar quantization and entropy coding of transform coefficients. A difficulty commonly encountered in such minimization problems is imposing the orthonormality constraint on the matrix solution. We get around this difficulty by mapping the constrained problem in Euclidean space to an unconstrained problem on the Stiefel manifold and leveraging known algorithms for unconstrained optimization on manifolds. While the basic design algorithm directly applies to non-separable transforms, an extension to separable transforms is also proposed. We present experimental results for adaptive transform coding of still images and video inter-frame prediction residuals, comparing the transforms designed using the proposed method and a number of other content-adaptive transforms recently reported in the literature. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
34. Multiplex Transformed Tensor Decomposition for Multidimensional Image Recovery.
- Author
-
Feng, Lanlan, Zhu, Ce, Long, Zhen, Liu, Jiani, and Liu, Yipeng
- Subjects
- *
SINGULAR value decomposition , *MATRIX decomposition , *DISCRETE cosine transforms , *UNITARY groups , *SIGNAL processing - Abstract
Low-rank tensor completion aims to recover the missing entries of multi-way data, which has become popular and vital in many fields such as signal processing and computer vision. It varies with different tensor decomposition frameworks. Compared with matrix SVD, recently emerging transform t-SVD can better characterize the low-rank structure of order-3 data. However, it suffers from rotation sensitivity, and dimensional limitation (i.e., only effective for order-3 tensors). To alleviate these deficiencies, we develop a novel multiplex transformed tensor decomposition (MTTD) framework, which can characterize the global low-rank structure along all modes for any order- $N$ tensor. Based on MTTD, we propose a related multi-dimensional square model for low-rank tensor completion. Besides, a total variation term is also introduced to utilize the local piecewise smoothness of the tensor data. The classic alternating direction method of multipliers is used to solve the convex optimization problems. For performance testing, we choose three linear invertible transforms including FFT, DCT, and a group of unitary transform matrices for our proposed methods. The simulated and real-data experiments demonstrate the superior recovery accuracy and computational efficiency of our method compared with state-of-the-art ones. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
35. A Convolutional Neural Network-Based Conditional Random Field Model for Structured Multi-Focus Image Fusion Robust to Noise.
- Author
-
Bouzos, Odysseas, Andreadis, Ioannis, and Mitianoudis, Nikolaos
- Subjects
- *
CONVOLUTIONAL neural networks , *NOISE measurement , *DEPTH of field , *DEEP learning , *RANDOM fields , *NOISE , *IMAGE fusion - Abstract
The limited depth of field of optical lenses, makes multi-focus image fusion (MFIF) algorithms of vital importance. Lately, Convolutional Neural Networks (CNN) have been widely adopted in MFIF methods, however their predictions mostly lack structure and are limited by the size of the receptive field. Moreover, since images have noise due to various sources, the development of MFIF methods robust to image noise is required. A novel robust to noise Convolutional Neural Network-based Conditional Random Field (mf-CNNCRF) model is introduced. The model takes advantage of the powerful mapping between input and output of CNN networks and the long range interactions of the CRF models in order to reach structured inference. Rich priors for both unary and smoothness terms are learned by training CNN networks. The $\alpha $ -expansion graph-cut algorithm is used to reach structured inference for MFIF. A new dataset, which includes clean and noisy image pairs, is introduced and is used to train the networks of both CRF terms. A low-light MFIF dataset is also developed to demonstrate real-life noise introduced by the camera sensor. Qualitative and quantitative evaluation prove that mf-CNNCRF outperforms state-of-the-art MFIF methods for clean and noisy input images, while being more robust to different noise types without requiring prior knowledge of noise. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
36. JPEG 2000 Extensions for Scalable Coding of Discontinuous Media.
- Author
-
Mathew, Reji, Naman, Aous Thabit, Li, Yue, and Taubman, David
- Subjects
- *
DEPTH maps (Digital image processing) , *DISCRETE wavelet transforms , *BINARY sequences , *JPEG (Image coding standard) , *OPTICAL flow - Abstract
In this paper we propose novel extensions to JPEG 2000 for the coding of discontinuous media which includes piecewise smooth imagery such as depth maps and optical flows. These extensions use breakpoints to model discontinuity boundary geometry and apply a breakpoint dependent Discrete Wavelet Transform (BP-DWT) to the input imagery. The highly scalable and accessible coding features provided by the JPEG 2000 compression framework are preserved by our proposed extensions, with the breakpoint and transform components encoded as independent bit streams that can be progressively decoded. Comparative rate-distortion results are provided along with corresponding visual examples which highlight the advantages of using breakpoint representations with accompanying BD-DWT and embedded bit-plane coding. Recently our proposed extensions have been adopted and are in the process of being published as a new Part 17 to the JPEG 2000 family of coding standards. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
37. Learned Video Compression With Efficient Temporal Context Learning.
- Author
-
Jin, Dengchao, Lei, Jianjun, Peng, Bo, Pan, Zhaoqing, Li, Li, and Ling, Nam
- Subjects
- *
IMAGE compression , *VIDEO coding , *CODECS , *VIDEO compression , *SIGNALS & signaling - Abstract
In contrast to image compression, the key of video compression is to efficiently exploit the temporal context for reducing the inter-frame redundancy. Existing learned video compression methods generally rely on utilizing short-term temporal correlations or image-oriented codecs, which prevents further improvement of the coding performance. This paper proposed a novel temporal context-based video compression network (TCVC-Net) for improving the performance of learned video compression. Specifically, a global temporal reference aggregation (GTRA) module is proposed to obtain an accurate temporal reference for motion-compensated prediction by aggregating long-term temporal context. Furthermore, in order to efficiently compress the motion vector and residue, a temporal conditional codec (TCC) is proposed to preserve structural and detailed information by exploiting the multi-frequency components in temporal context. Experimental results show that the proposed TCVC-Net outperforms public state-of-the-art methods in terms of both PSNR and MS-SSIM metrics. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
38. Motion-Compensated Predictive RAHT for Dynamic Point Clouds.
- Author
-
Souto, Andre L., De Queiroz, Ricardo L., and Dorea, Camilo
- Subjects
- *
IMAGE color analysis , *POINT cloud , *COLOR codes - Abstract
We study the use of predictive approaches alongside the region-adaptive hierarchical transform (RAHT) in attribute compression of dynamic point clouds. The use of intra-frame prediction with RAHT was shown to improve attribute compression performance over pure RAHT and represents the state-of-the-art in attribute compression of point clouds, being part of MPEG’s geometry-based test model. We studied a combination of inter-frame and intra-frame prediction for RAHT for the compression of dynamic point clouds. An adaptive zero-motion-vector (ZMV) scheme and an adaptive motion-compensated scheme are developed. The simple adaptive ZMV approach is able to achieve sizable gains over pure RAHT and over the intra-frame predictive RAHT (I-RAHT) for point clouds with little or no motion while ensuring similar compression performance to I-RAHT for point clouds with intense motion. The motion-compensated approach, more complex and more powerful, is able to achieve large gains across all of the tested dynamic point clouds. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
39. Hierarchical Hashing Learning for Image Set Classification.
- Author
-
Sun, Yuan, Wang, Xu, Peng, Dezhong, Ren, Zhenwen, and Shen, Xiaobo
- Subjects
- *
IMAGE recognition (Computer vision) , *TASK analysis , *SET functions , *SEMANTICS , *BINARY codes , *RECOGNITION (Psychology) - Abstract
With the development of video network, image set classification (ISC) has received a lot of attention and can be used for various practical applications, such as video based recognition, action recognition, and so on. Although the existing ISC methods have obtained promising performance, they often have extreme high complexity. Due to the superiority in storage space and complexity cost, learning to hash becomes a powerful solution scheme. However, existing hashing methods often ignore complex structural information and hierarchical semantics of the original features. They usually adopt a single-layer hashing strategy to transform high-dimensional data into short-length binary codes in one step. This sudden drop of dimension could result in the loss of advantageous discriminative information. In addition, they do not take full advantage of intrinsic semantic knowledge from whole gallery sets. To tackle these problems, in this paper, we propose a novel Hierarchical Hashing Learning (HHL) for ISC. Specifically, a coarse-to-fine hierarchical hashing scheme is proposed that utilizes a two-layer hash function to gradually refine the beneficial discriminative information in a layer-wise fashion. Besides, to alleviate the effects of redundant and corrupted features, we impose the $\ell _{2,1}$ norm on the layer-wise hash function. Moreover, we adopt a bidirectional semantic representation with the orthogonal constraint to keep intrinsic semantic information of all samples in whole image sets adequately. Comprehensive experiments demonstrate HHL acquires significant improvements in accuracy and running time. We will release the demo code on https://github.com/sunyuan-cs. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
40. Deep Face Video Inpainting via UV Mapping.
- Author
-
Yang, Wenqi, Chen, Zhenfang, Chen, Chaofeng, Chen, Guanying, and Wong, Kwan-Yee K.
- Subjects
- *
IMAGE reconstruction , *DEEP learning , *INPAINTING , *TASK analysis , *STEREOLITHOGRAPHY - Abstract
This paper addresses the problem of face video inpainting. Existing video inpainting methods target primarily at natural scenes with repetitive patterns. They do not make use of any prior knowledge of the face to help retrieve correspondences for the corrupted face. They therefore only achieve sub-optimal results, particularly for faces under large pose and expression variations where face components appear very differently across frames. In this paper, we propose a two-stage deep learning method for face video inpainting. We employ 3DMM as our 3D face prior to transform a face between the image space and the UV (texture) space. In Stage I, we perform face inpainting in the UV space. This helps to largely remove the influence of face poses and expressions and makes the learning task much easier with well aligned face features. We introduce a frame-wise attention module to fully exploit correspondences in neighboring frames to assist the inpainting task. In Stage II, we transform the inpainted face regions back to the image space and perform face video refinement that inpaints any background regions not covered in Stage I and also refines the inpainted face regions. Extensive experiments have been carried out which show our method can significantly outperform methods based merely on 2D information, especially for faces under large pose and expression variations. Project page: https://ywq.github.io/FVIP. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
41. GiT: Graph Interactive Transformer for Vehicle Re-Identification.
- Author
-
Shen, Fei, Xie, Yi, Zhu, Jianqing, Zhu, Xiaobin, and Zeng, Huanqiang
- Subjects
- *
CONVOLUTIONAL neural networks , *TRANSFORMER models , *COMPUTER vision , *REPRESENTATIONS of graphs , *FEATURE extraction - Abstract
Transformers are more and more popular in computer vision, which treat an image as a sequence of patches and learn robust global features from the sequence. However, pure transformers are not entirely suitable for vehicle re-identification because vehicle re-identification requires both robust global features and discriminative local features. For that, a graph interactive transformer (GiT) is proposed in this paper. In the macro view, a list of GiT blocks are stacked to build a vehicle re-identification model, in where graphs are to extract discriminative local features within patches and transformers are to extract robust global features among patches. In the micro view, graphs and transformers are in an interactive status, bringing effective cooperation between local and global features. Specifically, one current graph is embedded after the former level’s graph and transformer, while the current transform is embedded after the current graph and the former level’s transformer. In addition to the interaction between graphs and transforms, the graph is a newly-designed local correction graph, which learns discriminative local features within a patch by exploring nodes’ relationships. Extensive experiments on three large-scale vehicle re-identification datasets demonstrate that our GiT method is superior to state-of-the-art vehicle re-identification approaches. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
42. Perceptually Optimizing Color Look-up Tables.
- Author
-
Reinhard, Johann and Urban, Philipp
- Subjects
- *
COLORIMETRY , *IMAGE color analysis , *COLOR printing , *DIGITAL printing - Abstract
The quality of ICC profiles with embedded look-up tables (LUTs) depends on multiple factors: 1. the accuracy of the optical printer model, 2. the exploitation of the available gamut combined with the quality of the gamut mapping approach encoded in the B2A-LUTs (backwards LUTs) and 3. the tonal smoothness as well color accuracy of the backwards LUTs. It can be shown that optimizing the smoothness of the LUTs comes at the expense of color accuracy and requires gamut reduction because of internal tonal edges. We present a method to optimize backwards LUTs of existing ICC profiles w.r.t accuracy, smoothness, gamut exploitation and mapping, which can be extended beyond color, e.g. to joint color and translucency backward LUTs. The approach is based on a perceptual difference metric that is used to optimize the LUT’s tonal smoothness constrained to preserve both the accuracy of and the relationship between colors. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
43. Monitoring Level of Hypnosis Using Stationary Wavelet Transform and Singular Value Decomposition Entropy With Feedforward Neural Network.
- Author
-
Dutt, Muhammad Ibrahim and Saadeh, Wala
- Subjects
FEEDFORWARD neural networks ,PREDICTION algorithms ,WAVELET transforms ,REGRESSION analysis ,DEEP learning - Abstract
Classifying the patient’s depth of anesthesia (LoH) level into a few distinct states may lead to inappropriate drug administration. To tackle the problem, this paper presents a robust and computationally efficient framework that predicts a continuous LoH index scale from 0–100 in addition to the LoH state. This paper proposes a novel approach for accurate LoH estimation based on Stationary Wavelet Transform (SWT) and fractal features. The deep learning model adopts an optimized temporal, fractal, and spectral feature set to identify the patient sedation level irrespective of age and the type of anesthetic agent. This feature set is then fed into a multilayer perceptron network (MLP), a class of feed-forward neural networks. A comparative analysis of regression and classification is made to measure the performance of the chosen features on the neural network architecture. The proposed LoH classifier outperforms the state-of-the-art LoH prediction algorithms with the highest accuracy of 97.1% while utilizing minimized feature set and MLP classifier. Moreover, for the first time, the LoH regressor achieves the highest performance metrics ($\text{R}^{{{2}}}=0.9$ , MAE = 1.5) as compared to previous work. This study is very helpful for developing highly accurate monitoring for LoH which is important for intraoperative and postoperative patients’ health. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
44. Revised Tunable Q-Factor Wavelet Transform for EEG-Based Epileptic Seizure Detection.
- Author
-
Liu, Zhen, Zhu, Bingyu, Hu, Manfeng, Deng, Zhaohong, and Zhang, Jingxiang
- Subjects
WAVELET transforms ,FEATURE extraction ,TIME series analysis ,EPILEPSY ,DECISION trees ,ELECTROENCEPHALOGRAPHY - Abstract
Electroencephalogram (EEG) signals are an essential tool for the detection of epilepsy. Because of the complex time series and frequency features of EEG signals, traditional feature extraction methods have difficulty meeting the requirements of recognition performance. The tunable Q-factor wavelet transform (TQWT), which is a constant-Q transform that is easily invertible and modestly oversampled, has been successfully used for feature extraction of EEG signals. Because the constant-Q is set in advance and cannot be optimized, further applications of the TQWT are restricted. To solve this problem, the revised tunable Q-factor wavelet transform (RTQWT) is proposed in this paper. RTQWT is based on the weighted normalized entropy and overcomes the problems of a nontunable Q-factor and the lack of an optimized tunable criterion. In contrast to the continuous wavelet transform and the raw tunable Q-factor wavelet transform, the wavelet transform corresponding to the revised Q-factor, i.e., RTQWT, is sufficiently better adapted to the nonstationary nature of EEG signals. Therefore, the precise and specific characteristic subspaces obtained can improve the classification accuracy of EEG signals. The classification of the extracted features was performed using the decision tree, linear discriminant, naive Bayes, SVM and KNN classifiers. The performance of the new approach was tested by evaluating the accuracies of five time-frequency distributions: FT, EMD, DWT, CWT and TQWT. The experiments showed that the RTQWT proposed in this paper can be used to extract detailed features more effectively and improve the classification accuracy of EEG signals. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
45. GTMSiam: Gated Transmitting-Based Multiscale Siamese Network for Hyperspectral Image Change Detection.
- Author
-
Wang, Xianghai, Zhao, Keyun, Zhao, Xiaoyang, and Li, Siyao
- Abstract
Hyperspectral image change detection (HSI-CD) is a technique that detects changes in land cover occurring in a specific area within a closed time. At present, most existing methods for HSI-CD employ exceedingly intricate network architectures, leading to a high model complexity that hampers the achievement of a favorable tradeoff between change detection (CD) accuracy and timeliness. Furthermore, existing methods often confine the feature extraction process to a single scale rather than multiple diverse scales. However, employing a multiscale approach for feature extraction allows for capturing fine-grained features encompassing more intricate details, as well as coarse-grained features that aggregate local information over a larger range. On the other hand, most existing methods overemphasize the complexity of the feature extraction process and underestimate the importance of the conversion process from bitemporal features to valuable change features. To this end, a gated transmitting-based multiscale Siamese network (GTMSiam) is proposed, which mainly contains the following two portions: 1) dual branches with the Siamese structure, which capture spatial features of the HSIs at multiple scales while preserving rich spectral information. Moreover, the Siamese design effectively reduces the network parameters, thereby alleviating the computational complexity of the model and 2) gated change information transmitting module (GTM), which utilizes gated neural units to transform bitemporal image features into land cover change information, while progressively transmitting change information at different scales. This enables the network to leverage diverse scale change information for comprehensive discrimination of land object changes. Experimental results on three publicly available datasets demonstrate the superior performance of the proposed GTMSiam. Simultaneously, the complexity analysis experiment proves that the GTMSiam can consider both detection performance and timeliness. The source code of this letter will be released at https://github.com/zkylnnu/GTMSiam. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
46. Channel Migration Correction for Low-Altitude Airborne SAR Tomography Based on Keystone Transform.
- Author
-
Lin, Yuqing, Qiu, Xiaolan, Li, Hang, Wang, Wei, Jiao, Zekun, and Ding, Chibiao
- Abstract
Synthetic aperture radar (SAR) tomography (TomoSAR) has a 3-D resolving ability. Most existing 3-D imaging methods for TomoSAR consider the layovers for different channels are the same. However, in the low-altitude airborne cases, the variation of layovers between channels becomes unneglectable. In this letter, we studied the channel migration (CM) phenomenon of sparse TomoSAR under low-altitude scenarios. We derived the model of the differential range from a particular target to different antennas, which is nearly a linear function along the array dimension. Therefore, we applied Keystone transform (KT) on the array axis to correct this CM, and then the traditional compressed sensing-based 3-D imaging methods for TomoSAR can be applied to get the final 3-D reconstruction results. The approach was tested with simulation data and also the real data of the MV3DSAR system, which is a mini drone-borne TomoSAR system. The results demonstrate that the proposed method can provide a more accurate solution to the low-altitude sparse TomoSAR reconstruction problem. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
47. PCA-CNN Hybrid Approach for Hyperspectral Pansharpening.
- Author
-
Guarino, Giuseppe, Ciotola, Matteo, Vivone, Gemine, Poggi, Giovanni, and Scarpa, Giuseppe
- Abstract
This work proposes a simple yet effective method to adapt unsupervised convolutional neural networks (CNNs) from multispectral (MS) to hyperspectral (HS) pansharpening. Thus, it focuses on the fusion of a single high-resolution panchromatic (PAN) band with a low-resolution HS data cube. This is achieved by means of a decorrelation transform, following the principal component analysis (PCA) approach, which enables the compression of a significant portion of the HS image energy into a few bands. Afterward, a suitably adapted pansharpening network designed for four spectral bands is used to super-resolve only the principal components (PCs). Experiments demonstrate high performance in both quantitative and qualitative evaluations, favorably comparing against state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
48. Weighted Order- p Tensor Nuclear Norm Minimization and Its Application to Hyperspectral Image Mixed Denoising.
- Author
-
He, Chengxun, Cao, Qiujie, Xu, Yang, Sun, Le, Wu, Zebin, and Wei, Zhihui
- Abstract
Recently, tensor singular value decomposition (t-SVD) has demonstrated excellent performance in various high-dimensional information processing applications. However, in adapting t-SVD to handle the typical tensor data restoration tasks, such as hyperspectral image (HSI) denoising, the following questions remain inadequately addressed: 1) the existing tensor nuclear norm minimization (TNN) regime treats all tensor singular values alike; thus, it lacks flexibility and dominance in dealing with the sophisticated HSI tensor; 2) the existing t-SVD-based denoising methods cannot directly process order- $p$ ($p>3$) tensors; thus, they fail to comprehensively exploit the high-dimensional structural correlation of the HSI tensor along different modes. To address the above challenges, in this study, we first generalize a novel weighted order- $p$ TNN minimization regime, which integrates the adaptively reweighting strategy for matrix, third-order, and order- $p$ tensors in a unified architecture. Subsequently, an efficient subspace low-rank learning model is established, using HSI denoising tasks as an application example to corroborate the superiority of the proposed regime in approximating the high-dimensional low-rank structure of natural tensor data. Extensive experimental results substantiate that our effort surpasses existing state-of-the-art low-rank tensor recovery methods in both restoration accuracy and efficiency. The source code is available at https://github.com/CX-He/WTNN.git. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
49. Adaptive Time-Synchroextracting S Transform and Its Application in Fault Identification.
- Author
-
Wu, Xuefeng, Zhang, Huixing, He, Bingshou, and Guo, Meng
- Abstract
Time-frequency (TF) analysis is an important tool for seismic signal analysis, and through traditional methods, it is difficult to achieve high TF resolution and energy aggregation. In this letter, we propose the adaptive time-synchroextracting S transform (ATSEST) for seismic data processing and interpretation. The method first calculates the scale parameter of the window function through the Fourier spectrum of the signal to obtain the adaptive S transform spectrum. After that, the final result is obtained by extracting the TF coefficients at the group delay (GD) and removing a large amount of fuzzy energy in the TF spectrum. The results of the synthetic example TF analysis show that the method can improve the localization ability of transient features of seismic signals. We apply the method to the fault identification of field data, and the results show that the coherent attribute slices extracted by using ATSEST can well characterize the fault. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
50. Automatic Seismic Lithology Interpretation via Multiattribute Integrated Deep Learning.
- Author
-
Pan, Lele, Gao, Jinghuai, Yang, Yang, Wang, Zhiguo, and Gao, Zhaoqi
- Abstract
Seismic lithology interpretation based on seismic data is an important task to delineate oil and gas reservoirs. However, this is an extremely unstable work when only utilizing seismic data, which would result in multiple solutions. We suggest a multiattribute integrated deep learning (MAIDL) workflow for automatic seismic lithology interpretation. To implement the proposed model, we first propose to apply the wavelet scattering transform (WST) to seismic data for multiscale features extraction. Note that the WST has local deformation stability and translation invariance for analyzing seismic data, which would be proven to promote seismic lithology interpretation. Next, the MAIDL model is suggested to combine the multiscale features extracted by the WST and seismic data simultaneously, which can improve the accuracy of automatic seismic lithology prediction. Afterward, the Res-UNet, which incorporates residual blocks into the UNet, is introduced to avoid the over-fitting and the degradation problem of the proposed MAIDL model. Finally, a 2-D synthetic data and a 2-D post-stack field data are adopted to test the effectiveness of the suggested MAIDL model for automatic seismic lithology interpretation. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.