Descriptor: "spatial transformer networks" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"spatial transformer networks"' showing total 29 results

Start Over Descriptor "spatial transformer networks"

29 results on '"spatial transformer networks"'

1. Unsupervised Sparse-View Backprojection via Convolutional and Spatial Transformer Networks

Author: Liu, Xueqing, Sajda, Paul, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Liu, Feng, editor, Zhang, Yu, editor, Kuai, Hongzhi, editor, Stephen, Emily P., editor, and Wang, Hongjun, editor
Published: 2023
Full Text: View/download PDF

2. Adversarial and Random Transformations for Robust Domain Adaptation and Generalization.

Author: Xiao, Liang, Xu, Jiaolong, Zhao, Dawei, Shang, Erke, Zhu, Qi, and Dai, Bin
Subjects: *ARTIFICIAL neural networks, *DATA augmentation, *MACHINE learning, *REINFORCEMENT learning, *GENERALIZATION, *SEARCH algorithms
Abstract: Data augmentation has been widely used to improve generalization in training deep neural networks. Recent works show that using worst-case transformations or adversarial augmentation strategies can significantly improve accuracy and robustness. However, due to the non-differentiable properties of image transformations, searching algorithms such as reinforcement learning or evolution strategy have to be applied, which are not computationally practical for large-scale problems. In this work, we show that by simply applying consistency training with random data augmentation, state-of-the-art results on domain adaptation (DA) and generalization (DG) can be obtained. To further improve the accuracy and robustness with adversarial examples, we propose a differentiable adversarial data augmentation method based on spatial transformer networks (STNs). The combined adversarial and random-transformation-based method outperforms the state-of-the-art on multiple DA and DG benchmark datasets. Furthermore, the proposed method shows desirable robustness to corruption, which is also validated on commonly used datasets. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

3. Recognition of the Multioriented Text Based on Deep Learning

Author: Priyadarsini, K., Janahan, Senthil Kumar, Thirumal, S., Bindu, P., Raj, T. Ajith Bosco, Majji, Sankararao, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Jacob, I. Jeena, editor, Kolandapalayam Shanmugam, Selvanayaki, editor, and Bestak, Robert, editor
Published: 2022
Full Text: View/download PDF

4. A Detection Network United Local Feature Points and Components for Fine-Grained Image Classification

Author: Du, Yong, Yu, Bin, Hong, Peng, Pan, Wei, Wang, Yang, Wang, Yu, Xhafa, Fatos, Series Editor, Hassanien, Aboul Ella, editor, Xu, Yaoqun, editor, Zhao, Zhijie, editor, Mohammed, Sabah, editor, and Fan, Zhipeng, editor
Published: 2022
Full Text: View/download PDF

5. Scalable Handwritten Text Recognition System for Lexicographic Sources of Under-Resourced Languages and Alphabets

Author: Idziak, Jan, Šeļa, Artjoms, Woźniak, Michał, Leśniak, Albert, Byszuk, Joanna, Eder, Maciej, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Paszynski, Maciej, editor, Kranzlmüller, Dieter, editor, Krzhizhanovskaya, Valeria V., editor, Dongarra, Jack J., editor, and Sloot, Peter M. A., editor
Published: 2021
Full Text: View/download PDF

6. Revisiting Data Augmentation for Rotational Invariance in Convolutional Neural Networks

Author: Quiroga, Facundo, Ronchetti, Franco, Lanzarini, Laura, Bariviera, Aurelio F., Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Ferrer-Comalat, Joan Carles, editor, Linares-Mustarós, Salvador, editor, and Merigó, José M., editor
Published: 2020
Full Text: View/download PDF

7. Cascaded Region Proposal Networks for Proposal-Based Tracking

Author: Zhang, Ximing, Fan, Xuewu, Luo, Shujuan, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, McDaniel, Troy, editor, Berretti, Stefano, editor, Curcio, Igor D. D., editor, and Basu, Anup, editor
Published: 2020
Full Text: View/download PDF

8. A Deep Learning Framework for Audio Deepfake Detection.

Author: Khochare, Janavi, Joshi, Chaitali, Yenarkar, Bakul, Suratkar, Shraddha, and Kazi, Faruk
Subjects: *CONVOLUTIONAL neural networks, *DEEP learning, *SPEECH synthesis, *MACHINE learning, *DEEPFAKES, *CLASSIFICATION algorithms
Abstract: Audio deepfakes have been increasingly emerging as a potential source of deceit, with the development of avant-garde methods of synthetic speech generation. Hence, differentiating fake audio from the real one is becoming even more difficult owing to the increasing accuracy of text-to-speech models, posing a serious threat to speaker verification systems. Within the domain of audio deepfake detection, a majority of experiments have been based on the ASVSpoof or the AVSpoof dataset using various machine learning and deep learning approaches. In this work, experiments were performed on a more recent dataset, the Fake or Real (FoR) dataset which contains data generated using some of the best text to speech models. Two approaches have been adopted to the solve problem: feature-based approach and image-based approach. The feature-based approach involves converting audio data into a dataset consisting of various spectral features of the audio samples, which are fed to the machine learning algorithms for the classification of audio as fake or real. While in the image-based approach audio samples are converted into melspectrograms which are input into deep learning algorithms, namely Temporal Convolutional Network (TCN) and Spatial Transformer Network (STN). TCN has been implemented because it is a sequential model and has been shown to give good results on sequential data. A comparison between the performances of both the approaches has been made, and it is observed that deep learning algorithms, particularly TCN, outperforms the machine learning algorithms by a significant margin, with a 92 percent test accuracy. This solution presents a model for audio deepfake classification which has an accuracy comparable to the traditional CNN models like VGG16, XceptionNet, etc. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

9. Computationally Efficient ANN Model for Small-Scale Problems

Author: Sharma, Shikhar, Shivhare, Shiv Naresh, Singh, Navjot, Kumar, Krishan, Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Tanveer, M., editor, and Pachori, Ram Bilas, editor
Published: 2019
Full Text: View/download PDF

10. Improving Deep Image Clustering with Spatial Transformer Layers

Author: Souza, Thiago V. M., Zanchettin, Cleber, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Tetko, Igor V., editor, Kůrková, Věra, editor, Karpov, Pavel, editor, and Theis, Fabian, editor
Published: 2019
Full Text: View/download PDF

11. A Reinforcement Learning Approach for Sequential Spatial Transformer Networks

Author: Azimi, Fatemeh, Raue, Federico, Hees, Jörn, Dengel, Andreas, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Tetko, Igor V., editor, Kůrková, Věra, editor, Karpov, Pavel, editor, and Theis, Fabian, editor
Published: 2019
Full Text: View/download PDF

12. Unsupervised Domain Adaptation From Axial to Short-Axis Multi-Slice Cardiac MR Images by Incorporating Pretrained Task Networks.

Author: Koehler, Sven, Hussain, Tarique, Blair, Zach, Huffaker, Tyler, Ritzmann, Florian, Tandon, Animesh, Pickardt, Thomas, Sarikouch, Samir, Latus, Heiner, Greil, Gerald, Wolf, Ivo, and Engelhardt, Sandy
Subjects: *MAGNETIC resonance imaging, *CARDIAC magnetic resonance imaging, *CARDIAC imaging, *CONGENITAL heart disease, *CARDIOVASCULAR diseases
Abstract: Anisotropic multi-slice Cardiac Magnetic Resonance (CMR) Images are conventionally acquired in patient-specific short-axis (SAX) orientation. In specific cardiovascular diseases that affect right ventricular (RV) morphology, acquisitions in standard axial (AX) orientation are preferred by some investigators, due to potential superiority in RV volume measurement for treatment planning. Unfortunately, due to the rare occurrence of these diseases, data in this domain is scarce. Recent research in deep learning-based methods mainly focused on SAX CMR images and they had proven to be very successful. In this work, we show that there is a considerable domain shift between AX and SAX images, and therefore, direct application of existing models yield sub-optimal results on AX samples. We propose a novel unsupervised domain adaptation approach, which uses task-related probabilities in an attention mechanism. Beyond that, cycle consistency is imposed on the learned patient-individual 3D rigid transformation to improve stability when automatically re-sampling the AX images to SAX orientations. The network was trained on 122 registered 3D AX-SAX CMR volume pairs from a multi-centric patient cohort. A mean 3D Dice of 0.86 ± 0.06 for the left ventricle, 0.65 ± 0.08 for the myocardium, and 0.77 ± 0.10 for the right ventricle could be achieved. This is an improvement of 25% in Dice for RV in comparison to direct application on axial slices. To conclude, our pre-trained task module has neither seen CMR images nor labels from the target domain, but is able to segment them after the domain gap is reduced. Code: https://github.com/Cardio-AI/3d-mri-domain-adaptation [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

13. A Refined Spatial Transformer Network

Author: Shu, Chang, Chen, Xi, Yu, Chong, Han, Hua, Hutchison, David, Series Editor, Kanade, Takeo, Series Editor, Kittler, Josef, Series Editor, Kleinberg, Jon M., Series Editor, Mattern, Friedemann, Series Editor, Mitchell, John C., Series Editor, Naor, Moni, Series Editor, Pandu Rangan, C., Series Editor, Steffen, Bernhard, Series Editor, Terzopoulos, Demetri, Series Editor, Tygar, Doug, Series Editor, Weikum, Gerhard, Series Editor, Cheng, Long, editor, Leung, Andrew Chi Sing, editor, and Ozawa, Seiichi, editor
Published: 2018
Full Text: View/download PDF

14. An effective recognition approach for contactless palmprint.

Author: Xu, Nuoya, Zhu, Qi, Xu, Xiangyu, and Zhang, Daoqiang
Subjects: *CONVOLUTIONAL neural networks, *IMAGE registration, *SIGNAL convolution
Abstract: The biometrics character has been widely used for individual identification and verification. Palmprint as one of biological features contains abundant discriminative features, which has already attracted a lot of interest. In this work, we focus on the identification and verification of contactless palmprint images. Considering the main differences between contact and contactless images, including orientation and deformation, we use a deep network combined with image alignment to further improve the recognition performance of contactless palmprint images. Recently, convolutional neural networks can well solve many classification problems, and researchers have proposed many networks with different architectures. We exploit the residual network in our framework, which achieves promising performance on the image classification problem. In order to improve the accuracy of verification, the spatial transformation network is used to align the image. The proposed method is tested on two public palmprint databases CASIA, GPDS. Extensive experiments are carried out with several state-of-the-art approaches as comparison, and the results demonstrated the effectiveness of our method. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

15. gvnn: Neural Network Library for Geometric Computer Vision

Author: Handa, Ankur, Bloesch, Michael, Pătrăucean, Viorica, Stent, Simon, McCormac, John, Davison, Andrew, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Hua, Gang, editor, and Jégou, Hervé, editor
Published: 2016
Full Text: View/download PDF

16. An Improved Deep Residual Network Prediction Model for the Early Diagnosis of Alzheimer’s Disease

Author: Haijing Sun, Anna Wang, Wenhui Wang, and Chen Liu
Subjects: residual network, Mish, spatial transformer networks, non-local attention mechanism, Alzheimer’s disease, Chemical technology, TP1-1185
Abstract: The early diagnosis of Alzheimer’s disease (AD) can allow patients to take preventive measures before irreversible brain damage occurs. It can be seen from cross-sectional imaging studies of AD that the features of the lesion areas in AD patients, as observed by magnetic resonance imaging (MRI), show significant variation, and these features are distributed throughout the image space. Since the convolutional layer of the general convolutional neural network (CNN) cannot satisfactorily extract long-distance correlation in the feature space, a deep residual network (ResNet) model, based on spatial transformer networks (STN) and the non-local attention mechanism, is proposed in this study for the early diagnosis of AD. In this ResNet model, a new Mish activation function is selected in the ResNet-50 backbone to replace the Relu function, STN is introduced between the input layer and the improved ResNet-50 backbone, and a non-local attention mechanism is introduced between the fourth and the fifth stages of the improved ResNet-50 backbone. This ResNet model can extract more information from the layers by deepening the network structure through deep ResNet. The introduced STN can transform the spatial information in MRI images of Alzheimer’s patients into another space and retain the key information. The introduced non-local attention mechanism can find the relationship between the lesion areas and normal areas in the feature space. This model can solve the problem of local information loss in traditional CNN and can extract the long-distance correlation in feature space. The proposed method was validated using the ADNI (Alzheimer’s disease neuroimaging initiative) experimental dataset, and compared with several models. The experimental results show that the classification accuracy of the algorithm proposed in this study can reach 97.1%, the macro precision can reach 95.5%, the macro recall can reach 95.3%, and the macro F1 value can reach 95.4%. The proposed model is more effective than other algorithms.
Published: 2021
Full Text: View/download PDF

17. Learning transform-aware attentive network for object tracking.

Author: Lu, Xiankai, Ni, Bingbing, Ma, Chao, and Yang, Xiaokang
Subjects: *OBJECT tracking (Computer vision), *ARTIFICIAL satellite tracking, *MOTION
Abstract: Existing trackers often decompose the task of visual tracking into multiple independent components, such as target appearance sampling, classifier learning, and target state inferring. In this paper, we present a transform-aware attentive tracking framework, which uses a deep attentive network to directly predict the target states via spatial transform parameters. During off-line training, the proposed network learns generic motion patterns of target objects from auxiliary large-scale videos. These leaned motion patterns are then applied to track target objects on test sequences. Built on the Spatial Transform Network (STN), the proposed attentive network is fully differentiable and can be trained in an end-to-end manner. Notably, we only fine-tune the pre-trained network in the initial frame. The proposed tracker requires neither online model update nor appearance sampling during the tracking process. Extensive experiments on OTB-2013, OTB-2015, VOT-2014 and UAV-123 datasets demonstrate the competitive performance of our method against state-of-the-art attentive tracking methods. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

18. An Automatic Modulation Recognition Method with Low Parameter Estimation Dependence Based on Spatial Transformer Networks.

Author: Li, Mingxuan, Li, Ou, Liu, Guangyi, and Zhang, Ce
Subjects: DEEP learning, PREDICATE calculus, PARAMETER estimation
Abstract: Recently, automatic modulation recognition has been an important research topic in wireless communication. Due to the application of deep learning, it is prospective of using convolution neural networks on raw in-phase and quadrature signals in developing automatic modulation recognition methods. However, the errors introduced during signal reception and processing will greatly deteriorate the classification performance, which affects the practical application of such methods. Therefore, we first analyze and quantify the errors introduced by signal detection and isolation in noncooperative communication through a baseline convolution neural network. In response to these errors, we then design a signal spatial transformer module based on the attention model to eliminate errors by a priori learning of signal structure. By cascading a signal spatial transformer module in front of the baseline classification network, we propose a method that can adaptively resample the signal capture to adjust time drift, symbol rate, and clock recovery. Besides, it can also automatically add a perturbation on the signal carrier to correct frequency offset. By applying this improved model to automatic modulation recognition, we obtain a significant improvement in classification performance compared with several existing methods. Our method significantly improves the prospect of the application of automatic modulation recognition based on deep learning under nonideal synchronization. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

19. A deep learning method for image super-resolution based on geometric similarity.

Author: Lu, Jian, Hu, Weidong, and Sun, Yi
Subjects: *DEEP learning, *DIGITAL image processing, *HIGH resolution imaging, *ALGORITHMS, *ARTIFICIAL neural networks
Abstract: Abstract A single image super-resolution (SR) algorithm that combines deep convolutional neural networks (CNNs) with multi-scale similarity is presented in this work. The aim of this method is to address the incapability of the existing CNN methods in digging the potential information in the image itself. In order to dig these information, the image patches that look similar within the same scale and across the different scales are firstly searched inside the input image. Subsequently, a spatial transform networks (STNs) are embedded into the CNNs to make the similar patches well aligned. The STNs allow the CNNs to have the ability of spatial manipulation of data. Finally, when SR is performing through the proposed pyramid-shaped CNNs, the high-resolution (HR) image will be predicted gradually according to the complementary information provided by these aligned patches. The experimental results confirm the effectiveness of the proposed method and demonstrate it can be compared with state-of-the-art approaches for single image SR. Highlights • Super resolution with convolutional neural networks and multi-scale similarity. • Embedded spatial transformer into convolutional neural network (CNN) to align data. • Enhanced image resolution gradually using a pyramid-shaped CNN. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

20. Sequence recognition of Chinese license plates.

Author: Wang, Jianlin, Huang, He, Qian, Xusheng, Cao, Jinde, and Dai, Yakang
Subjects: *AUTOMOBILE license plates, *FEATURE selection, *ARTIFICIAL neural networks, *ACCURACY, *MATHEMATICAL models
Abstract: Abstract The recognition of license plates is very important for intelligent transportation systems. Generally, the performance of an intelligent recognition algorithm is greatly affected by different shooting angles, illumination conditions and backgrounds of the license plate images. This paper presents a sequence recognition approach for intelligent recognition of Chinese license plates. Firstly, a spatial transformer network (STN) is employed to adjust the inclined and deformed license plates such that all the plates have a uniform orientation and thus are easier to be recognized. Then, an improved convolutional neural network (CNN) is designed to extract sequence features of the rectified license plates. The features of different convolutional layers are integrated as input to a bi-directional recurrent neural network (BRNN), where the character segmentation is not needed. Finally, the recognition is accomplished by the BRNN and connectionist temporal classification (CTC). Due to the lack of adequate Chinese license plates, an effective training method is presented in which the network is pre-trained by sufficiently enough synthetic license plates and is fine-tuned by our collected real Chinese license plates. The experimental results show that our model achieves better recognition accuracy and lower average edit distance than some existing methods. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

21. Ω-Net (Omega-Net): Fully automatic, multi-view cardiac MR detection, orientation, and segmentation with deep neural networks.

Author: Vigneault, Davis M., Xie, Weidi, Ho, Carolyn Y., Bluemke, David A., and Noble, J. Alison
Subjects: *CARDIAC magnetic resonance imaging, *IMAGE segmentation, *ARTIFICIAL neural networks, *HYPERTROPHIC cardiomyopathy, *IMAGE processing
Abstract: Pixelwise segmentation of the left ventricular (LV) myocardium and the four cardiac chambers in 2-D steady state free precession (SSFP) cine sequences is an essential preprocessing step for a wide range of analyses. Variability in contrast, appearance, orientation, and placement of the heart between patients, clinical views, scanners, and protocols makes fully automatic semantic segmentation a notoriously difficult problem. Here, we present Ω-Net (Omega-Net): A novel convolutional neural network (CNN) architecture for simultaneous localization, transformation into a canonical orientation, and semantic segmentation. First, an initial segmentation is performed on the input image; second, the features learned during this initial segmentation are used to predict the parameters needed to transform the input image into a canonical orientation; and third, a final segmentation is performed on the transformed image. In this work, Ω-Nets of varying depths were trained to detect five foreground classes in any of three clinical views (short axis, SA; four-chamber, 4C; two-chamber, 2C), without prior knowledge of the view being segmented. This constitutes a substantially more challenging problem compared with prior work. The architecture was trained using three-fold cross-validation on a cohort of patients with hypertrophic cardiomyopathy (HCM, N = 42 ) and healthy control subjects ( N = 21 ). Network performance, as measured by weighted foreground intersection-over-union (IoU), was substantially improved for the best-performing Ω-Net compared with U-Net segmentation without localization or orientation (0.858 vs 0.834). In addition, to be comparable with other works, Ω-Net was retrained from scratch using five-fold cross-validation on the publicly available 2017 MICCAI Automated Cardiac Diagnosis Challenge (ACDC) dataset. The Ω-Net outperformed the state-of-the-art method in segmentation of the LV and RV bloodpools, and performed slightly worse in segmentation of the LV myocardium. We conclude that this architecture represents a substantive advancement over prior approaches, with implications for biomedical image segmentation more generally. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

22. An Automatic Modulation Recognition Method with Low Parameter Estimation Dependence Based on Spatial Transformer Networks

Author: Mingxuan Li, Ou Li, Guangyi Liu, and Ce Zhang
Subjects: deep learning, automatic modulation recognition, spatial transformer networks, signal processing, Technology, Engineering (General). Civil engineering (General), TA1-2040, Biology (General), QH301-705.5, Physics, QC1-999, Chemistry, QD1-999
Abstract: Recently, automatic modulation recognition has been an important research topic in wireless communication. Due to the application of deep learning, it is prospective of using convolution neural networks on raw in-phase and quadrature signals in developing automatic modulation recognition methods. However, the errors introduced during signal reception and processing will greatly deteriorate the classification performance, which affects the practical application of such methods. Therefore, we first analyze and quantify the errors introduced by signal detection and isolation in noncooperative communication through a baseline convolution neural network. In response to these errors, we then design a signal spatial transformer module based on the attention model to eliminate errors by a priori learning of signal structure. By cascading a signal spatial transformer module in front of the baseline classification network, we propose a method that can adaptively resample the signal capture to adjust time drift, symbol rate, and clock recovery. Besides, it can also automatically add a perturbation on the signal carrier to correct frequency offset. By applying this improved model to automatic modulation recognition, we obtain a significant improvement in classification performance compared with several existing methods. Our method significantly improves the prospect of the application of automatic modulation recognition based on deep learning under nonideal synchronization.
Published: 2019
Full Text: View/download PDF

23. Understanding when spatial transformer networks do not support invariance, and what to do about it

Author: Finnveden, Lukas, Jansson, Ylva, Lindeberg, Tony, Finnveden, Lukas, Jansson, Ylva, and Lindeberg, Tony
Abstract: Spatial transformer networks (STNs) were designed to enable convolutional neural networks (CNNs) to learn invariance to image transformations. STNs were originally proposed to transform CNN feature maps as well as input images. This enables the use of more complex features when predicting transformation parameters. However, since STNs perform a purely spatial transformation, they do not, in the general case, have the ability to align the feature maps of a transformed image with those of its original. STNs are therefore unable to support invariance when transforming CNN feature maps. We present a simple proof for this and study the practical implications, showing that this inability is coupled with decreased classification accuracy. We therefore investigate alternative STN architectures that make use of complex features. We find that while deeper localization networks are difficult to train, localization networks that share parameters with the classification network remain stable as they grow deeper, which allows for higher classification accuracy on difficult datasets. Finally, we explore the interaction between localization network complexity and iterative image alignment., Not duplicate with DiVA 1428271QC 20210831
Published: 2021
Full Text: View/download PDF

24. Understanding when spatial transformer networks do not support invariance, and what to do about it

Author: Lukas Finnveden, Tony Lindeberg, and Ylva Jansson
Subjects: FOS: Computer and information sciences, Network complexity, Computer science, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, spatial transformer networks, 02 engineering and technology, invariant neural networks, 010501 environmental sciences, 01 natural sciences, Convolutional neural network, Datorseende och robotik (autonoma system), convolutional neural networks, 0202 electrical engineering, electronic engineering, information engineering, Practical implications, Computer Vision and Robotics (Autonomous Systems), 0105 earth and related environmental sciences, Transformer (machine learning model), business.industry, deep learning, Pattern recognition, Spatial transformation, Image alignment, 020201 artificial intelligence & image processing, Artificial intelligence, business
Abstract: Spatial transformer networks (STNs) were designed to enable convolutional neural networks (CNNs) to learn invariance to image transformations. STNs were originally proposed to transform CNN feature maps as well as input images. This enables the use of more complex features when predicting transformation parameters. However, since STNs perform a purely spatial transformation, they do not, in the general case, have the ability to align the feature maps of a transformed image with those of its original. STNs are therefore unable to support invariance when transforming CNN feature maps. We present a simple proof for this and study the practical implications, showing that this inability is coupled with decreased classification accuracy. We therefore investigate alternative STN architectures that make use of complex features. We find that while deeper localization networks are difficult to train, localization networks that share parameters with the classification network remain stable as they grow deeper, which allows for higher classification accuracy on difficult datasets. Finally, we explore the interaction between localization network complexity and iterative image alignment., Comment: 13 pages, 7 figures, 6 tables
Published: 2021

25. Understanding when spatial transformer networks do not support invariance, and what to do about it

Author: Finnveden, Lukas, Jansson, Ylva, Lindeberg, Tony, Finnveden, Lukas, Jansson, Ylva, and Lindeberg, Tony
Abstract: Spatial transformer networks (STNs) were designed to enable convolutional neural networks (CNNs) to learn invariance to image transformations. STNs were originally proposed to transform CNN feature maps as well as input images. This enables the use of more complex features when predicting transformation parameters. However, since STNs perform a purely spatial transformation, they do not, in the general case, have the ability to align the feature maps of a transformed image with those of its original. STNs are therefore unable to support invariance when transforming CNN feature maps. We present a simple proof for this and study the practical implications, showing that this inability is coupled with decreased classification accuracy. We therefore investigate alternative STN architectures that make use of complex features. We find that while deeper localization networks are difficult to train, localization networks that share parameters with the classification network remain stable as they grow deeper, which allows for higher classification accuracy on difficult datasets. Finally, we explore the interaction between localization network complexity and iterative image alignment., Not duplicate with DiVA 1516191QC 20200511
Published: 2020

26. Inability of spatial transformations of CNN feature maps to support invariant recognition

Author: Jansson, Ylva, Maydanskiy, Maksim, Finnveden, Lukas, Lindeberg, Tony, Jansson, Ylva, Maydanskiy, Maksim, Finnveden, Lukas, and Lindeberg, Tony
Abstract: A large number of deep learning architectures use spatial transformations of CNN feature maps or filters to better deal with variability in object appearance caused by natural image transformations. In this paper, we prove that spatial transformations of CNN feature maps cannot align the feature maps of a transformed image to match those of its original, for general affine transformations, unless the extracted features are themselves invariant. Our proof is based on elementary analysis for both the single- and multi-layer network case. The results imply that methods based on spatial transformations of CNN feature maps or filters cannot replace image alignment of the input and cannot enable invariant recognition for general affine transformations, specifically not for scaling transformations or shear transformations. For rotations and reflections, spatially transforming feature maps or filters can enable invariance but only for networks with learnt or hardcoded rotation- or reflection-invariant features, QC 20200507
Published: 2020

27. The problems with using STNs to align CNN feature maps

Author: Finnveden, Lukas, Jansson, Ylva, Lindeberg, Tony, Finnveden, Lukas, Jansson, Ylva, and Lindeberg, Tony
Abstract: Spatial transformer networks (STNs) were designed to enable CNNs to learn invariance to image transformations. STNs were originally proposed to transform CNN feature maps as well as input images. This enables the use of more complex features when predicting transformation parameters. However, since STNs perform a purely spatial transformation, they do not, in the general case, have the ability to align the feature maps of a transformed image and its original. We present a theoretical argument for this and investigate the practical implications, showing that this inability is coupled with decreased classification accuracy. We advocate taking advantage of more complex features in deeper layers by instead sharing parameters between the classification and the localisation network., QC 20200123
Published: 2020

28. An Improved Deep Residual Network Prediction Model for the Early Diagnosis of Alzheimer's Disease.

Author: Sun, Haijing, Wang, Anna, Wang, Wenhui, and Liu, Chen
Subjects: MAGNETIC resonance imaging, ALZHEIMER'S disease, EARLY diagnosis, CONVOLUTIONAL neural networks, CROSS-sectional imaging, CLASSIFICATION algorithms
Abstract: The early diagnosis of Alzheimer's disease (AD) can allow patients to take preventive measures before irreversible brain damage occurs. It can be seen from cross-sectional imaging studies of AD that the features of the lesion areas in AD patients, as observed by magnetic resonance imaging (MRI), show significant variation, and these features are distributed throughout the image space. Since the convolutional layer of the general convolutional neural network (CNN) cannot satisfactorily extract long-distance correlation in the feature space, a deep residual network (ResNet) model, based on spatial transformer networks (STN) and the non-local attention mechanism, is proposed in this study for the early diagnosis of AD. In this ResNet model, a new Mish activation function is selected in the ResNet-50 backbone to replace the Relu function, STN is introduced between the input layer and the improved ResNet-50 backbone, and a non-local attention mechanism is introduced between the fourth and the fifth stages of the improved ResNet-50 backbone. This ResNet model can extract more information from the layers by deepening the network structure through deep ResNet. The introduced STN can transform the spatial information in MRI images of Alzheimer's patients into another space and retain the key information. The introduced non-local attention mechanism can find the relationship between the lesion areas and normal areas in the feature space. This model can solve the problem of local information loss in traditional CNN and can extract the long-distance correlation in feature space. The proposed method was validated using the ADNI (Alzheimer's disease neuroimaging initiative) experimental dataset, and compared with several models. The experimental results show that the classification accuracy of the algorithm proposed in this study can reach 97.1%, the macro precision can reach 95.5%, the macro recall can reach 95.3%, and the macro F1 value can reach 95.4%. The proposed model is more effective than other algorithms. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

29. Automatic target recognition with convolutional neural networks.

Author: Baili, Nada
Subjects: Automatic target recognition, convolutional neural networks, spatial transformer networks, wide residual neural networks, DSIAC, Other Computer Engineering
Abstract: Automatic Target Recognition (ATR) characterizes the ability for an algorithm or device to identify targets or other objects based on data obtained from sensors, being commonly thermal. ATR is an important technology for both civilian and military computer vision applications. However, the current level of performance that is available is largely deficient compared to the requirements. This is mainly due to the difficulty of acquiring targets in realistic environments, and also to limitations of the distribution of classified data to the academic community for research purposes. This thesis proposes to solve the ATR task using Convolutional Neural Networks (CNN). We present three learning approaches using WideResNet-28-2\cite{wrn} as a backbone CNN. The first method uses random initialization of the network weights. The second method explores transfer learning. Finally, the third approach relies on spatial transformer networks \cite{stn} to enhance the geometric invariance of the model. To validate, analyze and compare our three proposed models, we use a large-scale real benchmark dataset that includes civilian and military vehicles. These targets are captured at different viewing angles, different resolutions, and different times of the day. We evaluate the effectiveness of our methods by studying their robustness to realistic case scenarios where no ground truth data is available and targets are automatically detected. We show that the method that uses spatial transformer networks achieves the best results and demonstrates the most robustness to various perturbations that can be encountered in real applications.
Published: 2020

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

29 results on '"spatial transformer networks"'

1. Unsupervised Sparse-View Backprojection via Convolutional and Spatial Transformer Networks

2. Adversarial and Random Transformations for Robust Domain Adaptation and Generalization.

3. Recognition of the Multioriented Text Based on Deep Learning

4. A Detection Network United Local Feature Points and Components for Fine-Grained Image Classification

5. Scalable Handwritten Text Recognition System for Lexicographic Sources of Under-Resourced Languages and Alphabets

6. Revisiting Data Augmentation for Rotational Invariance in Convolutional Neural Networks

7. Cascaded Region Proposal Networks for Proposal-Based Tracking

8. A Deep Learning Framework for Audio Deepfake Detection.

9. Computationally Efficient ANN Model for Small-Scale Problems

10. Improving Deep Image Clustering with Spatial Transformer Layers

11. A Reinforcement Learning Approach for Sequential Spatial Transformer Networks

12. Unsupervised Domain Adaptation From Axial to Short-Axis Multi-Slice Cardiac MR Images by Incorporating Pretrained Task Networks.

13. A Refined Spatial Transformer Network

14. An effective recognition approach for contactless palmprint.

15. gvnn: Neural Network Library for Geometric Computer Vision

16. An Improved Deep Residual Network Prediction Model for the Early Diagnosis of Alzheimer’s Disease

17. Learning transform-aware attentive network for object tracking.

18. An Automatic Modulation Recognition Method with Low Parameter Estimation Dependence Based on Spatial Transformer Networks.

19. A deep learning method for image super-resolution based on geometric similarity.

20. Sequence recognition of Chinese license plates.

21. Ω-Net (Omega-Net): Fully automatic, multi-view cardiac MR detection, orientation, and segmentation with deep neural networks.

22. An Automatic Modulation Recognition Method with Low Parameter Estimation Dependence Based on Spatial Transformer Networks

23. Understanding when spatial transformer networks do not support invariance, and what to do about it

24. Understanding when spatial transformer networks do not support invariance, and what to do about it

25. Understanding when spatial transformer networks do not support invariance, and what to do about it

26. Inability of spatial transformations of CNN feature maps to support invariant recognition

27. The problems with using STNs to align CNN feature maps

28. An Improved Deep Residual Network Prediction Model for the Early Diagnosis of Alzheimer's Disease.

29. Automatic target recognition with convolutional neural networks.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

29 results on '"spatial transformer networks"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources