9 results on '"Image corruption"'
Search Results
2. On the Detection of Anomalous or Out-of-Distribution Data in Vision Models Using Statistical Techniques
- Author
-
O’Mahony, Laura, O’Sullivan, David JP, Nikolov, Nikola S., Xhafa, Fatos, Series Editor, Hassanien, Aboul Ella, editor, Haqiq, Abdelkrim, editor, Azar, Ahmad Taher, editor, Santosh, KC, editor, Jabbar, M. A., editor, Słowik, Adam, editor, and Subashini, Parthasarathy, editor
- Published
- 2023
- Full Text
- View/download PDF
3. Surgical-VQLA++: Adversarial contrastive learning for calibrated robust visual question-localized answering in robotic surgery.
- Author
-
Bai, Long, Wang, Guankun, Islam, Mobarakol, Seenivasan, Lalithkumar, Wang, An, and Ren, Hongliang
- Subjects
- *
SURGICAL equipment , *SURGICAL robots , *SURGICAL education , *IMAGE transmission , *LEARNING strategies - Abstract
Medical visual question answering (VQA) bridges the gap between visual information and clinical decision-making, enabling doctors to extract understanding from clinical images and videos. In particular, surgical VQA can enhance the interpretation of surgical data, aiding in accurate diagnoses, effective education, and clinical interventions. However, the inability of VQA models to visually indicate the regions of interest corresponding to the given questions results in incomplete comprehension of the surgical scene. To tackle this, we propose the surgical visual question localized-answering (VQLA) for precise and context-aware responses to specific queries regarding surgical images. Furthermore, to address the strong demand for safety in surgical scenarios and potential corruptions in image acquisition and transmission, we propose a novel approach called Calibrated Co-Attention Gated Vision-Language (C 2 G-ViL) embedding to integrate and align multimodal information effectively. Additionally, we leverage the adversarial sample-based contrastive learning strategy to boost our performance and robustness. We also extend our EndoVis-18-VQLA and EndoVis-17-VQLA datasets to broaden the scope and application of our data. Extensive experiments on the aforementioned datasets demonstrate the remarkable performance and robustness of our solution. Our solution can effectively combat real-world image corruption. Thus, our proposed approach can serve as an effective tool for assisting surgical education, patient care, and enhancing surgical outcomes. Our code and data will be released at https://github.com/longbai1006/Surgical-VQLAPlus. • We propose a Surgical-VQLA++ framework to connect answering and localization. • We incorporate feature calibration and adversarial contrastive training techniques. • We expand our datasets by including additional queries related to surgical tools. • Extensive experiments prove the effectiveness and robustness of our solution. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
4. Robust Image Classification Using a Low-Pass Activation Function and DCT Augmentation
- Author
-
Md Tahmid Hossain, Shyh Wei Teng, Ferdous Sohel, and Guojun Lu
- Subjects
Robust image classification ,activation function ,low-pass filtering ,input corruption ,image corruption ,data augmentation ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Convolutional Neural Network’s (CNN’s) performance disparity on clean and corrupted datasets has recently come under scrutiny. In this work, we analyse common corruptions in the frequency domain, i.e., High Frequency corruptions (HFc, e.g., noise) and Low Frequency corruptions (LFc, e.g., blur). Although a simple solution to HFc is low-pass filtering, ReLU – a widely used Activation Function (AF), does not have any filtering mechanism. In this work, we instill low-pass filtering into the AF (LP-ReLU) to improve robustness against HFc. To deal with LFc, we complement LP-ReLU with Discrete Cosine Transform based augmentation. LP-ReLU, coupled with DCT augmentation, enables a deep network to tackle the entire spectrum of corruption. We use CIFAR-10-C and Tiny ImageNet-C for evaluation and demonstrate improvements of 5% and 7.3% in accuracy respectively, compared to the State-Of-The-Art (SOTA). We further evaluate our method’s stability on a variety of perturbations in CIFAR-10-P and Tiny ImageNet-P, achieving new SOTA in these experiments as well. To further strengthen our understanding regarding CNN’s lack of robustness, a decision space visualisation process is proposed and presented in this work.
- Published
- 2021
- Full Text
- View/download PDF
5. Enhanced robustness of convolutional networks with a push–pull inhibition layer.
- Author
-
Strisciuglio, Nicola, Lopez-Antequera, Manuel, and Petkov, Nicolai
- Subjects
- *
CONVOLUTIONAL neural networks , *NEURAL inhibition , *INTERNEURONS - Abstract
Convolutional neural networks (CNNs) lack robustness to test image corruptions that are not seen during training. In this paper, we propose a new layer for CNNs that increases their robustness to several types of corruptions of the input images. We call it a 'push–pull' layer and compute its response as the combination of two half-wave rectified convolutions, with kernels of different size and opposite polarity. Its implementation is based on a biologically motivated model of certain neurons in the visual system that exhibit response suppression, known as push–pull inhibition. We validate our method by replacing the first convolutional layer of the LeNet, ResNet and DenseNet architectures with our push–pull layer. We train the networks on original training images from the MNIST and CIFAR data sets and test them on images with several corruptions, of different types and severities, that are unseen by the training process. We experiment with various configurations of the ResNet and DenseNet models on a benchmark test set with typical image corruptions constructed on the CIFAR test images. We demonstrate that our push–pull layer contributes to a considerable improvement in robustness of classification of corrupted images, while maintaining state-of-the-art performance on the original image classification task. We released the code and trained models at the url http://github.com/nicstrisc/Push-Pull-CNN-layer. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
6. Why You Trust in Visual Saliency
- Author
-
Ardizzone, Edoardo, Bruno, Alessandro, Greco, Luca, La Cascia, Marco, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Murino, Vittorio, editor, Puppo, Enrico, editor, Sona, Diego, editor, Cristani, Marco, editor, and Sansone, Carlo, editor
- Published
- 2015
- Full Text
- View/download PDF
7. Robust Image Classification Using a Low-Pass Activation Function and DCT Augmentation
- Author
-
Guojun Lu, Ferdous Sohel, Tahmid Hossain, and Shyh Wei Teng
- Subjects
FOS: Computer and information sciences ,General Computer Science ,Computer science ,Robust image classification ,Low-pass filter ,Computer Vision and Pattern Recognition (cs.CV) ,Stability (learning theory) ,Computer Science - Computer Vision and Pattern Recognition ,02 engineering and technology ,010501 environmental sciences ,01 natural sciences ,Convolutional neural network ,0202 electrical engineering, electronic engineering, information engineering ,Discrete cosine transform ,low-pass filtering ,General Materials Science ,activation function ,Electrical and Electronic Engineering ,0105 earth and related environmental sciences ,input corruption ,Contextual image classification ,image corruption ,business.industry ,General Engineering ,Pattern recognition ,Filter (signal processing) ,TK1-9971 ,Noise ,Frequency domain ,020201 artificial intelligence & image processing ,Artificial intelligence ,Electrical engineering. Electronics. Nuclear engineering ,business ,data augmentation - Abstract
Convolutional Neural Network’s (CNN’s) performance disparity on clean and corrupted datasets has recently come under scrutiny. In this work, we analyse common corruptions in the frequency domain, i.e., High Frequency corruptions (HFc, e.g., noise) and Low Frequency corruptions (LFc, e.g., blur). Although a simple solution to HFc is low-pass filtering, ReLU – a widely used Activation Function (AF), does not have any filtering mechanism. In this work, we instill low-pass filtering into the AF (LP-ReLU) to improve robustness against HFc. To deal with LFc, we complement LP-ReLU with Discrete Cosine Transform based augmentation. LP-ReLU, coupled with DCT augmentation, enables a deep network to tackle the entire spectrum of corruption. We use CIFAR-10-C and Tiny ImageNet-C for evaluation and demonstrate improvements of 5% and 7.3% in accuracy respectively, compared to the State-Of-The-Art (SOTA). We further evaluate our method’s stability on a variety of perturbations in CIFAR-10-P and Tiny ImageNet-P, achieving new SOTA in these experiments as well. To further strengthen our understanding regarding CNN’s lack of robustness, a decision space visualisation process is proposed and presented in this work.
- Published
- 2021
8. Why you trust in visual saliency
- Author
-
Luca Greco, Alessandro Bruno, Edoardo Ardizzone, Marco La Cascia, Ardizzone, E, Bruno, A, Greco, L, and La Cascia, M.
- Subjects
Settore ING-INF/05 - Sistemi Di Elaborazione Delle Informazioni ,Ground truth ,Saliency map ,Image compression ,business.industry ,Image corruption ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Observer (special relativity) ,Digital image ,Visual attention ,Computer vision ,Artificial intelligence ,business ,Visual saliency ,Mathematics - Abstract
Image understanding is a simple task for a human observer. Visual attention is automatically pointed to interesting regions by a natural objective stimulus in a first step and by prior knowledge in a second step. Saliency maps try to simulate human response and use actual eye-movements measurements as ground truth. An interesting question is: how much corruption in a digital image can affect saliency detection respect to the original image? One of the contributions of this work is to compare the performances of standard approaches with respect to different type of image corruptions and different threshold values on saliency maps. If the corruption can be estimated and/or the threshold is fixed, the results of this work can also be used to help in the selection of a method with best performance.
- Published
- 2015
9. Single-Sample Face Recognition with Image Corruption and Misalignment via Sparse Illumination Transfer
- Author
-
CALIFORNIA UNIV BERKELEY DEPT OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, Zhuang, Liansheng, Yang, Allen Y, Zhou, Zihan, Sastry, S S, Ma, Yi, CALIFORNIA UNIV BERKELEY DEPT OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, Zhuang, Liansheng, Yang, Allen Y, Zhou, Zihan, Sastry, S S, and Ma, Yi
- Abstract
Single-sample face recognition is one of the most challenging problems in face recognition. We propose a novel face recognition algorithm to address this problem based on a sparse representation-based classification (SRC) framework. The new algorithm is robust to image misalignment and pixel corruption, and is able to reduce required training images to one sample per class. To compensate for the missing illumination information typically provided by multiple training images, a sparse illumination transfer (SIT) technique is introduced. The SIT algorithms seek additional illumination examples of face images from one or more additional subject classes, and form an illumination dictionary. By enforcing a sparse representation of the query image, the method can recover and transfer the pose and illumination information from the alignment stage to the recognition stage. Our extensive experiments have demonstrated that the new algorithms significantly outperform the existing algorithms in the single-sample regime and with fewer restrictions. In particular, the face alignment accuracy is comparable to that of the well-known Deformable SRC algorithm using multiple training images, and the face recognition accuracy exceeds those of the SRC and Extended SRC algorithms using hand-labeled alignment initialization., Presented at the 26th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2013) held in Portland, Oregon, on 23-28 June 2013. Sponsored in part by the Army Research Office through grant 63092-MA-II. U.S. Government or Federal Rights License.
- Published
- 2013
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.