332 results on '"Model robustness"'
Search Results
2. Counterfactual Contrastive Learning: Robust Representations via Causal Image Synthesis
- Author
-
Roschewitz, Mélanie, de Sousa Ribeiro, Fabio, Xia, Tian, Khara, Galvin, Glocker, Ben, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Bhattarai, Binod, editor, Ali, Sharib, editor, Rau, Anita, editor, Caramalau, Razvan, editor, Nguyen, Anh, editor, Gyawali, Prashnna, editor, Namburete, Ana, editor, and Stoyanov, Danail, editor
- Published
- 2025
- Full Text
- View/download PDF
3. Research on Fine-Tuning Optimization Strategies for Large Language Models in Tabular Data Processing.
- Author
-
Zhao, Xiaoyong, Leng, Xingxin, Wang, Lei, and Wang, Ningning
- Subjects
- *
LANGUAGE models , *NATURAL language processing , *DATA structures , *ELECTRONIC data processing , *LANGUAGE acquisition - Abstract
Recent advancements in natural language processing (NLP) have been significantly driven by the development of large language models (LLMs). Despite their impressive performance across various language tasks, these models still encounter challenges when processing tabular data. This study investigates the optimization of fine-tuning strategies for LLMs specifically in the context of tabular data processing. The focus is on the effects of decimal truncation, multi-dataset mixing, and the ordering of JSON key–value pairs on model performance. Experimental results indicate that decimal truncation reduces data noise, thereby enhancing the model's learning efficiency. Additionally, multi-dataset mixing improves the model's generalization and stability, while the random shuffling of key–value pair orders increases the model's adaptability to changes in data structure. These findings underscore the significant impact of these strategies on model performance and robustness. The research provides novel insights into improving the practical effectiveness of LLMs and offers effective data processing methods for researchers in related fields. By thoroughly analyzing these strategies, this study aims to establish theoretical foundations and practical guidance for the future optimization of LLMs across a broader range of application scenarios. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. Towards Robust Semantic Segmentation against Patch-Based Attack via Attention Refinement.
- Author
-
Yuan, Zheng, Zhang, Jie, Wang, Yude, Shan, Shiguang, and Chen, Xilin
- Subjects
- *
CONVOLUTIONAL neural networks , *TRANSFORMER models , *SPINE - Abstract
The attention mechanism has been proven effective on various visual tasks in recent years. In the semantic segmentation task, the attention mechanism is applied in various methods, including the case of both convolution neural networks and vision transformer as backbones. However, we observe that the attention mechanism is vulnerable to patch-based adversarial attacks. Through the analysis of the effective receptive field, we attribute it to the fact that the wide receptive field brought by global attention may lead to the spread of the adversarial patch. To address this issue, in this paper, we propose a robust attention mechanism (RAM) to improve the robustness of the semantic segmentation model, which can notably relieve the vulnerability against patch-based attacks. Compared to the vallina attention mechanism, RAM introduces two novel modules called max attention suppression and random attention dropout, both of which aim to refine the attention matrix and limit the influence of a single adversarial patch on the semantic segmentation results of other positions. Extensive experiments demonstrate the effectiveness of our RAM to improve the robustness of semantic segmentation models against various patch-based attack methods under different attack settings. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. The application of Bayesian inference under SAFE model.
- Author
-
Wu, Lunshuai
- Subjects
- *
CREDIT ratings , *CREDIT analysis , *ENVIRONMENTAL, social, & governance factors , *BAYESIAN field theory , *CORPORATE finance - Abstract
This paper responds to Professor Paolo Giudici's call for papers in 'Safe Machine Learning' and explores the application of Bayesian inference within the SAFE model framework, aiming to enhance the accuracy and reliability of environmental, social, and governance (ESG) factor analysis in the financial sector. The paper begins by introducing the basic concept of the SAFE model, which integrates ESG factors into the assessment of corporate credit ratings to promote sustainable development in the financial field. Furthermore, this paper discusses forecast estimation, uncertainty quantification, Gaussian processes, iterative optimization, and model robustness within the SAFE model framework. It is important to note that the SAFE model is not limited to the aforementioned applications; it also has the capacity to understand the model's extensive utility in other financial sectors, which can reflect the model's comprehensive scope. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Advancing Model Generalization in Continuous Cyclic Test-Time Adaptation with Matrix Perturbation Noise.
- Author
-
Jiang, Jinshen, Yang, Hao, Yang, Lin, and Zhou, Yun
- Subjects
- *
DEEP learning , *GENERALIZATION , *NOISE , *STATISTICS - Abstract
Test-time adaptation (TTA) aims to optimize source-pretrained model parameters to target domains using only unlabeled test data. However, traditional TTA methods often risk overfitting to the specific, localized test domains, leading to compromised generalization. Moreover, these methods generally presume static target domains, neglecting the dynamic and cyclic nature of real-world settings. To alleviate this limitation, this paper explores the continuous cyclic test-time adaptation (CycleTTA) setting. Our unique approach within this setting employs matrix-wise perturbation noise in batch-normalization statistics to enhance the adaptability of source-pretrained models to dynamically changing target domains, without the need for additional parameters. We demonstrated the effectiveness of our method through extensive experiments, where our approach reduced the average error by 39.8% on the CIFAR10-C dataset using the WideResNet-28-10 model, by 38.8% using the WideResNet-40-2 model, and by 33.8% using the PreActResNet-18 model. Additionally, on the CIFAR100-C dataset with the WideResNet-40-2 model, our method reduced the average error by 5.3%, showcasing significant improvements in model generalization in continuous cyclic testing scenarios. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. RobustE2E : Exploring the Robustness of End-to-End Autonomous Driving.
- Author
-
Jiang, Wei, Wang, Lu, Zhang, Tianyuan, Chen, Yuwei, Dong, Jian, Bao, Wei, Zhang, Zichao, and Fu, Qiang
- Subjects
AUTONOMOUS vehicles ,DEEP learning ,CORRUPTION ,NOISE ,WEATHER - Abstract
Autonomous driving technology has advanced significantly with deep learning, but noise and attacks threaten its real-world deployment. While research has revealed vulnerabilities in individual intelligent tasks, a comprehensive evaluation of these impacts across complete end-to-end systems is still underexplored. To address this void, we thoroughly analyze the robustness of four end-to-end autonomous driving systems against various noise and build the RobustE2E Benchmark, including five traditional adversarial attacks and a newly proposed Module-Wise Attack specifically targeting end-to-end autonomous driving in white-box settings, as well as four major categories of natural corruptions (a total of 17 types, with five severity levels) in black-box settings. Additionally, we extend the robustness evaluation from the open-loop model level to the closed-loop case studies of autonomous driving system level. Our comprehensive evaluation and analysis provide valuable insights into the robustness of end-to-end autonomous driving, which may offer potential guidance for targeted improvements to models. For example, (1) even the most advanced end-to-end models suffer large planning failures under minor perturbations, with perception tasks showing the most substantial decline; (2) among adversarial attacks, our Module-Wise Attack poses the greatest threat to end-to-end autonomous driving models, while PGD- l 2 is the weakest, and among four categories of natural corruptions, noise and weather are the most harmful, followed by blur and digital distortion being less severe; (3) the integrated, multitask approach results in significantly higher robustness and reliability compared with the simpler design, highlighting the critical role of collaborative multitask in autonomous driving; and (4) the autonomous driving systems amplify the model's lack of robustness, etc. Our research contributes to developing more resilient autonomous driving models and their deployment in the real world. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. AI-Q: a Framework to Measure the Robustness of Text Classification Models.
- Author
-
Mysliwiec, Konrad, Chinea-Ríos, Mara, Borrego-Obrador, Ian, and Franco-Salvador, Marc
- Subjects
DATA augmentation ,FRAGMENTED landscapes ,AREA studies ,CLASSIFICATION ,DATA modeling - Abstract
Robustness analysis of text Classification models through adversarial attacks has gained substantial attention in recent research. This area studies the consistent behavior of text Classification models under attacks. These attacks use perturbation methods based on applying semantic and label-preserving changes to the inputs. However, the fragmented landscape of individual attack implementations, dispersed across code repositories, poses complicates the development and application of comprehensive adversarial strategies for model enhancement. To address these challenges, this paper introduces AI-Q, a Python framework specifically designed for text Classification adversarial attacks and data augmentation. One of the major strengths of our framework lies in its extensive library of perturbation methods for adversarial attacks (24 in total), and its evaluation metrics for model robustness. The framework exhibits versatility by supporting both custom models and those from the HuggingFace ecosystem, ensuring broad compatibility with leading benchmarks in the field. Beyond adversarial attacks, AI-Q can be used for data augmentation, enabling users to harness the components of adversarial attacks to increase dataset diversity. Finally, our evaluation, including human annotations, highlights the AI-Q potential for model robustness. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Voxel-Wise Medical Image Generalization for Eliminating Distribution Shift.
- Author
-
Li, Feifei, Wang, Yuanbin, Beyan, Oya, Schöneck, Mirjam, and Caldeira, Liliana Lourenco
- Subjects
FEDERATED learning ,DIAGNOSTIC imaging ,GENERATIVE adversarial networks ,SUPERVISED learning ,DATA augmentation ,IMAGE segmentation ,MACHINE theory ,MARKOV random fields - Abstract
Currently, the medical field is witnessing an increase in the use of machine learning techniques. Supervised learning methods adopted in classification, prediction, and segmentation tasks for medical images always experience decreased performance when the training and testing datasets do not follow the independent and identically distributed assumption. These distribution shift situations seriously influence machine learning applications' robustness, fairness, and trustworthiness in the medical domain. Hence, in this article, we adopt the CycleGAN (generative adversarial network) method to cycle train the computed tomography data from different scanners/manufacturers. It aims to eliminate the distribution shift from diverse data terminals based on our previous work [14]. However, due to the model collapse problem and generative mechanisms of the GAN-based model, the images we generated contained serious artifacts. To remove the boundary marks and artifacts, we adopt score-based diffusion generative models to refine the images voxel-wisely. This innovative combination of two generative models enhances the quality of data providers while maintaining significant features. Meanwhile, we use five paired patients' medical images to deal with the evaluation experiments with structural similarity index measure metrics and the segmentation model's performance comparison. We conclude that CycleGAN can be utilized as an efficient data augmentation technique rather than a distribution-shift-eliminating method. In contrast, the denoising diffusion the denoising diffusion model is more suitable for dealing with the distribution shift problem aroused by the different terminal modules. The limitation of generative methods applied in medical images is the difficulty in obtaining large and diverse datasets that accurately capture the complexity of biological structure and variability. In our following research, we plan to assess the initial and generated datasets to explore more possibilities to overcome the above limitation. We will also incorporate the generative methods into the federated learning architecture, which can maintain their advantages and resolve the distribution shift issue on a larger scale. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Assessment of robustness of machine learning-assisted modelling approach to describe growth kinetics of microorganisms using Monte Carlo simulation.
- Author
-
TARLAK, FATIH and YÜCEL, ÖZGÜN
- Subjects
- *
MONTE Carlo method , *STANDARD deviations , *KRIGING , *MICROBIAL growth , *FOOD microbiology - Abstract
Understanding the growth behaviour of microorganisms is crucial for various fields such as microbiology, food safety and biotechnology. Traditional modelling approaches face challenges in accurately capturing the dynamic and complex nature of microbial growth especially when high variation is seen. In contrast, machine learning techniques offer a promising avenue for creating more accurate and adaptable models. This study aimed to develop a new modelling method, machine learning-assisted modelling approach, and compare the robustness of machine learning-assisted and traditional modelling approaches in describing microbial growth behaviour, employing Monte Carlo simulation. The research involved subjecting both machine learning-assisted and traditional modelling approaches to 10, 50 and 500 trials. The results showed that the machine learning approach led to more robust results than the traditional modelling approach providing higher adjusted coefficient of determination (R2adj) value than 0.919 and lower root mean square error (RMSE) value than 0.319. These findings suggest that the machine learning-assisted modelling approach, particularly with Gaussian process regression, has the potential to serve as a highly reliable prediction method for describing the growth behaviour of microorganisms in frames of predictive food microbiology. The study provides insights into practical application of machine learning in enhancing our understanding and predictive capabilities of microbial growth dynamics. [ABSTRACT FROM AUTHOR]
- Published
- 2024
11. Construction and Practical Exploration of a Precise Management Model for Student Aid: A Multi-Dimensional Data-Driven Methodology
- Author
-
Wang, Chao, Lv, Xiao, Sun, Yang, Liu, Zhe, Striełkowski, Wadim, Editor-in-Chief, Black, Jessica M., Series Editor, Butterfield, Stephen A., Series Editor, Chang, Chi-Cheng, Series Editor, Cheng, Jiuqing, Series Editor, Dumanig, Francisco Perlas, Series Editor, Al-Mabuk, Radhi, Series Editor, Scheper-Hughes, Nancy, Series Editor, Urban, Mathias, Series Editor, Webb, Stephen, Series Editor, Rad, Dana, editor, Chew, Fong Peng, editor, Hutagalung, Fonny Dameaty, editor, and Birkök, Cüneyt, editor
- Published
- 2024
- Full Text
- View/download PDF
12. Inclusive Counterfactual Generation: Leveraging LLMs in Identifying Online Hate
- Author
-
Qureshi, M. Atif, Younus, Arjumand, Caton, Simon, Hartmanis, Juris, Founding Editor, van Leeuwen, Jan, Series Editor, Hutchison, David, Editorial Board Member, Kanade, Takeo, Editorial Board Member, Kittler, Josef, Editorial Board Member, Kleinberg, Jon M., Editorial Board Member, Kobsa, Alfred, Series Editor, Mattern, Friedemann, Editorial Board Member, Mitchell, John C., Editorial Board Member, Naor, Moni, Editorial Board Member, Nierstrasz, Oscar, Series Editor, Pandu Rangan, C., Editorial Board Member, Sudan, Madhu, Series Editor, Terzopoulos, Demetri, Editorial Board Member, Tygar, Doug, Editorial Board Member, Weikum, Gerhard, Series Editor, Vardi, Moshe Y, Series Editor, Goos, Gerhard, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Stefanidis, Kostas, editor, Systä, Kari, editor, Matera, Maristella, editor, Heil, Sebastian, editor, Kondylakis, Haridimos, editor, and Quintarelli, Elisa, editor
- Published
- 2024
- Full Text
- View/download PDF
13. Through the Eyes of the Expert: Aligning Human and Machine Attention for Industrial AI
- Author
-
Koebler, Alexander, Greisinger, Christian, Paulus, Jan, Thon, Ingo, Buettner, Florian, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, van Leeuwen, Jan, Series Editor, Hutchison, David, Editorial Board Member, Kanade, Takeo, Editorial Board Member, Kittler, Josef, Editorial Board Member, Kleinberg, Jon M., Editorial Board Member, Kobsa, Alfred, Series Editor, Mattern, Friedemann, Editorial Board Member, Mitchell, John C., Editorial Board Member, Naor, Moni, Editorial Board Member, Nierstrasz, Oscar, Series Editor, Pandu Rangan, C., Editorial Board Member, Sudan, Madhu, Series Editor, Terzopoulos, Demetri, Editorial Board Member, Tygar, Doug, Editorial Board Member, Weikum, Gerhard, Series Editor, Vardi, Moshe Y, Series Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Degen, Helmut, editor, and Ntoa, Stavroula, editor
- Published
- 2024
- Full Text
- View/download PDF
14. The CLEF-2024 CheckThat! Lab: Check-Worthiness, Subjectivity, Persuasion, Roles, Authorities, and Adversarial Robustness
- Author
-
Barrón-Cedeño, Alberto, Alam, Firoj, Chakraborty, Tanmoy, Elsayed, Tamer, Nakov, Preslav, Przybyła, Piotr, Struß, Julia Maria, Haouari, Fatima, Hasanain, Maram, Ruggeri, Federico, Song, Xingyi, Suwaileh, Reem, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Goharian, Nazli, editor, Tonellotto, Nicola, editor, He, Yulan, editor, Lipani, Aldo, editor, McDonald, Graham, editor, Macdonald, Craig, editor, and Ounis, Iadh, editor
- Published
- 2024
- Full Text
- View/download PDF
15. Robust Analysis of Visual Question Answering Based on Irrelevant Visual Contextual Information
- Author
-
Qin, Jun, Zhang, Zejin, Ye, Zheng, Liu, Zhou, Cheng, Yong, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Zhang, Junjie James, Series Editor, Tan, Kay Chen, Series Editor, Wang, Wei, editor, Mu, Jiasong, editor, Liu, Xin, editor, and Na, Zhenyu Na, editor
- Published
- 2024
- Full Text
- View/download PDF
16. Can You Really Reason: A Novel Framework for Assessing Natural Language Reasoning Datasets and Models
- Author
-
Huang, Shanshan, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Luo, Biao, editor, Cheng, Long, editor, Wu, Zheng-Guang, editor, Li, Hongyi, editor, and Li, Chaojie, editor
- Published
- 2024
- Full Text
- View/download PDF
17. Robustness-Enhanced Assertion Generation Method Based on Code Mutation and Attack Defense
- Author
-
Li, Min, Chen, Shizhan, Fan, Guodong, Zhang, Lu, Wu, Hongyue, Xue, Xiao, Feng, Zhiyong, Akan, Ozgur, Editorial Board Member, Bellavista, Paolo, Editorial Board Member, Cao, Jiannong, Editorial Board Member, Coulson, Geoffrey, Editorial Board Member, Dressler, Falko, Editorial Board Member, Ferrari, Domenico, Editorial Board Member, Gerla, Mario, Editorial Board Member, Kobayashi, Hisashi, Editorial Board Member, Palazzo, Sergio, Editorial Board Member, Sahni, Sartaj, Editorial Board Member, Shen, Xuemin, Editorial Board Member, Stan, Mircea, Editorial Board Member, Jia, Xiaohua, Editorial Board Member, Zomaya, Albert Y., Editorial Board Member, Gao, Honghao, editor, Wang, Xinheng, editor, and Voros, Nikolaos, editor
- Published
- 2024
- Full Text
- View/download PDF
18. Structural Adversarial Attack for Code Representation Models
- Author
-
Zhang, Yuxin, Wu, Ruoting, Liao, Jie, Chen, Liang, Akan, Ozgur, Editorial Board Member, Bellavista, Paolo, Editorial Board Member, Cao, Jiannong, Editorial Board Member, Coulson, Geoffrey, Editorial Board Member, Dressler, Falko, Editorial Board Member, Ferrari, Domenico, Editorial Board Member, Gerla, Mario, Editorial Board Member, Kobayashi, Hisashi, Editorial Board Member, Palazzo, Sergio, Editorial Board Member, Sahni, Sartaj, Editorial Board Member, Shen, Xuemin, Editorial Board Member, Stan, Mircea, Editorial Board Member, Jia, Xiaohua, Editorial Board Member, Zomaya, Albert Y., Editorial Board Member, Gao, Honghao, editor, Wang, Xinheng, editor, and Voros, Nikolaos, editor
- Published
- 2024
- Full Text
- View/download PDF
19. SCME: A Self-contrastive Method for Data-Free and Query-Limited Model Extraction Attack
- Author
-
Liu, Renyang, Zhang, Jinhong, Lam, Kwok-Yan, Zhao, Jun, Zhou, Wei, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Luo, Biao, editor, Cheng, Long, editor, Wu, Zheng-Guang, editor, Li, Hongyi, editor, and Li, Chaojie, editor
- Published
- 2024
- Full Text
- View/download PDF
20. DeepAstroUDA: semi-supervised universal domain adaptation for cross-survey galaxy morphology classification and anomaly detection
- Author
-
Ćiprijanović, A, Lewis, A, Pedro, K, Madireddy, S, Nord, B, Perdue, GN, and Wild, SM
- Subjects
Information and Computing Sciences ,Machine Learning ,domain adaptation ,convolutional neural networks ,deep learning ,model robustness ,galaxy morphological classification ,sky surveys ,Applied computing ,Machine learning - Abstract
Artificial intelligence methods show great promise in increasing the quality and speed of work with large astronomical datasets, but the high complexity of these methods leads to the extraction of dataset-specific, non-robust features. Therefore, such methods do not generalize well across multiple datasets. We present a universal domain adaptation method, DeepAstroUDA, as an approach to overcome this challenge. This algorithm performs semi-supervised domain adaptation (DA) and can be applied to datasets with different data distributions and class overlaps. Non-overlapping classes can be present in any of the two datasets (the labeled source domain, or the unlabeled target domain), and the method can even be used in the presence of unknown classes. We apply our method to three examples of galaxy morphology classification tasks of different complexities (three-class and ten-class problems), with anomaly detection: (1) datasets created after different numbers of observing years from a single survey (Legacy Survey of Space and Time mock data of one and ten years of observations); (2) data from different surveys (Sloan Digital Sky Survey (SDSS) and DECaLS); and (3) data from observing fields with different depths within one survey (wide field and Stripe 82 deep field of SDSS). For the first time, we demonstrate the successful use of DA between very discrepant observational datasets. DeepAstroUDA is capable of bridging the gap between two astronomical surveys, increasing classification accuracy in both domains (up to 40 % on the unlabeled data), and making model performance consistent across datasets. Furthermore, our method also performs well as an anomaly detection algorithm and successfully clusters unknown class samples even in the unlabeled target dataset.
- Published
- 2023
21. Towards Defending Multiple ℓp-Norm Bounded Adversarial Perturbations via Gated Batch Normalization.
- Author
-
Liu, Aishan, Tang, Shiyu, Chen, Xinyun, Huang, Lei, Qin, Haotong, Liu, Xianglong, and Tao, Dacheng
- Subjects
- *
ARTIFICIAL neural networks - Abstract
There has been extensive evidence demonstrating that deep neural networks are vulnerable to adversarial examples, which motivates the development of defenses against adversarial attacks. Existing adversarial defenses typically improve model robustness against individual specific perturbation types (e.g., ℓ ∞ -norm bounded adversarial examples). However, adversaries are likely to generate multiple types of perturbations in practice (e.g., ℓ 1 , ℓ 2 , and ℓ ∞ perturbations). Some recent methods improve model robustness against adversarial attacks in multiple ℓ p balls, but their performance against each perturbation type is still far from satisfactory. In this paper, we observe that different ℓ p bounded adversarial perturbations induce different statistical properties that can be separated and characterized by the statistics of Batch Normalization (BN). We thus propose Gated Batch Normalization (GBN) to adversarially train a perturbation-invariant predictor for defending multiple ℓ p bounded adversarial perturbations. GBN consists of a multi-branch BN layer and a gated sub-network. Each BN branch in GBN is in charge of one perturbation type to ensure that the normalized output is aligned towards learning perturbation-invariant representation. Meanwhile, the gated sub-network is designed to separate inputs added with different perturbation types. We perform an extensive evaluation of our approach on commonly-used dataset including MNIST, CIFAR-10, and Tiny-ImageNet, and demonstrate that GBN outperforms previous defense proposals against multiple perturbation types (i.e., ℓ 1 , ℓ 2 , and ℓ ∞ perturbations) by large margins. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. Robust semantic segmentation method of urban scenes in snowy environment.
- Author
-
Yin, Hanqi, Yin, Guisheng, Sun, Yiming, Zhang, Liguo, and Tian, Ye
- Abstract
Semantic segmentation plays a crucial role in various computer vision tasks, such as autonomous driving in urban scenes. The related researches have made significant progress. However, since most of the researches focus on how to enhance the performance of semantic segmentation models, there is a noticeable lack of attention given to the performance deterioration of these models in severe weather. To address this issue, we study the robustness of the multimodal semantic segmentation model in snowy environment, which represents a subset of severe weather conditions. The proposed method generates realistically simulated snowy environment images by combining unpaired image translation with adversarial snowflake generation, effectively misleading the segmentation model’s predictions. These generated adversarial images are then utilized for model robustness learning, enabling the model to adapt to the harshest snowy environment and enhancing its robustness to artificially adversarial perturbance to some extent. The experimental visualization results show that the proposed method can generate approximately realistic snowy environment images, and yield satisfactory visual effects for both daytime and nighttime scenes. Moreover, the experimental quantitation results generated by MFNet Dataset indicate that compared with the model without enhancement, the proposed method achieves average improvements of 4.82% and 3.95% on mAcc and mIoU, respectively. These improvements enhance the adaptability of the multimodal semantic segmentation model to snowy environments and contribute to road safety. Furthermore, the proposed method demonstrates excellent applicability, as it can be seamlessly integrated into various multimodal semantic segmentation models. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. Efficient Model-Free Subsampling Method for Massive Data.
- Author
-
Zhou, Zheng, Yang, Zebin, Zhang, Aijun, and Zhou, Yongdao
- Subjects
- *
STATISTICAL learning , *SAMPLE size (Statistics) , *PARALLEL programming - Abstract
Subsampling plays a crucial role in tackling problems associated with the storage and statistical learning of massive datasets. However, most existing subsampling methods are model-based, which means their performances can drop significantly when the underlying model is misspecified. Such an issue calls for model-free subsampling methods that are robust under diverse model specifications. Recently, several model-free subsampling methods have been developed. However, the computing time of these methods grows explosively with the sample size, making them impractical for handling massive data. In this article, an efficient model-free subsampling method is proposed, which segments the original data into some regular data blocks and obtains subsamples from each data block by the data-driven subsampling method. Compared with existing model-free subsampling methods, the proposed method has a significant speed advantage and performs more robustly for datasets with complex underlying distributions. As demonstrated in simulation experiments, the proposed method is an order of magnitude faster than other commonly used model-free subsampling methods when the sample size of the original dataset reaches the order of 107. Moreover, simulation experiments and case studies show that the proposed method is more robust than other model-free subsampling methods under diverse model specifications and subsample sizes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. WordBlitz: An Efficient Hard-Label Textual Adversarial Attack Method Jointly Leveraging Adversarial Transferability and Word Importance.
- Author
-
Li, Xiangge, Luo, Hong, and Sun, Yan
- Subjects
NATURAL language processing ,DATA augmentation ,SEARCH algorithms ,LEXICAL access - Abstract
Existing textual attacks mostly perturb keywords in sentences to generate adversarial examples by relying on the prediction confidence of victim models. In practice, attackers can only access the prediction label, meaning that the victim model can easily defend against such hard-label attacks by denying access based on the attack's frequency. In this paper, we propose an efficient hard-label attack approach, called WordBlitz. First, based on the adversarial transferability, we train a substitute model to initialize the attack parameter set, including a candidate pool and two weight tables of keywords and candidate words. Then, adversarial examples are generated and optimized under the guidance of the two weight tables. During optimization, we design a hybrid local search algorithm with word importance to find the globally optimal solution while updating the two weight tables according to the attack results. Finally, the non-adversarial text generated during perturbation optimization is added to the training of the substitute model as data augmentation to improve the adversarial transferability. Experimental results show that WordBlitz surpasses the baseline in terms of better effectiveness, higher efficiency, and lower cost. Its efficiency is especially pronounced in scenarios with broader search spaces, and its attack success rate on a Chinese dataset is higher than on baselines. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. AMPLIFY: attention-based mixup for performance improvement and label smoothing in transformer.
- Author
-
Yang, Leixin and Xiang, Yu
- Subjects
TRANSFORMER models ,DATA augmentation ,PROBLEM solving - Abstract
Mixup is an effective data augmentation method that generates new augmented samples by aggregating linear combinations of different original samples. However, if there are noises or aberrant features in the original samples, mixup may propagate them to the augmented samples, leading to over-sensitivity of the model to these outliers. To solve this problem, this paper proposes a new mixup method called AMPLIFY. This method uses the attention mechanism of Transformer itself to reduce the influence of noises and aberrant values in the original samples on the prediction results, without increasing additional trainable parameters, and the computational cost is very low, thereby avoiding the problem of high resource consumption in common mixup methods such as Sentence Mixup. The experimental results show that, under a smaller computational resource cost, AMPLIFY outperforms other mixup methods in text classification tasks on seven benchmark datasets, providing new ideas and new ways to further improve the performance of pre-trained models based on the attention mechanism, such as BERT, ALBERT, RoBERTa, and GPT. Our code can be obtained at https://github.com/kiwi-lilo/AMPLIFY. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. DART: A Solution for decentralized federated learning model robustness analysis
- Author
-
Chao Feng, Alberto Huertas Celdrán, Jan von der Assen, Enrique Tomás Martínez Beltrán, Gérôme Bovet, and Burkhard Stiller
- Subjects
Decentralized federated learning ,Poisoning attack ,Cybersecurity ,Model robustness ,Computer engineering. Computer hardware ,TK7885-7895 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Federated Learning (FL) has emerged as a promising approach to address privacy concerns inherent in Machine Learning (ML) practices. However, conventional FL methods, particularly those following the Centralized FL (CFL) paradigm, utilize a central server for global aggregation, which exhibits limitations such as bottleneck and single point of failure. To address these issues, the Decentralized FL (DFL) paradigm has been proposed, which removes the client–server boundary and enables all participants to engage in model training and aggregation tasks. Nevertheless, as CFL, DFL remains vulnerable to adversarial attacks, notably poisoning attacks that undermine model performance. While existing research on model robustness has predominantly focused on CFL, there is a noteworthy gap in understanding the model robustness of the DFL paradigm. In this paper, a thorough review of poisoning attacks targeting the model robustness in DFL systems, as well as their corresponding countermeasures, are presented. Additionally, a solution called DART is proposed to evaluate the robustness of DFL models, which is implemented and integrated into a DFL platform. Through extensive experiments, this paper compares the behavior of CFL and DFL under diverse poisoning attacks, pinpointing key factors affecting attack spread and effectiveness within the DFL. It also evaluates the performance of different defense mechanisms and investigates whether defense mechanisms designed for CFL are compatible with DFL. The empirical results provide insights into research challenges and suggest ways to improve the robustness of DFL models for future research.
- Published
- 2024
- Full Text
- View/download PDF
27. Flipover outperforms dropout in deep learning
- Author
-
Yuxuan Liang, Chuang Niu, Pingkun Yan, and Ge Wang
- Subjects
Model robustness ,Regularization ,Flipover ,Dropout ,Adversarial defense ,Drawing. Design. Illustration ,NC1-1940 ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Computer software ,QA76.75-76.765 - Abstract
Abstract Flipover, an enhanced dropout technique, is introduced to improve the robustness of artificial neural networks. In contrast to dropout, which involves randomly removing certain neurons and their connections, flipover randomly selects neurons and reverts their outputs using a negative multiplier during training. This approach offers stronger regularization than conventional dropout, refining model performance by (1) mitigating overfitting, matching or even exceeding the efficacy of dropout; (2) amplifying robustness to noise; and (3) enhancing resilience against adversarial attacks. Extensive experiments across various neural networks affirm the effectiveness of flipover in deep learning.
- Published
- 2024
- Full Text
- View/download PDF
28. Distance Matters: Euclidean Embedding Distances for Improved Language Model Generalization and Adaptability
- Author
-
Sultan Alshamrani
- Subjects
Language models ,natural language processing ,embeddings ,model generalization ,model robustness ,model performance ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Large language models (LLMs) have revolutionized natural language processing (NLP), enabling machines to process, understand and generate human-like text with high accuracy. However, the current practices in training and evaluating these models often overlook the relationship between the embeddings of training and testing samples, leading to potential overfitting and limited generalization capabilities. This paper introduces a new approach to enhancing the performance, reliability, and generalization of LLMs by curating training and testing samples based on the Euclidean distances between their embeddings. The central hypothesis is that training models on samples with high Euclidean distances between training and testing embeddings, coupled with evaluations spanning diverse distances, will improve the models’ robustness and adaptability to inputs diverging from the training data distribution. The comprehensive evaluation across multiple datasets and architectures shows that models trained on samples with high Euclidean distances from the testing samples generally exhibit superior generalization and robustness compared to those trained on low-distance samples. The proposed evaluation methodology, assessing performance across a range of distances, provides a more reliable measure of a model’s true adaptability. This study provides insights into the relationship between training data diversity and model reliability, paving the way for more robust and generalizable LLMs.
- Published
- 2024
- Full Text
- View/download PDF
29. Research on Fine-Tuning Optimization Strategies for Large Language Models in Tabular Data Processing
- Author
-
Xiaoyong Zhao, Xingxin Leng, Lei Wang, and Ningning Wang
- Subjects
data preprocessing ,data noise ,fine-tuning ,generalization ability ,large language models ,model robustness ,Technology - Abstract
Recent advancements in natural language processing (NLP) have been significantly driven by the development of large language models (LLMs). Despite their impressive performance across various language tasks, these models still encounter challenges when processing tabular data. This study investigates the optimization of fine-tuning strategies for LLMs specifically in the context of tabular data processing. The focus is on the effects of decimal truncation, multi-dataset mixing, and the ordering of JSON key–value pairs on model performance. Experimental results indicate that decimal truncation reduces data noise, thereby enhancing the model’s learning efficiency. Additionally, multi-dataset mixing improves the model’s generalization and stability, while the random shuffling of key–value pair orders increases the model’s adaptability to changes in data structure. These findings underscore the significant impact of these strategies on model performance and robustness. The research provides novel insights into improving the practical effectiveness of LLMs and offers effective data processing methods for researchers in related fields. By thoroughly analyzing these strategies, this study aims to establish theoretical foundations and practical guidance for the future optimization of LLMs across a broader range of application scenarios.
- Published
- 2024
- Full Text
- View/download PDF
30. DeepAdversaries: examining the robustness of deep learning models for galaxy morphology classification
- Author
-
Ćiprijanović, Aleksandra, Kafkes, Diana, Snyder, Gregory, Sánchez, F Javier, Perdue, Gabriel Nathan, Pedro, Kevin, Nord, Brian, Madireddy, Sandeep, and Wild, Stefan M
- Subjects
Information and Computing Sciences ,Machine Learning ,Bioengineering ,Machine Learning and Artificial Intelligence ,Networking and Information Technology R&D (NITRD) ,convolutional neural networks ,deep learning ,model robustness ,adversarial attacks ,galaxy morphological classification ,sky surveys ,Applied computing ,Machine learning - Abstract
With increased adoption of supervised deep learning methods for work with cosmological survey data, the assessment of data perturbation effects (that can naturally occur in the data processing and analysis pipelines) and the development of methods that increase model robustness are increasingly important. In the context of morphological classification of galaxies, we study the effects of perturbations in imaging data. In particular, we examine the consequences of using neural networks when training on baseline data and testing on perturbed data. We consider perturbations associated with two primary sources: (a) increased observational noise as represented by higher levels of Poisson noise and (b) data processing noise incurred by steps such as image compression or telescope errors as represented by one-pixel adversarial attacks. We also test the efficacy of domain adaptation techniques in mitigating the perturbation-driven errors. We use classification accuracy, latent space visualizations, and latent space distance to assess model robustness in the face of these perturbations. For deep learning models without domain adaptation, we find that processing pixel-level errors easily flip the classification into an incorrect class and that higher observational noise makes the model trained on low-noise data unable to classify galaxy morphologies. On the other hand, we show that training with domain adaptation improves model robustness and mitigates the effects of these perturbations, improving the classification accuracy up to 23% on data with higher observational noise. Domain adaptation also increases up to a factor of ≈2.3 the latent space distance between the baseline and the incorrectly classified one-pixel perturbed image, making the model more robust to inadvertent perturbations. Successful development and implementation of methods that increase model robustness in astronomical survey pipelines will help pave the way for many more uses of deep learning for astronomy.
- Published
- 2022
31. DocXclassifier: towards a robust and interpretable deep neural network for document image classification
- Author
-
Saifullah, Saifullah, Agne, Stefan, Dengel, Andreas, and Ahmed, Sheraz
- Published
- 2024
- Full Text
- View/download PDF
32. Flipover outperforms dropout in deep learning
- Author
-
Liang, Yuxuan, Niu, Chuang, Yan, Pingkun, and Wang, Ge
- Published
- 2024
- Full Text
- View/download PDF
33. A Post-training Framework for Improving the Performance of Deep Learning Models via Model Transformation.
- Author
-
Jiang, Jiajun, Yang, Junjie, Zhang, Yingyi, Wang, Zan, You, Hanmo, and Chen, Junjie
- Subjects
DEEP learning ,ARTIFICIAL neural networks ,REGRESSION analysis - Abstract
Deep learning (DL) techniques have attracted much attention in recent years and have been applied to many application scenarios. To improve the performance of DL models regarding different properties, many approaches have been proposed in the past decades, such as improving the robustness and fairness of DL models to meet the requirements for practical use. Among existing approaches, post-training is an effective method that has been widely adopted in practice due to its high efficiency and good performance. Nevertheless, its performance is still limited due to the incompleteness of training data. Additionally, existing approaches are always specifically designed for certain tasks, such as improving model robustness, which cannot be used for other purposes. In this article, we aim to fill this gap and propose an effective and general post-training framework, which can be adapted to improve the model performance from different aspects. Specifically, it incorporates a novel model transformation technique that transforms a classification model into an isomorphic regression model for fine-tuning, which can effectively overcome the problem of incomplete training data by forcing the model to strengthen the memory of crucial input features and thus improve the model performance eventually. To evaluate the performance of our framework, we have adapted it to two emerging tasks for improving DL models, i.e., robustness and fairness improvement, and conducted extensive studies by comparing it with state-of-the-art approaches. The experimental results demonstrate that our framework is indeed general, as it is effective in both tasks. Specifically, in the task of robustness improvement, our approach Dare has achieved the best results on 61.1% cases (vs. 11.1% cases achieved by baselines). In the task of fairness improvement, our approach FMT can effectively improve the fairness without sacrificing the accuracy of the models. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. Spectroscopy‐based prediction of 73 wheat quality parameters and insights for practical applications.
- Author
-
Nagel‐Held, Johannes, El Hassouni, Khaoula, Longin, Friedrich, and Hitzmann, Bernd
- Abstract
Background and Objectives: Quality assessment of bread wheat is time‐consuming and requires the determination of many complex characteristics. Because of its simplicity, protein content prediction using near‐infrared spectroscopy (NIRS) serves as the primary quality attribute in wheat trade. To enable the prediction of more complex traits, information from Raman and fluorescence spectra is added to the NIR spectra of whole grain and extracted flour. Model robustness is assessed by predictions across cultivars, locations, and years. The prediction error is corrected for the measurement error of the reference methods. Findings: Successful prediction, robustness testing, and measurement error correction were achieved for several parameters. Predicting loaf volume yielded a corrected prediction error RMSECV of 27.5 mL/100 g flour and an R² of 0.86. However, model robustness was limited due to data distribution, environmental factors, and temporal influences. Conclusions: The proposed method was proven to be suitable for applications in the wheat value chain. Furthermore, the study provides valuable insights for practical implementations. Significance and Novelty: With up to 1200 wheat samples, this is the largest study on predicting complex characteristics comprising agronomic traits; dough rheological parameters measured by Extensograph, micro‐doughLAB, and GlutoPeak; baking trial parameters like loaf volume; and specific ingredients, such as grain protein content, sugars, and minerals. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. AMPLIFY: attention-based mixup for performance improvement and label smoothing in transformer
- Author
-
Leixin Yang and Yu Xiang
- Subjects
Mixup ,Attention mechanism ,Data augmentation ,Model robustness ,Label smoothing ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Mixup is an effective data augmentation method that generates new augmented samples by aggregating linear combinations of different original samples. However, if there are noises or aberrant features in the original samples, mixup may propagate them to the augmented samples, leading to over-sensitivity of the model to these outliers. To solve this problem, this paper proposes a new mixup method called AMPLIFY. This method uses the attention mechanism of Transformer itself to reduce the influence of noises and aberrant values in the original samples on the prediction results, without increasing additional trainable parameters, and the computational cost is very low, thereby avoiding the problem of high resource consumption in common mixup methods such as Sentence Mixup. The experimental results show that, under a smaller computational resource cost, AMPLIFY outperforms other mixup methods in text classification tasks on seven benchmark datasets, providing new ideas and new ways to further improve the performance of pre-trained models based on the attention mechanism, such as BERT, ALBERT, RoBERTa, and GPT. Our code can be obtained at https://github.com/kiwi-lilo/AMPLIFY.
- Published
- 2024
- Full Text
- View/download PDF
36. Assessment of landslide susceptibility mapping based on XGBoost model: A case study of Yanshan Township
- Author
-
Hongyang WU, Chao ZHOU, Xin LIANG, Pengcheng YUAN, and Lanbing YU
- Subjects
landslides ,landslide susceptibility mapping ,extreme gradient boosting model ,prediction accuracy ,model robustness ,Geology ,QE1-996.5 - Abstract
Landslide susceptibility assessment forms the foundation for precise evaluation of landslide risk. To enhance the accuracy and robustness of landslide susceptibility mapping, a state-of-art machine learning algorithm named the extreme gradient boosting model (XGBoost) was introduced to this study. Yanshan Town in Wanzhou district, Three Gorges reservoir, was chosen as a case study. Nine influencing factors, including engineering geological lithology and thickness of deposit layer, were selected to construct the landslide susceptibility evaluation index system. The relationship between landslide development and these indicators is quantitatively analyzed using the information value model. Subsequently, 70% of landslide samples were randomly assigned for training, while the remaining 30% were used for validation. The XGBoost model was then employed for landslide susceptibility mapping. The output were compared with those of the decision tree model (DT) and gradient boosting decision tree (GBDT) in terms of prediction accuracy and model stability. The findings revealed that distance to the Yangtze River, soil thickness, and lithology were the primary factors influencing landslide development. The XGBoost model demonstrated the highest average prediction accuracy (97.3%) in 100 repeated trials, surpassing the DT (91.3%) and GBDT models. Moreover, the XGBoost model exhibited superior robustness with a standard deviation and coefficient of variation of 0.01, lower than the other two models. It also achieved the highest accuracy (94.3%) and prediction accuracy (97.3%) in the validation process. The proposed XGBoost model serves as a reliable assessment method and yields optimal results in regional landslide susceptibility mapping.
- Published
- 2023
- Full Text
- View/download PDF
37. Advancing Model Generalization in Continuous Cyclic Test-Time Adaptation with Matrix Perturbation Noise
- Author
-
Jinshen Jiang, Hao Yang, Lin Yang, and Yun Zhou
- Subjects
deep learning ,model generalization ,distribution shift ,test-time adaptation ,model robustness ,Mathematics ,QA1-939 - Abstract
Test-time adaptation (TTA) aims to optimize source-pretrained model parameters to target domains using only unlabeled test data. However, traditional TTA methods often risk overfitting to the specific, localized test domains, leading to compromised generalization. Moreover, these methods generally presume static target domains, neglecting the dynamic and cyclic nature of real-world settings. To alleviate this limitation, this paper explores the continuous cyclic test-time adaptation (CycleTTA) setting. Our unique approach within this setting employs matrix-wise perturbation noise in batch-normalization statistics to enhance the adaptability of source-pretrained models to dynamically changing target domains, without the need for additional parameters. We demonstrated the effectiveness of our method through extensive experiments, where our approach reduced the average error by 39.8% on the CIFAR10-C dataset using the WideResNet-28-10 model, by 38.8% using the WideResNet-40-2 model, and by 33.8% using the PreActResNet-18 model. Additionally, on the CIFAR100-C dataset with the WideResNet-40-2 model, our method reduced the average error by 5.3%, showcasing significant improvements in model generalization in continuous cyclic testing scenarios.
- Published
- 2024
- Full Text
- View/download PDF
38. MIRS: [MASK] Insertion Based Retrieval Stabilizer for Query Variations
- Author
-
Liu, Junping, Gong, Mingkang, Hu, Xinrong, Yang, Jie, Guo, Yi, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Strauss, Christine, editor, Amagasa, Toshiyuki, editor, Kotsis, Gabriele, editor, Tjoa, A Min, editor, and Khalil, Ismail, editor
- Published
- 2023
- Full Text
- View/download PDF
39. Robust Educational Dialogue Act Classifiers with Low-Resource and Imbalanced Datasets
- Author
-
Lin, Jionghao, Tan, Wei, Nguyen, Ngoc Dang, Lang, David, Du, Lan, Buntine, Wray, Beare, Richard, Chen, Guanliang, Gašević, Dragan, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Wang, Ning, editor, Rebolledo-Mendez, Genaro, editor, Matsuda, Noboru, editor, Santos, Olga C., editor, and Dimitrova, Vania, editor
- Published
- 2023
- Full Text
- View/download PDF
40. Discrepant Semantic Diffusion Boosts Transfer Learning Robustness.
- Author
-
Gao, Yajun, Bai, Shihao, Zhao, Xiaowei, Gong, Ruihao, Wu, Yan, and Ma, Yuqing
- Subjects
ARTIFICIAL intelligence ,BOOSTING algorithms ,GENERALIZATION - Abstract
Transfer learning could improve the robustness and generalization of the model, reducing potential privacy and security risks. It operates by fine-tuning a pre-trained model on downstream datasets. This process not only enhances the model's capacity to acquire generalizable features but also ensures an effective alignment between upstream and downstream knowledge domains. Transfer learning can effectively speed up the model convergence when adapting to novel tasks, thereby leading to the efficient conservation of both data and computational resources. However, existing methods often neglect the discrepant downstream–upstream connections. Instead, they rigidly preserve the upstream information without an adequate regularization of the downstream semantic discrepancy. Consequently, this results in weak generalization, issues with collapsed classification, and an overall inferior performance. The main reason lies in the collapsed downstream–upstream connection due to the mismatched semantic granularity. Therefore, we propose a discrepant semantic diffusion method for transfer learning, which could adjust the mismatched semantic granularity and alleviate the collapsed classification problem to improve the transfer learning performance. Specifically, the proposed framework consists of a Prior-Guided Diffusion for pre-training and a discrepant diffusion for fine-tuning. Firstly, the Prior-Guided Diffusion aims to empower the pre-trained model with the semantic-diffusion ability. This is achieved through a semantic prior, which consequently provides a more robust pre-trained model for downstream classification. Secondly, the discrepant diffusion focuses on encouraging semantic diffusion. Its design intends to avoid the unwanted semantic centralization, which often causes the collapsed classification. Furthermore, it is constrained by the semantic discrepancy, serving to elevate the downstream discrimination capabilities. Extensive experiments on eight prevalent downstream classification datasets confirm that our method can outperform a number of state-of-the-art approaches, especially for fine-grained datasets or datasets dissimilar to upstream data (e.g., 3.75% improvement for Cars dataset and 1.79% improvement for SUN dataset under the few-shot setting with 15% data). Furthermore, the experiments of data sparsity caused by privacy protection successfully validate our proposed method's effectiveness in the field of artificial intelligence security. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
41. Self-supervised machine learning using adult inpatient data produces effective models for pediatric clinical prediction tasks.
- Author
-
Lemmon, Joshua, Guo, Lin Lawrence, Steinberg, Ethan, Morse, Keith E, Fleming, Scott Lanyon, Aftandilian, Catherine, Pfohl, Stephen R, Posada, Jose D, Shah, Nigam, Fries, Jason, and Sung, Lillian
- Abstract
Objective Development of electronic health records (EHR)-based machine learning models for pediatric inpatients is challenged by limited training data. Self-supervised learning using adult data may be a promising approach to creating robust pediatric prediction models. The primary objective was to determine whether a self-supervised model trained in adult inpatients was noninferior to logistic regression models trained in pediatric inpatients, for pediatric inpatient clinical prediction tasks. Materials and Methods This retrospective cohort study used EHR data and included patients with at least one admission to an inpatient unit. One admission per patient was randomly selected. Adult inpatients were 18 years or older while pediatric inpatients were more than 28 days and less than 18 years. Admissions were temporally split into training (January 1, 2008 to December 31, 2019), validation (January 1, 2020 to December 31, 2020), and test (January 1, 2021 to August 1, 2022) sets. Primary comparison was a self-supervised model trained in adult inpatients versus count-based logistic regression models trained in pediatric inpatients. Primary outcome was mean area-under-the-receiver-operating-characteristic-curve (AUROC) for 11 distinct clinical outcomes. Models were evaluated in pediatric inpatients. Results When evaluated in pediatric inpatients, mean AUROC of self-supervised model trained in adult inpatients (0.902) was noninferior to count-based logistic regression models trained in pediatric inpatients (0.868) (mean difference = 0.034, 95% CI=0.014-0.057; P < .001 for noninferiority and P = .006 for superiority). Conclusions Self-supervised learning in adult inpatients was noninferior to logistic regression models trained in pediatric inpatients. This finding suggests transferability of self-supervised models trained in adult patients to pediatric patients, without requiring costly model retraining. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
42. Securing recommender system via cooperative training.
- Author
-
Wang, Qingyang, Wu, Chenwang, Lian, Defu, and Chen, Enhong
- Subjects
- *
RECOMMENDER systems , *BILEVEL programming , *ELECTRONIC data processing , *POISONING - Abstract
Recommender systems are often susceptible to well-crafted fake profiles, leading to biased recommendations. Among existing defense methods, data-processing-based methods inevitably exclude normal samples, while model-based methods struggle to enjoy both generalization and robustness. To this end, we suggest integrating data processing and the robust model to propose a general framework, Triple Cooperative Defense (TCD), which employs three cooperative models that mutually enhance data and thereby improve recommendation robustness. Furthermore, Considering that existing attacks struggle to balance bi-level optimization and efficiency, we revisit poisoning attacks in recommender systems and introduce an efficient attack strategy, Co-training Attack (Co-Attack), which cooperatively optimizes the attack optimization and model training, considering the bi-level setting while maintaining attack efficiency. Moreover, we reveal a potential reason for the insufficient threat of existing attacks is their default assumption of optimizing attacks in undefended scenarios. This overly optimistic setting limits the potential of attacks. Consequently, we put forth a Game-based Co-training Attack (GCoAttack), which frames the proposed CoAttack and TCD as a game-theoretic process, thoroughly exploring CoAttack's attack potential in the cooperative training of attack and defense. Extensive experiments on three real datasets demonstrate TCD's superiority in enhancing model robustness. Additionally, we verify that the two proposed attack strategies significantly outperform existing attacks, with game-based GCoAttack posing a greater poisoning threat than CoAttack. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
43. On Checking Robustness on Named Entity Recognition with Pre-trained Transformers Models.
- Author
-
GARCÍA-PABLOS, Aitor, MANDRAVICKAITĖ, Justina, and VERŠINSKIENĖ, Egidija
- Subjects
SOCIAL networks ,SOCIAL media - Abstract
In this paper we are conducting a series of experiments with several state-of-the-art models, based on Transformers architecture, to perform Named Entity Recognition and Classification (NERC) on text of different styles (social networks vs. news) and languages, and with different levels of noise. We are using different publicly-available datasets such as WNUT17, CoNLL2002 and CoNLL2003. Furthermore, we synthetically add extra levels of noise (random capitalization, random character additions/replacements/removals, etc.), to study the impact and the robustness of the models. The Transformer models we compare (mBERT, CANINE, mDe-BERTa) use different tokenisation strategies (token-based vs. character-based) which may exhibit different levels of robustness towards certain types of noise. The experiments show that the subword-based models (mBERT and mDeBERTa) tend to achieve higher scores, especially in the presence of clean text. However, when the amount of noise increases, the character-based tokenisation exhibits a smaller performance drop, suggesting that models such as CANINE might be a better candidate to deal with noisy text. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
44. 基于XGBoost模型的三峡库区燕山乡滑坡易发性评价 与区划.
- Author
-
吴宏阳, 周 超, 梁 鑫, 袁鹏程, and 余蓝冰
- Abstract
Copyright of Chinese Journal of Geological Hazard & Control is the property of China Institute of Geological Environmental Monitoring (CIGEM) Editorial Department and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2023
- Full Text
- View/download PDF
45. Model Robustness Optimization Method Using GAN and Feature Pyramid
- Author
-
SUN Jiaze+, TANG Yanmei, WANG Shuyan
- Subjects
generative adversarial network (gan) ,deep neural networks ,adversarial sample ,feature pyramid ,model robustness ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Under the artificial intelligence adversarial environment, deep neural networks have an obvious vulnerability to adversarial samples. To improve the robustness of the model in the adversarial environment, a deep neural network model robustness optimization method AdvRob is proposed. Firstly, the target model is transformed into a feature pyramid structure, and then the prior knowledge of latent features is used to generate more aggressive adversarial samples for adversarial training. Experiments on the MNIST and CIFAR-10 datasets show that the adversarial samples generated by using latent features have a higher attack success rate, more diversity and stronger transferability than the AdvGAN method. Under high disturbances, on the MNIST dataset, compared with original model, the defensive ability of the AdvRob method against FGSM and JSMA attacks has been improved by at least 4 times, and the defensive ability against PGD, BIM, and C&W attacks has been improved by at least 10 times. Compared with original model, the defensive ability against FGSM, PGD, C&W, BIM and JSMA attacks is improved by at least 5 times, and the defensive effect is obvious on the CIFAR-10 dataset. On the SVHN dataset, compared with FGSM adversarial training, PGD adversarial training, defensive distillation, and model robustness optimization methods adding external modules, the AdvRob method has the most significant defensive effect against white-box attacks. It provides an efficient and robust optimization method for the DNN model in the adversarial environment.
- Published
- 2023
- Full Text
- View/download PDF
46. WordBlitz: An Efficient Hard-Label Textual Adversarial Attack Method Jointly Leveraging Adversarial Transferability and Word Importance
- Author
-
Xiangge Li, Hong Luo, and Yan Sun
- Subjects
natural language processing ,textual attack ,hard label ,adversarial samples ,model robustness ,Technology ,Engineering (General). Civil engineering (General) ,TA1-2040 ,Biology (General) ,QH301-705.5 ,Physics ,QC1-999 ,Chemistry ,QD1-999 - Abstract
Existing textual attacks mostly perturb keywords in sentences to generate adversarial examples by relying on the prediction confidence of victim models. In practice, attackers can only access the prediction label, meaning that the victim model can easily defend against such hard-label attacks by denying access based on the attack’s frequency. In this paper, we propose an efficient hard-label attack approach, called WordBlitz. First, based on the adversarial transferability, we train a substitute model to initialize the attack parameter set, including a candidate pool and two weight tables of keywords and candidate words. Then, adversarial examples are generated and optimized under the guidance of the two weight tables. During optimization, we design a hybrid local search algorithm with word importance to find the globally optimal solution while updating the two weight tables according to the attack results. Finally, the non-adversarial text generated during perturbation optimization is added to the training of the substitute model as data augmentation to improve the adversarial transferability. Experimental results show that WordBlitz surpasses the baseline in terms of better effectiveness, higher efficiency, and lower cost. Its efficiency is especially pronounced in scenarios with broader search spaces, and its attack success rate on a Chinese dataset is higher than on baselines.
- Published
- 2024
- Full Text
- View/download PDF
47. AI-Based Glioma Grading for a Trustworthy Diagnosis: An Analytical Pipeline for Improved Reliability.
- Author
-
Pitarch, Carla, Ribas, Vicent, and Vellido, Alfredo
- Subjects
- *
RELIABILITY (Personality trait) , *DIGITAL image processing , *DEEP learning , *CLINICAL decision support systems , *GLIOMAS , *MACHINE learning , *MAGNETIC resonance imaging , *ARTIFICIAL intelligence , *RESEARCH funding , *AUTOMATION , *COMPUTER-aided diagnosis , *ARTIFICIAL neural networks , *PREDICTION models , *TUMOR grading , *ALGORITHMS , *TRUST - Abstract
Simple Summary: Accurately grading gliomas, which are the most common and aggressive malignant brain tumors in adults, poses a significant challenge for radiologists. This study explores the application of Deep Learning techniques in assisting tumor grading using Magnetic Resonance Images (MRIs). By analyzing a glioma database sourced from multiple public datasets and comparing different settings, the aim of this study is to develop a robust and reliable grading system. The study demonstrates that by focusing on the tumor region of interest and augmenting the available data, there is a significant improvement in both the accuracy and confidence of tumor grade classifications. While successful in differentiating low-grade gliomas from high-grade gliomas, the accurate classification of grades 2, 3, and 4 remains challenging. The research findings have significant implications for advancing the development of a non-invasive, robust, and trustworthy data-driven system to support clinicians in the diagnosis and therapy planning of glioma patients. Glioma is the most common type of tumor in humans originating in the brain. According to the World Health Organization, gliomas can be graded on a four-stage scale, ranging from the most benign to the most malignant. The grading of these tumors from image information is a far from trivial task for radiologists and one in which they could be assisted by machine-learning-based decision support. However, the machine learning analytical pipeline is also fraught with perils stemming from different sources, such as inadvertent data leakage, adequacy of 2D image sampling, or classifier assessment biases. In this paper, we analyze a glioma database sourced from multiple datasets using a simple classifier, aiming to obtain a reliable tumor grading and, on the way, we provide a few guidelines to ensure such reliability. Our results reveal that by focusing on the tumor region of interest and using data augmentation techniques we significantly enhanced the accuracy and confidence in tumor classifications. Evaluation on an independent test set resulted in an AUC-ROC of 0.932 in the discrimination of low-grade gliomas from high-grade gliomas, and an AUC-ROC of 0.893 in the classification of grades 2, 3, and 4. The study also highlights the importance of providing, beyond generic classification performance, measures of how reliable and trustworthy the model's output is, thus assessing the model's certainty and robustness. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
48. EBCDet: Energy-Based Curriculum for Robust Domain Adaptive Object Detection
- Author
-
Amin Banitalebi-Dehkordi, Abdollah Amirkhani, and Alireza Mohammadinasab
- Subjects
Object detection ,domain adaptation ,energy ,model robustness ,curriculum learning ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
This paper proposes a new method for addressing the problem of unsupervised domain adaptation for robust object detection. To this end, we propose an energy-based curriculum for progressively adapting a model, thereby mitigating the pseudo-label noise caused by domain shifts. Throughout the adaptation process, we also make use of spatial domain mixing as well as knowledge distillation to improve the pseudo-labels reliability. Our method does not require any modifications in the model architecture or any special training tricks or complications. Our end-to-end pipeline, although simple, proves effective in adapting object detector neural networks. To verify our method, we perform an extensive systematic set of experiments on: synthetic-to-real scenario, cross-camera setup, cross-domain artistic datasets, and image corruption benchmarks, and establish a new state-of-the-art in several cases. For example, compared to the best existing baselines, our Energy-Based Curriculum learning method for robust object Detection (EBCDet), achieves: 1–3 % AP50 improvement on Sim10k-to-Cityscapes and KITTI-to-Cityscapes, 3–4 % AP50 boost on Pascal-VOC-to- Comic, WaterColor, and ClipArt, and 1-5% relative robustness improvement on Pascal-C, COCO-C, and Cityscapes-C (1-2 % absolute mPC). Code is available at: https://github.com/AutomotiveML/EBCDet.
- Published
- 2023
- Full Text
- View/download PDF
49. Towards Robust Recommender Systems via Triple Cooperative Defense
- Author
-
Wang, Qingyang, Lian, Defu, Wu, Chenwang, Chen, Enhong, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Chbeir, Richard, editor, Huang, Helen, editor, Silvestri, Fabrizio, editor, Manolopoulos, Yannis, editor, and Zhang, Yanchun, editor
- Published
- 2022
- Full Text
- View/download PDF
50. Improving Robustness by Enhancing Weak Subnets
- Author
-
Guo, Yong, Stutz, David, Schiele, Bernt, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Avidan, Shai, editor, Brostow, Gabriel, editor, Cissé, Moustapha, editor, Farinella, Giovanni Maria, editor, and Hassner, Tal, editor
- Published
- 2022
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.