5,428 results on '"Xu, Yao"'
Search Results
2. Neural-Symbolic Collaborative Distillation: Advancing Small Language Models for Complex Reasoning Tasks
- Author
-
Liao, Huanxuan, He, Shizhu, Xu, Yao, Zhang, Yuanzhe, Liu, Kang, and Zhao, Jun
- Subjects
Computer Science - Computation and Language - Abstract
In this paper, we propose $\textbf{Ne}$ural-$\textbf{Sy}$mbolic $\textbf{C}$ollaborative $\textbf{D}$istillation ($\textbf{NesyCD}$), a novel knowledge distillation method for learning the complex reasoning abilities of Large Language Models (LLMs, e.g., \textgreater 13B). We argue that complex reasoning tasks are difficult for Small Language Models (SLMs, e.g., $\leq$ 7B), as these tasks demand not only general cognitive abilities but also specialized knowledge, which is often sparse and difficult for these neural-based SLMs to effectively capture. Therefore, NesyCD distills the general capabilities and specialized knowledge in LLMs using different manners. On the one hand, we distill only general abilities from teacher LLMs into the student SLMs of parameterized neural networks. On the other hand, for the specialized abilities and uncommon knowledge of a complex reasoning task, we employ a symbolic knowledge distillation approach to obtain and store the specialized knowledge within a symbolic knowledge base (KB). By decoupling general and specialized capabilities, the proposed NesyCD can achieve superior performance cost-effectively, utilizing smaller models and blending parameterized neural networks with symbolic KB. Moreover, the specialized KB generalizes well and is comprehended and manipulated by humans. Our experiments show that NesyCD significantly boosts SLMs' complex reasoning performance on in-domain (BBH, GSM8K) and out-of-domain (AGIEval, ARC) datasets. Notably, our approach enabled the LLaMA3-8B and Qwen2-7B to surpass GPT-3.5-turbo in performance and come close to matching LLaMA3-70B, despite the latter having nine times more parameters. Our code will be available at https://github.com/Xnhyacinth/NesyCD.
- Published
- 2024
3. Enhancing Outlier Knowledge for Few-Shot Out-of-Distribution Detection with Extensible Local Prompts
- Author
-
Zeng, Fanhu, Cheng, Zhen, Zhu, Fei, and Zhang, Xu-Yao
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Out-of-Distribution (OOD) detection, aiming to distinguish outliers from known categories, has gained prominence in practical scenarios. Recently, the advent of vision-language models (VLM) has heightened interest in enhancing OOD detection for VLM through few-shot tuning. However, existing methods mainly focus on optimizing global prompts, ignoring refined utilization of local information with regard to outliers. Motivated by this, we freeze global prompts and introduce a novel coarse-to-fine tuning paradigm to emphasize regional enhancement with local prompts. Our method comprises two integral components: global prompt guided negative augmentation and local prompt enhanced regional regularization. The former utilizes frozen, coarse global prompts as guiding cues to incorporate negative augmentation, thereby leveraging local outlier knowledge. The latter employs trainable local prompts and a regional regularization to capture local information effectively, aiding in outlier identification. We also propose regional-related metric to empower the enrichment of OOD detection. Moreover, since our approach explores enhancing local prompts only, it can be seamlessly integrated with trained global prompts during inference to boost the performance. Comprehensive experiments demonstrate the effectiveness and potential of our method. Notably, our method reduces average FPR95 by 5.17% against state-of-the-art method in 4-shot tuning on challenging ImageNet-1k dataset, even outperforming 16-shot results of previous methods.
- Published
- 2024
4. Recoverable Compression: A Multimodal Vision Token Recovery Mechanism Guided by Text Information
- Author
-
Chen, Yi, Xu, Jian, Zhang, Xu-Yao, Liu, Wen-Zhuo, Liu, Yang-Yang, and Liu, Cheng-Lin
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
With the advancement of large-scale language modeling techniques, large multimodal models combining visual encoders with large language models have demonstrated exceptional performance in various visual tasks. Most of the current large-scale multimodal models achieve this by mapping visual features obtained from the visual encoder into a large language model and using them as inputs alongside text for downstream tasks. Therefore, the number of visual tokens directly affects the training and inference speed of the model. There has been significant work on token pruning for visual transformers, but for large multimodal models, only relying on visual information for token pruning or compression may lead to significant loss of important information. On the other hand, the textual input in the form of a question may contain valuable information that can aid in answering the question, providing additional knowledge to the model. To address the potential oversimplification and excessive pruning that can occur with most purely visual token pruning methods, we propose a text information-guided dynamic visual token recovery mechanism that does not require training. This mechanism leverages the similarity between the question text and visual tokens to recover visually meaningful tokens with important text information while merging other less important tokens. Experimental results demonstrate that our proposed method achieves comparable performance to the original approach while compressing the visual tokens to an average of 10% of the original quantity. Our source code will be made publicly available following acceptance.
- Published
- 2024
5. Adapting Vision-Language Models to Open Classes via Test-Time Prompt Tuning
- Author
-
Gao, Zhengqing, Ao, Xiang, Zhang, Xu-Yao, and Liu, Cheng-Lin
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Adapting pre-trained models to open classes is a challenging problem in machine learning. Vision-language models fully explore the knowledge of text modality, demonstrating strong zero-shot recognition performance, which is naturally suited for various open-set problems. More recently, some research focuses on fine-tuning such models to downstream tasks. Prompt tuning methods achieved huge improvements by learning context vectors on few-shot data. However, through the evaluation under open-set adaptation setting with the test data including new classes, we find that there exists a dilemma that learned prompts have worse generalization abilities than hand-crafted prompts. In this paper, we consider combining the advantages of both and come up with a test-time prompt tuning approach, which leverages the maximum concept matching (MCM) scores as dynamic weights to generate an input-conditioned prompt for each image during test. Through extensive experiments on 11 different datasets, we show that our proposed method outperforms all comparison methods on average considering both base and new classes. The code is available at https://github.com/gaozhengqing/TTPT, Comment: PRCV 2024
- Published
- 2024
6. StylePrompter: Enhancing Domain Generalization with Test-Time Style Priors
- Author
-
Zhang, Jiao, Xu, Jian, Zhang, Xu-Yao, and Liu, Cheng-Lin
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
In real-world applications, the sample distribution at the inference stage often differs from the one at the training stage, causing performance degradation of trained deep models. The research on domain generalization (DG) aims to develop robust algorithms that can improve the generalized performance in unseen domains by training on a few domains. However, the domain-agnostic vision model, trained on a limited number of domains using traditional domain generalization methods, cannot guarantee its effectiveness in dealing with unseen domains. The introduction of language can break the closed cognition space of the vision model, providing additional semantic information that cannot be inferred from vision-only datasets. In this paper, we propose to overcome the challenge in previous DG methods by introducing the style prompt in the language modality to adapt the trained model dynamically. In particular, we train a style prompter to extract style information of the current image into an embedding in the token embedding space and place it in front of the candidate category words as prior knowledge to prompt the model. Our open space partition of the style token embedding space and the hand-crafted style regularization enable the trained style prompter to handle data from unknown domains effectively. Extensive experiments verify the effectiveness of our method and demonstrate state-of-the-art performances on multiple public datasets. Codes will be available after the acceptance of this paper.
- Published
- 2024
7. Enabling Practical Transparent Checkpointing for MPI: A Topological Sort Approach
- Author
-
Xu, Yao and Cooperman, Gene
- Subjects
Computer Science - Distributed, Parallel, and Cluster Computing ,D.1.3 - Abstract
MPI is the de facto standard for parallel computing on a cluster of computers. Checkpointing is an important component in any strategy for software resilience and for long-running jobs that must be executed by chaining together time-bounded resource allocations. This work solves an old problem: a practical and general algorithm for transparent checkpointing of MPI that is both efficient and compatible with most of the latest network software. Transparent checkpointing is attractive due to its generality and ease of use for most MPI application developers. Earlier efforts at transparent checkpointing for MPI, one decade ago, had two difficult problems: (i) by relying on a specific MPI implementation tied to a specific network technology; and (ii) by failing to demonstrate sufficiently low runtime overhead. Problem (i) (network dependence) was already solved in 2019 by MANA's introduction of split processes. Problem (ii) (efficient runtime overhead) is solved in this work. This paper introduces an approach that avoids these limitations, employing a novel topological sort to algorithmically determine a safe future synchronization point. The algorithm is valid for both blocking and non-blocking collective communication in MPI. We demonstrate the efficacy and scalability of our approach through both micro-benchmarks and a set of five real-world MPI applications, notably including the widely used VASP (Vienna Ab Initio Simulation Package), which is responsible for 11% of the workload on the Perlmutter supercomputer at Lawrence Berkley National Laboratory. VASP was previously cited as a special challenge for checkpointing, in part due to its multi-algorithm codes., Comment: 22 pages, 9 figures and 1 table, accepted to IEEE Cluster'24
- Published
- 2024
8. PASS++: A Dual Bias Reduction Framework for Non-Exemplar Class-Incremental Learning
- Author
-
Zhu, Fei, Zhang, Xu-Yao, Cheng, Zhen, and Liu, Cheng-Lin
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning - Abstract
Class-incremental learning (CIL) aims to recognize new classes incrementally while maintaining the discriminability of old classes. Most existing CIL methods are exemplar-based, i.e., storing a part of old data for retraining. Without relearning old data, those methods suffer from catastrophic forgetting. In this paper, we figure out two inherent problems in CIL, i.e., representation bias and classifier bias, that cause catastrophic forgetting of old knowledge. To address these two biases, we present a simple and novel dual bias reduction framework that employs self-supervised transformation (SST) in input space and prototype augmentation (protoAug) in deep feature space. On the one hand, SST alleviates the representation bias by learning generic and diverse representations that can transfer across different tasks. On the other hand, protoAug overcomes the classifier bias by explicitly or implicitly augmenting prototypes of old classes in the deep feature space, which poses tighter constraints to maintain previously learned decision boundaries. We further propose hardness-aware prototype augmentation and multi-view ensemble strategies, leading to significant improvements. The proposed framework can be easily integrated with pre-trained models. Without storing any samples of old classes, our method can perform comparably with state-of-the-art exemplar-based approaches which store plenty of old data. We hope to draw the attention of researchers back to non-exemplar CIL by rethinking the necessity of storing old samples in CIL.
- Published
- 2024
9. From Instance Training to Instruction Learning: Task Adapters Generation from Instructions
- Author
-
Liao, Huanxuan, Xu, Yao, He, Shizhu, Zhang, Yuanzhe, Hao, Yanchao, Liu, Shengping, Liu, Kang, and Zhao, Jun
- Subjects
Computer Science - Computation and Language - Abstract
Large language models (LLMs) have acquired the ability to solve general tasks by utilizing instruction finetuning (IFT). However, IFT still relies heavily on instance training of extensive task data, which greatly limits the adaptability of LLMs to real-world scenarios where labeled task instances are scarce and broader task generalization becomes paramount. Contrary to LLMs, humans acquire skills and complete tasks not merely through repeated practice but also by understanding and following instructional guidelines. This paper is dedicated to simulating human learning to address the shortcomings of instance training, focusing on instruction learning to enhance cross-task generalization. Within this context, we introduce Task Adapters Generation from Instructions (TAGI), which automatically constructs the task-specific model in a parameter generation manner based on the given task instructions without retraining for unseen tasks. Specifically, we utilize knowledge distillation to enhance the consistency between TAGI developed through Learning with Instruction and task-specific models developed through Training with Instance, by aligning the labels, output logits, and adapter parameters between them. TAGI is endowed with cross-task generalization capabilities through a two-stage training process that includes hypernetwork pretraining and finetuning. We evaluate TAGI on the Super-Natural Instructions and P3 datasets. The experimental results demonstrate that TAGI can match or even outperform traditional meta-trained models and other hypernetwork models, while significantly reducing computational requirements.
- Published
- 2024
10. Differentiable Proximal Graph Matching
- Author
-
Tan, Haoru, Wang, Chuang, Zhang, Xu-Yao, and Liu, Cheng-Lin
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Graph matching is a fundamental tool in computer vision and pattern recognition. In this paper, we introduce an algorithm for graph matching based on the proximal operator, referred to as differentiable proximal graph matching (DPGM). Specifically, we relax and decompose the quadratic assignment problem for the graph matching into a sequence of convex optimization problems. The whole algorithm can be considered as a differentiable map from the graph affinity matrix to the prediction of node correspondence. Therefore, the proposed method can be organically integrated into an end-to-end deep learning framework to jointly learn both the deep feature representation and the graph affinity matrix. In addition, we provide a theoretical guarantee to ensure the proposed method converges to a stable point with a reasonable number of iterations. Numerical experiments show that PGM outperforms existing graph matching algorithms on diverse datasets such as synthetic data, and CMU House. Meanwhile, PGM can fully harness the capability of deep feature extractors and achieve state-of-art performance on PASCAL VOC keypoints.
- Published
- 2024
11. Mapping dissolved carbon in space and time: An experimental technique for the measurement of pH and total carbon concentration in density driven convection of CO$_2$ dissolved in water
- Author
-
Birggison, Hilmar Yngvi, Xu, Yao, Moura, Marcel, Flekkøy, Eirik Grude, and Måløy, Knut Jørgen
- Subjects
Physics - Fluid Dynamics - Abstract
We present an experimental technique for determining the pH and the total carbon concentration when \ch{CO2} diffuses and flows in water. The technique employs three different pH indicators, which, when combined with an image analysis technique, provides a dynamic range in pH from 4.0 to 9.5. In contrast to usual techniques in which a single pH indicator is used, the methodology presented allows not only to produce a binary classification (pH larger or smaller than a given threshold) but to access a much more complete continuous spatial distribution of pH and concentration levels in the system. We calibrate the method against benchmark solutions and further demonstrate its potential by measuring the pH and total carbon concentration in a density driven convection (DDC) of carbon-enriched water. The motivation for testing the method in this particular experiment comes from the fact that DDC plays a pivotal role in the efficiency of engineered carbon storage processes. The application of the technique presented here provided a direct window for the analysis of the spatial distribution of captured carbon in the DDC flow., Comment: Supplementary Material containing videos of spatiotemporal pH and carbon concentration can be found in Zenodo via the link: https://doi.org/10.5281/zenodo.11148678
- Published
- 2024
12. Generate-on-Graph: Treat LLM as both Agent and KG in Incomplete Knowledge Graph Question Answering
- Author
-
Xu, Yao, He, Shizhu, Chen, Jiabei, Wang, Zihao, Song, Yangqiu, Tong, Hanghang, Liu, Kang, and Zhao, Jun
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
To address the issue of insufficient knowledge and the tendency to generate hallucination in Large Language Models (LLMs), numerous studies have endeavored to integrate LLMs with Knowledge Graphs (KGs). However, all these methods are evaluated on conventional Knowledge Graph Question Answering (KGQA) with complete KGs, where the factual triples involved in each question are entirely covered by the given KG. In this situation, LLM mainly acts as an agent to find answer entities by exploring the KG, rather than effectively integrating internal and external knowledge sources. However, in real-world scenarios, KGs are often incomplete to cover all the knowledge required to answer questions. To simulate real-world scenarios and evaluate the ability of LLMs to integrate internal and external knowledge, in this paper, we propose leveraging LLMs for QA under Incomplete Knowledge Graph (IKGQA), where the given KG doesn't include all the factual triples involved in each question. To handle IKGQA, we propose a training-free method called Generate-on-Graph (GoG) that can generate new factual triples while exploring on KGs. Specifically, we propose a selecting-generating-answering framework, which not only treat the LLM as an agent to explore on KGs, but also treat it as a KG to generate new facts based on the explored subgraph and its inherent knowledge. Experimental results on two datasets demonstrate that our GoG can solve IKGQA to a certain extent, while almost all previous methods cannot perform well on IKGQA.
- Published
- 2024
13. Unified Entropy Optimization for Open-Set Test-Time Adaptation
- Author
-
Gao, Zhengqing, Zhang, Xu-Yao, and Liu, Cheng-Lin
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Test-time adaptation (TTA) aims at adapting a model pre-trained on the labeled source domain to the unlabeled target domain. Existing methods usually focus on improving TTA performance under covariate shifts, while neglecting semantic shifts. In this paper, we delve into a realistic open-set TTA setting where the target domain may contain samples from unknown classes. Many state-of-the-art closed-set TTA methods perform poorly when applied to open-set scenarios, which can be attributed to the inaccurate estimation of data distribution and model confidence. To address these issues, we propose a simple but effective framework called unified entropy optimization (UniEnt), which is capable of simultaneously adapting to covariate-shifted in-distribution (csID) data and detecting covariate-shifted out-of-distribution (csOOD) data. Specifically, UniEnt first mines pseudo-csID and pseudo-csOOD samples from test data, followed by entropy minimization on the pseudo-csID data and entropy maximization on the pseudo-csOOD data. Furthermore, we introduce UniEnt+ to alleviate the noise caused by hard data partition leveraging sample-level confidence. Extensive experiments on CIFAR benchmarks and Tiny-ImageNet-C show the superiority of our framework. The code is available at https://github.com/gaozhengqing/UniEnt, Comment: CVPR 2024
- Published
- 2024
14. Quantum gravity of the Heisenberg algebra
- Author
-
Almheiri, Ahmed, Goel, Akash, and Hu, Xu-Yao
- Subjects
High Energy Physics - Theory ,Condensed Matter - Strongly Correlated Electrons ,General Relativity and Quantum Cosmology - Abstract
We consider a simplified model of double scaled SYK (DSSYK) in which the Hamiltonian is the position operator of the Harmonic oscillator. This model captures the high temperature limit of DSSYK but could also be defined as a quantum theory in its own right. We study properties of the emergent geometry including its dynamics in response to inserting matter particles. In particular, we find that the model displays de Sitter-like properties such as that infalling matter reduces the rate of growth of geodesic slices between the two boundaries. The simplicity of the model allows us to compute the full generating functional for correlation functions of the length mode or any number of matter operators. We provide evidence that the effective action of the geodesic length between boundary points is non-local. Furthermore, we use the on-shell solution for the geodesic lengths between any two boundary points to reconstruct an effective bulk metric and reverse engineer the dilaton gravity theory that generates this metric as a solution., Comment: 30 pages + appendices; v2: typos corrected, references added
- Published
- 2024
15. Awakening Augmented Generation: Learning to Awaken Internal Knowledge of Large Language Models for Question Answering
- Author
-
Liao, Huanxuan, He, Shizhu, Xu, Yao, Zhang, Yuanzhe, Liu, Kang, Liu, Shengping, and Zhao, Jun
- Subjects
Computer Science - Computation and Language - Abstract
Retrieval-Augmented-Generation and Generation-Augmented-Generation have been proposed to enhance the knowledge required for question answering with Large Language Models (LLMs) by leveraging richer context. However, the former relies on external resources, and both require incorporating explicit documents into the context, which increases execution costs and susceptibility to noise data during inference. Recent works indicate that LLMs model rich knowledge, but it is often not effectively activated and awakened. Inspired by this, we propose a novel knowledge-augmented framework, $\textbf{Awakening-Augmented-Generation}$ (AAG), which mimics the human ability to answer questions using only thinking and recalling to compensate for knowledge gaps, thereby awaking relevant knowledge in LLMs without relying on external resources. AAG consists of two key components for awakening richer context. Explicit awakening fine-tunes a context generator to create a synthetic, compressed document that functions as symbolic context. Implicit awakening utilizes a hypernetwork to generate adapters based on the question and synthetic document, which are inserted into LLMs to serve as parameter context. Experimental results on three datasets demonstrate that AAG exhibits significant advantages in both open-domain and closed-book settings, as well as in out-of-distribution generalization. Our code will be available at \url{https://github.com/Xnhyacinth/IAG}.
- Published
- 2024
16. Ensemble Quadratic Assignment Network for Graph Matching
- Author
-
Tan, Haoru, Wang, Chuang, Wu, Sitong, Zhang, Xu-Yao, Yin, Fei, and Liu, Cheng-Lin
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Graph matching is a commonly used technique in computer vision and pattern recognition. Recent data-driven approaches have improved the graph matching accuracy remarkably, whereas some traditional algorithm-based methods are more robust to feature noises, outlier nodes, and global transformation (e.g.~rotation). In this paper, we propose a graph neural network (GNN) based approach to combine the advantages of data-driven and traditional methods. In the GNN framework, we transform traditional graph-matching solvers as single-channel GNNs on the association graph and extend the single-channel architecture to the multi-channel network. The proposed model can be seen as an ensemble method that fuses multiple algorithms at every iteration. Instead of averaging the estimates at the end of the ensemble, in our approach, the independent iterations of the ensembled algorithms exchange their information after each iteration via a 1x1 channel-wise convolution layer. Experiments show that our model improves the performance of traditional algorithms significantly. In addition, we propose a random sampling strategy to reduce the computational complexity and GPU memory usage, so the model applies to matching graphs with thousands of nodes. We evaluate the performance of our method on three tasks: geometric graph matching, semantic feature matching, and few-shot 3D shape classification. The proposed model performs comparably or outperforms the best existing GNN-based methods., Comment: Accepted by IJCV in 2024
- Published
- 2024
17. Active Generalized Category Discovery
- Author
-
Ma, Shijie, Zhu, Fei, Zhong, Zhun, Zhang, Xu-Yao, and Liu, Cheng-Lin
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Generalized Category Discovery (GCD) is a pragmatic and challenging open-world task, which endeavors to cluster unlabeled samples from both novel and old classes, leveraging some labeled data of old classes. Given that knowledge learned from old classes is not fully transferable to new classes, and that novel categories are fully unlabeled, GCD inherently faces intractable problems, including imbalanced classification performance and inconsistent confidence between old and new classes, especially in the low-labeling regime. Hence, some annotations of new classes are deemed necessary. However, labeling new classes is extremely costly. To address this issue, we take the spirit of active learning and propose a new setting called Active Generalized Category Discovery (AGCD). The goal is to improve the performance of GCD by actively selecting a limited amount of valuable samples for labeling from the oracle. To solve this problem, we devise an adaptive sampling strategy, which jointly considers novelty, informativeness and diversity to adaptively select novel samples with proper uncertainty. However, owing to the varied orderings of label indices caused by the clustering of novel classes, the queried labels are not directly applicable to subsequent training. To overcome this issue, we further propose a stable label mapping algorithm that transforms ground truth labels to the label space of the classifier, thereby ensuring consistent training across different active selection stages. Our method achieves state-of-the-art performance on both generic and fine-grained datasets. Our code is available at https://github.com/mashijie1028/ActiveGCD, Comment: Accepted to CVPR 2024
- Published
- 2024
18. Revisiting Confidence Estimation: Towards Reliable Failure Prediction
- Author
-
Zhu, Fei, Zhang, Xu-Yao, Cheng, Zhen, and Liu, Cheng-Lin
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning - Abstract
Reliable confidence estimation is a challenging yet fundamental requirement in many risk-sensitive applications. However, modern deep neural networks are often overconfident for their incorrect predictions, i.e., misclassified samples from known classes, and out-of-distribution (OOD) samples from unknown classes. In recent years, many confidence calibration and OOD detection methods have been developed. In this paper, we find a general, widely existing but actually-neglected phenomenon that most confidence estimation methods are harmful for detecting misclassification errors. We investigate this problem and reveal that popular calibration and OOD detection methods often lead to worse confidence separation between correctly classified and misclassified examples, making it difficult to decide whether to trust a prediction or not. Finally, we propose to enlarge the confidence gap by finding flat minima, which yields state-of-the-art failure prediction performance under various settings including balanced, long-tailed, and covariate-shift classification scenarios. Our study not only provides a strong baseline for reliable confidence estimation but also acts as a bridge between understanding calibration, OOD detection, and failure prediction. The code is available at \url{https://github.com/Impression2805/FMFP}., Comment: Accepted by IEEE TPAMI. arXiv admin note: text overlap with arXiv:2303.02970; text overlap with arXiv:2007.01458 by other authors
- Published
- 2024
19. Open-world Machine Learning: A Review and New Outlooks
- Author
-
Zhu, Fei, Ma, Shijie, Cheng, Zhen, Zhang, Xu-Yao, Zhang, Zhaoxiang, and Liu, Cheng-Lin
- Subjects
Computer Science - Machine Learning ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Machine learning has achieved remarkable success in many applications. However, existing studies are largely based on the closed-world assumption, which assumes that the environment is stationary, and the model is fixed once deployed. In many real-world applications, this fundamental and rather naive assumption may not hold because an open environment is complex, dynamic, and full of unknowns. In such cases, rejecting unknowns, discovering novelties, and then incrementally learning them, could enable models to be safe and evolve continually as biological systems do. This paper provides a holistic view of open-world machine learning by investigating unknown rejection, novel class discovery, and class-incremental learning in a unified paradigm. The challenges, principles, and limitations of current methodologies are discussed in detail. Finally, we discuss several potential directions for future research. This paper aims to provide a comprehensive introduction to the emerging open-world machine learning paradigm, to help researchers build more powerful AI systems in their respective fields, and to promote the development of artificial general intelligence.
- Published
- 2024
20. PILoRA: Prototype Guided Incremental LoRA for Federated Class-Incremental Learning
- Author
-
Guo, Haiyang, Zhu, Fei, Liu, Wenzhuo, Zhang, Xu-Yao, and Liu, Cheng-Lin
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Existing federated learning methods have effectively dealt with decentralized learning in scenarios involving data privacy and non-IID data. However, in real-world situations, each client dynamically learns new classes, requiring the global model to classify all seen classes. To effectively mitigate catastrophic forgetting and data heterogeneity under low communication costs, we propose a simple and effective method named PILoRA. On the one hand, we adopt prototype learning to learn better feature representations and leverage the heuristic information between prototypes and class features to design a prototype re-weight module to solve the classifier bias caused by data heterogeneity without retraining the classifier. On the other hand, we view incremental learning as the process of learning distinct task vectors and encoding them within different LoRA parameters. Accordingly, we propose Incremental LoRA to mitigate catastrophic forgetting. Experimental results on standard datasets indicate that our method outperforms the state-of-the-art approaches significantly. More importantly, our method exhibits strong robustness and superiority in different settings and degrees of data heterogeneity. The code is available at \url{https://github.com/Ghy0501/PILoRA}., Comment: ECCV 2024
- Published
- 2024
21. Breaking the Limits of Reliable Prediction via Generated Data
- Author
-
Cheng, Zhen, Zhu, Fei, Zhang, Xu-Yao, and Liu, Cheng-Lin
- Published
- 2024
- Full Text
- View/download PDF
22. Curcumin Alleviates Microglia-Mediated Neuroinflammation and Neuronal Ferroptosis Following Experimental Subarachnoid Hemorrhage by Modulating the Nrf2/HO-1 Signaling Pathway
- Author
-
Xu, Yao, Liu, Yongsheng, Wu, Yan, Sun, Jingshan, Lu, Xiaocheng, Dai, Kun, Zhang, Yiting, Luo, Chengliang, and Zhang, Jian
- Published
- 2024
- Full Text
- View/download PDF
23. CA1 Modulates the Osteogenic Differentiation of Dental Follicle Stem Cells by Activating the BMP Signaling Pathway In Vitro
- Author
-
Zhao, Jin-ze, Ge, Ying-Ying, Xue, Ling-fa, Xu, Yao-xiang, Yue, Jin, Li, Cong, and Xiao, Wen-lin
- Published
- 2024
- Full Text
- View/download PDF
24. Dynamics of soil properties and microbial communities by crop rotation length: unveiling the key factors for enhanced sugar yield
- Author
-
Li, Bingchen, Geng, Gui, Li, Tai, Song, Shoujie, Xu, Yao, Yu, Lihua, and Wang, Yuguang
- Published
- 2024
- Full Text
- View/download PDF
25. Construction of iron oxyhydroxide/nickel sulfate hydroxide hybrid electrocatalyst for efficient oxygen evolution
- Author
-
Guo, Bing-Rong, Chen, Meng-Xin, Li, Si-Wei, Gao, Ru-Hai, Sang, Bo-Han, Ren, Xiao-Qian, Liu, Zhe, Cao, Xun, Liu, Jia, Ding, Ya-Ni, Xu, Ping, and Xu, Yao
- Published
- 2024
- Full Text
- View/download PDF
26. Unified Classification and Rejection: A One-versus-All Framework
- Author
-
Cheng, Zhen, Zhang, Xu-Yao, and Liu, Cheng-Lin
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning - Abstract
Classifying patterns of known classes and rejecting ambiguous and novel (also called as out-of-distribution (OOD)) inputs are involved in open world pattern recognition. Deep neural network models usually excel in closed-set classification while performs poorly in rejecting OOD inputs. To tackle this problem, numerous methods have been designed to perform open set recognition (OSR) or OOD rejection/detection tasks. Previous methods mostly take post-training score transformation or hybrid models to ensure low scores on OOD inputs while separating known classes. In this paper, we attempt to build a unified framework for building open set classifiers for both classification and OOD rejection. We formulate the open set recognition of $ K $-known-class as a $ (K+1) $-class classification problem with model trained on known-class samples only. By decomposing the $ K $-class problem into $ K $ one-versus-all (OVA) binary classification tasks and binding some parameters, we show that combining the scores of OVA classifiers can give $ (K+1) $-class posterior probabilities, which enables classification and OOD rejection in a unified framework. To maintain the closed-set classification accuracy of the OVA trained classifier, we propose a hybrid training strategy combining OVA loss and multi-class cross-entropy loss. We implement the OVA framework and hybrid training strategy on the recently proposed convolutional prototype network and prototype classifier on vision transformer (ViT) backbone. Experiments on popular OSR and OOD detection datasets demonstrate that the proposed framework, using a single multi-class classifier, yields competitive performance in closed-set classification, OOD detection, and misclassification detection., Comment: Published in Machine Intelligence Research (https://link.springer.com/article/10.1007/s11633-024-1514-4)
- Published
- 2023
- Full Text
- View/download PDF
27. Query2Triple: Unified Query Encoding for Answering Diverse Complex Queries over Knowledge Graphs
- Author
-
Xu, Yao, He, Shizhu, Wang, Cunguang, Cai, Li, Liu, Kang, and Zhao, Jun
- Subjects
Computer Science - Artificial Intelligence - Abstract
Complex Query Answering (CQA) is a challenge task of Knowledge Graph (KG). Due to the incompleteness of KGs, query embedding (QE) methods have been proposed to encode queries and entities into the same embedding space, and treat logical operators as neural set operators to obtain answers. However, these methods train KG embeddings and neural set operators concurrently on both simple (one-hop) and complex (multi-hop and logical) queries, which causes performance degradation on simple queries and low training efficiency. In this paper, we propose Query to Triple (Q2T), a novel approach that decouples the training for simple and complex queries. Q2T divides the training into two stages: (1) Pre-training a neural link predictor on simple queries to predict tail entities based on the head entity and relation. (2) Training a query encoder on complex queries to encode diverse complex queries into a unified triple form that can be efficiently solved by the pretrained neural link predictor. Our proposed Q2T is not only efficient to train, but also modular, thus easily adaptable to various neural link predictors that have been studied well. Extensive experiments demonstrate that, even without explicit modeling for neural set operators, Q2T still achieves state-of-the-art performance on diverse complex queries over three public benchmarks., Comment: Accepted by EMNLP 2023 findings
- Published
- 2023
28. LC–MS metabolomics analysis of serum metabolites during neoadjuvant chemoradiotherapy in locally advanced rectal cancer
- Author
-
Peng, Qiliang, Jiang, Lili, Shen, Yi, Xu, Yao, Shen, Xinan, Zou, Li, Zhu, Yaqun, and Shen, Yuntian
- Published
- 2024
- Full Text
- View/download PDF
29. A sensitive fluorescence biosensor based on ligation-transcription and CRISPR/Cas13a-assisted cascade amplification strategies to detect the H1N1 virus
- Author
-
Xue, Lulu, Bu, Shengjun, Xu, Mengyao, Wei, Jiaqi, Zhou, Hongyu, Xu, Yao, Hao, Zhuo, Li, Zehong, and Wan, Jiayu
- Published
- 2024
- Full Text
- View/download PDF
30. Fixed-time synchronization of time-varying coupled competitive neural networks with impulsive effects
- Author
-
Xu, Yao, Wang, Haodong, Mao, Yuheng, Wu, Yongbao, and Li, Wenxue
- Published
- 2024
- Full Text
- View/download PDF
31. Increased Nasal Blimp1 + Treg Cells After Sublingual Immunotherapy Reflect the Efficacy of Treatment in Allergic Rhinitis
- Author
-
Pan, Yue, Zhang, Xinxin, Geng, Huanting, Yu, Yan, Liu, Jianyong, Li, Menglin, Yang, Huijun, Yuan, Yifang, Xu, Yao, Wu, Yujia, Wu, Geping, Ma, Xingkai, and Cheng, Lei
- Published
- 2024
- Full Text
- View/download PDF
32. Liensinine reduces acute lung injury brought on by lipopolysaccharide by inhibiting the activation of the NF-κB signaling pathway through modification of the Src/TRAF6/TAK1 axis
- Author
-
Chen, Huizhen, Liu, Feixue, Dai, Dapeng, Ming, Yuanyuan, Xu, Yao, Huang, Zhengqian, Zhang, Le, and Sun, Yong
- Published
- 2024
- Full Text
- View/download PDF
33. Implementation-Oblivious Transparent Checkpoint-Restart for MPI
- Author
-
Xu, Yao, Belyaev, Leonid, Jain, Twinkle, Schafer, Derek, Skjellum, Anthony, and Cooperman, Gene
- Subjects
Computer Science - Distributed, Parallel, and Cluster Computing - Abstract
This work presents experience with traditional use cases of checkpointing on a novel platform. A single codebase (MANA) transparently checkpoints production workloads for major available MPI implementations: "develop once, run everywhere". The new platform enables application developers to compile their application against any of the available standards-compliant MPI implementations, and test each MPI implementation according to performance or other features., Comment: 17 pages, 4 figures
- Published
- 2023
34. Towards Reliable Domain Generalization: A New Dataset and Evaluations
- Author
-
Zhang, Jiao, Zhang, Xu-Yao, and Liu, Cheng-Lin
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning - Abstract
There are ubiquitous distribution shifts in the real world. However, deep neural networks (DNNs) are easily biased towards the training set, which causes severe performance degradation when they receive out-of-distribution data. Many methods are studied to train models that generalize under various distribution shifts in the literature of domain generalization (DG). However, the recent DomainBed and WILDS benchmarks challenged the effectiveness of these methods. Aiming at the problems in the existing research, we propose a new domain generalization task for handwritten Chinese character recognition (HCCR) to enrich the application scenarios of DG method research. We evaluate eighteen DG methods on the proposed PaHCC (Printed and Handwritten Chinese Characters) dataset and show that the performance of existing methods on this dataset is still unsatisfactory. Besides, under a designed dynamic DG setting, we reveal more properties of DG methods and argue that only the leave-one-domain-out protocol is unreliable. We advocate that researchers in the DG community refer to dynamic performance of methods for more comprehensive and reliable evaluation. Our dataset and evaluations bring new perspectives to the community for more substantial progress. We will make our dataset public with the article published to facilitate the study of domain generalization.
- Published
- 2023
35. Finite-dimensionality of attractors for wave equations with degenerate nonlocal damping
- Author
-
Tang, Zhijun, Yan, Senlin, Xu, Yao, and Zhong, Chengkui
- Subjects
Mathematics - Analysis of PDEs ,Mathematics - Dynamical Systems ,37L30, 35B41, 35B40 - Abstract
In this paper we study the fractal dimension of global attractors for a class of wave equations with (single-point) degenerate nonlocal damping. Both the equation and its linearization degenerate into linear wave equations at the degenerate point and the usual approaches to bound the dimension of the entirety of attractors do not work directly. Instead, we develop a new process concerning the dimension near the degenerate point individually and show the finite dimensionality of the attractor., Comment: 33 pages
- Published
- 2023
36. Towards Trustworthy Dataset Distillation
- Author
-
Ma, Shijie, Zhu, Fei, Cheng, Zhen, and Zhang, Xu-Yao
- Subjects
Computer Science - Machine Learning ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Efficiency and trustworthiness are two eternal pursuits when applying deep learning in real-world applications. With regard to efficiency, dataset distillation (DD) endeavors to reduce training costs by distilling the large dataset into a tiny synthetic dataset. However, existing methods merely concentrate on in-distribution (InD) classification in a closed-world setting, disregarding out-of-distribution (OOD) samples. On the other hand, OOD detection aims to enhance models' trustworthiness, which is always inefficiently achieved in full-data settings. For the first time, we simultaneously consider both issues and propose a novel paradigm called Trustworthy Dataset Distillation (TrustDD). By distilling both InD samples and outliers, the condensed datasets are capable of training models competent in both InD classification and OOD detection. To alleviate the requirement of real outlier data, we further propose to corrupt InD samples to generate pseudo-outliers, namely Pseudo-Outlier Exposure (POE). Comprehensive experiments on various settings demonstrate the effectiveness of TrustDD, and POE surpasses the state-of-the-art method Outlier Exposure (OE). Compared with the preceding DD, TrustDD is more trustworthy and applicable to open-world scenarios. Our code is available at https://github.com/mashijie1028/TrustDD, Comment: Accepted to Pattern Recognition 2024
- Published
- 2023
- Full Text
- View/download PDF
37. TensorGPT: Efficient Compression of the Embedding Layer in LLMs based on the Tensor-Train Decomposition
- Author
-
Xu, Mingxue, Xu, Yao Lei, and Mandic, Danilo P.
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning ,Computer Science - Neural and Evolutionary Computing ,Mathematics - Numerical Analysis - Abstract
High-dimensional token embeddings underpin Large Language Models (LLMs), as they can capture subtle semantic information and significantly enhance the modelling of complex language patterns. However, the associated high dimensionality also introduces considerable model parameters, and a prohibitively high model storage. To address this issue, this work proposes an approach based on the Tensor-Train Decomposition (TTD), where each token embedding is treated as a Matrix Product State (MPS) that can be efficiently computed in a distributed manner. The experimental results on GPT-2 demonstrate that, through our approach, the embedding layer can be compressed by a factor of up to 38.40 times, and when the compression factor is 3.31 times, even produced a better performance than the original GPT-2 model.
- Published
- 2023
38. The role of reactive oxygen species in plant-virus interactions
- Author
-
Xu, Yao, Zhang, Sutong, Zhang, Mengyuan, Jiao, Sibo, Guo, Yifan, and Jiang, Tong
- Published
- 2024
- Full Text
- View/download PDF
39. Research progress on the mechanism of the Hedgehog signaling pathway during mandibular development
- Author
-
XU Yao, LI Wenjin
- Subjects
hedgehog signaling pathway ,mandible ,condyle ,intramembranous ossification ,sonic hedgehog ,indian hedgehog ,mechanism ,abnormal developmentof the mandible ,temporomandibular osteoarthritis ,Medicine - Abstract
The source and process of mandible development are significantly different from those of other bones in the body, and abnormal development can lead to various bone-related diseases, seriously affecting the quality of life of patients. In recent years, the role of the Hedgehog signaling pathway in bone development has received increasing attention. The Hedgehog gene includes three subtypes: Sonic Hedgehog (Shh), Indian Hedgehog (Ihh), and Desert Hedgehog (Dhh). Shh and Ihh can participate in bone metabolism regulation through various pathways, with Shh primarily involved in limb development and Ihh playing a key role in endochondral osteogenesis. The Hedgehog signaling pathway includes Hedgehog signaling protein ligands, Patched (Ptch) receptors, Smoothed (Smo) receptors, nuclear transcription factors, glioma-associated oncogene homologues (Gli), and downstream target genes. The activation of typical Hedgehog signaling pathways requires the involvement of Gli, whereas atypical Hedgehog signaling is mainly regulated by Ptch, Smo, and others. Shh regulates various biological behaviors during early vertebrate embryogenesis, such as organ differentiation, neural stem formation, stem cell differentiation and proliferation, limb bone development, and tooth germ development. During the process of bone cell differentiation, Shh, Ptch1, and Gli1 are expressed in osteoblasts, further promoting the differentiation of bone marrow mesenchymal stem cells into osteoblasts and chondrocytes. IHh plays an indispensable functional role in bone growth, development, and homeostasis and participates in the formation of intramembrane bone collars, proliferation, and maturation of chondrocytes. IHh is expressed in mature skull osteoblasts and can act as a promoter of bone factor regulation of Ptch and bone morphogenetic protein (BMP) expression to induce intramembrane ossification. Brain and muscle ARNT-like protein 1 (BMAL1) can regulate the Hedgehog signaling pathway by binding to Ptch1 and Ihh, playing a crucial role in cartilage formation and endochondral osteogenesis in the temporomandibular joint. Hedgehog signal activators can improve the reduction in mandibular bone mass caused by BMAL1 deficiency. Hedgehog signaling imbalance can have a significant impact on bone development and lead to a series of bone diseases, such as abnormal bone development, fractures, osteoporosis, and osteoarthritis. The mechanism of the Hedgehog signaling pathway in relation to mandibular diseases has not been fully elucidated, and future research should seek to further explore Hedgehog signaling as a potential target for treating mandibular developmental-related diseases.
- Published
- 2024
- Full Text
- View/download PDF
40. Quantum gravity of the Heisenberg algebra
- Author
-
Ahmed Almheiri, Akash Goel, and Xu-Yao Hu
- Subjects
2D Gravity ,Field Theories in Lower Dimensions ,Models of Quantum Gravity ,Gauge-Gravity Correspondence ,Nuclear and particle physics. Atomic energy. Radioactivity ,QC770-798 - Abstract
Abstract We consider a simplified model of double scaled SYK (DSSYK) in which the Hamiltonian is the position operator of the Harmonic oscillator. This model captures the high temperature limit of DSSYK but could also be defined as a quantum theory in its own right. We study properties of the emergent geometry including its dynamics in response to inserting matter particles. In particular, we find that the model displays de Sitter-like properties such as that infalling matter reduces the rate of growth of geodesic slices between the two boundaries. The simplicity of the model allows us to compute the full generating functional for correlation functions of the length mode or any number of matter operators. We provide evidence that the effective action of the geodesic length between boundary points is non-local. Furthermore, we use the on-shell solution for the geodesic lengths between any two boundary points to reconstruct an effective bulk metric and reverse engineer the dilaton gravity theory that generates this metric as a solution.
- Published
- 2024
- Full Text
- View/download PDF
41. 5α-Epoxyalantolactone from Inula macrophylla attenuates cognitive deficits in scopolamine-induced Alzheimer’s disease mice model
- Author
-
Rui Ma, Xu-Yao Feng, Jiang-Jiang Tang, Wei Ha, and Yan-Ping Shi
- Subjects
Alzheimer’s disease ,5α-Epoxyalantolactone (5α-EAL) ,Anti-neuroinflammation ,Attenuates cognitive deficits ,Botany ,QK1-989 - Abstract
Abstract Alzheimer’s disease (AD) is a complex neurodegenerative condition. 5α-epoxyalantolactone (5α-EAL), a eudesmane-type sesquiterpene isolated from the herb of Inula macrophylla, has various pharmacological effects. This work supposed to investigate the improved impact of 5α-EAL on cognitive impairment. 5α-EAL inhibited the generation of nitric oxide (NO) in BV-2 cells stimulated with lipopolysaccharide (LPS) with an EC50 of 6.2 μM. 5α-EAL significantly reduced the production of prostaglandin E2 (PGE2) and tumor necrosis factor-α (TNF-α), while also inhibiting the production of cyclooxygenase-2 (COX-2) and inducible nitric oxide synthase (iNOS) proteins. The ability of 5α-EAL to penetrate the blood–brain barrier (BBB) was confirmed via a parallel artificial membrane permeation assay. Scopolamine (SCOP)-induced AD mice model was employed to assess the improved impacts of 5α-EAL on cognitive impairment in vivo. After the mice were pretreated with 5α-EAL (10 and 30 mg/kg per day, i.p.) for 21 days, the behavioral experiments indicated that the administration of the 5α-EAL could alleviate the cognitive and memory impairments. 5α-EAL significantly reduced the AChE activity in the brain of SCOP-induced AD mice. In summary, these findings highlight the beneficial effects of the natural product 5α-EAL as a potential bioactive compound for attenuating cognitive deficits in AD due to its pharmacological profile. Graphical Abstract
- Published
- 2024
- Full Text
- View/download PDF
42. GeoGLUE: A GeoGraphic Language Understanding Evaluation Benchmark
- Author
-
Li, Dongyang, Ding, Ruixue, Zhang, Qiang, Li, Zheng, Chen, Boli, Xie, Pengjun, Xu, Yao, Li, Xin, Guo, Ning, Huang, Fei, and He, Xiaofeng
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
With a fast developing pace of geographic applications, automatable and intelligent models are essential to be designed to handle the large volume of information. However, few researchers focus on geographic natural language processing, and there has never been a benchmark to build a unified standard. In this work, we propose a GeoGraphic Language Understanding Evaluation benchmark, named GeoGLUE. We collect data from open-released geographic resources and introduce six natural language understanding tasks, including geographic textual similarity on recall, geographic textual similarity on rerank, geographic elements tagging, geographic composition analysis, geographic where what cut, and geographic entity alignment. We also pro vide evaluation experiments and analysis of general baselines, indicating the effectiveness and significance of the GeoGLUE benchmark.
- Published
- 2023
43. Prototype Calibration with Synthesized Samples for Zero-Shot Chinese Character Recognition.
- Author
-
Xiang Ao 0002, Xiaohui Li, Xu-Yao Zhang, and Chenglin Liu 0001
- Published
- 2024
- Full Text
- View/download PDF
44. Collaborative Defense Method Against DDoS Attacks on SDN-Architected Cloud Servers
- Author
-
Zhang, Yiying, Xu, Yao, Han, Longzhe, Liang, Kun, Li, Wenjing, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Huang, De-Shuang, editor, Zhang, Chuanlei, editor, and Chen, Wei, editor
- Published
- 2024
- Full Text
- View/download PDF
45. Sugar accumulation stage in sugar beets is a key stage in response to continuous cropping soil microbial community assembly
- Author
-
Li, Tai, Cui, Rufei, Geng, Gui, Dong, Yinzhuang, Xu, Yao, Sun, Yanchun, Stevanato, Piergiorgio, Yu, Lihua, Liu, Jiahui, Nurminsky, Vadim N., and Wang, Yuguang
- Published
- 2024
- Full Text
- View/download PDF
46. Effect of Plant Spacing on Growth and Yield Formation of Sugar Beet Taproot
- Author
-
Xu, Yao, Liu, Danyang, Shi, Jing, Wang, Xu, Geng, Gui, Liu, Jiahui, Yu, Lihua, Lu, Yuncai, and Wang, Yuguang
- Published
- 2024
- Full Text
- View/download PDF
47. KIF4A promotes epithelial–mesenchymal transition by activating the TGF-β/SMAD signaling pathway in glioma cells
- Author
-
Xu, Yao, Xue, Guangren, Zhou, Lei, Wu, Gaotian, Hu, Lingji, Ma, Shuchen, Zhang, Jian, and Li, Xiangdong
- Published
- 2024
- Full Text
- View/download PDF
48. Direct conversion of CO and H2O to hydrocarbons at atmospheric pressure using a TiO2−x/Ni photothermal catalyst
- Author
-
Qin, Xuetao, Xu, Ming, Guan, Jianxin, Feng, Li, Xu, Yao, Zheng, Lirong, Wang, Meng, Zhao, Jian-Wen, Chen, Jia-Lan, Zhang, Jie, Xie, Jinglin, Yu, Zhihao, Zhang, Ruiqi, Li, Xinmao, Liu, Xi, Liu, Jin-Xun, Zheng, Junrong, and Ma, Ding
- Published
- 2024
- Full Text
- View/download PDF
49. Robust Transformer-based model for spatiotemporal PM2.5 prediction in California
- Author
-
Tong, Weitian, Limperis, Jordan, Hamza-Lup, Felix, Xu, Yao, and Li, Lixin
- Published
- 2024
- Full Text
- View/download PDF
50. OpenMix: Exploring Outlier Samples for Misclassification Detection
- Author
-
Zhu, Fei, Cheng, Zhen, Zhang, Xu-Yao, and Liu, Cheng-Lin
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Reliable confidence estimation for deep neural classifiers is a challenging yet fundamental requirement in high-stakes applications. Unfortunately, modern deep neural networks are often overconfident for their erroneous predictions. In this work, we exploit the easily available outlier samples, i.e., unlabeled samples coming from non-target classes, for helping detect misclassification errors. Particularly, we find that the well-known Outlier Exposure, which is powerful in detecting out-of-distribution (OOD) samples from unknown classes, does not provide any gain in identifying misclassification errors. Based on these observations, we propose a novel method called OpenMix, which incorporates open-world knowledge by learning to reject uncertain pseudo-samples generated via outlier transformation. OpenMix significantly improves confidence reliability under various scenarios, establishing a strong and unified framework for detecting both misclassified samples from known classes and OOD samples from unknown classes. The code is publicly available at https://github.com/Impression2805/OpenMix., Comment: Accepted by CVPR 2023 (Highlight)
- Published
- 2023
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.