Author: "Li, Aoxue" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Li, Aoxue"' showing total 139 results

Start Over Author "Li, Aoxue"

139 results on '"Li, Aoxue"'

1. GenArtist: Multimodal LLM as an Agent for Unified Image Generation and Editing

Author: Wang, Zhenyu, Li, Aoxue, Li, Zhenguo, and Liu, Xihui
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Despite the success achieved by existing image generation and editing methods, current models still struggle with complex problems including intricate text prompts, and the absence of verification and self-correction mechanisms makes the generated images unreliable. Meanwhile, a single model tends to specialize in particular tasks and possess the corresponding capabilities, making it inadequate for fulfilling all user requirements. We propose GenArtist, a unified image generation and editing system, coordinated by a multimodal large language model (MLLM) agent. We integrate a comprehensive range of existing models into the tool library and utilize the agent for tool selection and execution. For a complex problem, the MLLM agent decomposes it into simpler sub-problems and constructs a tree structure to systematically plan the procedure of generation, editing, and self-correction with step-by-step verification. By automatically generating missing position-related inputs and incorporating position information, the appropriate tool can be effectively employed to address each sub-problem. Experiments demonstrate that GenArtist can perform various generation and editing tasks, achieving state-of-the-art performance and surpassing existing models such as SDXL and DALL-E 3, as can be seen in Fig. 1. Project page is https://zhenyuw16.github.io/GenArtist_page.
Published: 2024

2. Towards Understanding the Working Mechanism of Text-to-Image Diffusion Model

Author: Yi, Mingyang, Li, Aoxue, Xin, Yi, and Li, Zhenguo
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Recently, the strong latent Diffusion Probabilistic Model (DPM) has been applied to high-quality Text-to-Image (T2I) generation (e.g., Stable Diffusion), by injecting the encoded target text prompt into the gradually denoised diffusion image generator. Despite the success of DPM in practice, the mechanism behind it remains to be explored. To fill this blank, we begin by examining the intermediate statuses during the gradual denoising generation process in DPM. The empirical observations indicate, the shape of image is reconstructed after the first few denoising steps, and then the image is filled with details (e.g., texture). The phenomenon is because the low-frequency signal (shape relevant) of the noisy image is not corrupted until the final stage in the forward process (initial stage of generation) of adding noise in DPM. Inspired by the observations, we proceed to explore the influence of each token in the text prompt during the two stages. After a series of experiments of T2I generations conditioned on a set of text prompts. We conclude that in the earlier generation stage, the image is mostly decided by the special token [\texttt{EOS}] in the text prompt, and the information in the text prompt is already conveyed in this stage. After that, the diffusion model completes the details of generated images by information from themselves. Finally, we propose to apply this observation to accelerate the process of T2I generation by properly removing text guidance, which finally accelerates the sampling up to 25\%+.
Published: 2024

3. Enhancing Text-to-Image Editing via Hybrid Mask-Informed Fusion

Author: Li, Aoxue, Yi, Mingyang, and Li, Zhenguo
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Recently, text-to-image (T2I) editing has been greatly pushed forward by applying diffusion models. Despite the visual promise of the generated images, inconsistencies with the expected textual prompt remain prevalent. This paper aims to systematically improve the text-guided image editing techniques based on diffusion models, by addressing their limitations. Notably, the common idea in diffusion-based editing firstly reconstructs the source image via inversion techniques e.g., DDIM Inversion. Then following a fusion process that carefully integrates the source intermediate (hidden) states (obtained by inversion) with the ones of the target image. Unfortunately, such a standard pipeline fails in many cases due to the interference of texture retention and the new characters creation in some regions. To mitigate this, we incorporate human annotation as an external knowledge to confine editing within a ``Mask-informed'' region. Then we carefully Fuse the edited image with the source image and a constructed intermediate image within the model's Self-Attention module. Extensive empirical results demonstrate the proposed ``MaSaFusion'' significantly improves the existing T2I editing techniques.
Published: 2024

4. Open-Vocabulary Object Detection with Meta Prompt Representation and Instance Contrastive Optimization

Author: Wang, Zhao, Li, Aoxue, Zhou, Fengwei, Li, Zhenguo, and Dou, Qi
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Classical object detectors are incapable of detecting novel class objects that are not encountered before. Regarding this issue, Open-Vocabulary Object Detection (OVOD) is proposed, which aims to detect the objects in the candidate class list. However, current OVOD models are suffering from overfitting on the base classes, heavily relying on the large-scale extra data, and complex training process. To overcome these issues, we propose a novel framework with Meta prompt and Instance Contrastive learning (MIC) schemes. Firstly, we simulate a novel-class-emerging scenario to help the prompt learner that learns class and background prompts generalize to novel classes. Secondly, we design an instance-level contrastive strategy to promote intra-class compactness and inter-class separation, which benefits generalization of the detector to novel class objects. Without using knowledge distillation, ensemble model or extra training data during detector training, our proposed MIC outperforms previous SOTA methods trained with these complex techniques on LVIS. Most importantly, MIC shows great generalization ability on novel classes, e.g., with $+4.3\%$ and $+1.9\% \ \mathrm{AP}$ improvement compared with previous SOTA on COCO and Objects365, respectively., Comment: BMVC 2023
Published: 2024

5. Efficient Transferability Assessment for Selection of Pre-trained Detectors

Author: Wang, Zhao, Li, Aoxue, Li, Zhenguo, and Dou, Qi
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Large-scale pre-training followed by downstream fine-tuning is an effective solution for transferring deep-learning-based models. Since finetuning all possible pre-trained models is computational costly, we aim to predict the transferability performance of these pre-trained models in a computational efficient manner. Different from previous work that seek out suitable models for downstream classification and segmentation tasks, this paper studies the efficient transferability assessment of pre-trained object detectors. To this end, we build up a detector transferability benchmark which contains a large and diverse zoo of pre-trained detectors with various architectures, source datasets and training schemes. Given this zoo, we adopt 7 target datasets from 5 diverse domains as the downstream target tasks for evaluation. Further, we propose to assess classification and regression sub-tasks simultaneously in a unified framework. Additionally, we design a complementary metric for evaluating tasks with varying objects. Experimental results demonstrate that our method outperforms other state-of-the-art approaches in assessing transferability under different target domains while efficiently reducing wall-clock time 32$\times$ and requires a mere 5.2\% memory footprint compared to brute-force fine-tuning of all pre-trained detectors., Comment: WACV 2024
Published: 2024

6. Divide and Conquer: Language Models can Plan and Self-Correct for Compositional Text-to-Image Generation

Author: Wang, Zhenyu, Xie, Enze, Li, Aoxue, Wang, Zhongdao, Liu, Xihui, and Li, Zhenguo
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Despite significant advancements in text-to-image models for generating high-quality images, these methods still struggle to ensure the controllability of text prompts over images in the context of complex text prompts, especially when it comes to retaining object attributes and relationships. In this paper, we propose CompAgent, a training-free approach for compositional text-to-image generation, with a large language model (LLM) agent as its core. The fundamental idea underlying CompAgent is premised on a divide-and-conquer methodology. Given a complex text prompt containing multiple concepts including objects, attributes, and relationships, the LLM agent initially decomposes it, which entails the extraction of individual objects, their associated attributes, and the prediction of a coherent scene layout. These individual objects can then be independently conquered. Subsequently, the agent performs reasoning by analyzing the text, plans and employs the tools to compose these isolated objects. The verification and human feedback mechanism is finally incorporated into our agent to further correct the potential attribute errors and refine the generated images. Guided by the LLM agent, we propose a tuning-free multi-concept customization model and a layout-to-image generation model as the tools for concept composition, and a local image editing method as the tool to interact with the agent for verification. The scene layout controls the image generation process among these tools to prevent confusion among multiple objects. Extensive experiments demonstrate the superiority of our approach for compositional text-to-image generation: CompAgent achieves more than 10\% improvement on T2I-CompBench, a comprehensive benchmark for open-world compositional T2I generation. The extension to various related tasks also illustrates the flexibility of our CompAgent for potential applications.
Published: 2024

7. CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects

Author: Wang, Zhao, Li, Aoxue, Zhu, Lingting, Guo, Yong, Dou, Qi, and Li, Zhenguo
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Customized text-to-video generation aims to generate high-quality videos guided by text prompts and subject references. Current approaches for personalizing text-to-video generation suffer from tackling multiple subjects, which is a more challenging and practical scenario. In this work, our aim is to promote multi-subject guided text-to-video customization. We propose CustomVideo, a novel framework that can generate identity-preserving videos with the guidance of multiple subjects. To be specific, firstly, we encourage the co-occurrence of multiple subjects via composing them in a single image. Further, upon a basic text-to-video diffusion model, we design a simple yet effective attention control strategy to disentangle different subjects in the latent space of diffusion model. Moreover, to help the model focus on the specific area of the object, we segment the object from given reference images and provide a corresponding object mask for attention learning. Also, we collect a multi-subject text-to-video generation dataset as a comprehensive benchmark, with 63 individual subjects from 13 different categories and 68 meaningful pairs. Extensive qualitative, quantitative, and user study results demonstrate the superiority of our method compared to previous state-of-the-art approaches. The project page is https://kyfafyd.wang/projects/customvideo., Comment: 18 pages, 11 figures, 7 tables
Published: 2024

8. Mixture of Cluster-conditional LoRA Experts for Vision-language Instruction Tuning

Author: Gou, Yunhao, Liu, Zhili, Chen, Kai, Hong, Lanqing, Xu, Hang, Li, Aoxue, Yeung, Dit-Yan, Kwok, James T., and Zhang, Yu
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Instruction tuning of Large Vision-language Models (LVLMs) has revolutionized the development of versatile models with zero-shot generalization across a wide range of downstream vision-language tasks. However, the diversity of training tasks of different sources and formats would lead to inevitable task conflicts, where different tasks conflict for the same set of model parameters, resulting in sub-optimal instruction-following abilities. To address that, we propose the Mixture of Cluster-conditional LoRA Experts (MoCLE), a novel Mixture of Experts (MoE) architecture designed to activate the task-customized model parameters based on the instruction clusters. A separate universal expert is further incorporated to improve generalization capabilities of MoCLE for novel instructions. Extensive experiments on InstructBLIP and LLaVA demonstrate the effectiveness of MoCLE., Comment: Project website: https://gyhdog99.github.io/projects/mocle/
Published: 2023

9. UniTR: A Unified and Efficient Multi-Modal Transformer for Bird's-Eye-View Representation

Author: Wang, Haiyang, Tang, Hao, Shi, Shaoshuai, Li, Aoxue, Li, Zhenguo, Schiele, Bernt, and Wang, Liwei
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Jointly processing information from multiple sensors is crucial to achieving accurate and robust perception for reliable autonomous driving systems. However, current 3D perception research follows a modality-specific paradigm, leading to additional computation overheads and inefficient collaboration between different sensor data. In this paper, we present an efficient multi-modal backbone for outdoor 3D perception named UniTR, which processes a variety of modalities with unified modeling and shared parameters. Unlike previous works, UniTR introduces a modality-agnostic transformer encoder to handle these view-discrepant sensor data for parallel modal-wise representation learning and automatic cross-modal interaction without additional fusion steps. More importantly, to make full use of these complementary sensor types, we present a novel multi-modal integration strategy by both considering semantic-abundant 2D perspective and geometry-aware 3D sparse neighborhood relations. UniTR is also a fundamentally task-agnostic backbone that naturally supports different 3D perception tasks. It sets a new state-of-the-art performance on the nuScenes benchmark, achieving +1.1 NDS higher for 3D object detection and +12.0 higher mIoU for BEV map segmentation with lower inference latency. Code will be available at https://github.com/Haiyang-W/UniTR ., Comment: Accepted by ICCV2023
Published: 2023

10. ContraNeRF: Generalizable Neural Radiance Fields for Synthetic-to-real Novel View Synthesis via Contrastive Learning

Author: Yang, Hao, Hong, Lanqing, Li, Aoxue, Hu, Tianyang, Li, Zhenguo, Lee, Gim Hee, and Wang, Liwei
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Although many recent works have investigated generalizable NeRF-based novel view synthesis for unseen scenes, they seldom consider the synthetic-to-real generalization, which is desired in many practical applications. In this work, we first investigate the effects of synthetic data in synthetic-to-real novel view synthesis and surprisingly observe that models trained with synthetic data tend to produce sharper but less accurate volume densities. For pixels where the volume densities are correct, fine-grained details will be obtained. Otherwise, severe artifacts will be produced. To maintain the advantages of using synthetic data while avoiding its negative effects, we propose to introduce geometry-aware contrastive learning to learn multi-view consistent features with geometric constraints. Meanwhile, we adopt cross-view attention to further enhance the geometry perception of features by querying features across input views. Experiments demonstrate that under the synthetic-to-real setting, our method can render images with higher quality and better fine-grained details, outperforming existing generalizable novel view synthesis methods in terms of PSNR, SSIM, and LPIPS. When trained on real data, our method also achieves state-of-the-art results.
Published: 2023

11. CAGroup3D: Class-Aware Grouping for 3D Object Detection on Point Clouds

Author: Wang, Haiyang, Ding, Lihe, Dong, Shaocong, Shi, Shaoshuai, Li, Aoxue, Li, Jianan, Li, Zhenguo, and Wang, Liwei
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: We present a novel two-stage fully sparse convolutional 3D object detection framework, named CAGroup3D. Our proposed method first generates some high-quality 3D proposals by leveraging the class-aware local group strategy on the object surface voxels with the same semantic predictions, which considers semantic consistency and diverse locality abandoned in previous bottom-up approaches. Then, to recover the features of missed voxels due to incorrect voxel-wise segmentation, we build a fully sparse convolutional RoI pooling module to directly aggregate fine-grained spatial information from backbone for further proposal refinement. It is memory-and-computation efficient and can better encode the geometry-specific features of each 3D proposal. Our model achieves state-of-the-art 3D detection performance with remarkable gains of +\textit{3.6\%} on ScanNet V2 and +\textit{2.6}\% on SUN RGB-D in term of mAP@0.25. Code will be available at https://github.com/Haiyang-W/CAGroup3D., Comment: Accept by NeurIPS2022
Published: 2022

12. STC-IDS: Spatial-Temporal Correlation Feature Analyzing based Intrusion Detection System for Intelligent Connected Vehicles

Author: Cheng, Pengzhou, Han, Mu, Li, Aoxue, and Zhang, Fengwei
Subjects: Computer Science - Cryptography and Security, Computer Science - Artificial Intelligence
Abstract: Intrusion detection is an important defensive measure for automotive communications security. Accurate frame detection models assist vehicles to avoid malicious attacks. Uncertainty and diversity regarding attack methods make this task challenging. However, the existing works have the limitation of only considering local features or the weak feature mapping of multi-features. To address these limitations, we present a novel model for automotive intrusion detection by spatial-temporal correlation features of in-vehicle communication traffic (STC-IDS). Specifically, the proposed model exploits an encoding-detection architecture. In the encoder part, spatial and temporal relations are encoded simultaneously. To strengthen the relationship between features, the attention-based convolutional network still captures spatial and channel features to increase the receptive field, while attention-LSTM builds meaningful relationships from previous time series or crucial bytes. The encoded information is then passed to detector for generating forceful spatial-temporal attention features and enabling anomaly classification. In particular, single-frame and multi-frame models are constructed to present different advantages respectively. Under automatic hyper-parameter selection based on Bayesian optimization, the model is trained to attain the best performance. Extensive empirical studies based on a real-world vehicle attack dataset demonstrate that STC-IDS has outperformed baseline methods and obtains fewer false-alarm rates while maintaining efficiency.
Published: 2022
Full Text: View/download PDF

13. Persistent transcriptional changes in cardiac adaptive immune cells following myocardial infarction: New evidence from the re-analysis of publicly available single cell and nuclei RNA-sequencing data sets

Author: de Winter, Natasha, Ji, Jiahui, Sintou, Amalia, Forte, Elvira, Lee, Michael, Noseda, Michela, Li, Aoxue, Koenig, Andrew L., Lavine, Kory J., Hayat, Sikander, Rosenthal, Nadia, Emanueli, Costanza, Srivastava, Prashant K., and Sattler, Susanne
Published: 2024
Full Text: View/download PDF

14. Predicting Cd accumulation in rice and identifying nonlinear effects of soil nutrient elements based on machine learning methods

Author: Li, Aoxue, Kong, Linglan, Peng, Chi, Feng, Wenli, Zhang, Yan, and Guo, Zhaohui
Published: 2024
Full Text: View/download PDF

15. Dense Relation Distillation with Context-aware Aggregation for Few-Shot Object Detection

Author: Hu, Hanzhe, Bai, Shuai, Li, Aoxue, Cui, Jinshi, and Wang, Liwei
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Conventional deep learning based methods for object detection require a large amount of bounding box annotations for training, which is expensive to obtain such high quality annotated data. Few-shot object detection, which learns to adapt to novel classes with only a few annotated examples, is very challenging since the fine-grained feature of novel object can be easily overlooked with only a few data available. In this work, aiming to fully exploit features of annotated novel object and capture fine-grained features of query object, we propose Dense Relation Distillation with Context-aware Aggregation (DCNet) to tackle the few-shot detection problem. Built on the meta-learning based framework, Dense Relation Distillation module targets at fully exploiting support features, where support features and query feature are densely matched, covering all spatial locations in a feed-forward fashion. The abundant usage of the guidance information endows model the capability to handle common challenges such as appearance changes and occlusions. Moreover, to better capture scale-aware features, Context-aware Aggregation module adaptively harnesses features from different scales for a more comprehensive feature representation. Extensive experiments illustrate that our proposed approach achieves state-of-the-art results on PASCAL VOC and MS COCO datasets. Code will be made available at https://github.com/hzhupku/DCNet., Comment: Accepted by CVPR2021
Published: 2021

16. ZooKT: Task-adaptive knowledge transfer of Model Zoo for few-shot learning

Author: Zhang, Baoquan, Shan, Bingqi, Li, Aoxue, Luo, Chuyao, Ye, Yunming, and Li, Zhenguo
Published: 2025
Full Text: View/download PDF

17. Boosting Few-Shot Learning With Adaptive Margin Loss

Author: Li, Aoxue, Huang, Weiran, Lan, Xu, Feng, Jiashi, Li, Zhenguo, and Wang, Liwei
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Few-shot learning (FSL) has attracted increasing attention in recent years but remains challenging, due to the intrinsic difficulty in learning to generalize from a few examples. This paper proposes an adaptive margin principle to improve the generalization ability of metric-based meta-learning approaches for few-shot learning problems. Specifically, we first develop a class-relevant additive margin loss, where semantic similarity between each pair of classes is considered to separate samples in the feature embedding space from similar classes. Further, we incorporate the semantic context among all classes in a sampled training task and develop a task-relevant additive margin loss to better distinguish samples from different classes. Our adaptive margin method can be easily extended to a more realistic generalized FSL setting. Extensive experiments demonstrate that the proposed method can boost the performance of current metric-based meta-learning approaches, under both the standard FSL and generalized FSL settings., Comment: Accepted by CVPR 2020
Published: 2020

18. Few-Shot Learning with Global Class Representations

Author: Luo, Tiange, Li, Aoxue, Xiang, Tao, Huang, Weiran, and Wang, Liwei
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In this paper, we propose to tackle the challenging few-shot learning (FSL) problem by learning global class representations using both base and novel class training samples. In each training episode, an episodic class mean computed from a support set is registered with the global representation via a registration module. This produces a registered global class representation for computing the classification loss using a query set. Though following a similar episodic training pipeline as existing meta learning based approaches, our method differs significantly in that novel class training samples are involved in the training from the beginning. To compensate for the lack of novel class training samples, an effective sample synthesis strategy is developed to avoid overfitting. Importantly, by joint base-novel class training, our approach can be easily extended to a more practical yet challenging FSL setting, i.e., generalized FSL, where the label space of test data is extended to both base and novel classes. Extensive experiments show that our approach is effective for both of the two FSL settings., Comment: Accepted by ICCV2019
Published: 2019

19. A New Microscopic Traffic Model Using a Spring-Mass-Damper-Clutch System

Author: Li, Zhaojian, Khasawneh, Firas, Yin, Xiang, Li, Aoxue, and Song, Ziyou
Subjects: Computer Science - Systems and Control
Abstract: Microscopic traffic models describe how cars interact with their neighbors in an uninterrupted traffic flow and are frequently used for reference in advanced vehicle control design. In this paper, we propose a novel mechanical system inspired microscopic traffic model using a mass-spring-damper-clutch system. This model naturally captures the ego vehicle's resistance to large relative speed and deviation from a (driver and speed dependent) desired relative distance when following the lead vehicle. Comparing to existing car following (CF) models, this model offers physically interpretable insights on the underlying CF dynamics, and is able to characterize the impact of the ego vehicle on the lead vehicle, which is neglected in existing CF models. Thanks to the nonlinear wave propagation analysis techniques for mechanical systems, the proposed model therefore has great scalability so that multiple mass-spring-damper-clutch system can be chained to study the macroscopic traffic flow. We investigate the stability of the proposed model on the system parameters and the time delay using spectral element method. We also develop a parallel recursive least square with inverse QR decomposition (PRLS-IQR) algorithm to identify the model parameters online. These real-time estimated parameters can be used to predict the driving trajectory that can be incorporated in advanced vehicle longitudinal control systems for improved safety and fuel efficiency. The PRLS-IQR is computationally efficient and numerically stable so it is suitable for online implementation. The traffic model and the parameter identification algorithm are validated on both simulations and naturalistic driving data from multiple drivers. Promising performance is demonstrated.
Published: 2019

20. An algebraic projection procedure for construction of the basis vectors of irreducible representations of U(4) in the SuS⊗T(2)S⊗TsuS⊗T(2) basis

Author: Pan, Feng, Wu, Yingxin, Li, Aoxue, Zhang, Yuqing, Dai, Lianrong, and Draayer, J. P.
Published: 2023
Full Text: View/download PDF

21. 3D Object Detection under Urban Road Traffic Scenarios Based on Dual-Layer Voxel Features Fusion Augmentation

Author: Jiang, Haobin, primary, Ren, Junhao, additional, and Li, Aoxue, additional
Published: 2024
Full Text: View/download PDF

22. Mitochondrial abnormalities: a hub in metabolic syndrome-related cardiac dysfunction caused by oxidative stress

Author: Li, Aoxue, Zheng, Ningning, and Ding, Xudong
Published: 2022
Full Text: View/download PDF

23. Zero and Few Shot Learning with Semantic Feature Synthesis and Competitive Learning

Author: Lu, Zhiwu, Guan, Jiechao, Li, Aoxue, Xiang, Tao, Zhao, An, and Wen, Ji-Rong
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Zero-shot learning (ZSL) is made possible by learning a projection function between a feature space and a semantic space (e.g.,~an attribute space). Key to ZSL is thus to learn a projection that is robust against the often large domain gap between the seen and unseen class domains. In this work, this is achieved by unseen class data synthesis and robust projection function learning. Specifically, a novel semantic data synthesis strategy is proposed, by which semantic class prototypes (e.g., attribute vectors) are used to simply perturb seen class data for generating unseen class ones. As in any data synthesis/hallucination approach, there are ambiguities and uncertainties on how well the synthesised data can capture the targeted unseen class data distribution. To cope with this, the second contribution of this work is a novel projection learning model termed competitive bidirectional projection learning (BPL) designed to best utilise the ambiguous synthesised data. Specifically, we assume that each synthesised data point can belong to any unseen class; and the most likely two class candidates are exploited to learn a robust projection function in a competitive fashion. As a third contribution, we show that the proposed ZSL model can be easily extended to few-shot learning (FSL) by again exploiting semantic (class prototype guided) feature synthesis and competitive BPL. Extensive experiments show that our model achieves the state-of-the-art results on both problems., Comment: Submitted to IEEE TPAMI
Published: 2018

24. Transferrable Feature and Projection Learning with Class Hierarchy for Zero-Shot Learning

Author: Li, Aoxue, Lu, Zhiwu, Guan, Jiechao, Xiang, Tao, Wang, Liwei, and Wen, Ji-Rong
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Zero-shot learning (ZSL) aims to transfer knowledge from seen classes to unseen ones so that the latter can be recognised without any training samples. This is made possible by learning a projection function between a feature space and a semantic space (e.g. attribute space). Considering the seen and unseen classes as two domains, a big domain gap often exists which challenges ZSL. Inspired by the fact that an unseen class is not exactly `unseen' if it belongs to the same superclass as a seen class, we propose a novel inductive ZSL model that leverages superclasses as the bridge between seen and unseen classes to narrow the domain gap. Specifically, we first build a class hierarchy of multiple superclass layers and a single class layer, where the superclasses are automatically generated by data-driven clustering over the semantic representations of all seen and unseen class names. We then exploit the superclasses from the class hierarchy to tackle the domain gap challenge in two aspects: deep feature learning and projection function learning. First, to narrow the domain gap in the feature space, we integrate a recurrent neural network (RNN) defined with the superclasses into a convolutional neural network (CNN), in order to enforce the superclass hierarchy. Second, to further learn a transferrable projection function for ZSL, a novel projection function learning method is proposed by exploiting the superclasses to align the two domains. Importantly, our transferrable feature and projection learning methods can be easily extended to a closely related task -- few-shot learning (FSL). Extensive experiments show that the proposed model significantly outperforms the state-of-the-art alternatives in both ZSL and FSL tasks., Comment: Submitted to IJCV
Published: 2018

25. Zero-Shot Fine-Grained Classification by Deep Feature Learning with Semantics

Author: Li, Aoxue, Lu, Zhiwu, Wang, Liwei, Xiang, Tao, Li, Xinqi, and Wen, Ji-Rong
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Fine-grained image classification, which aims to distinguish images with subtle distinctions, is a challenging task due to two main issues: lack of sufficient training data for every class and difficulty in learning discriminative features for representation. In this paper, to address the two issues, we propose a two-phase framework for recognizing images from unseen fine-grained classes, i.e. zero-shot fine-grained classification. In the first feature learning phase, we finetune deep convolutional neural networks using hierarchical semantic structure among fine-grained classes to extract discriminative deep visual features. Meanwhile, a domain adaptation structure is induced into deep convolutional neural networks to avoid domain shift from training data to test data. In the second label inference phase, a semantic directed graph is constructed over attributes of fine-grained classes. Based on this graph, we develop a label propagation algorithm to infer the labels of images in the unseen classes. Experimental results on two benchmark datasets demonstrate that our model outperforms the state-of-the-art zero-shot learning models. In addition, the features obtained by our feature learning model also yield significant gains when they are used by other zero-shot learning models, which shows the flexility of our model in zero-shot fine-grained classification., Comment: This paper has been submitted to IEEE TIP for peer-review
Published: 2017

26. Accurate Pulmonary Nodule Detection in Computed Tomography Images Using Deep Convolutional Neural Networks

Author: Ding, Jia, Li, Aoxue, Hu, Zhiqiang, and Wang, Liwei
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Early detection of pulmonary cancer is the most promising way to enhance a patient's chance for survival. Accurate pulmonary nodule detection in computed tomography (CT) images is a crucial step in diagnosing pulmonary cancer. In this paper, inspired by the successful use of deep convolutional neural networks (DCNNs) in natural image recognition, we propose a novel pulmonary nodule detection approach based on DCNNs. We first introduce a deconvolutional structure to Faster Region-based Convolutional Neural Network (Faster R-CNN) for candidate detection on axial slices. Then, a three-dimensional DCNN is presented for the subsequent false positive reduction. Experimental results of the LUng Nodule Analysis 2016 (LUNA16) Challenge demonstrate the superior detection performance of the proposed approach on nodule detection(average FROC-score of 0.891, ranking the 1st place over all submitted results)., Comment: MICCAI 2017 accepted
Published: 2017

27. Risk Assessment of Roundabout Scenarios in Virtual Testing Based on an Improved Driving Safety Field.

Author: Chen, Wentao, Li, Aoxue, and Jiang, Haobin
Subjects: *TRAFFIC safety, *TRAFFIC flow, *MOTOR vehicle driving, *AUTONOMOUS vehicles, *SOCIAL forces, *TRAFFIC circles
Abstract: With the advancement of autonomous driving technology, scenario-based testing has become the mainstream testing method for intelligent vehicles. However, traditional risk indicators often fail in roundabout scenarios and cannot accurately define dangerous situations. To accurately quantify driving risks in roundabout scenarios, an improved driving safety field model is proposed in this paper. First, considering the unique traffic flow characteristics of roundabouts, the dynamic characteristics of vehicles during diverging or merging were taken into account, and the driving safety field model was improved to accurately quantify the driving risks in roundabout scenarios. Second, based on data from the rounD dataset, the model parameters were calibrated using the social force model. Finally, a DENCLUE-like method was used to extract collision systems, calculate vehicle risk degree, and analyze these risks for both the temporal and the spatial dimensions, providing guidance for virtual testing. The proposed method significantly improves detection efficiency, increasing the number of identified dangerous scenarios by 175% compared to the Time to Collision (TTC) method. Moreover, this method can more accurately quantify driving risks in roundabout scenarios and enhance the efficiency of generating dangerous scenarios, contributing to promoting the safety of autonomous vehicles. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

28. A review of traffic behaviour and intelligent driving at roundabouts based on a microscopic perspective.

Author: Jiang, Haobin, Shen, Qingyuan, Li, Aoxue, and Yin, Chenhui
Abstract: The contradiction between increasing traffic and the relatively poor roundabout infrastructure is getting stronger. The control and optimization of the macroscopic traffic flow needs to be improved to resolve congestion and safety problems at roundabouts and the connected road network. In order to better understand the gaps and trends in this field, we have systematically reviewed the main research and developments in traffic phenomena, driving behaviour, autonomous vehicles (AVs), intelligent connected vehicles and real vehicle trajectory data sets at roundabouts. The study is based on 388 papers about roundabouts, selected through a comprehensive literature search. The review demonstrates that based on a microscopic perspective, sensing, prediction, decision-making, planning and control aspects of AVs and intelligent connected vehicles can be designed and optimized to fundamentally and significantly improve traffic capacity and driving safety at roundabouts. However, the generation mechanism of traffic conflicts among traffic participants at roundabouts is complex, which is a tremendous challenge for the systematic design of AVs. Therefore, based on naturalistic driving data and machine learning theory, it is an important research direction to build driver models by learning and imitating human driver decision-making and driving behaviours. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

29. Targeting RNA‐binding motif protein 39 for arginine reduction: unveiling metabolic vulnerability in arginine‐dependent liver cancer.

Author: Li, Aoxue, Cui, Hongjuan, and Zhao, Erhu
Subjects: RNA-binding proteins, LIVER cancer, METABOLIC reprogramming, ARGININE, AMINO acid metabolism, METABOLIC disorders
Abstract: Cancer is increasingly acknowledged as a metabolic disease, characterized by metabolic reprogramming as its hallmark. However, the precise mechanisms behind this phenomenon and the factors contributing to tumorigenicity are still poorly understood. In a recent publication in Cell, Mossmann and colleague reported a study unveiling arginine as a molecule with second messenger‐like properties that reshapes metabolism to facilitate the tumor development in hepatocellular carcinoma (HCC). Their research revealed that the RNA‐binding motif protein 39 (RBM39)‐mediated increase in asparagine synthesis results in increased arginine uptake. This establishes a positive feedback loop that sustains elevated levels of arginine and facilitates oncogenic metabolic reprogramming. Additionally, Mossmann et al. demonstrated that depleting RBM39 with indisulam effectively disrupts the proto‐oncogenic metabolic reprogramming in HCC. This discovery presents a novel treatment strategy for arginine‐dependent liver cancers. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

30. Efficient Transferability Assessment for Selection of Pre-trained Detectors

Author: Wang, Zhao, primary, Li, Aoxue, additional, Li, Zhenguo, additional, and Dou, Qi, additional
Published: 2024
Full Text: View/download PDF

31. Predicting Cd accumulation in rice and identifying nonlinear effects of soil nutrient elements based on machine learning methods

Author: Li, Aoxue, primary, Kong, Linglan, additional, Peng, Chi, additional, Feng, Wenli, additional, Zhang, Yan, additional, and Guo, Zhaohui, additional
Published: 2023
Full Text: View/download PDF

32. Transferrable Feature and Projection Learning with Class Hierarchy for Zero-Shot Learning

Author: Li, Aoxue, Lu, Zhiwu, Guan, Jiechao, Xiang, Tao, Wang, Liwei, and Wen, Ji-Rong
Published: 2020
Full Text: View/download PDF

33. Effect of Anesthesia Intensive Care Unit during the COVID-19 Pandemic

Author: Chen, Hong, Zhang, Lili, Wang, Yuwen, Li, Aoxue, Zhang, Ye, and Wu, Yun
Published: 2023
Full Text: View/download PDF

34. Extended Heine-Stieltjes polynomials related to the isovector pairing model

Author: Pan, Feng, He, Yingwen, Li, Aoxue, Wang, Yu, Wu, Yingxin, and Draayer, J. P.
Published: 2021
Full Text: View/download PDF

35. Mean-field plus quadrupole–quadrupole and pairing model in the ds-shell

Author: Pan, Feng, He, Yingwen, Li, Aoxue, Wu, Yingxin, Zhou, Dan, and Draayer, Jerry P.
Published: 2021
Full Text: View/download PDF

36. ContraNeRF: Generalizable Neural Radiance Fields for Synthetic-to-real Novel View Synthesis via Contrastive Learning

Author: Yang, Hao, primary, Hong, Lanqing, additional, Li, Aoxue, additional, Hu, Tianyang, additional, Li, Zhenguo, additional, Lee, Gim Hee, additional, and Wang, Liwei, additional
Published: 2023
Full Text: View/download PDF

37. An exact solution of the homogenous trimer Bose-Hubbard model

Author: Pan, Feng, primary, Li, Aoxue, additional, Wu, Yingxin, additional, and Draayer, J P, additional
Published: 2023
Full Text: View/download PDF

38. An Algebraic Projection Procedure for Construction of the Basis Vectors of Irreducible Representations of U(4) in the Su S (2)⊗Su T (2) Basis

Author: Pan, Feng, primary, Wu, Yingxin, additional, Li, Aoxue, additional, Dai, Lianrong, additional, and Draayer, Jerry P., additional
Published: 2023
Full Text: View/download PDF

39. The Rotor-Vibrator Plus Multi-Particle-Hole Description of 154Gd

Author: Wu, Yingxin, primary, Li, Aoxue, additional, Pan, Feng, additional, Dai, Lianrong, additional, and Draayer, Jerry P., additional
Published: 2022
Full Text: View/download PDF

40. The Particle-Rotor-Quadrupole-Coupling Model for Transitional Odd-A Nuclei

Author: Li, Aoxue, primary, Wu, Yingxin, additional, Zhang, Yu, additional, Feng, Ziwei, additional, Pan, Feng, additional, and Dai, Lianrong, additional
Published: 2022
Full Text: View/download PDF

41. Federated learning‐based trajectory prediction model with privacy preserving for intelligent vehicle

Author: Han, Mu, primary, Xu, Kai, additional, Ma, Shidian, additional, Li, Aoxue, additional, and Jiang, Haobin, additional
Published: 2022
Full Text: View/download PDF

42. STC‐IDS: Spatial–temporal correlation feature analyzing based intrusion detection system for intelligent connected vehicles

Author: Cheng, Pengzhou, primary, Han, Mu, additional, Li, Aoxue, additional, and Zhang, Fengwei, additional
Published: 2022
Full Text: View/download PDF

43. Accurate Pulmonary Nodule Detection in Computed Tomography Images Using Deep Convolutional Neural Networks

Author: Ding, Jia, primary, Li, Aoxue, additional, Hu, Zhiqiang, additional, and Wang, Liwei, additional
Published: 2017
Full Text: View/download PDF

44. Conservation and Development: Spatial Identification of Relative Poverty Areas Affected by Protected Areas in China and Its Spatiotemporal Evolutionary Characteristics

Author: He, Xi, primary, Li, Aoxue, additional, Li, Junhong, additional, and Zhuang, Youbo, additional
Published: 2022
Full Text: View/download PDF

45. Semi-Supervised Object Detection via Multi-instance Alignment with Global Class Prototypes

Author: Li, Aoxue, primary, Yuan, Peng, additional, and Li, Zhenguo, additional
Published: 2022
Full Text: View/download PDF

46. Dynamic Local Path Planning for Intelligent Vehicles Based on Sampling Area Point Discrete and Quadratic Programming

Author: Jiang, Haobin, primary, Pi, Jian, additional, Li, Aoxue, additional, and Yin, Chenhui, additional
Published: 2022
Full Text: View/download PDF

47. Zero and Few Shot Learning With Semantic Feature Synthesis and Competitive Learning

Author: Guan, Jiechao, primary, Lu, Zhiwu, additional, Xiang, Tao, additional, Li, Aoxue, additional, Zhao, An, additional, and Wen, Ji-Rong, additional
Published: 2021
Full Text: View/download PDF

48. Dense Relation Distillation with Context-aware Aggregation for Few-Shot Object Detection

Author: Hu, Hanzhe, primary, Bai, Shuai, additional, Li, Aoxue, additional, Cui, Jinshi, additional, and Wang, Liwei, additional
Published: 2021
Full Text: View/download PDF

49. Transformation Invariant Few-Shot Object Detection

Author: Li, Aoxue, primary and Li, Zhenguo, additional
Published: 2021
Full Text: View/download PDF

50. Mitochondrial abnormalities: a hub in metabolic syndrome-related cardiac dysfunction caused by oxidative stress

Author: Li, Aoxue, primary, Zheng, Ningning, additional, and Ding, Xudong, additional
Published: 2021
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

139 results on '"Li, Aoxue"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources