Author: "Xu, Xinxing" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Xu, Xinxing"' showing total 530 results

Start Over Author "Xu, Xinxing"

530 results on '"Xu, Xinxing"'

1. BenchX: A Unified Benchmark Framework for Medical Vision-Language Pretraining on Chest X-Rays

Author: Zhou, Yang, Faith, Tan Li Hui, Xu, Yanyu, Leng, Sicong, Xu, Xinxing, Liu, Yong, and Goh, Rick Siow Mong
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Medical Vision-Language Pretraining (MedVLP) shows promise in learning generalizable and transferable visual representations from paired and unpaired medical images and reports. MedVLP can provide useful features to downstream tasks and facilitate adapting task-specific models to new setups using fewer examples. However, existing MedVLP methods often differ in terms of datasets, preprocessing, and finetuning implementations. This pose great challenges in evaluating how well a MedVLP method generalizes to various clinically-relevant tasks due to the lack of unified, standardized, and comprehensive benchmark. To fill this gap, we propose BenchX, a unified benchmark framework that enables head-to-head comparison and systematical analysis between MedVLP methods using public chest X-ray datasets. Specifically, BenchX is composed of three components: 1) Comprehensive datasets covering nine datasets and four medical tasks; 2) Benchmark suites to standardize data preprocessing, train-test splits, and parameter selection; 3) Unified finetuning protocols that accommodate heterogeneous MedVLP methods for consistent task adaptation in classification, segmentation, and report generation, respectively. Utilizing BenchX, we establish baselines for nine state-of-the-art MedVLP methods and found that the performance of some early MedVLP methods can be enhanced to surpass more recent ones, prompting a revisiting of the developments and conclusions from prior works in MedVLP. Our code are available at https://github.com/yangzhou12/BenchX., Comment: Accepted to NeurIPS24 Datasets and Benchmarks Track
Published: 2024

2. Enhancing Community Vision Screening -- AI Driven Retinal Photography for Early Disease Detection and Patient Trust

Author: Lei, Xiaofeng, Tham, Yih-Chung, Goh, Jocelyn Hui Lin, Feng, Yangqin, Bai, Yang, Da Soh, Zhi, Goh, Rick Siow Mong, Xu, Xinxing, Liu, Yong, and Cheng, Ching-Yu
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition
Abstract: Community vision screening plays a crucial role in identifying individuals with vision loss and preventing avoidable blindness, particularly in rural communities where access to eye care services is limited. Currently, there is a pressing need for a simple and efficient process to screen and refer individuals with significant eye disease-related vision loss to tertiary eye care centers for further care. An ideal solution should seamlessly and readily integrate with existing workflows, providing comprehensive initial screening results to service providers, thereby enabling precise patient referrals for timely treatment. This paper introduces the Enhancing Community Vision Screening (ECVS) solution, which addresses the aforementioned concerns with a novel and feasible solution based on simple, non-invasive retinal photography for the detection of pathology-based visual impairment. Our study employs four distinct deep learning models: RETinal photo Quality Assessment (RETQA), Pathology Visual Impairment detection (PVI), Eye Disease Diagnosis (EDD) and Visualization of Lesion Regions of the eye (VLR). We conducted experiments on over 10 datasets, totaling more than 80,000 fundus photos collected from various sources. The models integrated into ECVS achieved impressive AUC scores of 0.98 for RETQA, 0.95 for PVI, and 0.90 for EDD, along with a DICE coefficient of 0.48 for VLR. These results underscore the promising capabilities of ECVS as a straightforward and scalable method for community-based vision screening., Comment: 11 pages, 4 figures, published in MICCAI2024 OMIA XI workshop
Published: 2024

3. A New Perspective to Boost Performance Fairness for Medical Federated Learning

Author: Yan, Yunlu, Zhu, Lei, Li, Yuexiang, Xu, Xinxing, Goh, Rick Siow Mong, Liu, Yong, Khan, Salman, and Feng, Chun-Mei
Subjects: Computer Science - Machine Learning, Computer Science - Cryptography and Security, Computer Science - Computers and Society, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: Improving the fairness of federated learning (FL) benefits healthy and sustainable collaboration, especially for medical applications. However, existing fair FL methods ignore the specific characteristics of medical FL applications, i.e., domain shift among the datasets from different hospitals. In this work, we propose Fed-LWR to improve performance fairness from the perspective of feature shift, a key issue influencing the performance of medical FL systems caused by domain shift. Specifically, we dynamically perceive the bias of the global model across all hospitals by estimating the layer-wise difference in feature representations between local and global models. To minimize global divergence, we assign higher weights to hospitals with larger differences. The estimated client weights help us to re-aggregate the local models per layer to obtain a fairer global model. We evaluate our method on two widely used federated medical image segmentation benchmarks. The results demonstrate that our method achieves better and fairer performance compared with several state-of-the-art fair FL methods., Comment: 11 pages, 2 Figures
Published: 2024
Full Text: View/download PDF

4. UrFound: Towards Universal Retinal Foundation Models via Knowledge-Guided Masked Modeling

Author: Yu, Kai, Zhou, Yang, Bai, Yang, Da Soh, Zhi, Xu, Xinxing, Goh, Rick Siow Mong, Cheng, Ching-Yu, and Liu, Yong
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: Retinal foundation models aim to learn generalizable representations from diverse retinal images, facilitating label-efficient model adaptation across various ophthalmic tasks. Despite their success, current retinal foundation models are generally restricted to a single imaging modality, such as Color Fundus Photography (CFP) or Optical Coherence Tomography (OCT), limiting their versatility. Moreover, these models may struggle to fully leverage expert annotations and overlook the valuable domain knowledge essential for domain-specific representation learning. To overcome these limitations, we introduce UrFound, a retinal foundation model designed to learn universal representations from both multimodal retinal images and domain knowledge. UrFound is equipped with a modality-agnostic image encoder and accepts either CFP or OCT images as inputs. To integrate domain knowledge into representation learning, we encode expert annotation in text supervision and propose a knowledge-guided masked modeling strategy for model pre-training. It involves reconstructing randomly masked patches of retinal images while predicting masked text tokens conditioned on the corresponding retinal image. This approach aligns multimodal images and textual expert annotations within a unified latent space, facilitating generalizable and domain-specific representation learning. Experimental results demonstrate that UrFound exhibits strong generalization ability and data efficiency when adapting to various tasks in retinal image analysis. By training on ~180k retinal images, UrFound significantly outperforms the state-of-the-art retinal foundation model trained on up to 1.6 million unlabelled images across 8 public retinal datasets. Our code and data are available at https://github.com/yukkai/UrFound.
Published: 2024

5. CPT: Consistent Proxy Tuning for Black-box Optimization

Author: He, Yuanyang, Huang, Zitong, Xu, Xinxing, Goh, Rick Siow Mong, Khan, Salman, Zuo, Wangmeng, Liu, Yong, and Feng, Chun-Mei
Subjects: Computer Science - Machine Learning
Abstract: Black-box tuning has attracted recent attention due to that the structure or inner parameters of advanced proprietary models are not accessible. Proxy-tuning provides a test-time output adjustment for tuning black-box language models. It applies the difference of the output logits before and after tuning a smaller white-box "proxy" model to improve the black-box model. However, this technique serves only as a decoding-time algorithm, leading to an inconsistency between training and testing which potentially limits overall performance. To address this problem, we introduce Consistent Proxy Tuning (CPT), a simple yet effective black-box tuning method. Different from Proxy-tuning, CPT additionally exploits the frozen large black-box model and another frozen small white-box model, ensuring consistency between training-stage optimization objective and test-time proxies. This consistency benefits Proxy-tuning and enhances model performance. Note that our method focuses solely on logit-level computation, which makes it model-agnostic and applicable to any task involving logit classification. Extensive experimental results demonstrate the superiority of our CPT in both black-box tuning of Large Language Models (LLMs) and Vision-Language Models (VLMs) across various datasets. The code is available at https://github.com/chunmeifeng/CPT., Comment: 10 pages,2 figures plus supplementary materials
Published: 2024

6. Clinical Domain Knowledge-Derived Template Improves Post Hoc AI Explanations in Pneumothorax Classification

Author: Yuan, Han, Hong, Chuan, Jiang, Pengtao, Zhao, Gangming, Tran, Nguyen Tuan Anh, Xu, Xinxing, Yan, Yet Yen, and Liu, Nan
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Background: Pneumothorax is an acute thoracic disease caused by abnormal air collection between the lungs and chest wall. To address the opaqueness often associated with deep learning (DL) models, explainable artificial intelligence (XAI) methods have been introduced to outline regions related to pneumothorax diagnoses made by DL models. However, these explanations sometimes diverge from actual lesion areas, highlighting the need for further improvement. Method: We propose a template-guided approach to incorporate the clinical knowledge of pneumothorax into model explanations generated by XAI methods, thereby enhancing the quality of these explanations. Utilizing one lesion delineation created by radiologists, our approach first generates a template that represents potential areas of pneumothorax occurrence. This template is then superimposed on model explanations to filter out extraneous explanations that fall outside the template's boundaries. To validate its efficacy, we carried out a comparative analysis of three XAI methods with and without our template guidance when explaining two DL models in two real-world datasets. Results: The proposed approach consistently improved baseline XAI methods across twelve benchmark scenarios built on three XAI methods, two DL models, and two datasets. The average incremental percentages, calculated by the performance improvements over the baseline performance, were 97.8% in Intersection over Union (IoU) and 94.1% in Dice Similarity Coefficient (DSC) when comparing model explanations and ground-truth lesion areas. Conclusions: In the context of pneumothorax diagnoses, we proposed a template-guided approach for improving AI explanations. We anticipate that our template guidance will forge a fresh approach to elucidating AI models by integrating clinical domain expertise.
Published: 2024
Full Text: View/download PDF

7. RLPeri: Accelerating Visual Perimetry Test with Reinforcement Learning and Convolutional Feature Extraction

Author: Verma, Tanvi, Dinh, Linh Le, Tan, Nicholas, Xu, Xinxing, Cheng, Chingyu, and Liu, Yong
Subjects: Computer Science - Artificial Intelligence
Abstract: Visual perimetry is an important eye examination that helps detect vision problems caused by ocular or neurological conditions. During the test, a patient's gaze is fixed at a specific location while light stimuli of varying intensities are presented in central and peripheral vision. Based on the patient's responses to the stimuli, the visual field mapping and sensitivity are determined. However, maintaining high levels of concentration throughout the test can be challenging for patients, leading to increased examination times and decreased accuracy. In this work, we present RLPeri, a reinforcement learning-based approach to optimize visual perimetry testing. By determining the optimal sequence of locations and initial stimulus values, we aim to reduce the examination time without compromising accuracy. Additionally, we incorporate reward shaping techniques to further improve the testing performance. To monitor the patient's responses over time during testing, we represent the test's state as a pair of 3D matrices. We apply two different convolutional kernels to extract spatial features across locations as well as features across different stimulus values for each location. Through experiments, we demonstrate that our approach results in a 10-20% reduction in examination time while maintaining the accuracy as compared to state-of-the-art methods. With the presented approach, we aim to make visual perimetry testing more efficient and patient-friendly, while still providing accurate results., Comment: Published at AAAI-24
Published: 2024

8. Learning Prompt with Distribution-Based Feature Replay for Few-Shot Class-Incremental Learning

Author: Huang, Zitong, Chen, Ze, Chen, Zhixing, Zhou, Erjin, Xu, Xinxing, Goh, Rick Siow Mong, Liu, Yong, Zuo, Wangmeng, and Feng, Chunmei
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Few-shot Class-Incremental Learning (FSCIL) aims to continuously learn new classes based on very limited training data without forgetting the old ones encountered. Existing studies solely relied on pure visual networks, while in this paper we solved FSCIL by leveraging the Vision-Language model (e.g., CLIP) and propose a simple yet effective framework, named Learning Prompt with Distribution-based Feature Replay (LP-DiF). We observe that simply using CLIP for zero-shot evaluation can substantially outperform the most influential methods. Then, prompt tuning technique is involved to further improve its adaptation ability, allowing the model to continually capture specific knowledge from each session. To prevent the learnable prompt from forgetting old knowledge in the new session, we propose a pseudo-feature replay approach. Specifically, we preserve the old knowledge of each class by maintaining a feature-level Gaussian distribution with a diagonal covariance matrix, which is estimated by the image features of training images and synthesized features generated from a VAE. When progressing to a new session, pseudo-features are sampled from old-class distributions combined with training images of the current session to optimize the prompt, thus enabling the model to learn new knowledge while retaining old knowledge. Experiments on three prevalent benchmarks, i.e., CIFAR100, mini-ImageNet, CUB-200, and two more challenging benchmarks, i.e., SUN-397 and CUB-200$^*$ proposed in this paper showcase the superiority of LP-DiF, achieving new state-of-the-art (SOTA) in FSCIL. Code is publicly available at https://github.com/1170300714/LP-DiF.
Published: 2024

9. VQA4CIR: Boosting Composed Image Retrieval with Visual Question Answering

Author: Feng, Chun-Mei, Bai, Yang, Luo, Tao, Li, Zhen, Khan, Salman, Zuo, Wangmeng, Xu, Xinxing, Goh, Rick Siow Mong, and Liu, Yong
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Albeit progress has been made in Composed Image Retrieval (CIR), we empirically find that a certain percentage of failure retrieval results are not consistent with their relative captions. To address this issue, this work provides a Visual Question Answering (VQA) perspective to boost the performance of CIR. The resulting VQA4CIR is a post-processing approach and can be directly plugged into existing CIR methods. Given the top-C retrieved images by a CIR method, VQA4CIR aims to decrease the adverse effect of the failure retrieval results being inconsistent with the relative caption. To find the retrieved images inconsistent with the relative caption, we resort to the "QA generation to VQA" self-verification pipeline. For QA generation, we suggest fine-tuning LLM (e.g., LLaMA) to generate several pairs of questions and answers from each relative caption. We then fine-tune LVLM (e.g., LLaVA) to obtain the VQA model. By feeding the retrieved image and question to the VQA model, one can find the images inconsistent with relative caption when the answer by VQA is inconsistent with the answer in the QA pair. Consequently, the CIR performance can be boosted by modifying the ranks of inconsistently retrieved images. Experimental results show that our proposed method outperforms state-of-the-art CIR methods on the CIRR and Fashion-IQ datasets.
Published: 2023

10. Partially Supervised Unpaired Multi-modal Learning for Label-Efficient Medical Image Segmentation

Author: Zhu, Lei, Xu, Yanyu, Fu, Huazhu, Xu, Xinxing, Goh, Rick Siow Mong, Liu, Yong, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Xu, Xuanang, editor, Cui, Zhiming, editor, Rekik, Islem, editor, Ouyang, Xi, editor, and Sun, Kaicong, editor
Published: 2025
Full Text: View/download PDF

11. Enhancing Community Vision Screening: AI-Driven Retinal Photography for Early Disease Detection and Patient Trust

Author: Lei, Xiaofeng, Tham, Yih-Chung, Goh, Jocelyn Hui Lin, Feng, Yangqin, Bai, Yang, Soh, Zhi Da, Goh, Rick Siow Mong, Xu, Xinxing, Liu, Yong, Cheng, Ching-Yu, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Bhavna, Antony, editor, Chen, Hao, editor, Fang, Huihui, editor, Fu, Huazhu, editor, and Lee, Cecilia S., editor
Published: 2025
Full Text: View/download PDF

12. A Surrogate-Assisted Extended Generative Adversarial Network for Parameter Optimization in Free-Form Metasurface Design

Author: Dai, Manna, Jiang, Yang, Yang, Feng, Chattoraj, Joyjit, Xia, Yingzhi, Xu, Xinxing, Zhao, Weijiang, Dao, My Ha, and Liu, Yong
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition, Electrical Engineering and Systems Science - Image and Video Processing, Physics - Optics
Abstract: Metasurfaces have widespread applications in fifth-generation (5G) microwave communication. Among the metasurface family, free-form metasurfaces excel in achieving intricate spectral responses compared to regular-shape counterparts. However, conventional numerical methods for free-form metasurfaces are time-consuming and demand specialized expertise. Alternatively, recent studies demonstrate that deep learning has great potential to accelerate and refine metasurface designs. Here, we present XGAN, an extended generative adversarial network (GAN) with a surrogate for high-quality free-form metasurface designs. The proposed surrogate provides a physical constraint to XGAN so that XGAN can accurately generate metasurfaces monolithically from input spectral responses. In comparative experiments involving 20000 free-form metasurface designs, XGAN achieves 0.9734 average accuracy and is 500 times faster than the conventional methodology. This method facilitates the metasurface library building for specific spectral responses and can be extended to various inverse design problems, including optical metamaterials, nanophotonic devices, and drug discovery.
Published: 2023

13. Sentence-level Prompts Benefit Composed Image Retrieval

Author: Bai, Yang, Xu, Xinxing, Liu, Yong, Khan, Salman, Khan, Fahad, Zuo, Wangmeng, Goh, Rick Siow Mong, and Feng, Chun-Mei
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: Composed image retrieval (CIR) is the task of retrieving specific images by using a query that involves both a reference image and a relative caption. Most existing CIR models adopt the late-fusion strategy to combine visual and language features. Besides, several approaches have also been suggested to generate a pseudo-word token from the reference image, which is further integrated into the relative caption for CIR. However, these pseudo-word-based prompting methods have limitations when target image encompasses complex changes on reference image, e.g., object removal and attribute modification. In this work, we demonstrate that learning an appropriate sentence-level prompt for the relative caption (SPRC) is sufficient for achieving effective composed image retrieval. Instead of relying on pseudo-word-based prompts, we propose to leverage pretrained V-L models, e.g., BLIP-2, to generate sentence-level prompts. By concatenating the learned sentence-level prompt with the relative caption, one can readily use existing text-based image retrieval models to enhance CIR performance. Furthermore, we introduce both image-text contrastive loss and text prompt alignment loss to enforce the learning of suitable sentence-level prompts. Experiments show that our proposed method performs favorably against the state-of-the-art CIR methods on the Fashion-IQ and CIRR datasets. The source code and pretrained model are publicly available at https://github.com/chunmeifeng/SPRC
Published: 2023

14. DS-Depth: Dynamic and Static Depth Estimation via a Fusion Cost Volume

Author: Miao, Xingyu, Bai, Yang, Duan, Haoran, Huang, Yawen, Wan, Fan, Xu, Xinxing, Long, Yang, and Zheng, Yefeng
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Self-supervised monocular depth estimation methods typically rely on the reprojection error to capture geometric relationships between successive frames in static environments. However, this assumption does not hold in dynamic objects in scenarios, leading to errors during the view synthesis stage, such as feature mismatch and occlusion, which can significantly reduce the accuracy of the generated depth maps. To address this problem, we propose a novel dynamic cost volume that exploits residual optical flow to describe moving objects, improving incorrectly occluded regions in static cost volumes used in previous work. Nevertheless, the dynamic cost volume inevitably generates extra occlusions and noise, thus we alleviate this by designing a fusion module that makes static and dynamic cost volumes compensate for each other. In other words, occlusion from the static volume is refined by the dynamic volume, and incorrect information from the dynamic volume is eliminated by the static volume. Furthermore, we propose a pyramid distillation loss to reduce photometric error inaccuracy at low resolutions and an adaptive photometric error loss to alleviate the flow direction of the large gradient in the occlusion regions. We conducted extensive experiments on the KITTI and Cityscapes datasets, and the results demonstrate that our model outperforms previously published baselines for self-supervised monocular depth estimation.
Published: 2023
Full Text: View/download PDF

15. Towards Instance-adaptive Inference for Federated Learning

Author: Feng, Chun-Mei, Yu, Kai, Liu, Nian, Xu, Xinxing, Khan, Salman, and Zuo, Wangmeng
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition
Abstract: Federated learning (FL) is a distributed learning paradigm that enables multiple clients to learn a powerful global model by aggregating local training. However, the performance of the global model is often hampered by non-i.i.d. distribution among the clients, requiring extensive efforts to mitigate inter-client data heterogeneity. Going beyond inter-client data heterogeneity, we note that intra-client heterogeneity can also be observed on complex real-world data and seriously deteriorate FL performance. In this paper, we present a novel FL algorithm, i.e., FedIns, to handle intra-client data heterogeneity by enabling instance-adaptive inference in the FL framework. Instead of huge instance-adaptive models, we resort to a parameter-efficient fine-tuning method, i.e., scale and shift deep features (SSF), upon a pre-trained model. Specifically, we first train an SSF pool for each client, and aggregate these SSF pools on the server side, thus still maintaining a low communication cost. To enable instance-adaptive inference, for a given instance, we dynamically find the best-matched SSF subsets from the pool and aggregate them to generate an adaptive SSF specified for the instance, thereby reducing the intra-client as well as the inter-client heterogeneity. Extensive experiments show that our FedIns outperforms state-of-the-art FL algorithms, e.g., a 6.64\% improvement against the top-performing method with less than 15\% communication cost on Tiny-ImageNet. Our code and models will be publicly released., Comment: Proceedings of the IEEE/CVF International Conference on Computer Vision 2023
Published: 2023

16. Optical coherence tomography choroidal enhancement using generative deep learning

Author: Bellemo, Valentina, Kumar Das, Ankit, Sreng, Syna, Chua, Jacqueline, Wong, Damon, Shah, Janika, Jonas, Rahul, Tan, Bingyao, Liu, Xinyu, Xu, Xinxing, Tan, Gavin Siew Wei, Agrawal, Rupesh, Ting, Daniel Shu Wei, Yong, Liu, and Schmetterer, Leopold
Published: 2024
Full Text: View/download PDF

17. Uncertainty-inspired Open Set Learning for Retinal Anomaly Identification

Author: Wang, Meng, Lin, Tian, Wang, Lianyu, Lin, Aidi, Zou, Ke, Xu, Xinxing, Zhou, Yi, Peng, Yuanyuan, Meng, Qingquan, Qian, Yiming, Deng, Guoyao, Wu, Zhiqun, Chen, Junhong, Lin, Jianhong, Zhang, Mingzhi, Zhu, Weifang, Zhang, Changqing, Zhang, Daoqiang, Goh, Rick Siow Mong, Liu, Yong, Pang, Chi Pui, Chen, Xinjian, Chen, Haoyu, and Fu, Huazhu
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition
Abstract: Failure to recognize samples from the classes unseen during training is a major limitation of artificial intelligence in the real-world implementation for recognition and classification of retinal anomalies. We established an uncertainty-inspired open-set (UIOS) model, which was trained with fundus images of 9 retinal conditions. Besides assessing the probability of each category, UIOS also calculated an uncertainty score to express its confidence. Our UIOS model with thresholding strategy achieved an F1 score of 99.55%, 97.01% and 91.91% for the internal testing set, external target categories (TC)-JSIEC dataset and TC-unseen testing set, respectively, compared to the F1 score of 92.20%, 80.69% and 64.74% by the standard AI model. Furthermore, UIOS correctly predicted high uncertainty scores, which would prompt the need for a manual check in the datasets of non-target categories retinal diseases, low-quality fundus images, and non-fundus images. UIOS provides a robust method for real-world screening of retinal anomalies.
Published: 2023

18. Learning Federated Visual Prompt in Null Space for MRI Reconstruction

Author: Feng, Chun-Mei, Li, Bangjun, Xu, Xinxing, Liu, Yong, Fu, Huazhu, and Zuo, Wangmeng
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Federated Magnetic Resonance Imaging (MRI) reconstruction enables multiple hospitals to collaborate distributedly without aggregating local data, thereby protecting patient privacy. However, the data heterogeneity caused by different MRI protocols, insufficient local training data, and limited communication bandwidth inevitably impair global model convergence and updating. In this paper, we propose a new algorithm, FedPR, to learn federated visual prompts in the null space of global prompt for MRI reconstruction. FedPR is a new federated paradigm that adopts a powerful pre-trained model while only learning and communicating the prompts with few learnable parameters, thereby significantly reducing communication costs and achieving competitive performance on limited local data. Moreover, to deal with catastrophic forgetting caused by data heterogeneity, FedPR also updates efficient federated visual prompts that project the local prompts into an approximate null space of the global prompt, thereby suppressing the interference of gradients on the server performance. Extensive experiments on federated MRI show that FedPR significantly outperforms state-of-the-art FL algorithms with <6% of communication costs when given the limited amount of local training data., Comment: 8 pages, Proceedings of the IEEE/CVF International Conference on Computer Vision
Published: 2023

19. Federated Uncertainty-Aware Aggregation for Fundus Diabetic Retinopathy Staging

Author: Wang, Meng, Wang, Lianyu, Xu, Xinxing, Zou, Ke, Qian, Yiming, Goh, Rick Siow Mong, Liu, Yong, and Fu, Huazhu
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: Deep learning models have shown promising performance in the field of diabetic retinopathy (DR) staging. However, collaboratively training a DR staging model across multiple institutions remains a challenge due to non-iid data, client reliability, and confidence evaluation of the prediction. To address these issues, we propose a novel federated uncertainty-aware aggregation paradigm (FedUAA), which considers the reliability of each client and produces a confidence estimation for the DR staging. In our FedUAA, an aggregated encoder is shared by all clients for learning a global representation of fundus images, while a novel temperature-warmed uncertainty head (TWEU) is utilized for each client for local personalized staging criteria. Our TWEU employs an evidential deep layer to produce the uncertainty score with the DR staging results for client reliability evaluation. Furthermore, we developed a novel uncertainty-aware weighting module (UAW) to dynamically adjust the weights of model aggregation based on the uncertainty score distribution of each client. In our experiments, we collect five publicly available datasets from different institutions to conduct a dataset for federated DR staging to satisfy the real non-iid condition. The experimental results demonstrate that our FedUAA achieves better DR staging performance with higher reliability compared to other federated learning methods. Our proposed FedUAA paradigm effectively addresses the challenges of collaboratively training DR staging models across multiple institutions, and provides a robust and reliable solution for the deployment of DR diagnosis models in real-world clinical scenarios.
Published: 2023
Full Text: View/download PDF

20. Medical Phrase Grounding with Region-Phrase Context Contrastive Alignment

Author: Chen, Zhihao, Zhou, Yang, Tran, Anh, Zhao, Junting, Wan, Liang, Ooi, Gideon, Cheng, Lionel, Thng, Choon Hua, Xu, Xinxing, Liu, Yong, and Fu, Huazhu
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Medical phrase grounding (MPG) aims to locate the most relevant region in a medical image, given a phrase query describing certain medical findings, which is an important task for medical image analysis and radiological diagnosis. However, existing visual grounding methods rely on general visual features for identifying objects in natural images and are not capable of capturing the subtle and specialized features of medical findings, leading to sub-optimal performance in MPG. In this paper, we propose MedRPG, an end-to-end approach for MPG. MedRPG is built on a lightweight vision-language transformer encoder and directly predicts the box coordinates of mentioned medical findings, which can be trained with limited medical data, making it a valuable tool in medical image analysis. To enable MedRPG to locate nuanced medical findings with better region-phrase correspondences, we further propose Tri-attention Context contrastive alignment (TaCo). TaCo seeks context alignment to pull both the features and attention outputs of relevant region-phrase pairs close together while pushing those of irrelevant regions far away. This ensures that the final box prediction depends more on its finding-specific regions and phrases. Experimental results on three MPG datasets demonstrate that our MedRPG outperforms state-of-the-art visual grounding approaches by a large margin. Additionally, the proposed TaCo strategy is effective in enhancing finding localization ability and reducing spurious region-phrase correlations.
Published: 2023

21. Extended R-matrix description of two-proton radioactivity

Author: Zhang, Zhaozhan, Yuan, Cenxi, Qi, Chong, Cai, Boshuai, and Xu, Xinxing
Subjects: Nuclear Theory
Abstract: Two-proton ($2p$) radioactivity provides fundamental knowledge on the three-body decay mechanism and the residual nuclear interaction. In this work, we propose decay width formulae in the extended R-matrix framework for different decay mechanisms, including sequential $2p$ decay, diproton decay, tri-body decay, and sequential two-diproton decay. The diproton and tri-body formulae, combined with information on the two-nucleon transfer amplitude and Wigner single-particle reduced width, can reproduce well experimental $2p$ radioactivity half-lives. For the case of $^{67}$Kr, theoretical predictions for direct $2p$ decay give much larger half-lives than the recent measurement from RIKEN. A combination of direct and sequential $2p$ emission is analyzed by considering a small negative one-proton separation energy and a possible enhanced contribution from the $p$-wave component. The present method predicts that $^{71}$Sr and $^{74}$Zr may be the most promising candidates for future study on $2p$ radioactivity. Our model gives an upper limit of 55(4) keV for the decay width of $4p$ emission in recently found four-proton resonant nuclide, $^{18}$Mg, which agrees with the observed width of 115(100) keV., Comment: version accepted for publication in Physics Letters B
Published: 2023
Full Text: View/download PDF

22. Multi-Scale Region-Aware Implicit Neural Network for Medical Images Matting

Author: Xu, Yanyu, Xia, Yingzhi, Fu, Huazhu, Goh, Rick Siow Mong, Liu, Yong, Xu, Xinxing, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Linguraru, Marius George, editor, Dou, Qi, editor, Feragen, Aasa, editor, Giannarou, Stamatia, editor, Glocker, Ben, editor, Lekadir, Karim, editor, and Schnabel, Julia A., editor
Published: 2024
Full Text: View/download PDF

23. Localizing Anatomical Landmarks in Ocular Images using Zoom-In Attentive Networks

Author: Lei, Xiaofeng, Li, Shaohua, Xu, Xinxing, Fu, Huazhu, Liu, Yong, Tham, Yih-Chung, Feng, Yangqin, Tan, Mingrui, Xu, Yanyu, Goh, Jocelyn Hui Lin, Goh, Rick Siow Mong, and Cheng, Ching-Yu
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Localizing anatomical landmarks are important tasks in medical image analysis. However, the landmarks to be localized often lack prominent visual features. Their locations are elusive and easily confused with the background, and thus precise localization highly depends on the context formed by their surrounding areas. In addition, the required precision is usually higher than segmentation and object detection tasks. Therefore, localization has its unique challenges different from segmentation or detection. In this paper, we propose a zoom-in attentive network (ZIAN) for anatomical landmark localization in ocular images. First, a coarse-to-fine, or "zoom-in" strategy is utilized to learn the contextualized features in different scales. Then, an attentive fusion module is adopted to aggregate multi-scale features, which consists of 1) a co-attention network with a multiple regions-of-interest (ROIs) scheme that learns complementary features from the multiple ROIs, 2) an attention-based fusion module which integrates the multi-ROIs features and non-ROI features. We evaluated ZIAN on two open challenge tasks, i.e., the fovea localization in fundus images and scleral spur localization in AS-OCT images. Experiments show that ZIAN achieves promising performances and outperforms state-of-the-art localization methods. The source code and trained models of ZIAN are available at https://github.com/leixiaofeng-astar/OMIA9-ZIAN.
Published: 2022

24. The preparation, characterization and gastroprotective activity of fermented oyster hydrolysate

Author: Liu, Li, Liu, Xue, Yang, Xinyi, Xu, Xinxing, and Zeng, Mingyong
Published: 2024
Full Text: View/download PDF

25. CRAFT: Cross-Attentional Flow Transformer for Robust Optical Flow

Author: Sui, Xiuchao, Li, Shaohua, Geng, Xue, Wu, Yan, Xu, Xinxing, Liu, Yong, Goh, Rick, and Zhu, Hongyuan
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Optical flow estimation aims to find the 2D motion field by identifying corresponding pixels between two images. Despite the tremendous progress of deep learning-based optical flow methods, it remains a challenge to accurately estimate large displacements with motion blur. This is mainly because the correlation volume, the basis of pixel matching, is computed as the dot product of the convolutional features of the two images. The locality of convolutional features makes the computed correlations susceptible to various noises. On large displacements with motion blur, noisy correlations could cause severe errors in the estimated flow. To overcome this challenge, we propose a new architecture "CRoss-Attentional Flow Transformer" (CRAFT), aiming to revitalize the correlation volume computation. In CRAFT, a Semantic Smoothing Transformer layer transforms the features of one frame, making them more global and semantically stable. In addition, the dot-product correlations are replaced with transformer Cross-Frame Attention. This layer filters out feature noises through the Query and Key projections, and computes more accurate correlations. On Sintel (Final) and KITTI (foreground) benchmarks, CRAFT has achieved new state-of-the-art performance. Moreover, to test the robustness of different models on large motions, we designed an image shifting attack that shifts input images to generate large artificial motions. Under this attack, CRAFT performs much more robustly than two representative methods, RAFT and GMA. The code of CRAFT is is available at https://github.com/askerlee/craft., Comment: CVPR 2022 camera ready
Published: 2022

26. A review of phycocyanin: Production, extraction, stability and food applications

Author: Mao, Mengxia, Han, Guixin, Zhao, Yilin, Xu, Xinxing, and Zhao, Yuanhui
Published: 2024
Full Text: View/download PDF

27. Anti-solvent precipitation for encapsulation of oyster protein hydrolysate nanoparticles: Effect on off-flavor elimination and bioaccessibility

Author: Liu, Li, Zhang, Weijia, Zhao, Yuanhui, and Xu, Xinxing
Published: 2024
Full Text: View/download PDF

28. Preparation and characterization of kelp polysaccharide and its research on anti-influenza a virus activity

Author: Pi, Tianxiang, Sun, Lishan, Li, Wei, Wang, Wei, Dong, Minghui, Xu, Xinxing, Xu, He, and Zhao, Yuanhui
Published: 2024
Full Text: View/download PDF

29. Comparative Analysis of Vision Transformers and Conventional Convolutional Neural Networks in Detecting Referable Diabetic Retinopathy

Author: Goh, Jocelyn Hui Lin, Ang, Elroy, Srinivasan, Sahana, Lei, Xiaofeng, Loh, Johnathan, Quek, Ten Cheer, Xue, Cancan, Xu, Xinxing, Liu, Yong, Cheng, Ching-Yu, Rajapakse, Jagath C., and Tham, Yih-Chung
Published: 2024
Full Text: View/download PDF

30. Multi-omics combined approach to analyze the mechanism of flavor evolution in sturgeon caviar (Acipenser gueldenstaedtii) during refrigeration storage

Author: Liu, Li, Liu, Yihuan, Bai, Fan, Wang, Jinlin, Xu, He, Jiang, Xiaoming, Lu, Shixue, Wu, Jihong, Zhao, Yuanhui, and Xu, Xinxing
Published: 2024
Full Text: View/download PDF

31. Mechanism of low-voltage electrostatic field on flavor retention in refrigerated sturgeon caviar: Insights from phospholipids

Author: Jiang, Xinyu, Liu, Yihuan, Liu, Li, Bai, Fan, Wang, Jinlin, Xu, He, Dong, Shiyuan, Jiang, Xiaoming, Wu, Jihong, Zhao, Yuanhui, and Xu, Xinxing
Published: 2024
Full Text: View/download PDF

32. Composition-dependent reversal of anomalous Hall effect in Co1-xPdx single layer

Author: Chen, Zehan, Liu, Lin, Luo, Weikai, Yang, Hui, Xu, Xinxing, and An, Hongyu
Published: 2024
Full Text: View/download PDF

33. Highly efficient selective extraction of Li from spent LiNixCoyMnzO2 assisted with activated pyrite in a subcritical water system

Author: Su, Fanyun, Zhou, Xiangyang, Liu, Xiaojian, Zhu, Yong, Tang, Jingjing, Chen, Yanxi, Liu, Guangli, Xu, Xinxing, Wang, Hui, and Yang, Juan
Published: 2024
Full Text: View/download PDF

34. Flavor formation mechanisms based on phospholipid fermentation simulation system in oyster juice co-fermented by Lactiplantibacillus plantarum and Saccharomyces cerevisiae

Author: Li, Ke, Han, Guixin, Liu, Li, Zhao, Yuanhui, Liu, Tianhong, Wang, Hongjiang, and Xu, Xinxing
Published: 2025
Full Text: View/download PDF

35. Hypoglycemic peptide preparation from Bacillus subtilis fermented with Pyropia: Identification, molecular docking, and in vivo confirmation

Author: Han, Guixin, Xu, Yuxian, Li, Jiayu, Li, Ke, Xu, Xinxing, Gao, Xin, Zhao, Yuanhui, Jiang, Hong, and Mao, Xiangzhao
Published: 2025
Full Text: View/download PDF

36. REFUGE2 Challenge: A Treasure Trove for Multi-Dimension Analysis and Evaluation in Glaucoma Screening

Author: Fang, Huihui, Li, Fei, Wu, Junde, Fu, Huazhu, Sun, Xu, Son, Jaemin, Yu, Shuang, Zhang, Menglu, Yuan, Chenglang, Bian, Cheng, Lei, Baiying, Zhao, Benjian, Xu, Xinxing, Li, Shaohua, Fumero, Francisco, Sigut, José, Almubarak, Haidar, Bazi, Yakoub, Guo, Yuanhao, Zhou, Yating, Baid, Ujjwal, Innani, Shubham, Guo, Tianjiao, Yang, Jie, Orlando, José Ignacio, Bogunović, Hrvoje, Zhang, Xiulan, and Xu, Yanwu
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: With the rapid development of artificial intelligence (AI) in medical image processing, deep learning in color fundus photography (CFP) analysis is also evolving. Although there are some open-source, labeled datasets of CFPs in the ophthalmology community, large-scale datasets for screening only have labels of disease categories, and datasets with annotations of fundus structures are usually small in size. In addition, labeling standards are not uniform across datasets, and there is no clear information on the acquisition device. Here we release a multi-annotation, multi-quality, and multi-device color fundus image dataset for glaucoma analysis on an original challenge -- Retinal Fundus Glaucoma Challenge 2nd Edition (REFUGE2). The REFUGE2 dataset contains 2000 color fundus images with annotations of glaucoma classification, optic disc/cup segmentation, as well as fovea localization. Meanwhile, the REFUGE2 challenge sets three sub-tasks of automatic glaucoma diagnosis and fundus structure analysis and provides an online evaluation framework. Based on the characteristics of multi-device and multi-quality data, some methods with strong generalizations are provided in the challenge to make the predictions more robust. This shows that REFUGE2 brings attention to the characteristics of real-world multi-domain data, bridging the gap between scientific research and clinical application., Comment: 29 pages, 21 figures
Published: 2022

37. GAMMA Challenge:Glaucoma grAding from Multi-Modality imAges

Author: Wu, Junde, Fang, Huihui, Li, Fei, Fu, Huazhu, Lin, Fengbin, Li, Jiongcheng, Huang, Lexing, Yu, Qinji, Song, Sifan, Xu, Xinxing, Xu, Yanyu, Wang, Wensai, Wang, Lingxiao, Lu, Shuai, Li, Huiqi, Huang, Shihua, Lu, Zhichao, Ou, Chubin, Wei, Xifei, Liu, Bingyuan, Kobbi, Riadh, Tang, Xiaoying, Lin, Li, Zhou, Qiang, Hu, Qiang, Bogunovic, Hrvoje, Orlando, José Ignacio, Zhang, Xiulan, and Xu, Yanwu
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Color fundus photography and Optical Coherence Tomography (OCT) are the two most cost-effective tools for glaucoma screening. Both two modalities of images have prominent biomarkers to indicate glaucoma suspected. Clinically, it is often recommended to take both of the screenings for a more accurate and reliable diagnosis. However, although numerous algorithms are proposed based on fundus images or OCT volumes in computer-aided diagnosis, there are still few methods leveraging both of the modalities for the glaucoma assessment. Inspired by the success of Retinal Fundus Glaucoma Challenge (REFUGE) we held previously, we set up the Glaucoma grAding from Multi-Modality imAges (GAMMA) Challenge to encourage the development of fundus \& OCT-based glaucoma grading. The primary task of the challenge is to grade glaucoma from both the 2D fundus images and 3D OCT scanning volumes. As part of GAMMA, we have publicly released a glaucoma annotated dataset with both 2D fundus color photography and 3D OCT volumes, which is the first multi-modality dataset for glaucoma grading. In addition, an evaluation framework is also established to evaluate the performance of the submitted methods. During the challenge, 1272 results were submitted, and finally, top-10 teams were selected to the final stage. We analysis their results and summarize their methods in the paper. Since all these teams submitted their source code in the challenge, a detailed ablation study is also conducted to verify the effectiveness of the particular modules proposed. We find many of the proposed techniques are practical for the clinical diagnosis of glaucoma. As the first in-depth study of fundus \& OCT multi-modality glaucoma grading, we believe the GAMMA Challenge will be an essential starting point for future research.
Published: 2022

38. Inhibition effect of non-contact biocontrol bacteria and plant essential oil mixture on the generation of N-nitrosamines in deli meat during storage

Author: Li, Ke, Han, Guixin, Lu, Shixue, Xu, Xinxing, Dong, Hao, Wang, Haiyan, Luan, Fulei, Jiang, Xiaoming, Liu, Tianhong, and Zhao, Yuanhui
Published: 2024
Full Text: View/download PDF

39. Enhancement of flavor quality in oyster hydrolysate through fermentation with oyster-derived lactic acid bacteria

Author: Lu, Xiangzhi, Shi, Min, Liu, Li, Chen, Zefan, Xu, Xinxing, Feng, Guangxin, and Zeng, Mingyong
Published: 2024
Full Text: View/download PDF

40. A surrogate-assisted extended generative adversarial network for parameter optimization in free-form metasurface design

Author: Dai, Manna, Jiang, Yang, Yang, Feng, Chattoraj, Joyjit, Xia, Yingzhi, Xu, Xinxing, Zhao, Weijiang, Dao, My Ha, and Liu, Yong
Published: 2024
Full Text: View/download PDF

41. The influence mechanism of phospholipids structure and composition changes caused by oxidation on the formation of flavor substances in sturgeon caviar

Author: Zhang, Weijia, Jiang, Xinyu, Liu, Li, Zhao, Yuanhui, Bai, Fan, Wang, Jinlin, Gao, Ruichang, and Xu, Xinxing
Published: 2024
Full Text: View/download PDF

42. Identification and validation of core microbes for the formation of the characteristic flavor of fermented oysters (Crassostrea gigas)

Author: Liu, Li, Liu, Tianhong, Wang, Hongjiang, Zhao, Yuanhui, Xu, Xinxing, and Zeng, Mingyong
Published: 2024
Full Text: View/download PDF

43. Contribution of phospholipase B to the formation of characteristic flavor in steamed sturgeon meat

Author: Yang, Zhuyu, Liu, Yahui, Bai, Fan, Wang, Jinlin, Gao, Ruichang, Zhao, Yuanhui, and Xu, Xinxing
Published: 2024
Full Text: View/download PDF

44. Few-Shot Domain Adaptation with Polymorphic Transformers

Author: Li, Shaohua, Sui, Xiuchao, Fu, Jie, Fu, Huazhu, Luo, Xiangde, Feng, Yangqin, Xu, Xinxing, Liu, Yong, Ting, Daniel, and Goh, Rick Siow Mong
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Deep neural networks (DNNs) trained on one set of medical images often experience severe performance drop on unseen test images, due to various domain discrepancy between the training images (source domain) and the test images (target domain), which raises a domain adaptation issue. In clinical settings, it is difficult to collect enough annotated target domain data in a short period. Few-shot domain adaptation, i.e., adapting a trained model with a handful of annotations, is highly practical and useful in this case. In this paper, we propose a Polymorphic Transformer (Polyformer), which can be incorporated into any DNN backbones for few-shot domain adaptation. Specifically, after the polyformer layer is inserted into a model trained on the source domain, it extracts a set of prototype embeddings, which can be viewed as a "basis" of the source-domain features. On the target domain, the polyformer layer adapts by only updating a projection layer which controls the interactions between image features and the prototype embeddings. All other model weights (except BatchNorm parameters) are frozen during adaptation. Thus, the chance of overfitting the annotations is greatly reduced, and the model can perform robustly on the target domain after being trained on a few annotated images. We demonstrate the effectiveness of Polyformer on two medical segmentation tasks (i.e., optic disc/cup segmentation, and polyp segmentation). The source code of Polyformer is released at https://github.com/askerlee/segtran., Comment: MICCAI'2021 camera ready
Published: 2021

45. Federated benchmarking of medical artificial intelligence with MedPerf

Author: Karargyris, Alexandros, Umeton, Renato, Sheller, Micah J., Aristizabal, Alejandro, George, Johnu, Wuest, Anna, Pati, Sarthak, Kassem, Hasan, Zenk, Maximilian, Baid, Ujjwal, Narayana Moorthy, Prakash, Chowdhury, Alexander, Guo, Junyi, Nalawade, Sahil, Rosenthal, Jacob, Kanter, David, Xenochristou, Maria, Beutel, Daniel J., Chung, Verena, Bergquist, Timothy, Eddy, James, Abid, Abubakar, Tunstall, Lewis, Sanseviero, Omar, Dimitriadis, Dimitrios, Qian, Yiming, Xu, Xinxing, Liu, Yong, Goh, Rick Siow Mong, Bala, Srini, Bittorf, Victor, Puchala, Sreekar Reddy, Ricciuti, Biagio, Samineni, Soujanya, Sengupta, Eshna, Chaudhari, Akshay, Coleman, Cody, Desinghu, Bala, Diamos, Gregory, Dutta, Debo, Feddema, Diane, Fursin, Grigori, Huang, Xinyuan, Kashyap, Satyananda, Lane, Nicholas, Mallick, Indranil, Mascagni, Pietro, Mehta, Virendra, Moraes, Cassiano Ferro, Natarajan, Vivek, Nikolov, Nikola, Padoy, Nicolas, Pekhimenko, Gennady, Reddi, Vijay Janapa, Reina, G. Anthony, Ribalta, Pablo, Singh, Abhishek, Thiagarajan, Jayaraman J., Albrecht, Jacob, Wolf, Thomas, Miller, Geralyn, Fu, Huazhu, Shah, Prashant, Xu, Daguang, Yadav, Poonam, Talby, David, Awad, Mark M., Howard, Jeremy P., Rosenthal, Michael, Marchionni, Luigi, Loda, Massimo, Johnson, Jason M., Bakas, Spyridon, and Mattson, Peter
Published: 2023
Full Text: View/download PDF

46. Study on antimicrobial activity of sturgeon skin mucus polypeptides (Rational Design, Self-Assembly and Application)

Author: Yang, Beining, Li, Wei, Mao, Yuxuan, Zhao, Yuanhui, Xue, Yong, Xu, Xinxing, Zhao, Yilin, and Liu, Kang
Published: 2024
Full Text: View/download PDF

47. Interactions between phosvitin and aldehydes affect the release of flavor from Russian sturgeon caviar

Author: Zhang, Weijia, Liu, Li, Zhao, Yuanhui, Liu, Tianhong, Bai, Fan, Wang, Jinlin, Xu, He, Gao, Ruichang, Jiang, Xiaoming, and Xu, Xinxing
Published: 2024
Full Text: View/download PDF

48. Medical Image Segmentation Using Squeeze-and-Expansion Transformers

Author: Li, Shaohua, Sui, Xiuchao, Luo, Xiangde, Xu, Xinxing, Liu, Yong, and Goh, Rick
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: Medical image segmentation is important for computer-aided diagnosis. Good segmentation demands the model to see the big picture and fine details simultaneously, i.e., to learn image features that incorporate large context while keep high spatial resolutions. To approach this goal, the most widely used methods -- U-Net and variants, extract and fuse multi-scale features. However, the fused features still have small "effective receptive fields" with a focus on local image cues, limiting their performance. In this work, we propose Segtran, an alternative segmentation framework based on transformers, which have unlimited "effective receptive fields" even at high feature resolutions. The core of Segtran is a novel Squeeze-and-Expansion transformer: a squeezed attention block regularizes the self attention of transformers, and an expansion block learns diversified representations. Additionally, we propose a new positional encoding scheme for transformers, imposing a continuity inductive bias for images. Experiments were performed on 2D and 3D medical image segmentation tasks: optic disc/cup segmentation in fundus images (REFUGE'20 challenge), polyp segmentation in colonoscopy images, and brain tumor segmentation in MRI scans (BraTS'19 challenge). Compared with representative existing methods, Segtran consistently achieved the highest segmentation accuracy, and exhibited good cross-domain generalization capabilities. The source code of Segtran is released at https://github.com/askerlee/segtran., Comment: Camera ready for IJCAI'2021
Published: 2021

49. Minimal-Supervised Medical Image Segmentation via Vector Quantization Memory

Author: Xu, Yanyu, Zhou, Menghan, Feng, Yangqin, Xu, Xinxing, Fu, Huazhu, Goh, Rick Siow Mong, Liu, Yong, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Greenspan, Hayit, editor, Madabhushi, Anant, editor, Mousavi, Parvin, editor, Salcudean, Septimiu, editor, Duncan, James, editor, Syeda-Mahmood, Tanveer, editor, and Taylor, Russell, editor
Published: 2023
Full Text: View/download PDF

50. Category-Independent Visual Explanation for Medical Deep Network Understanding

Author: Qian, Yiming, Li, Liangzhi, Fu, Huazhu, Wang, Meng, Peng, Qingsheng, Tham, Yih Chung, Cheng, Chingyu, Liu, Yong, Goh, Rick Siow Mong, Xu, Xinxing, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Greenspan, Hayit, editor, Madabhushi, Anant, editor, Mousavi, Parvin, editor, Salcudean, Septimiu, editor, Duncan, James, editor, Syeda-Mahmood, Tanveer, editor, and Taylor, Russell, editor
Published: 2023
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

530 results on '"Xu, Xinxing"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources