Back to Search Start Over

TransDD: A transformer-based dual-path decoder for improving the performance of thoracic diseases classification using chest X-ray.

Authors :
Jiang, Xiaoben
Zhu, Yu
Liu, Yatong
Cai, Gan
Fang, Hao
Source :
Biomedical Signal Processing & Control; May2024, Vol. 91, pN.PAG-N.PAG, 1p
Publication Year :
2024

Abstract

• Introduce a learnable label embedding as queries to detect and match class-related features from the feature maps. • Spatial reduction attention is employed to simplify the complexity of global self-attention. • Dual-path attention is designed to establish a compact connection between visual features and their corresponding labels of thoracic diseases. • A classification attention block is used to balance two classification scores based on feature output and label output. • Achieve the highest average per-class AUC of 83.1% on the ChestX-ray14 datasets for multi-label classification of thoracic diseases, while yielding the highest ACC of 94.3% for multi-class classification. Manually and accurately detecting thoracic diseases from CXR images is a time-consuming task that requires experienced radiologists. Therefore, automated thoracic diseases classification has great significance. However, most existing methods solely leverage the feature maps extracted from CXR images to classify thoracic diseases, without effectively connecting the correlation between the local discriminative lesion features and their corresponding labels. To address this issue, we innovatively introduce a learnable label embedding as queries to detect and match class-related features from the feature maps, and then processed by a novel Transformer-based dual-path decoder (TransDD) to facilitate interaction. The proposed TransDD is comprised of three key components: spatial reduction attention (SRA), dual-path attention (DPA), and feature enhancement module (FEM). SRA is employed in simplifying the complexity of self-attention, while DPA is specifically designed to connect the explicit correlation between the features and labels. Moreover, FEM is used to boost the expressiveness of local features. Subsequently, the classification attention block is utilized to balance two classification scores based on the feature output and label output, respectively. The proposed TransDD-PVT attained SOTA performance on the ChestX-ray14 dataset, achieving a mean area under the receiver operating characteristic (AUC) of 83.1% across all 14 classes. Also, our method achieves 94.31% accuracy and 93.31% sensitivity on three-class classifications. Extensive experiments conducted on several datasets demonstrate the powerful ability of our TransDD to improve the performance of thoracic diseases classification. It can serve as a plug-and-play structure to improve the classification performance of both CNNs and recent Transformer-based backbones. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
17468094
Volume :
91
Database :
Supplemental Index
Journal :
Biomedical Signal Processing & Control
Publication Type :
Academic Journal
Accession number :
176072266
Full Text :
https://doi.org/10.1016/j.bspc.2023.105937