Back to Search Start Over

Simulating doctors' thinking logic for chest X-ray report generation via Transformer-based Semantic Query learning.

Authors :
Gao D
Kong M
Zhao Y
Huang J
Huang Z
Kuang K
Wu F
Zhu Q
Source :
Medical image analysis [Med Image Anal] 2024 Jan; Vol. 91, pp. 102982. Date of Electronic Publication: 2023 Sep 29.
Publication Year :
2024

Abstract

Medical report generation can be treated as a process of doctors' observing, understanding, and describing images from different perspectives. Following this process, this paper innovatively proposes a Transformer-based Semantic Query learning paradigm (TranSQ). Briefly, this paradigm is to learn an intention embedding set and make a semantic query to the visual features, generate intent-compliant sentence candidates, and form a coherent report. We apply a bipartite matching mechanism during training to realize the dynamic correspondence between the intention embeddings and the sentences to induct medical concepts into the observation intentions. Experimental results on two major radiology reporting datasets (i.e., IU X-ray and MIMIC-CXR) demonstrate that our model outperforms state-of-the-art models regarding generation effectiveness and clinical efficacy. In addition, comprehensive ablation experiments fully validate the TranSQ model's innovation and interpretation. The code is available at https://github.com/zjukongming/TranSQ.<br />Competing Interests: Declaration of competing interest We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, there is no professional or other personal interest of any nature or kind in any product, service, and/or company that could be construed as influencing the position presented in or the review of the manuscript.<br /> (Copyright © 2023 Elsevier B.V. All rights reserved.)

Details

Language :
English
ISSN :
1361-8423
Volume :
91
Database :
MEDLINE
Journal :
Medical image analysis
Publication Type :
Academic Journal
Accession number :
37837692
Full Text :
https://doi.org/10.1016/j.media.2023.102982