Back to Search Start Over

Self-supervised multi-modal training from uncurated images and reports enables monitoring AI in radiology.

Authors :
Park S
Lee ES
Shin KS
Lee JE
Ye JC
Source :
Medical image analysis [Med Image Anal] 2024 Jan; Vol. 91, pp. 103021. Date of Electronic Publication: 2023 Nov 07.
Publication Year :
2024

Abstract

The escalating demand for artificial intelligence (AI) systems that can monitor and supervise human errors and abnormalities in healthcare presents unique challenges. Recent advances in vision-language models reveal the challenges of monitoring AI by understanding both visual and textual concepts and their semantic correspondences. However, there has been limited success in the application of vision-language models in the medical domain. Current vision-language models and learning strategies for photographic images and captions call for a web-scale data corpus of image and text pairs which is not often feasible in the medical domain. To address this, we present a model named medical cross-attention vision-language model (Medical X-VL), which leverages key components to be tailored for the medical domain. The model is based on the following components: self-supervised unimodal models in medical domain and a fusion encoder to bridge them, momentum distillation, sentencewise contrastive learning for medical reports, and sentence similarity-adjusted hard negative mining. We experimentally demonstrated that our model enables various zero-shot tasks for monitoring AI, ranging from the zero-shot classification to zero-shot error correction. Our model outperformed current state-of-the-art models in two medical image datasets, suggesting a novel clinical application of our monitoring AI model to alleviate human errors. Our method demonstrates a more specialized capacity for fine-grained understanding, which presents a distinct advantage particularly applicable to the medical domain.<br />Competing Interests: Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.<br /> (Copyright © 2023 Elsevier B.V. All rights reserved.)

Details

Language :
English
ISSN :
1361-8423
Volume :
91
Database :
MEDLINE
Journal :
Medical image analysis
Publication Type :
Academic Journal
Accession number :
37952385
Full Text :
https://doi.org/10.1016/j.media.2023.103021