Back to Search Start Over

Semantic Retrieval of Similar Radiological Images using Vision Transformers

Authors :
Anjali Thakrar
Michael Jayasuriya
Adrian Serapio
Xiao Wu
Eric Davis
Jamie Schroeder
Maya Vella
Jae Ho Sohn
Publication Year :
2023
Publisher :
Cold Spring Harbor Laboratory, 2023.

Abstract

BackgroundIdentifying visually and semantically similar radiological images in a database can facilitate the creation of decision support tools, teaching files, and research cohorts. Existing content-based image retrieval tools are often limited to searching by pixel-wise difference or vector distance of model predictions. Vision transformers (ViT) use attention to simultaneously take into account radiological diagnosis and visual appearance.PurposeWe aim to develop a ViT-based image retrieval framework and evaluate the algorithm on NIH Chest Radiographs (CXR) and NLST Chest CTs.Materials and MethodsThe model was trained on 112,120 CXR and 111,955 CT images. For CXR, a ViT binary classifier was trained on 4 ground truth labels (Cardiomegaly, Opacity, Emphysema, No Finding) and ensembled to produce multilabel classifications for each CXR. For CT, a regression model was trained to minimize L1 loss on the continuous ground truth labels of patient weight. The ViT image embedding layer was treated as a global image descriptor, using the L2 distance between descriptors as a similarity measure. To qualitatively evaluate the model, five radiologists performed a reader performance study with random query images (25 CT, 25 CXR). For each image, they chose the 5 most similar images from a set of 10 images (the 5 closest and 5 furthest images from the query in model space). Inter-radiologist and radiologist-model agreement statistics were calculated.ResultsThe CXR model achieved nDCG@5 of 0.73 (pConclusionOur ViT architecture retrieved visually and semantically similar radiological images.Summary StatementThis study evaluates the efficacy of using ViT based image embeddings for CBIR tasks for CXR and CT images, finding that it performs well on visual and semantic recognition tasks.Key ResultsThe CXR model achieved nDCG@5 of 0.73 (pThe CT model achieved nDCG of 16.85 (pInter-radiologist Fleiss Kappa of 0.51 and radiologist consensus to model Cohen’s Kappa of 0.65 were observed.

Details

Database :
OpenAIRE
Accession number :
edsair.doi...........bafb9dae9275771f03d0b9cb78750621