Background Lung cancer is one of the most common types of cancer in the United Kingdom. It is often diagnosed late. The 5-year survival rate for lung cancer is below 10%. Early diagnosis may improve survival. Software that has an artificial intelligence-developed algorithm might be useful in assisting with the identification of suspected lung cancer. Objectives This review sought to identify evidence on adjunct artificial intelligence software for analysing chest X-rays for suspected lung cancer, and to develop a conceptual cost-effectiveness model to inform discussion of what would be required to develop a fully executable cost-effectiveness model for future economic evaluation. Data sources The data sources were MEDLINE All, EMBASE, Cochrane Database of Systematic Reviews, Cochrane CENTRAL, Epistemonikos, ACM Digital Library, World Health Organization International Clinical Trials Registry Platform, clinical experts, Tufts Cost-Effectiveness Analysis Registry, company submissions and clinical experts. Searches were conducted from 25 November 2022 to 18 January 2023. Methods Rapid evidence synthesis methods were employed. Data from companies were scrutinised. The eligibility criteria were (1) primary care populations referred for chest X-ray due to symptoms suggestive of lung cancer or reasons unrelated to lung cancer; (2) study designs that compared radiology specialist assessing chest X-ray with adjunct artificial intelligence software versus radiology specialists alone and (3) outcomes relating to test accuracy, practical implications of using artificial intelligence software and patient-related outcomes. A conceptual decision-analytic model was developed to inform a potential full cost-effectiveness evaluation of adjunct artificial intelligence software for analysing chest X-ray images to identify suspected lung cancer. Results None of the studies identified in the searches or submitted by the companies met the inclusion criteria of the review. Contextual information from six studies that did not meet the inclusion criteria provided some evidence that sensitivity for lung cancer detection (but not nodule detection) might be higher when chest X-rays are interpreted by radiology specialists in combination with artificial intelligence software than when they are interpreted by radiology specialists alone. No significant differences were observed for specificity, positive predictive value or number of cancers detected. None of the six studies provided evidence on the clinical effectiveness of adjunct artificial intelligence software. The conceptual model highlighted a paucity of input data along the course of the diagnostic pathway and identified key assumptions required for evidence linkage. Limitations This review employed rapid evidence synthesis methods. This included only one reviewer conducting all elements of the review, and targeted searches that were conducted in English only. No eligible studies were identified. Conclusions There is currently no evidence applicable to this review on the use of adjunct artificial intelligence software for the detection of suspected lung cancer on chest X-ray in either people referred from primary care with symptoms of lung cancer or people referred from primary care for other reasons. Future work Future research is required to understand the accuracy of adjunct artificial intelligence software to detect lung nodules and cancers, as well as its impact on clinical decision-making and patient outcomes. Research generating key input parameters for the conceptual model will enable refinement of the model structure, and conversion to a full working model, to analyse the cost-effectiveness of artificial intelligence software for this indication. Study registration This study is registered as PROSPERO CRD42023384164. Funding This award was funded by the National Institute for Health and Care Research (NIHR) Evidence Synthesis programme (NIHR award ref: NIHR135755) and is published in full in Health Technology Assessment; Vol. 28, No. 50. See the NIHR Funding and Awards website for further award information. Plain language summary Lung cancer is one of the most common types of cancer in the United Kingdom. Early diagnosis may improve survival, as lung cancer is often diagnosed late. Chest X-rays can be used to identify features of lung cancer. There can be delays in getting X-rays, and sometimes features of lung cancer are not seen on them. Artificial intelligence software may help by finding features of cancer on chest X-rays and highlighting them. A radiologist will look at the X-rays and information from the software. There is a lack of information about how lung cancer diagnosis could change if artificial intelligence software is used and what the costs may be to the National Health Service. This project looked at the use of artificial intelligence software in the detection of lung cancer in people referred from primary care. Software companies were invited to provide evidence. There were no studies that looked at this topic among people from primary care. We summarised the closest evidence we could find instead. All of this had flaws, so we could not tell if the results were accurate or helpful to this review. It was not clear if artificial intelligence helped to find cancers or improve people’s health. We made a theoretical model to discuss the best way to assess if artificial intelligence software might be cost-effective in detecting lung cancer and what evidence would be needed to do this in a fully working model. Costs and alternative pricing models provided by five companies were used to calculate the cost of adding artificial intelligence software to review chest X-rays in people referred from their general practitioner, for the first 5 years, based on one National Health Service trust. Future studies are needed to identify the impact of adjunct artificial intelligence on test accuracy, clinical decision-making and patient outcomes (e.g. mortality and morbidity). Scientific summary Background Lung cancer occurs when abnormal cells multiply in an uncontrolled way to form a tumour in the lung. It is one of the most common types of cancer in the UK, and each year over 43,000 new cases are diagnosed. In the early stages of the disease, people usually do not have symptoms, which means that lung cancer is often diagnosed late. The 5-year survival rate for lung cancer is low, at below 10%. Early diagnosis may improve survival. The National Institute for Health and Care Excellence (NICE) has identified software that has an artificial intelligence (AI)-developed algorithm (referred to hereafter as AI software) as potentially useful in assisting with the identification of suspected lung cancer. AI combines computer science and data sets to enable problem-solving. Machine learning and deep learning are subfields of AI. They comprise AI algorithms that seek to create expert systems to make predictions or classifications based on data input. This assessment covers the use of AI software as an adjunct to an appropriate radiology specialist to assist in the identification of suspected lung cancer on chest X-rays (CXRs). AI technologies subject to this assessment are standalone software platforms developed with deep-learning algorithms to interpret CXRs. The algorithms are fixed but updated periodically. The AI software automatically interprets radiology images from the CXR to identify abnormalities or suspected abnormalities. The abnormalities detected and the methods of flagging the location and type of abnormalities differ between different AI technologies. For example, a CXR may be flagged as suspected lung cancer when a lung nodule, lung mass or hilar enlargement, or a combination of these, is identified. A technology may classify CXRs into those with and without a nodule, or it may identify several different abnormalities or lung diseases. Objectives The overall aim of this early value assessment (EVA) is to identify evidence on adjunct AI software for analysing CXRs for suspected lung cancer and identify evidence gaps to help direct data collection and further research. A conceptual modelling process was undertaken to inform discussion of what would be required to develop a fully executable cost-effectiveness model for future economic evaluation. The assessment is not intended to replace the need for a full assessment (Diagnostic Assessment Report) or to provide sufficient detail or synthesis to enable a recommendation to be made about whether AI software can be implemented in clinical practice at the present time. There are two populations of interest in this EVA: (1) people referred from primary care for a CXR because they have symptoms suggestive of lung cancer (symptomatic population) and (2) people referred from primary care for a CXR for reasons unrelated to lung cancer (incidental population). Based on the scope produced by NICE, we defined the following questions to inform future assessment on the benefits, harms and costs of adjunct AI for analysing on CXRs for suspected lung cancer compared with human reader alone in these populations: What is the test accuracy and test failure rate of adjunct AI software to detect lung cancer on CXRs? What are the practical implications of adjunct AI to detect lung cancer on CXRs? What is the clinical effectiveness of adjunct AI software applied to CXRs? What are the cost and resource use considerations relating to use of adjunct AI to detect lung cancer? What would a health economic model to estimate the cost-effectiveness of adjunct AI to detect lung cancer look like? Methods Data sources MEDLINE All (via Ovid), EMBASE (via Ovid), Cochrane Database of Systematic Reviews (via Wiley), Cochrane CENTRAL (via Wiley), Epistemonikos, ACM Digital Library, World Health Organization International Clinical Trials Registry Platform, clinical experts, and company submissions. Eligibility criteria Population: people referred for CXR from primary care because they have symptoms suggestive of lung cancer, and people referred for CXR from primary care for reasons unrelated to lung cancer. Intervention: radiology specialist with adjunct AI Comparator: radiology specialist without adjunct AI Outcomes: test accuracy, patient management, clinical effectiveness. Study selection, data extraction and assessment of risks of bias Titles and abstracts of all identified records were screened by one reviewer against the review eligibility criteria, with a random 20% screened by a second reviewer. Full texts of records considered potentially relevant by either reviewer were retrieved and assessed for inclusion by one reviewer. A random 20% sample were assessed independently by a second reviewer, with any disagreements resolved by consensus or discussion with a third reviewer. We planned to extract data into a piloted form, assess risk of bias and synthesise data using methods described in the research protocol; however, no studies met the inclusion criteria. Post hoc methods were determined following discussions with NICE to select and summarise the closest available evidence to the review inclusion criteria. Studies were selected that assessed eligible AI software in conjunction with radiologists compared with radiologists alone but in which the referral status and symptomatic status of the population was unclear. Data were extracted by one reviewer, with a random 20% checked by a second reviewer. Results were summarised narratively, and key biases were noted. Data synthesis A narrative data synthesis was performed. Modelling The conceptual modelling process explored both the structure and evidence requirements for parameter inputs, for future model development. An iterative approach was taken to facilitate the identification of cost outcomes, potential value drivers of AI software for this indication and evidence linkage requirements for longer-term outcomes, where time allowed. Costs associated with implementing AI software were also considered. Information to inform the conceptual model was obtained from a variety of sources including a literature review, current clinical guidelines, discussion with specialist clinical experts and the companies submitting evidence on AI software. Given the time available, the diagnostic component of the model was the primary focus of the health economics aspect of the report. Priority was given to the following considerations: input parameters to populate the model – including consideration of the type of evidence required, sources available and gaps in the evidence relevant outcome measures to compare the cost-effectiveness and clinical effectiveness of AI software in the detection of lung cancer identification of potential value drivers of the model – with recommendations of how these can be measured for inclusion in a cost-effectiveness model. Results Test accuracy, practical implications and clinical effectiveness No studies met the inclusion criteria of the review. Two ongoing studies with unclear eligibility were identified. In the absence of available evidence, we summarised data from six studies that had unclear populations but included a comparison of CXRs read by readers with and without the use of commercial AI software. Statistical comparisons were not undertaken in most of the studies, but there was some evidence that sensitivity might be higher among specialist radiologist with AI than among specialist radiologist without AI. This finding was not consistent between studies, however. No significant differences were observed for specificity, positive predictive value or number of cancers detected. None of the studies provided evidence on the clinical effectiveness of adjunct AI software. The summarised studies were small retrospective studies with important methodological limitations, and their generalisability to the UK population is unclear. Conclusions There is currently no evidence applicable to this review on the use of adjunct AI software for the detection of suspected lung cancer on CXRs in either people referred from primary care with symptoms of lung cancer or people referred from primary care for other reasons. Implications for service provision Lung cancer pathways are complex and contain many routes to diagnosis. Although national guidance and timelines for diagnosis exist, practice varies widely throughout radiology departments and lung cancer teams both within and across NHS trusts. With many ways to achieve these targets, changes in any area of the diagnostic pathway may have a significant impact elsewhere. There is some evidence for the impact of CXR results on the diagnostic pathway when performed without AI assistance, as is current practice. This is limited, and it is difficult to compare results because of the different study designs used and the different outcomes reported. There is no published evidence to link measures of progression through the diagnostic pathways with long-term outcomes such as stage at diagnosis and survival. There is currently no applicable evidence to show the impact of the addition of AI software to CXR review on the diagnosis of lung cancer. There may be multiple ways AI software could change measures along this pathway. These could include improved accuracy of lung cancer detection, directing patients along the quickest pathway to diagnosis, quicker report turnaround time to achieve earlier confirmatory testing, or prioritisation of cases for review including those without lung cancer who can be discharged more quickly and free up staff time and resources. AI software may also have a negative impact on pathways by increasing the number of benign lung nodules detected and increasing the number of patients who undergo a computed tomography scan that they might not have needed. This would be detrimental to the patients, with increased exposure to radiation and anxiety due to a positive CXR result, and has cost and resource use implications for the department. With a lack of evidence on AI software, the impact on service provision is unknown and may have significant implications in terms of progression through diagnostic pathways, resource use, costs and patient outcomes. Study registration This study is registered as PROSPERO CRD42023384164. Funding This award was funded by the National Institute for Health and Care Research (NIHR) Evidence Synthesis programme (NIHR award ref: NIHR135755) and is published in full in Health Technology Assessment; Vol. 28, No. 50. See the NIHR Funding and Awards website for further award information.