Background Magnetic resonance imaging-based technologies are non-invasive diagnostic tests that can be used to assess non-alcoholic fatty liver disease. Objectives The study objectives were to assess the diagnostic test accuracy, clinical impact and cost-effectiveness of two magnetic resonance imaging-based technologies (LiverMultiScan and magnetic resonance elastography) for patients with non-alcoholic fatty liver disease for whom advanced fibrosis or cirrhosis had not been diagnosed and who had indeterminate results from fibrosis testing, or for whom transient elastography or acoustic radiation force impulse was unsuitable, or who had discordant results from fibrosis testing. Data sources The data sources searched were MEDLINE, MEDLINE Epub Ahead of Print, In-Process & Other Non-Indexed Citations, Embase, Cochrane Database of Systematic Reviews, Cochrane Central Database of Controlled Trials, Database of Abstracts of Reviews of Effects and the Health Technology Assessment. Methods A systematic review was conducted using established methods. Diagnostic test accuracy estimates were calculated using bivariate models and a summary receiver operating characteristic curve was calculated using a hierarchical model. A simple decision-tree model was developed to generate cost-effectiveness results. Results The diagnostic test accuracy review (13 studies) and the clinical impact review (11 studies) only included one study that provided evidence for patients who had indeterminate or discordant results from fibrosis testing. No studies of patients for whom transient elastography or acoustic radiation force impulse were unsuitable were identified. Depending on fibrosis level, relevant published LiverMultiScan diagnostic test accuracy results ranged from 50% to 88% (sensitivity) and from 42% to 75% (specificity). No magnetic resonance elastography diagnostic test accuracy data were available for the specific population of interest. Results from the clinical impact review suggested that acceptability of LiverMultiScan was generally positive. To explore how the decision to proceed to biopsy is influenced by magnetic resonance imaging-based technologies, the External Assessment Group presented cost-effectiveness analyses for LiverMultiScan plus biopsy versus biopsy only. Base-case incremental cost-effectiveness ratio per quality-adjusted life year gained results for seven of the eight diagnostic test strategies considered showed that LiverMultiScan plus biopsy was dominated by biopsy only; for the remaining strategy (Brunt grade ≥2), the incremental cost-effectiveness ratio per quality-adjusted life year gained was £1,266,511. Results from threshold and scenario analyses demonstrated that External Assessment Group base-case results were robust to plausible variations in the magnitude of key parameters. Limitations Diagnostic test accuracy, clinical impact and cost-effectiveness data for magnetic resonance imaging-based technologies for the population that is the focus of this assessment were limited. Conclusions Magnetic resonance imaging-based technologies may be useful to identify patients who may benefit from additional testing in the form of liver biopsy and those for whom this additional testing may not be necessary. However, there is a paucity of diagnostic test accuracy and clinical impact data for patients who have indeterminate results from fibrosis testing, for whom transient elastography or acoustic radiation force impulse are unsuitable or who had discordant results from fibrosis testing. Given the External Assessment Group cost-effectiveness analyses assumptions, the use of LiverMultiScan and magnetic resonance elastography for assessing non-alcoholic fatty liver disease for patients with inconclusive results from previous fibrosis testing is unlikely to be a cost-effective use of National Health Service resources compared with liver biopsy only. Study registration This study is registered as PROSPERO CRD42021286891. Funding Funding for this study was provided by the Evidence Synthesis Programme of the National Institute for Health and Care Research (NIHR) Health Technology Assessment programme and will be published in full in Health Technology Assessment; Vol. 27, No. 10. See the NIHR Journals Library website for further project information. Plain language summary Non-alcoholic fatty liver disease includes a range of conditions that are caused by a build-up of fat in the liver, and not by alcohol consumption. This build-up of fat can cause inflammation. Persistent inflammation can cause scar tissue (fibrosis) to develop. It is important to identify patients with fibrosis because severe fibrosis can cause permanent liver damage (cirrhosis), which can lead to liver failure and liver cancer. In the National Health Service, patients with non-alcoholic fatty liver disease undergo tests to determine whether they have fibrosis. The test results are not always accurate and multiple tests can give conflicting results. Some of the tests may not be suitable for patients who have a very high body mass index. In the National Health Service, a liver biopsy may be offered to patients with inconclusive or conflicting test results or to those patients for whom other tests are unsuitable. However, liver biopsy is expensive, and is associated with side-effects such as pain and bleeding. Magnetic resonance imaging-based testing could be used as an extra test to help clinicians assess non-alcoholic fatty liver disease and identify patients who may need a liver biopsy. We assessed two magnetic resonance imaging-based diagnostic tests, LiverMultiScan and magnetic resonance elastography. LiverMultiScan is imaging software that is used alongside magnetic resonance imaging to measure markers of liver disease. Magnetic resonance elastography is used in some National Health Service centres to assess liver fibrosis; however, magnetic resonance elastography requires more equipment than just an magnetic resonance imaging scanner. We reviewed all studies examining how well LiverMultiScan and magnetic resonance elastography assess patients with non-alcoholic fatty liver disease. We also built an economic model to estimate the costs and benefits of using LiverMultiScan to identify patients who should be sent for a biopsy. Results from the model showed that LiverMultiScan may not provide good value for money to the National Health Service. Scientific summary Background Non-alcoholic fatty liver disease (NAFLD) is an umbrella term for a range of conditions caused by a build-up of fat in the liver that has not been caused by alcohol consumption. NAFLD covers a spectrum of histological lesions ranging from steatosis (simple fatty liver) to complex patterns of hepatocyte injury, inflammation and fibrosis. In the current National Health Service diagnostic pathway for staging fibrosis (based on guidelines and expert advice to NICE), patients with NAFLD (confirmed by ultrasound and liver aetiology screen) are referred for the fibrosis-4 (FIB-4), NAFLD fibrosis score (NFS) or enhanced liver fibrosis (ELF) test as first-line testing. Patients with an indeterminate result from first-line testing are referred for second-line testing using transient elastography (TE), acoustic radiation force impulse (ARFI) or the ELF test, if it had not already been used as a first-line test. Patients with indeterminate or discordant results from fibrosis testing and patients with high risk of advanced fibrosis are considered for liver biopsy. Magnetic resonance imaging (MRI)-based testing could be used as an additional, non-invasive, diagnostic test to help clinicians stage NAFLD and potentially identify which patients should be referred for liver biopsy. Liver biopsy is expensive and is an invasive procedure that is associated with complications. Objectives The objectives of this study were to assess the diagnostic test accuracy (DTA), the clinical impact and the cost-effectiveness of two non-invasive MRI-based technologies, namely LiverMultiScan and magnetic resonance elastography (MRE), for patients with NAFLD for whom advanced fibrosis or cirrhosis had not yet been diagnosed and who had indeterminate results from fibrosis testing, for whom TE or ARFI was unsuitable, or who had discordant results from fibrosis testing. To achieve the study objectives, the External Assessment Group (EAG): conducted a systematic literature review to evaluate the (1) DTA of MRI-based technologies for the assessment of fibrosis, inflammation, and steatosis for a patients with NAFLD for whom advanced fibrosis or cirrhosis had not yet been diagnosed, using liver biopsy as the reference standard, and (2) the clinical impact of MRI-based technologies conducted a systematic literature review to explore the cost-effectiveness of MRI-based technologies as diagnostic tools and built a de novo economic model to assess the cost-effectiveness of two diagnostic pathways, namely MRI-based technologies plus biopsy and liver biopsy. Methods: assessment of diagnostic test accuracy and clinical impact Electronic databases (MEDLINE, MEDLINE Epub Ahead of Print In-Process & Other Non-Indexed Citations, Embase, Cochrane Databases of Systematic Reviews, Cochrane Central Database of Controlled Trials, Database of Abstracts of Reviews of Effects, Health Technology Assessment Database) were searched from inception to 4 October 2021. Eligible studies assessed the DTA or clinical impact of LiverMultiScan or MRE for patients with NAFLD for whom advanced fibrosis or cirrhosis had not yet been diagnosed (who have indeterminate results from fibrosis testing, for whom TE or ARFI is unsuitable, or who have discordant results from fibrosis testing). Two reviewers independently screened the titles and abstracts of all reports identified through electronic database searches and of all full-text articles subsequently obtained for assessment. Data extraction and quality assessment were conducted by one reviewer and checked for agreement by a second reviewer. The methodological quality of the included DTA studies was assessed using the QUality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) tool. The methodological quality of randomised controlled trials (RCTs) evaluating the clinical impact of MRI-based technologies was assessed using the Cochrane Risk of Bias 2.0 tool. The National Institute of Health study quality-assessment tools for cohort studies, case-control studies and before–after (pre-post) studies with no control group were used to assess risk of bias of included non-randomised studies. Qualitative studies were assessed using the Critical Appraisal Skills Programme (CASP) qualitative studies checklist. The sensitivity and specificity of each index test were summarised in forest plots. Where at least three studies provided both sensitivity and specificity data for a specific combination of index test, diagnosis of interest, and cut-off value, a bivariate random-effects meta-analysis to provide pooled estimates of sensitivity and specificity was considered. We did not perform bivariate meta-analyses where statistical heterogeneity between the studies (assessed by visually examining forest plots) was so great that pooled estimates of sensitivity and specificity would have been meaningless. Where at least three studies provided both sensitivity and specificity data for a specific combination of index test and diagnosis of interest, but used different cut-off values for the index test, we used a hierarchical model to estimate a summary receiver operating characteristic (ROC) curve. Methods: assessment of cost-effectiveness The EAG appended an economic evaluation-specific search filter to the clinical search strategies to identify published cost-effectiveness studies. In addition, two databases of economic publications [EconLit (EBSCO) and the cost-effectiveness analysis (CEA) registry] were searched from inception until 4 October 2021. The EAG developed a simple, flexible de novo model to estimate the cost-effectiveness of an MRI-based technologies plus biopsy pathway versus liver biopsy only pathway. Results The EAG searches of the electronic databases and reference lists of relevant studies and systematic reviews identified 4489 records (3331 unique records). Although all the identified studies for inclusion in the DTA, clinical impact and cost-effectiveness reviews included patients with NAFLD for whom advanced fibrosis or cirrhosis had not yet been diagnosed, only one study provided results for patients with NAFLD who had indeterminate or discordant results from fibrosis testing. No studies were identified that considered patients for whom TE or ARFI was unsuitable. Diagnostic test accuracy The EAG identified 13 studies (15 publications). Two studies (four publications) were evaluations of LiverMultiScan, 10 studies (10 publications) were evaluations of MRE, and one study (one publication) was an evaluation of LiverMultiScan and MRE. MRI-based technology: LiverMultiScan For the LiverMultiScan proton density fat fraction (PDFF) and LiverMultiScan iron-corrected T1 (cT1) outputs, 2 × 2 data were available from three studies. The EAG considers that the Eddowes 2018 study is the most relevant study to this assessment. Eddowes 2018 recruited patients who were scheduled for non-targeted liver biopsy to stage fibrosis after inconclusive non-invasive assessment of fibrosis or to make a diagnosis after a range of non-invasive tests had not confirmed a diagnosis. For diagnosis of fibrosis, estimates from Eddowes 2018 ranged from 50% to 88% for sensitivity and from 42% to 75% for specificity. Sensitivity and specificity values for fibrosis testing in Eddowes 2018 were consistently higher for LiverMultiScan cT1 than for LiverMultiScan PDFF. Data from three studies were included in the meta-analyses for LiverMultiScan. For advanced fibrosis (≥F3), the pooled sensitivity and specificity values were higher for LiverMultiScan cT1 [sensitivity = 60.2%, 95% confidence interval (CI): 50.9% to 68.8%; specificity = 65.4%, 95% CI 55.8% to 73.9%] than for LiverMultiScan PDFF (sensitivity = 38.6%, 95% CI 23.8% to 56.0%; specificity = 43.6%, 95% CI 30.7% to 57.5%). MRI-based technology: magnetic resonance elastography For the MRE test, 2 × 2 data were available from four studies. Estimates of sensitivity and specificity for advanced fibrosis (≥F3) were high and ranged from 71% to 100% and 79% to 93%, respectively. However, the cut-off values used to indicate a positive result from the index test varied between studies, therefore a summary ROC curve was estimated. The summary ROC curve indicates high DTA. However, observed study results do not all lie close to the summary ROC curve, which could be due to small sample sizes and/or clinical and methodological heterogeneity between the included studies. Clinical impact Eleven studies (14 publications) were included in the clinical impact review. Five studies (eight publications) were evaluations of LiverMultiScan and six studies (six publications) were evaluations of MRE. MRI-based technology: LiverMultiScan Two studies reported on the prognostic ability of LiverMultiScan cT1. However, neither study reported results specifically for the subpopulation of patients with NAFLD for whom advanced fibrosis or cirrhosis had not yet been diagnosed. One study reported that LiverMultiScan cT1 and LiverMultiScan PDFF could reduce the number of unnecessary biopsies for patients with non-NAFLD and NAFLD to diagnose non-alcoholic steatohepatitis (NASH) and fibrosis unrelated to NAFLD [EAG calculated odds ratio (OR) = 0.65, 95% CI 0.22 to 1.96] and for patients with no to mild fibrosis (F0 to F1) to diagnose significant fibrosis to cirrhosis (F2 to F4; EAG calculated OR = 0.59, 95% CI 0.18 to 1.89) when compared to standard of care. Three studies reported the test failure rate of LiverMultiScan for patients with all liver aetiologies. The test failure rate ranged from 5.3% to 7.6%. One study reported the test failure rate for LiverMultiScan for patients with NAFLD (5.6%). Acceptability of LiverMultiScan was reported in a qualitative study and was generally positive. MRI-based technology: magnetic resonance elastography Six studies reported the test failure rate of MRE for patients with all liver aetiologies. The test failure rate ranged from 0.0% to 7.6%. Three studies reported the test failure rate for MRE specifically for patients with NAFLD. The EAG performed a fixed-effects meta-analysis to obtain a pooled estimate of 4.2% (95% CI 2.5% to 6.2%) test failure rate for patients with NAFLD. Despite conducting additional targeted searches, the EAG did not identify any relevant studies that provided evidence of the clinical impact of MRI-based technologies for patients with NAFLD for whom advanced fibrosis or cirrhosis has not been diagnosed, for the remaining clinical impact outcomes listed in the final scope issued by NICE. Cost-effectiveness The EAG base-case incremental cost-effectiveness ratio (ICER) per quality-adjusted life year (QALY) gained results for seven of the eight diagnostic test strategies considered, and showed that the LiverMultiScan plus biopsy pathway was dominated by the biopsy only pathway. For Brunt grade ≥2, the ICER per QALY gained was £1,266,511. Results from the EAG threshold and scenario analyses demonstrated that these results were robust to plausible variations in the magnitude of key parameters. Conclusions The DTA, clinical impact and cost-effectiveness data for MRI-based technologies are limited for patients who have indeterminate results from fibrosis testing, for whom TE or ARFI is unsuitable or patients who have discordant results from fibrosis testing. Only one small LiverMultiScan study provided DTA and population prevalence data for patients described in the final scope issued by NICE. It is unclear whether sensitivity and specificity estimates reported by this small study will give clinicians sufficient confidence to use LiverMultiScan test results to triage patients with inconclusive results from previous fibrosis testing to biopsy. Cost-effectiveness results from the EAG model are only informative if clinicians have confidence in LiverMultiScan DTA data. Using the available DTA and population prevalence data, EAG cost-effectiveness results showed that LiverMultiScan is unlikely to be cost-effective at current prices when used to triage patients with inconclusive results from previous fibrosis testing to biopsy. LiverMultiScan data are not available for patients for whom TE or ARFI was unsuitable. Further, no MRE DTA data were available for the population described in the final scope issued by NICE. The EAG was unable to generate cost-effectiveness results for this technology; however, even if MRE was 100% accurate, due to high population prevalence estimates it is unlikely that MRE would be cost-effective at current prices. Study registration This study is registered as PROSPERO CRD42021286891. Funding Funding for this study was provided by the Evidence Synthesis Programme of the National Institute for Health and Care Research (NIHR) Health Technology Assessment programme and will be published in full in Health Technology Assessment; Vol. 27, No. 10. See the NIHR Journals Library website for further project information.