Start Over

Can large language models fully automate or partially assist paper selection in systematic reviews?

Authors :: Chen H
Jiang Z
Liu X
Xue CC
Yew SME
Sheng B
Zheng YF
Wang X
Wu Y
Sivaprasad S
Wong TY
Chaudhary V
Tham YC
Source :: The British journal of ophthalmology [Br J Ophthalmol] 2025 Jan 15. Date of Electronic Publication: 2025 Jan 15.
Publication Year :: 2025
Publisher :: Ahead of Print
Abstract: Background/aims: Large language models (LLMs) have substantial potential to enhance the efficiency of academic research. The accuracy and performance of LLMs in a systematic review, a core part of evidence building, has yet to be studied in detail. Methods: We introduced two LLM-based approaches of systematic review: an LLM-enabled fully automated approach (LLM-FA) utilising three different GPT-4 plugins (Consensus GPT, Scholar GPT and GPT web browsing modes) and an LLM-facilitated semi-automated approach (LLM-SA) using GPT4's Application Programming Interface (API). We benchmarked these approaches using three published systematic reviews that reported the prevalence of diabetic retinopathy across different populations (general population, pregnant women and children). Results: The three published reviews consisted of 98 papers in total. Across these three reviews, in the LLM-FA approach, Consensus GPT correctly identified 32.7% (32 out of 98) of papers, while Scholar GPT and GPT4's web browsing modes only identified 19.4% (19 out of 98) and 6.1% (6 out of 98), respectively. On the other hand, the LLM-SA approach not only successfully included 82.7% (81 out of 98) of these papers but also correctly excluded 92.2% of 4497 irrelevant papers. Conclusions: Our findings suggest LLMs are not yet capable of autonomously identifying and selecting relevant papers in systematic reviews. However, they hold promise as an assistive tool to improve the efficiency of the paper selection process in systematic reviews. Competing Interests: Competing interests: TYW declares consulting fees from Aldropika Therapeutics, Bayer, Boehringer Ingelheim, Genentech, Iveric Bio, Novartis, Plano, Oxurion, Roche, Sanofi and Shanghai Henlius; funding from the National Key R&D Program, China (grant number 2022YFC2502802); and being an inventor, patent holder and co-founder of the start-up companies EyRiS and Visre. All other authors declare no competing interests. (© Author(s) (or their employer(s)) 2025. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ Group.)

Details

Language :: English
ISSN :: 1468-2079
Database :: MEDLINE
Journal :: The British journal of ophthalmology
Publication Type :: Academic Journal
Accession number :: 39814458
Full Text :: https://doi.org/10.1136/bjo-2024-326254

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Can large language models fully automate or partially assist paper selection in systematic reviews?

Abstract

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Can large language models fully automate or partially assist paper selection in systematic reviews?

Abstract

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources