1. Three versions of an atopic dermatitis case report written by humans, artificial intelligence, or both: Identification of authorship and preferences
- Author
-
Mara Giavina Bianchi, MD, PhD, Andrew D’adario, MSc, Pedro Giavina Bianchi, MD, PhD, Birajara Soares Machado, PhD, Rosana Agondi, MD, PhD, Stephanie K.A. Almeida, MD, Wandilson Xavier Alves Junior, MD, Larissa M. Armelin, Marcelo Vivolo Aun, MD, PhD, Natália Bordignon, Karla Boufleur, MD, Felipe B. Brunheroto, Elisabeth A. Callegaro, MD, Paula Lazaretti M. Castro, MD, Herberto Jose Chong-Neto, MD, PhD, Mariana D. Dall’Osto, Julia Abou Dias, Viviane Heintze Ferreira, MD, André Luiz Oliveira Feodrippe, MD, Livia G. Fonseca, MD, Clydia M. Garcia, MD, Bruna H. Giavina-Bianchi, Ekaterini Goudouris, MD, PhD, Danilo Gois Gonçalves, MD, Debora D. Hernandes, MD, Malek Imad, MD, Larissa S. Izabel, Lucas Cauê Jacintho, Carolina Khouri-Panzarin, Fabio Kuschnir, MD, PhD, Maria Beatriz Pádua Lima, Amanda I. Lopes, MD, Larissa Nathalia Macêdo Nóbrega Lopes, MD, Alice Rocha de Magalhães, MD, Eli Mansour, MD, PhD, Ana Karolina B.B. Marinho, MD, PhD, Vivian S. Martimiano, Pedro H. Milori, Antonio Marcondes Mutarelli, Guilherme Paes Gonçalves Nogueira, Beatriz K.T. Oguido, Bruna S. Alarcon de Oliveira, MD, Emerson Costa de Oliveira, Georgia A. Padulla, MD, Letícia D’Ordaz Lhano Santos, Micaelly Samara Meneses Santos, MD, Emanuel Sarinho, MD, PhD, Marcela Schoen, MD, Brian Lucas A. Sousa, MD, PhD, Eduardo Magalhães de Souza-Lima, MD, Beatriz C. Todt, MD, and Najla Braz da Silva Vaz, MD
- Subjects
ChatGPT ,Generative Pre-training Transformer (GPT) ,large language model (LLM) ,artificial intelligence ,scientific writing ,medical survey ,Immunologic diseases. Allergy ,RC581-607 - Abstract
Background: The use of artificial intelligence (AI) in scientific writing is rapidly increasing, raising concerns about authorship identification, content quality, and writing efficiency. Objectives: This study investigates the real-world impact of ChatGPT, a large language model, on those aspects in a simulated publication scenario. Methods: Forty-eight individuals representing 3 medical expertise levels (medical students, residents, and experts in allergy or dermatology) evaluated 3 blinded versions of an atopic dermatitis case report: one each human written (HUM), AI generated (AI), and combined written (COM). The survey assessed authorship, ranked their preference, and graded 13 quality criteria for each text. Time taken to generate each manuscript was also recorded. Results: Authorship identification accuracy mirrored the odds at 33%. Expert participants (50.9%) demonstrated significantly higher accuracy compared to residents (27.7%) and students (19.6%, P < .001). Participants favored AI-assisted versions (AI and COM) over HUM (P < .001), with COM receiving the highest quality scores. COM and AI achieved 83.8% and 84.3% reduction in writing time, respectively, compared to HUM, while showing 13.9% (P < .001) and 11.1% improvement in quality (P < .001), respectively. However, experts assigned the lowest score for the references of the AI manuscript, potentially hindering its publication. Conclusion: AI can deceptively mimic human writing, particularly for less experienced readers. Although AI-assisted writing is appealing and offers significant time savings, human oversight remains crucial to ensure accuracy, ethical considerations, and optimal quality. These findings underscore the need for transparency in AI use and highlight the potential of human-AI collaboration in the future of scientific writing.
- Published
- 2025
- Full Text
- View/download PDF