Back to Search Start Over

LLaVA-Docent: Instruction Tuning with Multimodal Large Language Model to Support Art Appreciation Education

Authors :
Lee, Unggi
Jeon, Minji
Lee, Yunseo
Byun, Gyuri
Son, Yoorim
Shin, Jaeyoon
Ko, Hongkyu
Kim, Hyeoncheol
Publication Year :
2024

Abstract

Despite the development of various AI systems to support learning in various domains, AI assistance for art appreciation education has not been extensively explored. Art appreciation, often perceived as an unfamiliar and challenging endeavor for most students, can be more accessible with a generative AI enabled conversation partner that provides tailored questions and encourages the audience to deeply appreciate artwork. This study explores the application of multimodal large language models (MLLMs) in art appreciation education, with a focus on developing LLaVA-Docent, a model designed to serve as a personal tutor for art appreciation. Our approach involved design and development research, focusing on iterative enhancement to design and develop the application to produce a functional MLLM-enabled chatbot along with a data design framework for art appreciation education. To that end, we established a virtual dialogue dataset that was generated by GPT-4, which was instrumental in training our MLLM, LLaVA-Docent. The performance of LLaVA-Docent was evaluated by benchmarking it against alternative settings and revealed its distinct strengths and weaknesses. Our findings highlight the efficacy of the MMLM-based personalized art appreciation chatbot and demonstrate its applicability for a novel approach in which art appreciation is taught and experienced.<br />Comment: 37 pages, 4 figures, 10 tables

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2402.06264
Document Type :
Working Paper
Full Text :
https://doi.org/10.1016/j.caeai.2024.100297