1. Comparative analysis of convolutional neural networks and vision transformers for dermatological image classification.
- Author
-
Saputra, Verren Angelina, Devi, Marvella Shera, Diana, and Kurniawan, Afdhal
- Subjects
CONVOLUTIONAL neural networks ,TRANSFORMER models ,IMAGE recognition (Computer vision) ,OBJECT recognition (Computer vision) ,DEEP learning - Abstract
Skin is one of the outermost parts of the human body and is frequently exposed to various diseases. Healthcare professionals have endeavored to treat these conditions. However, the conventional methods often fall short in effectively distinguishing between diseases that visually present with similar characteristics and appearances. Therefore, Deep Learning was developed to facilitate image classification using the Convolutional Neural Network (CNN) model. Although CNN has achieved success in the form of high accuracy in image detection, research continues for the development of sustainable technology. Deep Learning has led to the discovery of the Vision Transformer (ViT) which is considered a popular new discovery model with intelligence that rivals CNN in object detection, offering greater accuracy and performance. This paper compares ResNet152 as one of the best models in CNN with ViT in classifying skin diseases. This comparison applies the HAM10000 dataset with a large number of images reaching 10,015 images divided for training, testing, and validation processes. In the process of applying the model to the selected dataset, ViT achieved 98.28% accuracy, while the classification accuracy of ResNet152 was 96.70%. Other metrics also proved ViT's superiority over ResNet152. However, ViT has major drawbacks in computation time and potential overfitting. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF