Back to Search Start Over

Food Classification using Joint Representation of Visual and Textual Data

Authors :
Mittal, Prateek
Goyal, Puneet
Chauhan, Joohi
Publication Year :
2023

Abstract

Food classification is an important task in health care. In this work, we propose a multimodal classification framework that uses the modified version of EfficientNet with the Mish activation function for image classification, and the traditional BERT transformer-based network is used for text classification. The proposed network and the other state-of-the-art methods are evaluated on a large open-source dataset, UPMC Food-101. The experimental results show that the proposed network outperforms the other methods, a significant difference of 11.57% and 6.34% in accuracy is observed for image and text classification, respectively, when compared with the second-best performing method. We also compared the performance in terms of accuracy, precision, and recall for text classification using both machine learning and deep learning-based models. The comparative analysis from the prediction results of both images and text demonstrated the efficiency and robustness of the proposed approach.<br />Comment: Updated results and discussions to be posted and some sections needed to be expanded

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2308.02562
Document Type :
Working Paper