Back to Search
Start Over
Combining semi-supervised model and optimized LSTM for image caption generation based on pseudo labels.
- Source :
- Multimedia Tools & Applications; Mar2024, Vol. 83 Issue 10, p29997-30017, 21p
- Publication Year :
- 2024
-
Abstract
- Artificial intelligence's crucial area of image captions. It's a very difficult situation until the advancement of DL is made. A lot of open challenges remain as robustness, generalization and accuracy, results are far from reasonable. As image captioning schemes are data avaricious, pre-training on larger scale datasets, even if not well-curated, is fetching a solid approach. In addition to precisely identifying the image includes the scene, object, connection, and qualities of the item in the image, the image caption generation method should produce natural, fluid, precise, and useful sentences. However, since not all visual information may be utilized, it might be difficult to effectively convey the image's content when writing image captions. Here, the image captioning is done under two models, i.e. NIC model and LSTM based model. At first, (Neural Image Caption) NIC process is done, where, CNN based caption generation is carried out for unlabelled and labeled dataset. Further, features namely, improved BOW and N-gram are derived that are used for training the CNN model. The final caption is generated by optimized LSTM, where the weights are optimally tuned by Harris Hawks with Sinusoidal Chaotic Map Assisted Exploitation (HH-SCME). Finally, BLEU score, rouge and CIDER scores are computed to prove the efficiency of HH-SCME. The proposed model of LSTM+HH-SCME achieves 0.84 BLEU score 1 value as compared to other existing methods like CNN, SSO, PRO, AOA, RNN, LSTM and LSTM+HH-SCME. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 13807501
- Volume :
- 83
- Issue :
- 10
- Database :
- Complementary Index
- Journal :
- Multimedia Tools & Applications
- Publication Type :
- Academic Journal
- Accession number :
- 175897006
- Full Text :
- https://doi.org/10.1007/s11042-023-16687-x