Start Over

Enhanced Image Captioning Using Bahdanau Attention Mechanism and Heuristic Beam Search Algorithm

Authors :: S. Abinaya
Mandava Deepak
A. Sherly Alphonse
Source :: IEEE Access, Vol 12, Pp 100991-101003 (2024)
Publication Year :: 2024
Publisher :: IEEE, 2024.
Abstract: Captioning images is a challenging task at the intersection of Computer Vision (CV) and Natural Language Processing (NLP), that involves generating descriptive text to depict the content of an image. Existing methodologies typically employ Convolutional Neural Networks (CNNs) for feature extraction and Recurrent Neural Networks (RNNs) for generating captions. However, these approaches often suffer from a lack of contextual understanding, inability to capture fine-grained details, and to generate generic captions. This study proposes VisualCaptionNet (VCN), a novel image captioning model that leverages ResNet50 for rich visual feature extraction and a Long Short-Term Memory (LSTM) network for sequential caption generation while retaining context. By incorporating the Bahdanau attention mechanism to focus on relevant image regions and integrating beam search for coherent and contextually relevant descriptions, VCN addresses the limitations of previous methodologies. Extensive experimentation on benchmark datasets such as Flickr30K and Flickr8K demonstrates VCN’s notable improvements of 10% and 12% over baseline models in terms of caption quality, coherence, and relevance. These enhancements emphasize VCN’s effectiveness in advancing image captioning tasks, promising more accurate and contextually relevant descriptions for images.

Subjects :: Attention mechanism
Bahdanau attention
beam search
BLEU score
image captioning
LSTM
Electrical engineering. Electronics. Nuclear engineering
TK1-9971

Details

Language :: English
ISSN :: 21693536
Volume :: 12
Database :: Directory of Open Access Journals
Journal :: IEEE Access
Publication Type :: Academic Journal
Accession number :: edsdoj.191fc72e6a5463b8fd9d691ddc53b1b
Document Type :: article
Full Text :: https://doi.org/10.1109/ACCESS.2024.3431091

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Enhanced Image Captioning Using Bahdanau Attention Mechanism and Heuristic Beam Search Algorithm

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Enhanced Image Captioning Using Bahdanau Attention Mechanism and Heuristic Beam Search Algorithm

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources