Back to Search
Start Over
Towards annotation-free evaluation of cross-lingual image captioning
- Source :
- Proceedings of the 2nd ACM International Conference on Multimedia in Asia.
- Publication Year :
- 2021
- Publisher :
- ACM, 2021.
-
Abstract
- Cross-lingual image captioning, with its ability to caption an unlabeled image in a target language other than English, is an emerging topic in the multimedia field. In order to save the precious human resource from re-writing reference sentences per target language, in this paper we make a brave attempt towards annotation-free evaluation of cross-lingual image captioning. Depending on whether we assume the availability of English references, two scenarios are investigated. For the first scenario with the references available, we propose two metrics, i.e., WMDRel and CLinRel. WMDRel measures the semantic relevance between a model-generated caption and machine translation of an English reference using their Word Mover's Distance. By projecting both captions into a deep visual feature space, CLinRel is a visual-oriented cross-lingual relevance measure. As for the second scenario, which has zero reference and is thus more challenging, we propose CMedRel to compute a cross-media relevance between the generated caption and the image content, in the same visual feature space as used by CLinRel. We have conducted a number of experiments to evaluate the effectiveness of the three proposed metrics. The combination of WMDRel, CLinRel and CMedRel has a Spearman's rank correlation of 0.952 with the sum of BLEU-4, METEOR, ROUGE-L and CIDEr, four standard metrics computed using references in the target language. CMedRel alone has a Spearman's rank correlation of 0.786 with the standard metrics. The promising results show high potential of the new metrics for evaluation with no need of references in the target language.
- Subjects :
- FOS: Computer and information sciences
Closed captioning
Machine translation
business.industry
Computer science
Computer Vision and Pattern Recognition (cs.CV)
Feature vector
Computer Science - Computer Vision and Pattern Recognition
02 engineering and technology
computer.software_genre
Field (computer science)
Multimedia (cs.MM)
Image (mathematics)
Annotation
ComputingMethodologies_DOCUMENTANDTEXTPROCESSING
0202 electrical engineering, electronic engineering, information engineering
020201 artificial intelligence & image processing
Relevance (information retrieval)
Artificial intelligence
business
computer
Computer Science - Multimedia
Natural language processing
Word (computer architecture)
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- Proceedings of the 2nd ACM International Conference on Multimedia in Asia
- Accession number :
- edsair.doi.dedup.....ef687214cd5cb8afeda2d96c1500656c