1. Domain-Specific Semantics Guided Approach to Video Captioning
- Author
-
M. Hemalatha and C. Chandra Sekhar
- Subjects
Closed captioning ,Semantic HTML ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,02 engineering and technology ,Semantics ,Domain (software engineering) ,Set (abstract data type) ,03 medical and health sciences ,0302 clinical medicine ,030221 ophthalmology & optometry ,0202 electrical engineering, electronic engineering, information engineering ,Feature (machine learning) ,Preprocessor ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Classifier (UML) - Abstract
In video captioning, the description of a video usually relies on the domain to which the video belongs. Typically, the videos belong to wide range domains such as sports, music, news, cooking, etc. In many cases, a video can be associated with more than one domain. In this paper, we propose an approach to video captioning that uses domain-specific decoders. We build a domain classifier to obtain the estimates of probabilities of a video belonging to different domains. For each video, we identify the top − k domains based on the estimated probabilities. Each video in the training data set is shared in training the domain-specific decoders of top−k labels obtained from the domain classifier. The domain-specific decoders use the domain-specific semantic tags for generating captions. The proposed approach uses the Temporal VLAD for preprocessing the features extracted from 2D-CNN and 3D-CNN features. The preprocessed features provide better feature representation of the videos. The effectiveness of the proposed approach is demonstrated through the results of experimental studies on Microsoft Video Description (MSVD) corpus and MSR-VTT dataset.
- Published
- 2020