Author: "Wang, Huayan" / Language: undetermined - Searchworks@Jio Institute Digital Library Search Results

1. Effectively leveraging Multi-modal Features for Movie Genre Classification

Author: Zhang, Zhongping, Gu, Yiwen, Plummer, Bryan A., Miao, Xin, Liu, Jiayi, and Wang, Huayan
Subjects: FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition
Abstract: Movie genre classification has been widely studied in recent years due to its various applications in video editing, summarization, and recommendation. Prior work has typically addressed this task by predicting genres based solely on the visual content. As a result, predictions from these methods often perform poorly for genres such as documentary or musical, since non-visual modalities like audio or language play an important role in correctly classifying these genres. In addition, the analysis of long videos at frame level is always associated with high computational cost and makes the prediction less efficient. To address these two issues, we propose a Multi-Modal approach leveraging shot information, MMShot, to classify video genres in an efficient and effective way. We evaluate our method on MovieNet and Condensed Movies for genre classification, achieving 17% ~ 21% improvement on mean Average Precision (mAP) over the state-of-the-art. Extensive experiments are conducted to demonstrate the ability of MMShot for long video analysis and uncover the correlations between genres and multiple movie elements. We also demonstrate our approach's ability to generalize by evaluating the scene boundary detection task, achieving 1.1% improvement on Average Precision (AP) over the state-of-the-art.
Published: 2022
Full Text: View/download PDF

2. Semantic Image Manipulation with Background-guided Internal Learning

Author: Zhang, Zhongping, He, Huiwen, Plummer, Bryan A., Liao, Zhenyu, and Wang, Huayan
Subjects: FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION
Abstract: Image manipulation has attracted a lot of interest due to its wide range of applications. Prior work modifies images either from low-level manipulation, such as image inpainting or through manual edits via paintbrushes and scribbles, or from high-level manipulation, employing deep generative networks to output an image conditioned on high-level semantic input. In this study, we propose Semantic Image Manipulation with Background-guided Internal Learning (SIMBIL), which combines high-level and low-level manipulation. Specifically, users can edit an image at the semantic level by applying changes on a scene graph. Then our model manipulates the image at the pixel level according to the modified scene graph. There are two major advantages of our approach. First, high-level manipulation of scene graphs requires less manual effort from the user compared to manipulating raw image pixels. Second, our low-level internal learning approach is scalable to images of various sizes without reliance on external visual datasets for training. We outperform the state-of-the-art in a quantitative and qualitative evaluation on the CLEVR and Visual Genome datasets. Experiments show 8 points improvement on FID scores (CLEVR) and 27% improvement on user evaluation (Visual Genome), demonstrating the effectiveness of our approach.
Published: 2022
Full Text: View/download PDF

3. ImageSubject: A Large-scale Dataset for Subject Detection

Author: Miao, Xin, Liu, Jiayi, Wang, Huayan, and Fu, Jun
Subjects: FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION
Abstract: Main subjects usually exist in the images or videos, as they are the objects that the photographer wants to highlight. Human viewers can easily identify them but algorithms often confuse them with other objects. Detecting the main subjects is an important technique to help machines understand the content of images and videos. We present a new dataset with the goal of training models to understand the layout of the objects and the context of the image then to find the main subjects among them. This is achieved in three aspects. By gathering images from movie shots created by directors with professional shooting skills, we collect the dataset with strong diversity, specifically, it contains 107\,700 images from 21\,540 movie shots. We labeled them with the bounding box labels for two classes: subject and non-subject foreground object. We present a detailed analysis of the dataset and compare the task with saliency detection and object detection. ImageSubject is the first dataset that tries to localize the subject in an image that the photographer wants to highlight. Moreover, we find the transformer-based detection model offers the best result among other popular model architectures. Finally, we discuss the potential applications and conclude with the importance of the dataset.
Published: 2022
Full Text: View/download PDF

4. Fine-Grained Control of Artistic Styles in Image Generation

Author: Miao, Xin, Wang, Huayan, Fu, Jun, Liu, Jiayi, Wang, Shen, and Liao, Zhenyu
Subjects: FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition
Abstract: Recent advances in generative models and adversarial training have enabled artificially generating artworks in various artistic styles. It is highly desirable to gain more control over the generated style in practice. However, artistic styles are unlike object categories -- there are a continuous spectrum of styles distinguished by subtle differences. Few works have been explored to capture the continuous spectrum of styles and apply it to a style generation task. In this paper, we propose to achieve this by embedding original artwork examples into a continuous style space. The style vectors are fed to the generator and discriminator to achieve fine-grained control. Our method can be used with common generative adversarial networks (such as StyleGAN). Experiments show that our method not only precisely controls the fine-grained artistic style but also improves image quality over vanilla StyleGAN as measured by FID.
Published: 2021
Full Text: View/download PDF

5. EVOQUER: Enhancing Temporal Grounding with Video-Pivoted BackQuery Generation

Author: Gao, Yanjun, Liu, Lulu, Wang, Jason, Chen, Xin, Wang, Huayan, and Zhang, Rui
Subjects: FOS: Computer and information sciences, Computer Science - Computation and Language, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Computation and Language (cs.CL)
Abstract: Temporal grounding aims to predict a time interval of a video clip corresponding to a natural language query input. In this work, we present EVOQUER, a temporal grounding framework incorporating an existing text-to-video grounding model and a video-assisted query generation network. Given a query and an untrimmed video, the temporal grounding model predicts the target interval, and the predicted video clip is fed into a video translation task by generating a simplified version of the input query. EVOQUER forms closed-loop learning by incorporating loss functions from both temporal grounding and query generation serving as feedback. Our experiments on two widely used datasets, Charades-STA and ActivityNet, show that EVOQUER achieves promising improvements by 1.05 and 1.31 at R@0.7. We also discuss how the query generation task could facilitate error analysis by explaining temporal grounding model behavior., Comment: Accepted by Visually Grounded Interaction and Language (ViGIL) Workshop at NAACL 2021
Published: 2021
Full Text: View/download PDF

6. Off-policy Reinforcement Learning with Optimistic Exploration and Distribution Correction

Author: Li, Jiachen, Cheng, Shuo, Liao, Zhenyu, Wang, Huayan, Wang, William Yang, and Bai, Qinxun
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Robotics, Robotics (cs.RO), Machine Learning (cs.LG)
Abstract: Improving the sample efficiency of reinforcement learning algorithms requires effective exploration. Following the principle of $\textit{optimism in the face of uncertainty}$ (OFU), we train a separate exploration policy to maximize the approximate upper confidence bound of the critics in an off-policy actor-critic framework. However, this introduces extra differences between the replay buffer and the target policy regarding their stationary state-action distributions. To mitigate the off-policy-ness, we adapt the recently introduced DICE framework to learn a distribution correction ratio for off-policy RL training. In particular, we correct the training distribution for both policies and critics. Empirically, we evaluate our proposed method in several challenging continuous control tasks and show superior performance compared to state-of-the-art methods. We also conduct extensive ablation studies to demonstrate the effectiveness and rationality of the proposed method., Comment: Deep RL Workshop, NeurIPS 2022
Published: 2021
Full Text: View/download PDF

7. Histoires officielles des Song, des Liao et des Jin

Author: Wang Huayan
Published: 2020

8. Encyclopédies historiques et institutionnelles (zhengshu)

Author: Wang Huayan
Published: 2020

9. Histoires officielles des Cinq dynasties

Author: Wang Huayan
Published: 2020

10. Miroir général pour aider à gouverner (Le) (Zizhi tongjian)

Author: Wang Huayan
Published: 2020

11. Histoire des Yuan (L’) (Yuanshi)

Author: Wang Huayan
Published: 2020

12. Transfer Learning by Structural Analogy

Author: Wang, Huayan and Qiang Yang
Subjects: General Medicine
Abstract: Transfer learning allows knowledge to be extracted from auxiliary domains and be used to enhance learning in a target domain. For transfer learning to be successful, it is critical to find the similarity between auxiliary and target domains, even when such mappings are not obvious. In this paper, we present a novel algorithm for finding the structural similarity between two domains, to enable transfer learning at a structured knowledge level. In particular, we address the problem of how to learn a non-trivial structural similarity mapping between two different domains when they are completely different on the representation level. This problem is challenging because we cannot directly compare features across domains. Our algorithm extracts the structural features within each domain and then maps the features into the Reproducing Kernel Hilbert Space (RKHS), such that the "structural dependencies" of features across domains can be estimated by kernel matrices of the features within each domain. By treating the analogues from both domains as equivalent, we can transfer knowledge to achieve a better understanding of the domains and improved performance for learning. We validate our approach on synthetic and real-world datasets.
Published: 2011

13. The Na+/H+ exchanger potentiates growth and retinoic acid induced differentiation of embryonal carcinoma cells

Author: Wang, Huayan, Singh, Dyal, and Fliegel, Larry
Published: 1997
Full Text: View/download PDF

14. Regulation of Na+/H+ Exchanger Gene Expression: Role of a poly T rich region in regulation of expression of the NHE1 promoter

Author: Yang, Weidong, Wang, Huayan, and Fliegel, Larry
Published: 1996
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

14 results on '"Wang, Huayan"'

1. Effectively leveraging Multi-modal Features for Movie Genre Classification

2. Semantic Image Manipulation with Background-guided Internal Learning

3. ImageSubject: A Large-scale Dataset for Subject Detection

4. Fine-Grained Control of Artistic Styles in Image Generation

5. EVOQUER: Enhancing Temporal Grounding with Video-Pivoted BackQuery Generation

6. Off-policy Reinforcement Learning with Optimistic Exploration and Distribution Correction

7. Histoires officielles des Song, des Liao et des Jin

8. Encyclopédies historiques et institutionnelles (zhengshu)

9. Histoires officielles des Cinq dynasties

10. Miroir général pour aider à gouverner (Le) (Zizhi tongjian)

11. Histoire des Yuan (L’) (Yuanshi)

12. Transfer Learning by Structural Analogy

13. The Na+/H+ exchanger potentiates growth and retinoic acid induced differentiation of embryonal carcinoma cells

14. Regulation of Na+/H+ Exchanger Gene Expression: Role of a poly T rich region in regulation of expression of the NHE1 promoter

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Database

Publisher

14 results on '"Wang, Huayan"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources