Author: "Yang, Sicheng" / Database: arXiv - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Yang, Sicheng"' showing total 16 results

Start Over Author "Yang, Sicheng" Database arXiv

16 results on '"Yang, Sicheng"'

1. Cross-conditioned Diffusion Model for Medical Image to Image Translation

Author: Xing, Zhaohu, Yang, Sicheng, Chen, Sixiang, Ye, Tian, Yang, Yijun, Qin, Jing, and Zhu, Lei
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: Multi-modal magnetic resonance imaging (MRI) provides rich, complementary information for analyzing diseases. However, the practical challenges of acquiring multiple MRI modalities, such as cost, scan time, and safety considerations, often result in incomplete datasets. This affects both the quality of diagnosis and the performance of deep learning models trained on such data. Recent advancements in generative adversarial networks (GANs) and denoising diffusion models have shown promise in natural and medical image-to-image translation tasks. However, the complexity of training GANs and the computational expense associated with diffusion models hinder their development and application in this task. To address these issues, we introduce a Cross-conditioned Diffusion Model (CDM) for medical image-to-image translation. The core idea of CDM is to use the distribution of target modalities as guidance to improve synthesis quality while achieving higher generation efficiency compared to conventional diffusion models. First, we propose a Modality-specific Representation Model (MRM) to model the distribution of target modalities. Then, we design a Modality-decoupled Diffusion Network (MDN) to efficiently and effectively learn the distribution from MRM. Finally, a Cross-conditioned UNet (C-UNet) with a Condition Embedding module is designed to synthesize the target modalities with the source modalities as input and the target distribution for guidance. Extensive experiments conducted on the BraTS2023 and UPenn-GBM benchmark datasets demonstrate the superiority of our method., Comment: miccai24
Published: 2024

2. Co-Speech Gesture Video Generation via Motion-Decoupled Diffusion Model

Author: He, Xu, Huang, Qiaochu, Zhang, Zhensong, Lin, Zhiwei, Wu, Zhiyong, Yang, Sicheng, Li, Minglei, Chen, Zhiyi, Xu, Songcen, and Wu, Xiaofei
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Human-Computer Interaction, Computer Science - Multimedia
Abstract: Co-speech gestures, if presented in the lively form of videos, can achieve superior visual effects in human-machine interaction. While previous works mostly generate structural human skeletons, resulting in the omission of appearance information, we focus on the direct generation of audio-driven co-speech gesture videos in this work. There are two main challenges: 1) A suitable motion feature is needed to describe complex human movements with crucial appearance information. 2) Gestures and speech exhibit inherent dependencies and should be temporally aligned even of arbitrary length. To solve these problems, we present a novel motion-decoupled framework to generate co-speech gesture videos. Specifically, we first introduce a well-designed nonlinear TPS transformation to obtain latent motion features preserving essential appearance information. Then a transformer-based diffusion model is proposed to learn the temporal correlation between gestures and speech, and performs generation in the latent motion space, followed by an optimal motion selection module to produce long-term coherent and consistent gesture videos. For better visual perception, we further design a refinement network focusing on missing details of certain areas. Extensive experimental results show that our proposed framework significantly outperforms existing approaches in both motion and video-related evaluations. Our code, demos, and more resources are available at https://github.com/thuhcsi/S2G-MDDiffusion., Comment: 22 pages, 8 figures, CVPR 2024
Published: 2024

3. MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space Models

Author: Xu, Zunnan, Lin, Yukang, Han, Haonan, Yang, Sicheng, Li, Ronghui, Zhang, Yachao, and Li, Xiu
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Human-Computer Interaction
Abstract: Gesture synthesis is a vital realm of human-computer interaction, with wide-ranging applications across various fields like film, robotics, and virtual reality. Recent advancements have utilized the diffusion model and attention mechanisms to improve gesture synthesis. However, due to the high computational complexity of these techniques, generating long and diverse sequences with low latency remains a challenge. We explore the potential of state space models (SSMs) to address the challenge, implementing a two-stage modeling strategy with discrete motion priors to enhance the quality of gestures. Leveraging the foundational Mamba block, we introduce MambaTalk, enhancing gesture diversity and rhythm through multimodal integration. Extensive experiments demonstrate that our method matches or exceeds the performance of state-of-the-art models., Comment: NeurlPS 2024
Published: 2024

4. Freetalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker Naturalness

Author: Yang, Sicheng, Xu, Zunnan, Xue, Haiwei, Cheng, Yongkang, Huang, Shaoli, Gong, Mingming, and Wu, Zhiyong
Subjects: Computer Science - Multimedia, Computer Science - Artificial Intelligence, Computer Science - Human-Computer Interaction, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Current talking avatars mostly generate co-speech gestures based on audio and text of the utterance, without considering the non-speaking motion of the speaker. Furthermore, previous works on co-speech gesture generation have designed network structures based on individual gesture datasets, which results in limited data volume, compromised generalizability, and restricted speaker movements. To tackle these issues, we introduce FreeTalker, which, to the best of our knowledge, is the first framework for the generation of both spontaneous (e.g., co-speech gesture) and non-spontaneous (e.g., moving around the podium) speaker motions. Specifically, we train a diffusion-based model for speaker motion generation that employs unified representations of both speech-driven gestures and text-driven motions, utilizing heterogeneous data sourced from various motion datasets. During inference, we utilize classifier-free guidance to highly control the style in the clips. Additionally, to create smooth transitions between clips, we utilize DoubleTake, a method that leverages a generative prior and ensures seamless motion blending. Extensive experiments show that our method generates natural and controllable speaker movements. Our code, model, and demo are are available at \url{https://youngseng.github.io/FreeTalker/}., Comment: 6 pages, 3 figures, ICASSP 2024
Published: 2024

5. Chain of Generation: Multi-Modal Gesture Synthesis via Cascaded Conditional Control

Author: Xu, Zunnan, Zhang, Yachao, Yang, Sicheng, Li, Ronghui, and Li, Xiu
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: This study aims to improve the generation of 3D gestures by utilizing multimodal information from human speech. Previous studies have focused on incorporating additional modalities to enhance the quality of generated gestures. However, these methods perform poorly when certain modalities are missing during inference. To address this problem, we suggest using speech-derived multimodal priors to improve gesture generation. We introduce a novel method that separates priors from speech and employs multimodal priors as constraints for generating gestures. Our approach utilizes a chain-like modeling method to generate facial blendshapes, body movements, and hand gestures sequentially. Specifically, we incorporate rhythm cues derived from facial deformation and stylization prior based on speech emotions, into the process of generating gestures. By incorporating multimodal priors, our method improves the quality of generated gestures and eliminate the need for expensive setup preparation during inference. Extensive experiments and user studies confirm that our proposed approach achieves state-of-the-art performance., Comment: AAAI-2024
Published: 2023

6. Conversational Co-Speech Gesture Generation via Modeling Dialog Intention, Emotion, and Context with Diffusion Models

Author: Xue, Haiwei, Yang, Sicheng, Zhang, Zhensong, Wu, Zhiyong, Li, Minglei, Dai, Zonghong, and Meng, Helen
Subjects: Computer Science - Human-Computer Interaction
Abstract: Audio-driven co-speech human gesture generation has made remarkable advancements recently. However, most previous works only focus on single person audio-driven gesture generation. We aim at solving the problem of conversational co-speech gesture generation that considers multiple participants in a conversation, which is a novel and challenging task due to the difficulty of simultaneously incorporating semantic information and other relevant features from both the primary speaker and the interlocutor. To this end, we propose CoDiffuseGesture, a diffusion model-based approach for speech-driven interaction gesture generation via modeling bilateral conversational intention, emotion, and semantic context. Our method synthesizes appropriate interactive, speech-matched, high-quality gestures for conversational motions through the intention perception module and emotion reasoning module at the sentence level by a pretrained language model. Experimental results demonstrate the promising performance of the proposed method., Comment: 5 pages,2 figures, Accepted for publication at the 2024 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2024)
Published: 2023

7. UnifiedGesture: A Unified Gesture Synthesis Model for Multiple Skeletons

Author: Yang, Sicheng, Wang, Zilin, Wu, Zhiyong, Li, Minglei, Zhang, Zhensong, Huang, Qiaochu, Hao, Lei, Xu, Songcen, Wu, Xiaofei, yang, changpeng, and Dai, Zonghong
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Artificial Intelligence, Computer Science - Multimedia
Abstract: The automatic co-speech gesture generation draws much attention in computer animation. Previous works designed network structures on individual datasets, which resulted in a lack of data volume and generalizability across different motion capture standards. In addition, it is a challenging task due to the weak correlation between speech and gestures. To address these problems, we present UnifiedGesture, a novel diffusion model-based speech-driven gesture synthesis approach, trained on multiple gesture datasets with different skeletons. Specifically, we first present a retargeting network to learn latent homeomorphic graphs for different motion capture standards, unifying the representations of various gestures while extending the dataset. We then capture the correlation between speech and gestures based on a diffusion model architecture using cross-local attention and self-attention to generate better speech-matched and realistic gestures. To further align speech and gesture and increase diversity, we incorporate reinforcement learning on the discrete gesture units with a learned reward function. Extensive experiments show that UnifiedGesture outperforms recent approaches on speech-driven gesture generation in terms of CCA, FGD, and human-likeness. All code, pre-trained models, databases, and demos are available to the public at https://github.com/YoungSeng/UnifiedGesture., Comment: 16 pages, 11 figures, ACM MM 2023
Published: 2023
Full Text: View/download PDF

8. The DiffuseStyleGesture+ entry to the GENEA Challenge 2023

Author: Yang, Sicheng, Xue, Haiwei, Zhang, Zhensong, Li, Minglei, Wu, Zhiyong, Wu, Xiaofei, Xu, Songcen, and Dai, Zonghong
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Artificial Intelligence, Computer Science - Multimedia
Abstract: In this paper, we introduce the DiffuseStyleGesture+, our solution for the Generation and Evaluation of Non-verbal Behavior for Embodied Agents (GENEA) Challenge 2023, which aims to foster the development of realistic, automated systems for generating conversational gestures. Participants are provided with a pre-processed dataset and their systems are evaluated through crowdsourced scoring. Our proposed model, DiffuseStyleGesture+, leverages a diffusion model to generate gestures automatically. It incorporates a variety of modalities, including audio, text, speaker ID, and seed gestures. These diverse modalities are mapped to a hidden space and processed by a modified diffusion model to produce the corresponding gesture for a given speech input. Upon evaluation, the DiffuseStyleGesture+ demonstrated performance on par with the top-tier models in the challenge, showing no significant differences with those models in human-likeness, appropriateness for the interlocutor, and achieving competitive performance with the best model on appropriateness for agent speech. This indicates that our model is competitive and effective in generating realistic and appropriate gestures for given speech. The code, pre-trained models, and demos are available at https://github.com/YoungSeng/DiffuseStyleGesture/tree/DiffuseStyleGesturePlus/BEAT-TWH-main., Comment: 7 pages, 8 figures, ICMI 2023
Published: 2023
Full Text: View/download PDF

9. QPGesture: Quantization-Based and Phase-Guided Motion Matching for Natural Speech-Driven Gesture Generation

Author: Yang, Sicheng, Wu, Zhiyong, Li, Minglei, Zhang, Zhensong, Hao, Lei, Bao, Weihong, and Zhuang, Haolin
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Multimedia, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Speech-driven gesture generation is highly challenging due to the random jitters of human motion. In addition, there is an inherent asynchronous relationship between human speech and gestures. To tackle these challenges, we introduce a novel quantization-based and phase-guided motion-matching framework. Specifically, we first present a gesture VQ-VAE module to learn a codebook to summarize meaningful gesture units. With each code representing a unique gesture, random jittering problems are alleviated effectively. We then use Levenshtein distance to align diverse gestures with different speech. Levenshtein distance based on audio quantization as a similarity metric of corresponding speech of gestures helps match more appropriate gestures with speech, and solves the alignment problem of speech and gestures well. Moreover, we introduce phase to guide the optimal gesture matching based on the semantics of context or rhythm of audio. Phase guides when text-based or speech-based gestures should be performed to make the generated gestures more natural. Extensive experiments show that our method outperforms recent approaches on speech-driven gesture generation. Our code, database, pre-trained models, and demos are available at https://github.com/YoungSeng/QPGesture., Comment: 15 pages, 12 figures, CVPR 2023 Highlight
Published: 2023

10. DiffuseStyleGesture: Stylized Audio-Driven Co-Speech Gesture Generation with Diffusion Models

Author: Yang, Sicheng, Wu, Zhiyong, Li, Minglei, Zhang, Zhensong, Hao, Lei, Bao, Weihong, Cheng, Ming, and Xiao, Long
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Multimedia
Abstract: The art of communication beyond speech there are gestures. The automatic co-speech gesture generation draws much attention in computer animation. It is a challenging task due to the diversity of gestures and the difficulty of matching the rhythm and semantics of the gesture to the corresponding speech. To address these problems, we present DiffuseStyleGesture, a diffusion model based speech-driven gesture generation approach. It generates high-quality, speech-matched, stylized, and diverse co-speech gestures based on given speeches of arbitrary length. Specifically, we introduce cross-local attention and self-attention to the gesture diffusion pipeline to generate better speech matched and realistic gestures. We then train our model with classifier-free guidance to control the gesture style by interpolation or extrapolation. Additionally, we improve the diversity of generated gestures with different initial gestures and noise. Extensive experiments show that our method outperforms recent approaches on speech-driven gesture generation. Our code, pre-trained models, and demos are available at https://github.com/YoungSeng/DiffuseStyleGesture., Comment: 11 pages, 9 figures, IJCAI 2023
Published: 2023

11. GTN-Bailando: Genre Consistent Long-Term 3D Dance Generation based on Pre-trained Genre Token Network

Author: Zhuang, Haolin, Lei, Shun, Xiao, Long, Li, Weiqin, Chen, Liyang, Yang, Sicheng, Wu, Zhiyong, Kang, Shiyin, and Meng, Helen
Subjects: Computer Science - Sound, Computer Science - Multimedia, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Music-driven 3D dance generation has become an intensive research topic in recent years with great potential for real-world applications. Most existing methods lack the consideration of genre, which results in genre inconsistency in the generated dance movements. In addition, the correlation between the dance genre and the music has not been investigated. To address these issues, we propose a genre-consistent dance generation framework, GTN-Bailando. First, we propose the Genre Token Network (GTN), which infers the genre from music to enhance the genre consistency of long-term dance generation. Second, to improve the generalization capability of the model, the strategy of pre-training and fine-tuning is adopted.Experimental results on the AIST++ dataset show that the proposed dance generation framework outperforms state-of-the-art methods in terms of motion quality and genre consistency., Comment: Accepted by ICASSP2023.Demo page: https://im1eon.github.io/ICASSP23-GTNB-DG/
Published: 2023

12. Dexterous In-Hand Manipulation of Slender Cylindrical Objects through Deep Reinforcement Learning with Tactile Sensing

Author: Hu, Wenbin, Huang, Bidan, Lee, Wang Wei, Yang, Sicheng, Zheng, Yu, and Li, Zhibin
Subjects: Computer Science - Robotics
Abstract: Continuous in-hand manipulation is an important physical interaction skill, where tactile sensing provides indispensable contact information to enable dexterous manipulation of small objects. This work proposed a framework for end-to-end policy learning with tactile feedback and sim-to-real transfer, which achieved fine in-hand manipulation that controls the pose of a thin cylindrical object, such as a long stick, to track various continuous trajectories through multiple contacts of three fingertips of a dexterous robot hand with tactile sensor arrays. We estimated the central contact position between the stick and each fingertip from the high-dimensional tactile information and showed that the learned policies achieved effective manipulation performance with the processed tactile feedback. The policies were trained with deep reinforcement learning in simulation and successfully transferred to real-world experiments, using coordinated model calibration and domain randomization. We evaluated the effectiveness of tactile information via comparative studies and validated the sim-to-real performance through real-world experiments., Comment: 10 pages, 12 figures, submitted to Transaction on Mechatronics
Published: 2023

13. The ReprGesture entry to the GENEA Challenge 2022

Author: Yang, Sicheng, Wu, Zhiyong, Li, Minglei, Zhao, Mengchen, Lin, Jiuxin, Chen, Liyang, and Bao, Weihong
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Artificial Intelligence, Computer Science - Multimedia, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: This paper describes the ReprGesture entry to the Generation and Evaluation of Non-verbal Behaviour for Embodied Agents (GENEA) challenge 2022. The GENEA challenge provides the processed datasets and performs crowdsourced evaluations to compare the performance of different gesture generation systems. In this paper, we explore an automatic gesture generation system based on multimodal representation learning. We use WavLM features for audio, FastText features for text and position and rotation matrix features for gesture. Each modality is projected to two distinct subspaces: modality-invariant and modality-specific. To learn inter-modality-invariant commonalities and capture the characters of modality-specific representations, gradient reversal layer based adversarial classifier and modality reconstruction decoders are used during training. The gesture decoder generates proper gestures using all representations and features related to the rhythm in the audio. Our code, pre-trained models and demo are available at https://github.com/YoungSeng/ReprGesture., Comment: 8 pages, 4 figures, ICMI 2022
Published: 2022
Full Text: View/download PDF

14. Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion

Author: Yang, SiCheng, Tantrawenith, Methawee, Zhuang, Haolin, Wu, Zhiyong, Sun, Aolan, Wang, Jianzong, Cheng, Ning, Tang, Huaizhen, Zhao, Xintao, Wang, Jie, and Meng, Helen
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Machine Learning, Computer Science - Sound
Abstract: One-shot voice conversion (VC) with only a single target speaker's speech for reference has become a hot research topic. Existing works generally disentangle timbre, while information about pitch, rhythm and content is still mixed together. To perform one-shot VC effectively with further disentangling these speech components, we employ random resampling for pitch and content encoder and use the variational contrastive log-ratio upper bound of mutual information and gradient reversal layer based adversarial mutual information learning to ensure the different parts of the latent space containing only the desired disentangled representation during training. Experiments on the VCTK dataset show the model achieves state-of-the-art performance for one-shot VC in terms of naturalness and intellgibility. In addition, we can transfer characteristics of one-shot VC on timbre, pitch and rhythm separately by speech representation disentanglement. Our code, pre-trained models and demo are available at https://im1eon.github.io/IS2022-SRDVC/., Comment: 5 pages,5 figures,INTERSPEECH 2022
Published: 2022

15. Multilevel orthogonal Bochner function subspaces with applications to robust machine learning

Author: Castrillon-Candas, Julio Enrique, Liu, Dingning, Yang, Sicheng, and Kon, Mark
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning, 62R10, 60G35, 62-08, 60G60, 65F25, 46B09
Abstract: In our approach, we consider the data as instances of a random field within a relevant Bochner space. Our key observation is that the classes can predominantly reside in two distinct subspaces. To uncover the separation between these classes, we employ the Karhunen-Loeve expansion and construct the appropriate subspaces. This allows us to effectively reveal the distinction between the classes. The novel features forming the above bases are constructed by applying a coordinate transformation based on the recent Functional Data Analysis theory for anomaly detection. The associated signal decomposition is an exact hierarchical tensor product expansion with known optimality properties for approximating stochastic processes (random fields) with finite dimensional function spaces. Using a hierarchical finite dimensional expansion of the nominal class, a series of orthogonal nested subspaces is constructed for detecting anomalous signal components. Projection coefficients of input data in these subspaces are then used to train a Machine Learning (ML classifier. However, due to the split of the signal into nominal and anomalous projection components, clearer separation surfaces for the classes arise. In fact we show that with a sufficiently accurate estimation of the covariance structure of the nominal class, a sharp classification can be obtained. This is particularly advantageous for large unbalanced datasets. We demonstrate it on a number of high-dimensional datasets. This approach yields significant increases in accuracy of ML methods compared to using the same ML algorithm with the original feature data. Our tests on the Alzheimer's Disease ADNI dataset shows a dramatic increase in accuracy (from 48% to 89% accuracy). Furthermore, tests using unbalanced semi-synthetic datasets created from the benchmark GCM dataset confirm increased accuracy as the dataset becomes more unbalanced.
Published: 2021

16. KVL-BERT: Knowledge Enhanced Visual-and-Linguistic BERT for Visual Commonsense Reasoning

Author: Song, Dandan, Ma, Siyi, Sun, Zhanchen, Yang, Sicheng, and Liao, Lejian
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: Reasoning is a critical ability towards complete visual understanding. To develop machine with cognition-level visual understanding and reasoning abilities, the visual commonsense reasoning (VCR) task has been introduced. In VCR, given a challenging question about an image, a machine must answer correctly and then provide a rationale justifying its answer. The methods adopting the powerful BERT model as the backbone for learning joint representation of image content and natural language have shown promising improvements on VCR. However, none of the existing methods have utilized commonsense knowledge in visual commonsense reasoning, which we believe will be greatly helpful in this task. With the support of commonsense knowledge, complex questions even if the required information is not depicted in the image can be answered with cognitive reasoning. Therefore, we incorporate commonsense knowledge into the cross-modal BERT, and propose a novel Knowledge Enhanced Visual-and-Linguistic BERT (KVL-BERT for short) model. Besides taking visual and linguistic contents as input, external commonsense knowledge extracted from ConceptNet is integrated into the multi-layer Transformer. In order to reserve the structural information and semantic representation of the original sentence, we propose using relative position embedding and mask-self-attention to weaken the effect between the injected commonsense knowledge and other unrelated components in the input sequence. Compared to other task-specific models and general task-agnostic pre-training models, our KVL-BERT outperforms them by a large margin.
Published: 2020

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

16 results on '"Yang, Sicheng"'

1. Cross-conditioned Diffusion Model for Medical Image to Image Translation

2. Co-Speech Gesture Video Generation via Motion-Decoupled Diffusion Model

3. MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space Models

4. Freetalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker Naturalness

5. Chain of Generation: Multi-Modal Gesture Synthesis via Cascaded Conditional Control

6. Conversational Co-Speech Gesture Generation via Modeling Dialog Intention, Emotion, and Context with Diffusion Models

7. UnifiedGesture: A Unified Gesture Synthesis Model for Multiple Skeletons

8. The DiffuseStyleGesture+ entry to the GENEA Challenge 2023

9. QPGesture: Quantization-Based and Phase-Guided Motion Matching for Natural Speech-Driven Gesture Generation

10. DiffuseStyleGesture: Stylized Audio-Driven Co-Speech Gesture Generation with Diffusion Models

11. GTN-Bailando: Genre Consistent Long-Term 3D Dance Generation based on Pre-trained Genre Token Network

12. Dexterous In-Hand Manipulation of Slender Cylindrical Objects through Deep Reinforcement Learning with Tactile Sensing

13. The ReprGesture entry to the GENEA Challenge 2022

14. Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion

15. Multilevel orthogonal Bochner function subspaces with applications to robust machine learning

16. KVL-BERT: Knowledge Enhanced Visual-and-Linguistic BERT for Visual Commonsense Reasoning

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Publication Type

Database

16 results on '"Yang, Sicheng"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources