Author: "Yu, Weijiang" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Yu, Weijiang"' showing total 250 results

Start Over Author "Yu, Weijiang"

250 results on '"Yu, Weijiang"'

1. Learning Fine-Grained Grounded Citations for Attributed Large Language Models

Author: Huang, Lei, Feng, Xiaocheng, Ma, Weitao, Gu, Yuxuan, Zhong, Weihong, Feng, Xiachong, Yu, Weijiang, Peng, Weihua, Tang, Duyu, Tu, Dandan, and Qin, Bing
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Despite the impressive performance on information-seeking tasks, large language models (LLMs) still struggle with hallucinations. Attributed LLMs, which augment generated text with in-line citations, have shown potential in mitigating hallucinations and improving verifiability. However, current approaches suffer from suboptimal citation quality due to their reliance on in-context learning. Furthermore, the practice of citing only coarse document identifiers makes it challenging for users to perform fine-grained verification. In this work, we introduce FRONT, a training framework designed to teach LLMs to generate Fine-Grained Grounded Citations. By grounding model outputs in fine-grained supporting quotes, these quotes guide the generation of grounded and consistent responses, not only improving citation quality but also facilitating fine-grained verification. Experiments on the ALCE benchmark demonstrate the efficacy of FRONT in generating superior grounded responses and highly supportive citations. With LLaMA-2-7B, the framework significantly outperforms all the baselines, achieving an average of 14.21% improvement in citation quality across all datasets, even surpassing ChatGPT., Comment: Accepted by ACL 2024 Findings
Published: 2024

2. BeamAggR: Beam Aggregation Reasoning over Multi-source Knowledge for Multi-hop Question Answering

Author: Chu, Zheng, Chen, Jingchang, Chen, Qianglong, Wang, Haotian, Zhu, Kun, Du, Xiyuan, Yu, Weijiang, Liu, Ming, and Qin, Bing
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Large language models (LLMs) have demonstrated strong reasoning capabilities. Nevertheless, they still suffer from factual errors when tackling knowledge-intensive tasks. Retrieval-augmented reasoning represents a promising approach. However, significant challenges still persist, including inaccurate and insufficient retrieval for complex questions, as well as difficulty in integrating multi-source knowledge. To address this, we propose Beam Aggregation Reasoning, BeamAggR, a reasoning framework for knowledge-intensive multi-hop QA. BeamAggR explores and prioritizes promising answers at each hop of question. Concretely, we parse the complex questions into trees, which include atom and composite questions, followed by bottom-up reasoning. For atomic questions, the LLM conducts reasoning on multi-source knowledge to get answer candidates. For composite questions, the LLM combines beam candidates, explores multiple reasoning paths through probabilistic aggregation, and prioritizes the most promising trajectory. Extensive experiments on four open-domain multi-hop reasoning datasets show that our method significantly outperforms SOTA methods by 8.5%. Furthermore, our analysis reveals that BeamAggR elicits better knowledge collaboration and answer aggregation., Comment: Accepted to ACL 2024
Published: 2024

3. Stable Diffusion Segmentation for Biomedical Images with Single-step Reverse Process

Author: Lin, Tianyu, Chen, Zhiguang, Yan, Zhonghao, Yu, Weijiang, and Zheng, Fudan
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: Diffusion models have demonstrated their effectiveness across various generative tasks. However, when applied to medical image segmentation, these models encounter several challenges, including significant resource and time requirements. They also necessitate a multi-step reverse process and multiple samples to produce reliable predictions. To address these challenges, we introduce the first latent diffusion segmentation model, named SDSeg, built upon stable diffusion (SD). SDSeg incorporates a straightforward latent estimation strategy to facilitate a single-step reverse process and utilizes latent fusion concatenation to remove the necessity for multiple samples. Extensive experiments indicate that SDSeg surpasses existing state-of-the-art methods on five benchmark datasets featuring diverse imaging modalities. Remarkably, SDSeg is capable of generating stable predictions with a solitary reverse step and sample, epitomizing the model's stability as implied by its name. The code is available at https://github.com/lin-tianyu/Stable-Diffusion-Seg, Comment: Accepted at MICCAI 2024. Code and citation info see https://github.com/lin-tianyu/Stable-Diffusion-Seg
Published: 2024

4. An Information Bottleneck Perspective for Effective Noise Filtering on Retrieval-Augmented Generation

Author: Zhu, Kun, Feng, Xiaocheng, Du, Xiyuan, Gu, Yuxuan, Yu, Weijiang, Wang, Haotian, Chen, Qianglong, Chu, Zheng, Chen, Jingchang, and Qin, Bing
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Retrieval-augmented generation integrates the capabilities of large language models with relevant information retrieved from an extensive corpus, yet encounters challenges when confronted with real-world noisy data. One recent solution is to train a filter module to find relevant content but only achieve suboptimal noise compression. In this paper, we propose to introduce the information bottleneck theory into retrieval-augmented generation. Our approach involves the filtration of noise by simultaneously maximizing the mutual information between compression and ground output, while minimizing the mutual information between compression and retrieved passage. In addition, we derive the formula of information bottleneck to facilitate its application in novel comprehensive evaluations, the selection of supervised fine-tuning data, and the construction of reinforcement learning rewards. Experimental results demonstrate that our approach achieves significant improvements across various question answering datasets, not only in terms of the correctness of answer generation but also in the conciseness with $2.5\%$ compression rate., Comment: Accepted to ACL 2024
Published: 2024

5. Exploring Low-Resource Medical Image Classification with Weakly Supervised Prompt Learning

Author: Zheng, Fudan, Cao, Jindong, Yu, Weijiang, Chen, Zhiguang, Xiao, Nong, and Lu, Yutong
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Most advances in medical image recognition supporting clinical auxiliary diagnosis meet challenges due to the low-resource situation in the medical field, where annotations are highly expensive and professional. This low-resource problem can be alleviated by leveraging the transferable representations of large-scale pre-trained vision-language models via relevant medical text prompts. However, existing pre-trained vision-language models require domain experts to carefully design the medical prompts, which greatly increases the burden on clinicians. To address this problem, we propose a weakly supervised prompt learning method MedPrompt to automatically generate medical prompts, which includes an unsupervised pre-trained vision-language model and a weakly supervised prompt learning model. The unsupervised pre-trained vision-language model utilizes the natural correlation between medical images and corresponding medical texts for pre-training, without any manual annotations. The weakly supervised prompt learning model only utilizes the classes of images in the dataset to guide the learning of the specific class vector in the prompt, while the learning of other context vectors in the prompt requires no manual annotations for guidance. To the best of our knowledge, this is the first model to automatically generate medical prompts. With these prompts, the pre-trained vision-language model can be freed from the strong expert dependency of manual annotation and manual prompt design. Experimental results show that the model using our automatically generated prompts outperforms its full-shot learning hand-crafted prompts counterparts with only a minimal number of labeled samples for few-shot learning, and reaches superior or comparable accuracy on zero-shot image classification. The proposed prompt generator is lightweight and therefore can be embedded into any network architecture., Comment: Accepted by Pattern Recognition
Published: 2024

6. Intensive Vision-guided Network for Radiology Report Generation

Author: Zheng, Fudan, Li, Mengfei, Wang, Ying, Yu, Weijiang, Wang, Ruixuan, Chen, Zhiguang, Xiao, Nong, and Lu, Yutong
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Automatic radiology report generation is booming due to its huge application potential for the healthcare industry. However, existing computer vision and natural language processing approaches to tackle this problem are limited in two aspects. First, when extracting image features, most of them neglect multi-view reasoning in vision and model single-view structure of medical images, such as space-view or channel-view. However, clinicians rely on multi-view imaging information for comprehensive judgment in daily clinical diagnosis. Second, when generating reports, they overlook context reasoning with multi-modal information and focus on pure textual optimization utilizing retrieval-based methods. We aim to address these two issues by proposing a model that better simulates clinicians' perspectives and generates more accurate reports. Given the above limitation in feature extraction, we propose a Globally-intensive Attention (GIA) module in the medical image encoder to simulate and integrate multi-view vision perception. GIA aims to learn three types of vision perception: depth view, space view, and pixel view. On the other hand, to address the above problem in report generation, we explore how to involve multi-modal signals to generate precisely matched reports, i.e., how to integrate previously predicted words with region-aware visual content in next word prediction. Specifically, we design a Visual Knowledge-guided Decoder (VKGD), which can adaptively consider how much the model needs to rely on visual information and previously predicted text to assist next word prediction. Hence, our final Intensive Vision-guided Network (IVGN) framework includes a GIA-guided Visual Encoder and the VKGD. Experiments on two commonly-used datasets IU X-Ray and MIMIC-CXR demonstrate the superior ability of our method compared with other state-of-the-art approaches., Comment: Accepted by Physics in Medicine & Biology
Published: 2024

7. AdaNAS: Adaptively Post-processing with Self-supervised Neural Architecture Search for Ensemble Rainfall Forecasts

Author: Wen, Yingpeng, Yu, Weijiang, Zheng, Fudan, Huang, Dan, and Xiao, Nong
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Physics - Atmospheric and Oceanic Physics
Abstract: Previous post-processing studies on rainfall forecasts using numerical weather prediction (NWP) mainly focus on statistics-based aspects, while learning-based aspects are rarely investigated. Although some manually-designed models are proposed to raise accuracy, they are customized networks, which need to be repeatedly tried and verified, at a huge cost in time and labor. Therefore, a self-supervised neural architecture search (NAS) method without significant manual efforts called AdaNAS is proposed in this study to perform rainfall forecast post-processing and predict rainfall with high accuracy. In addition, we design a rainfall-aware search space to significantly improve forecasts for high-rainfall areas. Furthermore, we propose a rainfall-level regularization function to eliminate the effect of noise data during the training. Validation experiments have been performed under the cases of \emph{None}, \emph{Light}, \emph{Moderate}, \emph{Heavy} and \emph{Violent} on a large-scale precipitation benchmark named TIGGE. Finally, the average mean-absolute error (MAE) and average root-mean-square error (RMSE) of the proposed AdaNAS model are 0.98 and 2.04 mm/day, respectively. Additionally, the proposed AdaNAS model is compared with other neural architecture search methods and previous studies. Compared results reveal the satisfactory performance and superiority of the proposed AdaNAS model in terms of precipitation amount prediction and intensity classification. Concretely, the proposed AdaNAS model outperformed previous best-performing manual methods with MAE and RMSE improving by 80.5\% and 80.3\%, respectively.
Published: 2023

8. Learning to Break: Knowledge-Enhanced Reasoning in Multi-Agent Debate System

Author: Wang, Haotian, Du, Xiyuan, Yu, Weijiang, Chen, Qianglong, Zhu, Kun, Chu, Zheng, Yan, Lian, and Guan, Yi
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Multi-agent debate system (MAD) imitating the process of human discussion in pursuit of truth, aims to align the correct cognition of different agents for the optimal solution. It is challenging to make various agents perform right and highly consistent cognition due to their limited and different knowledge backgrounds (i.e., cognitive islands), which hinders the search for the optimal solution. To address the challenge, we propose a novel \underline{M}ulti-\underline{A}gent \underline{D}ebate with \underline{K}nowledge-\underline{E}nhanced framework (\textbf{MADKE}) to promote the system to find the solution. First, we involve a shared retrieval knowledge pool in the debate process to solve the problem of limited and different knowledge backgrounds. Then, we propose an adaptive knowledge selection method to guarantee the accuracy and personalization of knowledge. This method allows agents to choose whether to use external knowledge in each conversation round according to their own needs. Our experimental results on six datasets show that our method achieves state-of-the-art results compared to existing single-agent and multi-agent methods. Further analysis reveals that the introduction of retrieval knowledge can help the agent to break cognitive islands in the debate process and effectively improve the consistency and correctness of the model. Moreover, MADKE using Qwen1.5-72B-Chat surpasses GPT-4 by +1.26\% on average in six datasets, which validates that our method can help open-source LLMs achieve or even surpass the performance of GPT-4. Our code is available at \url{https://github.com/FutureForMe/MADKE}., Comment: 18 pages, 10 figures, work in progress
Published: 2023

9. TimeBench: A Comprehensive Evaluation of Temporal Reasoning Abilities in Large Language Models

Author: Chu, Zheng, Chen, Jingchang, Chen, Qianglong, Yu, Weijiang, Wang, Haotian, Liu, Ming, and Qin, Bing
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Grasping the concept of time is a fundamental facet of human cognition, indispensable for truly comprehending the intricacies of the world. Previous studies typically focus on specific aspects of time, lacking a comprehensive temporal reasoning benchmark. To address this, we propose TimeBench, a comprehensive hierarchical temporal reasoning benchmark that covers a broad spectrum of temporal reasoning phenomena. TimeBench provides a thorough evaluation for investigating the temporal reasoning capabilities of large language models. We conduct extensive experiments on GPT-4, LLaMA2, and other popular LLMs under various settings. Our experimental results indicate a significant performance gap between the state-of-the-art LLMs and humans, highlighting that there is still a considerable distance to cover in temporal reasoning. Besides, LLMs exhibit capability discrepancies across different reasoning categories. Furthermore, we thoroughly analyze the impact of multiple aspects on temporal reasoning and emphasize the associated challenges. We aspire for TimeBench to serve as a comprehensive benchmark, fostering research in temporal reasoning. Resources are available at: https://github.com/zchuz/TimeBench, Comment: Accepted to ACL 2024
Published: 2023

10. Trends in Integration of Knowledge and Large Language Models: A Survey and Taxonomy of Methods, Benchmarks, and Applications

Author: Feng, Zhangyin, Ma, Weitao, Yu, Weijiang, Huang, Lei, Wang, Haotian, Chen, Qianglong, Peng, Weihua, Feng, Xiaocheng, Qin, Bing, and liu, Ting
Subjects: Computer Science - Computation and Language
Abstract: Large language models (LLMs) exhibit superior performance on various natural language tasks, but they are susceptible to issues stemming from outdated data and domain-specific limitations. In order to address these challenges, researchers have pursued two primary strategies, knowledge editing and retrieval augmentation, to enhance LLMs by incorporating external information from different aspects. Nevertheless, there is still a notable absence of a comprehensive survey. In this paper, we propose a review to discuss the trends in integration of knowledge and large language models, including taxonomy of methods, benchmarks, and applications. In addition, we conduct an in-depth analysis of different methods and point out potential research directions in the future. We hope this survey offers the community quick access and a comprehensive overview of this research area, with the intention of inspiring future research endeavors., Comment: Work in progress; 22 pages. This work has been submitted to the IEEE for possible publication
Published: 2023

11. A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions

Author: Huang, Lei, Yu, Weijiang, Ma, Weitao, Zhong, Weihong, Feng, Zhangyin, Wang, Haotian, Chen, Qianglong, Peng, Weihua, Feng, Xiaocheng, Qin, Bing, and Liu, Ting
Subjects: Computer Science - Computation and Language
Abstract: The emergence of large language models (LLMs) has marked a significant breakthrough in natural language processing (NLP), leading to remarkable advancements in text understanding and generation. Nevertheless, alongside these strides, LLMs exhibit a critical tendency to produce hallucinations, resulting in content that is inconsistent with real-world facts or user inputs. This phenomenon poses substantial challenges to their practical deployment and raises concerns over the reliability of LLMs in real-world scenarios, which attracts increasing attention to detect and mitigate these hallucinations. In this survey, we aim to provide a thorough and in-depth overview of recent advances in the field of LLM hallucinations. We begin with an innovative taxonomy of LLM hallucinations, then delve into the factors contributing to hallucinations. Subsequently, we present a comprehensive overview of hallucination detection methods and benchmarks. Additionally, representative approaches designed to mitigate hallucinations are introduced accordingly. Finally, we analyze the challenges that highlight the current limitations and formulate open questions, aiming to delineate pathways for future research on hallucinations in LLMs., Comment: Work in progress; 49 pages
Published: 2023

12. Navigate through Enigmatic Labyrinth A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future

Author: Chu, Zheng, Chen, Jingchang, Chen, Qianglong, Yu, Weijiang, He, Tao, Wang, Haotian, Peng, Weihua, Liu, Ming, Qin, Bing, and Liu, Ting
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Reasoning, a fundamental cognitive process integral to human intelligence, has garnered substantial interest within artificial intelligence. Notably, recent studies have revealed that chain-of-thought prompting significantly enhances LLM's reasoning capabilities, which attracts widespread attention from both academics and industry. In this paper, we systematically investigate relevant research, summarizing advanced methods through a meticulous taxonomy that offers novel perspectives. Moreover, we delve into the current frontiers and delineate the challenges and future directions, thereby shedding light on future research. Furthermore, we engage in a discussion about open questions. We hope this paper serves as an introduction for beginners and fosters future research. Resources have been made publicly available at https://github.com/zchuz/CoT-Reasoning-Survey, Comment: Accepted to ACL 2024
Published: 2023

13. Imputing spatial transcriptomics through gene network constructed from protein language model

Author: Zeng, Yuansong, Song, Yujie, Zhang, Chengyang, Li, Haoxuan, Zhao, Yongkang, Yu, Weijiang, Zhang, Shiqi, Zhang, Hongyu, Dai, Zhiming, and Yang, Yuedong
Published: 2024
Full Text: View/download PDF

14. Deciphering cell types by integrating scATAC-seq data with genome sequences

Author: Zeng, Yuansong, Luo, Mai, Shangguan, Ningyuan, Shi, Peiyu, Feng, Junxi, Xu, Jin, Chen, Ken, Lu, Yutong, Yu, Weijiang, and Yang, Yuedong
Published: 2024
Full Text: View/download PDF

15. Domain-Adaptive Text Classification with Structured Knowledge from Unlabeled Data

Author: Li, Tian, Chen, Xiang, Dong, Zhen, Yu, Weijiang, Yan, Yijun, Keutzer, Kurt, and Zhang, Shanghang
Subjects: Computer Science - Computation and Language
Abstract: Domain adaptive text classification is a challenging problem for the large-scale pretrained language models because they often require expensive additional labeled data to adapt to new domains. Existing works usually fails to leverage the implicit relationships among words across domains. In this paper, we propose a novel method, called Domain Adaptation with Structured Knowledge (DASK), to enhance domain adaptation by exploiting word-level semantic relationships. DASK first builds a knowledge graph to capture the relationship between pivot terms (domain-independent words) and non-pivot terms in the target domain. Then during training, DASK injects pivot-related knowledge graph information into source domain texts. For the downstream task, these knowledge-injected texts are fed into a BERT variant capable of processing knowledge-injected textual data. Thanks to the knowledge injection, our model learns domain-invariant features for non-pivots according to their relationships with pivots. DASK ensures the pivots to have domain-invariant behaviors by dynamically inferring via the polarity scores of candidate pivots during training with pseudo-labels. We validate DASK on a wide range of cross-domain sentiment classification tasks and observe up to 2.9% absolute performance improvement over baselines for 20 different domain pairs. Code will be made available at https://github.com/hikaru-nara/DASK.
Published: 2022

16. Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic Memory

Author: Siyao, Li, Yu, Weijiang, Gu, Tianpei, Lin, Chunze, Wang, Quan, Qian, Chen, Loy, Chen Change, and Liu, Ziwei
Subjects: Computer Science - Sound, Computer Science - Computer Vision and Pattern Recognition, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Driving 3D characters to dance following a piece of music is highly challenging due to the spatial constraints applied to poses by choreography norms. In addition, the generated dance sequence also needs to maintain temporal coherency with different music genres. To tackle these challenges, we propose a novel music-to-dance framework, Bailando, with two powerful components: 1) a choreographic memory that learns to summarize meaningful dancing units from 3D pose sequence to a quantized codebook, 2) an actor-critic Generative Pre-trained Transformer (GPT) that composes these units to a fluent dance coherent to the music. With the learned choreographic memory, dance generation is realized on the quantized units that meet high choreography standards, such that the generated dancing sequences are confined within the spatial constraints. To achieve synchronized alignment between diverse motion tempos and music beats, we introduce an actor-critic-based reinforcement learning scheme to the GPT with a newly-designed beat-align reward function. Extensive experiments on the standard benchmark demonstrate that our proposed framework achieves state-of-the-art performance both qualitatively and quantitatively. Notably, the learned choreographic memory is shown to discover human-interpretable dancing-style poses in an unsupervised manner., Comment: Accepted by CVPR 2022. Code and video link: https://github.com/lisiyao21/Bailando/
Published: 2022

17. Hybrid Reasoning Network for Video-based Commonsense Captioning

Author: Yu, Weijiang, Liang, Jian, Ji, Lei, Li, Lu, Fang, Yuejian, Xiao, Nong, and Duan, Nan
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Computation and Language, 68T07
Abstract: The task of video-based commonsense captioning aims to generate event-wise captions and meanwhile provide multiple commonsense descriptions (e.g., attribute, effect and intention) about the underlying event in the video. Prior works explore the commonsense captions by using separate networks for different commonsense types, which is time-consuming and lacks mining the interaction of different commonsense. In this paper, we propose a Hybrid Reasoning Network (HybridNet) to endow the neural networks with the capability of semantic-level reasoning and word-level reasoning. Firstly, we develop multi-commonsense learning for semantic-level reasoning by jointly training different commonsense types in a unified network, which encourages the interaction between the clues of multiple commonsense descriptions, event-wise captions and videos. Then, there are two steps to achieve the word-level reasoning: (1) a memory module records the history predicted sequence from the previous generation processes; (2) a memory-routed multi-head attention (MMHA) module updates the word-level attention maps by incorporating the history information from the memory module into the transformer decoder for word-level reasoning. Moreover, the multimodal features are used to make full use of diverse knowledge for commonsense reasoning. Experiments and abundant analysis on the large-scale Video-to-Commonsense benchmark show that our HybridNet achieves state-of-the-art performance compared with other methods., Comment: 11 pages, 6 figures
Published: 2021

18. Deep Animation Video Interpolation in the Wild

Author: Siyao, Li, Zhao, Shiyu, Yu, Weijiang, Sun, Wenxiu, Metaxas, Dimitris N., Loy, Chen Change, and Liu, Ziwei
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In the animation industry, cartoon videos are usually produced at low frame rate since hand drawing of such frames is costly and time-consuming. Therefore, it is desirable to develop computational models that can automatically interpolate the in-between animation frames. However, existing video interpolation methods fail to produce satisfying results on animation data. Compared to natural videos, animation videos possess two unique characteristics that make frame interpolation difficult: 1) cartoons comprise lines and smooth color pieces. The smooth areas lack textures and make it difficult to estimate accurate motions on animation videos. 2) cartoons express stories via exaggeration. Some of the motions are non-linear and extremely large. In this work, we formally define and study the animation video interpolation problem for the first time. To address the aforementioned challenges, we propose an effective framework, AnimeInterp, with two dedicated modules in a coarse-to-fine manner. Specifically, 1) Segment-Guided Matching resolves the "lack of textures" challenge by exploiting global matching among color pieces that are piece-wise coherent. 2) Recurrent Flow Refinement resolves the "non-linear and extremely large motion" challenge by recurrent predictions using a transformer-like architecture. To facilitate comprehensive training and evaluations, we build a large-scale animation triplet dataset, ATD-12K, which comprises 12,000 triplets with rich annotations. Extensive experiments demonstrate that our approach outperforms existing state-of-the-art interpolation methods for animation videos. Notably, AnimeInterp shows favorable perceptual quality and robustness for animation scenarios in the wild. The proposed dataset and code are available at https://github.com/lisiyao21/AnimeInterp/., Comment: Accepted by CVPR21
Published: 2021

19. Heterogeneous Graph Learning for Visual Commonsense Reasoning

Author: Yu, Weijiang, Zhou, Jingwen, Yu, Weihao, Liang, Xiaodan, and Xiao, Nong
Subjects: Computer Science - Computer Vision and Pattern Recognition, 68T01
Abstract: Visual commonsense reasoning task aims at leading the research field into solving cognition-level reasoning with the ability of predicting correct answers and meanwhile providing convincing reasoning paths, resulting in three sub-tasks i.e., Q->A, QA->R and Q->AR. It poses great challenges over the proper semantic alignment between vision and linguistic domains and knowledge reasoning to generate persuasive reasoning paths. Existing works either resort to a powerful end-to-end network that cannot produce interpretable reasoning paths or solely explore intra-relationship of visual objects (homogeneous graph) while ignoring the cross-domain semantic alignment among visual concepts and linguistic words. In this paper, we propose a new Heterogeneous Graph Learning (HGL) framework for seamlessly integrating the intra-graph and inter-graph reasoning in order to bridge vision and language domain. Our HGL consists of a primal vision-to-answer heterogeneous graph (VAHG) module and a dual question-to-answer heterogeneous graph (QAHG) module to interactively refine reasoning paths for semantic agreement. Moreover, our HGL integrates a contextual voting module to exploit a long-range visual context for better global reasoning. Experiments on the large-scale Visual Commonsense Reasoning benchmark demonstrate the superior performance of our proposed modules on three tasks (improving 5% accuracy on Q->A, 3.5% on QA->R, 5.8% on Q->AR), Comment: 11 pages, 5 figures
Published: 2019

20. Layout-Graph Reasoning for Fashion Landmark Detection

Author: Yu, Weijiang, Liang, Xiaodan, Gong, Ke, Jiang, Chenhan, Xiao, Nong, and Lin, Liang
Subjects: Computer Science - Computer Vision and Pattern Recognition, I.4.9
Abstract: Detecting dense landmarks for diverse clothes, as a fundamental technique for clothes analysis, has attracted increasing research attention due to its huge application potential. However, due to the lack of modeling underlying semantic layout constraints among landmarks, prior works often detect ambiguous and structure-inconsistent landmarks of multiple overlapped clothes in one person. In this paper, we propose to seamlessly enforce structural layout relationships among landmarks on the intermediate representations via multiple stacked layout-graph reasoning layers. We define the layout-graph as a hierarchical structure including a root node, body-part nodes (e.g. upper body, lower body), coarse clothes-part nodes (e.g. collar, sleeve) and leaf landmark nodes (e.g. left-collar, right-collar). Each Layout-Graph Reasoning(LGR) layer aims to map feature representations into structural graph nodes via a Map-to-Node module, performs reasoning over structural graph nodes to achieve global layout coherency via a layout-graph reasoning module, and then maps graph nodes back to enhance feature representations via a Node-to-Map module. The layout-graph reasoning module integrates a graph clustering operation to generate representations of intermediate nodes (bottom-up inference) and then a graph deconvolution operation (top-down inference) over the whole graph. Extensive experiments on two public fashion landmark datasets demonstrate the superiority of our model. Furthermore, to advance the fine-grained fashion landmark research for supporting more comprehensive clothes generation and attribute recognition, we contribute the first Fine-grained Fashion Landmark Dataset (FFLD) containing 200k images annotated with at most 32 key-points for 13 clothes types., Comment: 9 pages, 5 figures, CVPR2019
Published: 2019

21. Gradual Network for Single Image De-raining

Author: Huang, Zhe, Yu, Weijiang, Zhang, Wayne, Feng, Litong, and Xiao, Nong
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Multimedia
Abstract: Most advances in single image de-raining meet a key challenge, which is removing rain streaks with different scales and shapes while preserving image details. Existing single image de-raining approaches treat rain-streak removal as a process of pixel-wise regression directly. However, they are lacking in mining the balance between over-de-raining (e.g. removing texture details in rain-free regions) and under-de-raining (e.g. leaving rain streaks). In this paper, we firstly propose a coarse-to-fine network called Gradual Network (GraNet) consisting of coarse stage and fine stage for delving into single image de-raining with different granularities. Specifically, to reveal coarse-grained rain-streak characteristics (e.g. long and thick rain streaks/raindrops), we propose a coarse stage by utilizing local-global spatial dependencies via a local-global subnetwork composed of region-aware blocks. Taking the residual result (the coarse de-rained result) between the rainy image sample (i.e. the input data) and the output of coarse stage (i.e. the learnt rain mask) as input, the fine stage continues to de-rain by removing the fine-grained rain streaks (e.g. light rain streaks and water mist) to get a rain-free and well-reconstructed output image via a unified contextual merging sub-network with dense blocks and a merging block. Solid and comprehensive experiments on synthetic and real data demonstrate that our GraNet can significantly outperform the state-of-the-art methods by removing rain streaks with various densities, scales and shapes while keeping the image details of rain-free regions well-preserved., Comment: In Proceedings of the 27th ACM International Conference on Multimedia (MM 2019)
Published: 2019
Full Text: View/download PDF

22. CosNAS: Enhancing estimation on cosmological parameters via neural architecture search

Author: Wen, Yingpeng, Yu, Weijiang, Li, Dongsheng, Du, Jiangsu, Huang, Dan, and Xiao, Nong
Published: 2023
Full Text: View/download PDF

23. CellFM: a large-scale foundation model pre-trained on transcriptomics of 100 million human cells

Author: Zeng, Yuansong, primary, Xie, Jiancong, additional, Wei, Zhuoyi, additional, Su, Yun, additional, Shangguan, Ningyuan, additional, Yang, Shuangyu, additional, Zhang, Chengyang, additional, Li, Wenbing, additional, Zhang, Jinbo, additional, Fang, Nan, additional, Zhang, Hongyu, additional, Zhao, Huiying, additional, Lu, Yutong, additional, Fan, Jue, additional, Yu, Weijiang, additional, and Yang, Yuedong, additional
Published: 2024
Full Text: View/download PDF

24. Antibiotics-associated pseudomembranous colitis: a disproportionality analysis of the US food and drug administration adverse event reporting system (FAERS) database

Author: Chen, Jinhua, primary, Yu, Weijiang, additional, Zhang, Wancun, additional, Sun, Cuicui, additional, and Zhang, Wenzhou, additional
Published: 2024
Full Text: View/download PDF

25. Deciphering Cell Types by Integrating scATAC-seq Data with Genome Sequences

Author: Yang, Yuedong, primary, Zeng, Yuansong, additional, Luo, Mai, additional, Shangguan, Ningyuan, additional, Shi, Peiyu, additional, Feng, Junxi, additional, Xu, Jin, additional, Chen, Ken, additional, Lu, Yutong, additional, and Yu, Weijiang, additional
Published: 2024
Full Text: View/download PDF

26. Chlorin e6 (Ce6)-loaded supramolecular polypeptide micelles with enhanced photodynamic therapy effect against Pseudomonas aeruginosa

Author: Gao, Qiang, Huang, Danni, Deng, Yongyan, Yu, Weijiang, Jin, Qiao, Ji, Jian, and Fu, Guosheng
Published: 2021
Full Text: View/download PDF

27. Removable Photocatalysis Microneedle Reactor for Carbon Monoxide Delivery to Enhance Chemosensitization.

Author: Yu, Weijiang, Fu, Junzhe, Jia, Fan, Jin, Qiao, Wang, Youxiang, and Ji, Jian
Published: 2024
Full Text: View/download PDF

28. Exploring low-resource medical image classification with weakly supervised prompt learning

Author: Zheng, Fudan, primary, Cao, Jindong, additional, Yu, Weijiang, additional, Chen, Zhiguang, additional, Xiao, Nong, additional, and Lu, Yutong, additional
Published: 2024
Full Text: View/download PDF

29. AdaNAS: Adaptively Post-processing with Self-supervised Neural Architecture Search for Ensemble Rainfall Forecasts

Author: Wen, Yingpeng, primary, Yu, Weijiang, additional, Zheng, Fudan, additional, Huang, Dan, additional, and Xiao, Nong, additional
Published: 2024
Full Text: View/download PDF

30. Investigating the impact of seawater intrusion on the operation cost of groundwater supply in island aquifers

Author: Yu, Weijiang, primary, Baù, Domenico, additional, Mayer, Alex S., additional, Mancewicz, Lauren, additional, and Geranmehr, Mohammadali, additional
Published: 2023
Full Text: View/download PDF

31. Fabrication of composite microneedles integrated with insulin-loaded CaCO3 microparticles and PVP for transdermal delivery in diabetic rats

Author: Liu, Depeng, Yu, Bo, Jiang, Guohua, Yu, Weijiang, Zhang, Yang, and Xu, Bin
Published: 2018
Full Text: View/download PDF

32. Microneedles fabricated from alginate and maltose for transdermal delivery of insulin on diabetic rats

Author: Zhang, Yang, Jiang, Guohua, Yu, Weijiang, Liu, Depeng, and Xu, Bin
Published: 2018
Full Text: View/download PDF

33. Enhanced Transcutaneous Chemodynamic Therapy for Melanoma Treatment through Cascaded Fenton-like Reactions and Nitric Oxide Delivery

Author: Yu, Weijiang, primary, Jia, Fan, additional, Fu, Junzhe, additional, Chen, Yonghang, additional, Huang, Yan, additional, Jin, Qiao, additional, Wang, Youxiang, additional, and Ji, Jian, additional
Published: 2023
Full Text: View/download PDF

34. Electrospun carbon nanofiberic coated with ambutan-like NiCo2O4microspheres as electrode materials

Author: Chen, Hua, Jiang, Guohua, Yu, Weijiang, Liu, Depeng, Liu, Yongkun, Li, Lei, and Huang, Qin
Abstract: Abstract
Published: 2024
Full Text: View/download PDF

35. Polymer microneedles fabricated from alginate and hyaluronate for transdermal delivery of insulin

Author: Yu, Weijiang, Jiang, Guohua, Zhang, Yang, Liu, Depeng, Xu, Bin, and Zhou, Junyi
Published: 2017
Full Text: View/download PDF

36. Preparation of poly(lactic-co-glycolic acid) and chitosan composite nanocarriers via electrostatic self assembly for oral delivery of insulin

Author: Xu, Bin, Jiang, Guohua, Yu, Weijiang, Liu, Depeng, Liu, Yongkun, Kong, Xiangdong, and Yao, Juming
Published: 2017
Full Text: View/download PDF

37. Fabrication of biodegradable composite microneedles based on calcium sulfate and gelatin for transdermal delivery of insulin

Author: Yu, Weijiang, Jiang, Guohua, Liu, Depeng, Li, Lei, Chen, Hua, Liu, Yongkun, Huang, Qin, Tong, Zaizai, Yao, Juming, and Kong, Xiangdong
Published: 2017
Full Text: View/download PDF

38. Preparation of chitosan-based multifunctional nanocarriers overcoming multiple barriers for oral delivery of insulin

Author: Li, Lei, Jiang, Guohua, Yu, Weijiang, Liu, Depeng, Chen, Hua, Liu, Yongkun, Tong, Zaizai, Kong, Xiangdong, and Yao, Juming
Published: 2017
Full Text: View/download PDF

39. An Efficient Surrogate-based Multi-objective Optimisation Framework with Novel Sampling Strategy for Sustainable Island Groundwater Management.

Author: Yu, Weijiang, Baù, Domenico, Mayer, Alex S., and Geranmehr, Mohammadali
Subjects: MONTE Carlo method, SALTWATER encroachment, GROUNDWATER management, GAUSSIAN processes, PARETO optimum
Abstract: In groundwater pumping optimization (GPO), offline-trained data-driven surrogates can be used to replace numerical-intensive simulators in order to save computing time. The traditional offline training approach involves building surrogates prior to optimization, fitting training datasets that cover the input space uniformly or randomly, which can prove inefficient due to the potential oversampling of low-gradient areas and under-sampling of high-gradient areas. This study proposes an offline machine-learning (ML) algorithm that ranks candidate training points by scoring them based on their distance to the closest training point and on the local gradient of the surrogate estimate and then choosing the highest-rank point. This method is applied to develop surrogates for solving a two-objective GPO problem formulated on a three-dimensional (3D) island aquifer, using hydrogeological conditions representative of San Salvador Island, Bahamas. The objectives are to minimise the supply cost (fOC) resulting from groundwater pumping and desalination and maximise fresh groundwater supply (Qp), subject to constraints on seawater intrusion (SWI) control expressed in terms of aquifer drawdown Δs at pumping locations and aquifer salt mass increase ΔSM. Gaussian Process (GP) is the technique applied to construct surrogates of objectives and constraints, alongside the estimation of uncertainties. Using GP models, it is possible to estimate the probability of "Pareto optimality" for each pumping scheme by Monte Carlo simulation. Pareto optimal pumping schemes (POPS) are then characterized by a probability of occurrence, which can be verified by numerical simulation. The GP training strategy's effectiveness in generating POPS is compared to traditional training approaches, showing that such a strategy can efficiently identify reliable POPS. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

40. A composite hydrogel system containing glucose-responsive nanocarriers for oral delivery of insulin

Author: Li, Lei, Jiang, Guohua, Yu, Weijiang, Liu, Depeng, Chen, Hua, Liu, Yongkun, Huang, Qin, Tong, Zaizai, Yao, Juming, and Kong, Xiangdong
Published: 2016
Full Text: View/download PDF

41. Identifying B-cell epitopes using AlphaFold2 predicted structures and pretrained language model

Author: Zeng, Yuansong, primary, Wei, Zhuoyi, additional, Yuan, Qianmu, additional, Chen, Sheng, additional, Yu, Weijiang, additional, Lu, Yutong, additional, Gao, Jianzhao, additional, and Yang, Yuedong, additional
Published: 2023
Full Text: View/download PDF

42. Apollo's Oracle: Retrieval-Augmented Reasoning in Multi-Agent Debates

Author: Wang, Haotian, Du, Xiyuan, Yu, Weijiang, Chen, Qianglong, Zhu, Kun, Chu, Zheng, Yan, Lian, Guan, Yi, Wang, Haotian, Du, Xiyuan, Yu, Weijiang, Chen, Qianglong, Zhu, Kun, Chu, Zheng, Yan, Lian, and Guan, Yi
Abstract: Multi-agent debate systems are designed to derive accurate and consistent conclusions through adversarial interactions among agents. However, these systems often encounter challenges due to cognitive constraints, manifesting as (1) agents' obstinate adherence to incorrect viewpoints and (2) their propensity to abandon correct viewpoints. These issues are primarily responsible for the ineffectiveness of such debates. Addressing the challenge of cognitive constraints, we introduce a novel framework, the Multi-Agent Debate with Retrieval Augmented (MADRA). MADRA incorporates retrieval of prior knowledge into the debate process, effectively breaking cognitive constraints and enhancing the agents' reasoning capabilities. Furthermore, we have developed a self-selection module within this framework, enabling agents to autonomously select pertinent evidence, thereby minimizing the impact of irrelevant or noisy data. We have comprehensively tested and analyzed MADRA across six diverse datasets. The experimental results demonstrate that our approach significantly enhances performance across various tasks, proving the effectiveness of our proposed method., Comment: 16 pages, 7 figures
Published: 2023

43. AdaNAS: Adaptively Postprocessing With Self-Supervised Neural Architecture Search for Ensemble Rainfall Forecasts

Author: Wen, Yingpeng, Yu, Weijiang, Zheng, Fudan, Huang, Dan, and Xiao, Nong
Abstract: Previous postprocessing studies on rainfall forecasts using numerical weather prediction (NWP) mainly focus on statistics-based aspects, while learning-based aspects are rarely investigated. Although some manually designed models are proposed to raise accuracy, they are customized networks, which need to be repeatedly tried and verified, at a huge cost in time and labor. Therefore, a self-supervised neural architecture search (NAS) method without significant manual efforts called AdaNAS is proposed in this study to perform rainfall forecast postprocessing and predict rainfall with high accuracy. In addition, we design a rainfall-aware search space to significantly improve forecasts for high-rainfall areas. Furthermore, we propose a rainfall-level regularization function to eliminate the effect of noise data during the training. Validation experiments have been performed under the cases of None, Light, Moderate, Heavy, and Violent on a large-scale precipitation benchmark named TIGGE. Finally, the average mean-absolute error (MAE) and average root-mean-square error (RMSE) of the proposed AdaNAS model are 0.98 and 2.04 mm/day, respectively. Additionally, the proposed AdaNAS model is compared with other NAS methods and previous studies. Compared results reveal the satisfactory performance and superiority of the proposed AdaNAS model in terms of precipitation amount prediction and intensity classification. Concretely, the proposed AdaNAS model outperformed previous best-performing manual methods with MAE and RMSE improving by 80.5% and 80.3%, respectively.
Published: 2024
Full Text: View/download PDF

44. Distribution of aquaporins and sodium transporters in the gastrointestinal tract of a desert hare, Lepus yarkandensis

Author: Zhang, Jianping, Li, Shuwei, Deng, Fang, Baikeli, Buheliqihan, Yu, Weijiang, and Liu, Guoquan
Published: 2019
Full Text: View/download PDF

45. Bailando++: 3D Dance GPT With Choreographic Memory

Author: Siyao, Li, Yu, Weijiang, Gu, Tianpei, Lin, Chunze, Wang, Quan, Qian, Chen, Loy, Chen Change, and Liu, Ziwei
Abstract: Our proposed music-to-dance framework, Bailando++, addresses the challenges of driving 3D characters to dance in a way that follows the constraints of choreography norms and maintains temporal coherency with different music genres. Bailando++ consists of two components: a choreographic memory that learns to summarize meaningful dancing units from 3D pose sequences, and an actor-critic Generative Pre-trained Transformer (GPT) that composes these units into a fluent dance coherent to the music. In particular, to synchronize the diverse motion tempos and music beats, we introduce an actor-critic-based reinforcement learning scheme to the GPT with a novel beat-align reward function. Additionally, we consider learning human dance poses in the rotation domain to avoid body distortions incompatible with human morphology, and introduce a musical contextual encoding to allow the motion GPT to grasp longer-term patterns of music. Our experiments on the standard benchmark show that Bailando++ achieves state-of-the-art performance both qualitatively and quantitatively, with the added benefit of the unsupervised discovery of human-interpretable dancing-style poses in the choreographic memory.
Published: 2023
Full Text: View/download PDF

46. Sustainable Management of Coastal Aquifers subject to Seawater Intrusion using Reduced-Order Groundwater Flow Models

Author: Geranmehr, Mohammadali, primary, Baù, Domenico, additional, Mayer, Alex S., additional, Mancewicz, Lauren, additional, and Yu, Weijiang, additional
Published: 2023
Full Text: View/download PDF

47. Identifying spatial domain by adapting transcriptomics with histology through contrastive learning

Author: Zeng, Yuansong, primary, Yin, Rui, additional, Luo, Mai, additional, Chen, Jianing, additional, Pan, Zixiang, additional, Lu, Yutong, additional, Yu, Weijiang, additional, and Yang, Yuedong, additional
Published: 2023
Full Text: View/download PDF

48. Exploring Low-Resource Medical Image Classification with Weakly Supervised Prompt Learning

Author: Zheng, Fudan, primary, Cao, Jindong, additional, Yu, Weijiang, additional, Chen, Zhiguang, additional, Xiao, Nong, additional, and Lu, Yutong, additional
Published: 2023
Full Text: View/download PDF

49. A photocatalytic carbon monoxide-generating effervescent microneedle patch for improved transdermal chemotherapy

Author: Fu, Junzhe, primary, Yu, Weijiang, additional, Qian, Xuedan, additional, Wang, Youxiang, additional, and Ji, Jian, additional
Published: 2023
Full Text: View/download PDF

50. Knowledge-aware Global Reasoning for Situation Recognition

Author: Yu, Weijiang, primary, Wang, Haofan, additional, Li, Guohao, additional, Xiao, Nong, additional, and Ghanem, Bernard, additional
Published: 2023
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

250 results on '"Yu, Weijiang"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources