Author: "Liu, Zheyuan" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Liu, Zheyuan"' showing total 414 results

Start Over Author "Liu, Zheyuan"

414 results on '"Liu, Zheyuan"'

1. Can Large Language Models Understand Preferences in Personalized Recommendation?

Author: Tan, Zhaoxuan, Zeng, Zinan, Zeng, Qingkai, Wu, Zhenyu, Liu, Zheyuan, Mo, Fengran, and Jiang, Meng
Subjects: Computer Science - Computation and Language
Abstract: Large Language Models (LLMs) excel in various tasks, including personalized recommendations. Existing evaluation methods often focus on rating prediction, relying on regression errors between actual and predicted ratings. However, user rating bias and item quality, two influential factors behind rating scores, can obscure personal preferences in user-item pair data. To address this, we introduce PerRecBench, disassociating the evaluation from these two factors and assessing recommendation techniques on capturing the personal preferences in a grouped ranking manner. We find that the LLM-based recommendation techniques that are generally good at rating prediction fail to identify users' favored and disfavored items when the user rating bias and item quality are eliminated by grouping users. With PerRecBench and 19 LLMs, we find that while larger models generally outperform smaller ones, they still struggle with personalized recommendation. Our findings reveal the superiority of pairwise and listwise ranking approaches over pointwise ranking, PerRecBench's low correlation with traditional regression metrics, the importance of user profiles, and the role of pretraining data distributions. We further explore three supervised fine-tuning strategies, finding that merging weights from single-format training is promising but improving LLMs' understanding of user preferences remains an open research problem. Code and data are available at https://github.com/TamSiuhin/PerRecBench
Published: 2025

2. CLIPErase: Efficient Unlearning of Visual-Textual Associations in CLIP

Author: Yang, Tianyu, Dai, Lisen, Liu, Zheyuan, Wang, Xiangqi, Jiang, Meng, Tian, Yapeng, and Zhang, Xiangliang
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Machine unlearning (MU) has gained significant attention as a means to remove specific data from trained models without requiring a full retraining process. While progress has been made in unimodal domains like text and image classification, unlearning in multimodal models remains relatively underexplored. In this work, we address the unique challenges of unlearning in CLIP, a prominent multimodal model that aligns visual and textual representations. We introduce CLIPErase, a novel approach that disentangles and selectively forgets both visual and textual associations, ensuring that unlearning does not compromise model performance. CLIPErase consists of three key modules: a Forgetting Module that disrupts the associations in the forget set, a Retention Module that preserves performance on the retain set, and a Consistency Module that maintains consistency with the original model. Extensive experiments on the CIFAR-100 and Flickr30K datasets across four CLIP downstream tasks demonstrate that CLIPErase effectively forgets designated associations in zero-shot tasks for multimodal samples, while preserving the model's performance on the retain set after unlearning.
Published: 2024

3. Protecting Privacy in Multimodal Large Language Models with MLLMU-Bench

Author: Liu, Zheyuan, Dou, Guangyao, Jia, Mengzhao, Tan, Zhaoxuan, Zeng, Qingkai, Yuan, Yongle, and Jiang, Meng
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Generative models such as Large Language Models (LLM) and Multimodal Large Language models (MLLMs) trained on massive web corpora can memorize and disclose individuals' confidential and private data, raising legal and ethical concerns. While many previous works have addressed this issue in LLM via machine unlearning, it remains largely unexplored for MLLMs. To tackle this challenge, we introduce Multimodal Large Language Model Unlearning Benchmark (MLLMU-Bench), a novel benchmark aimed at advancing the understanding of multimodal machine unlearning. MLLMU-Bench consists of 500 fictitious profiles and 153 profiles for public celebrities, each profile feature over 14 customized question-answer pairs, evaluated from both multimodal (image+text) and unimodal (text) perspectives. The benchmark is divided into four sets to assess unlearning algorithms in terms of efficacy, generalizability, and model utility. Finally, we provide baseline results using existing generative model unlearning algorithms. Surprisingly, our experiments show that unimodal unlearning algorithms excel in generation and cloze tasks, while multimodal unlearning approaches perform better in classification tasks with multimodal inputs., Comment: 30 pages
Published: 2024

4. OpenKD: Opening Prompt Diversity for Zero- and Few-shot Keypoint Detection

Author: Lu, Changsheng, Liu, Zheyuan, and Koniusz, Piotr
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Exploiting the foundation models (e.g., CLIP) to build a versatile keypoint detector has gained increasing attention. Most existing models accept either the text prompt (e.g., ``the nose of a cat''), or the visual prompt (e.g., support image with keypoint annotations), to detect the corresponding keypoints in query image, thereby, exhibiting either zero-shot or few-shot detection ability. However, the research on taking multimodal prompt is still underexplored, and the prompt diversity in semantics and language is far from opened. For example, how to handle unseen text prompts for novel keypoint detection and the diverse text prompts like ``Can you detect the nose and ears of a cat?'' In this work, we open the prompt diversity from three aspects: modality, semantics (seen v.s. unseen), and language, to enable a more generalized zero- and few-shot keypoint detection (Z-FSKD). We propose a novel OpenKD model which leverages multimodal prototype set to support both visual and textual prompting. Further, to infer the keypoint location of unseen texts, we add the auxiliary keypoints and texts interpolated from visual and textual domains into training, which improves the spatial reasoning of our model and significantly enhances zero-shot novel keypoint detection. We also found large language model (LLM) is a good parser, which achieves over 96% accuracy to parse keypoints from texts. With LLM, OpenKD can handle diverse text prompts. Experimental results show that our method achieves state-of-the-art performance on Z-FSKD and initiates new ways to deal with unseen text and diverse texts. The source code and data are available at https://github.com/AlanLuSun/OpenKD., Comment: Accepted by ECCV 2024
Published: 2024

5. Chiral-Split Magnon in Altermagnetic MnTe

Author: Liu, Zheyuan, Ozeki, Makoto, Asai, Shinichiro, Itoh, Shinichi, and Masuda, Takatsugu
Subjects: Condensed Matter - Strongly Correlated Electrons
Abstract: Altermagnetism is a newly discovered magnetic class named after the alternating spin polarizations in both real and reciprocal spaces. Like the spin-splitting of electronic bands, the magnon bands in altermagnets are predicted to exhibit alternating chiral splitting. In this work, by performing inelastic neutron scattering on $\alpha$-MnTe, we directly verified the chiral splitting in altermagnetic magnon dispersions. The lifted degeneracy of chirality is further explained by a symmetric-exchange origin. In addition, the $g$-wave magnetism was identified in MnTe., Comment: 6 pages, 3 figures. To appear in Phys. Rev. Lett. This is the initial manuscript that was submitted to Phys. Rev. Lett
Published: 2024
Full Text: View/download PDF

6. Machine Unlearning in Generative AI: A Survey

Author: Liu, Zheyuan, Dou, Guangyao, Tan, Zhaoxuan, Tian, Yijun, and Jiang, Meng
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: Generative AI technologies have been deployed in many places, such as (multimodal) large language models and vision generative models. Their remarkable performance should be attributed to massive training data and emergent reasoning abilities. However, the models would memorize and generate sensitive, biased, or dangerous information originated from the training data especially those from web crawl. New machine unlearning (MU) techniques are being developed to reduce or eliminate undesirable knowledge and its effects from the models, because those that were designed for traditional classification tasks could not be applied for Generative AI. We offer a comprehensive survey on many things about MU in Generative AI, such as a new problem formulation, evaluation methods, and a structured discussion on the advantages and limitations of different kinds of MU techniques. It also presents several critical challenges and promising directions in MU research. A curated list of readings can be found: https://github.com/franciscoliu/GenAI-MU-Reading.
Published: 2024

7. DiffStega: Towards Universal Training-Free Coverless Image Steganography with Diffusion Models

Author: Yang, Yiwei, Liu, Zheyuan, Jia, Jun, Gao, Zhongpai, Li, Yunhao, Sun, Wei, Liu, Xiaohong, and Zhai, Guangtao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Traditional image steganography focuses on concealing one image within another, aiming to avoid steganalysis by unauthorized entities. Coverless image steganography (CIS) enhances imperceptibility by not using any cover image. Recent works have utilized text prompts as keys in CIS through diffusion models. However, this approach faces three challenges: invalidated when private prompt is guessed, crafting public prompts for semantic diversity, and the risk of prompt leakage during frequent transmission. To address these issues, we propose DiffStega, an innovative training-free diffusion-based CIS strategy for universal application. DiffStega uses a password-dependent reference image as an image prompt alongside the text, ensuring that only authorized parties can retrieve the hidden information. Furthermore, we develop Noise Flip technique to further secure the steganography against unauthorized decryption. To comprehensively assess our method across general CIS tasks, we create a dataset comprising various image steganography instances. Experiments indicate substantial improvements in our method over existing ones, particularly in aspects of versatility, password sensitivity, and recovery quality. Codes are available at \url{https://github.com/evtricks/DiffStega}., Comment: 9 pages, 7 figures; reference added; accepted at IJCAI2024 main track
Published: 2024

8. Avoiding Copyright Infringement via Large Language Model Unlearning

Author: Dou, Guangyao, Liu, Zheyuan, Lyu, Qing, Ding, Kaize, and Wong, Eric
Subjects: Computer Science - Computation and Language
Abstract: Pre-trained Large Language Models (LLMs) have demonstrated remarkable capabilities but also pose risks by learning and generating copyrighted material, leading to significant legal and ethical concerns. In real-world scenarios, model owners need to continuously address copyright infringement as new requests for content removal emerge at different time points. This leads to the need for sequential unlearning, where copyrighted content is removed sequentially as new requests arise. Despite its practical relevance, sequential unlearning in the context of copyright infringement has not been rigorously explored in existing literature. To address this gap, we propose Stable Sequential Unlearning (SSU), a novel framework designed to unlearn copyrighted content from LLMs over multiple time steps. Our approach works by identifying and removing specific weight updates in the model's parameters that correspond to copyrighted content. We improve unlearning efficacy by introducing random labeling loss and ensuring the model retains its general-purpose knowledge by adjusting targeted parameters. Experimental results show that SSU achieves an effective trade-off between unlearning efficacy and general-purpose language abilities, outperforming existing baselines.
Published: 2024

9. Personalized Pieces: Efficient Personalized Large Language Models through Collaborative Efforts

Author: Tan, Zhaoxuan, Liu, Zheyuan, and Jiang, Meng
Subjects: Computer Science - Computation and Language
Abstract: Personalized large language models (LLMs) aim to tailor interactions, content, and recommendations to individual user preferences. While parameter-efficient fine-tuning (PEFT) methods excel in performance and generalization, they are costly and limit communal benefits when used individually. To this end, we introduce Personalized Pieces (Per-Pcs), a framework that allows users to safely share and assemble personalized PEFT efficiently with collaborative efforts. Per-Pcs involves selecting sharers, breaking their PEFT into pieces, and training gates for each piece. These pieces are added to a pool, from which target users can select and assemble personalized PEFT using their history data. This approach preserves privacy and enables fine-grained user modeling without excessive storage and computation demands. Experimental results show Per-Pcs outperforms non-personalized and PEFT retrieval baselines, offering performance comparable to OPPU with significantly lower resource use across six tasks. Further analysis highlights Per-Pcs's robustness concerning sharer count and selection strategy, pieces sharing ratio, and scalability in computation time and storage space. Per-Pcs's modularity promotes safe sharing, making LLM personalization more efficient, effective, and widely accessible through collaborative efforts., Comment: EMNLP 2024 Main
Published: 2024

10. A halide-oxide composite solid-state electrolyte for enhancing ionic conductivity by promoting interfacial healing through low-temperature heat treatment

Author: Xu, Chenyuan, Chao, Yu, Yang, Sisheng, Li, Borong, Yu, Yan, Xu, Xiaoming, Sun, Yulong, Liu, Zheyuan, Wang, Qian, and Yang, Chengkai
Published: 2025
Full Text: View/download PDF

11. OpenKD: Opening Prompt Diversity for Zero- and Few-Shot Keypoint Detection

Author: Lu, Changsheng, Liu, Zheyuan, Koniusz, Piotr, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Leonardis, Aleš, editor, Ricci, Elisa, editor, Roth, Stefan, editor, Russakovsky, Olga, editor, Sattler, Torsten, editor, and Varol, Gül, editor
Published: 2025
Full Text: View/download PDF

12. Graph Learning for Parameter Prediction of Quantum Approximate Optimization Algorithm

Author: Liang, Zhiding, Liu, Gang, Liu, Zheyuan, Cheng, Jinglei, Hao, Tianyi, Liu, Kecheng, Ren, Hang, Song, Zhixin, Liu, Ji, Ye, Fanny, and Shi, Yiyu
Subjects: Quantum Physics, Computer Science - Machine Learning
Abstract: In recent years, quantum computing has emerged as a transformative force in the field of combinatorial optimization, offering novel approaches to tackling complex problems that have long challenged classical computational methods. Among these, the Quantum Approximate Optimization Algorithm (QAOA) stands out for its potential to efficiently solve the Max-Cut problem, a quintessential example of combinatorial optimization. However, practical application faces challenges due to current limitations on quantum computational resource. Our work optimizes QAOA initialization, using Graph Neural Networks (GNN) as a warm-start technique. This sacrifices affordable computational resource on classical computer to reduce quantum computational resource overhead, enhancing QAOA's effectiveness. Experiments with various GNN architectures demonstrate the adaptability and stability of our framework, highlighting the synergy between quantum algorithms and machine learning. Our findings show GNN's potential in improving QAOA performance, opening new avenues for hybrid quantum-classical approaches in quantum computing and contributing to practical applications.
Published: 2024

13. Underwater image enhancement via color conversion and white balance-based fusion

Author: Xu, Hanning, Mu, Pan, Liu, Zheyuan, and Cheng, Shichao
Published: 2024
Full Text: View/download PDF

14. Can we Soft Prompt LLMs for Graph Learning Tasks?

Author: Liu, Zheyuan, He, Xiaoxin, Tian, Yijun, and Chawla, Nitesh V.
Subjects: Computer Science - Machine Learning, Computer Science - Computation and Language
Abstract: Graph plays an important role in representing complex relationships in real-world applications such as social networks, biological data and citation networks. In recent years, Large Language Models (LLMs) have achieved tremendous success in various domains, which makes applying LLMs to graphs particularly appealing. However, directly applying LLMs to graph modalities presents unique challenges due to the discrepancy and mismatch between the graph and text modalities. Hence, to further investigate LLMs' potential for comprehending graph information, we introduce GraphPrompter, a novel framework designed to align graph information with LLMs via soft prompts. Specifically, GraphPrompter consists of two main components: a graph neural network to encode complex graph information and an LLM that effectively processes textual information. Comprehensive experiments on various benchmark datasets under node classification and link prediction tasks demonstrate the effectiveness of our proposed method. The GraphPrompter framework unveils the substantial capabilities of LLMs as predictors in graph-related tasks, enabling researchers to utilize LLMs across a spectrum of real-world graph scenarios more effectively., Comment: Accepted by The Web Conference (WWW) 2024 Short Paper Track
Published: 2024
Full Text: View/download PDF

15. Towards Safer Large Language Models through Machine Unlearning

Author: Liu, Zheyuan, Dou, Guangyao, Tan, Zhaoxuan, Tian, Yijun, and Jiang, Meng
Subjects: Computer Science - Computation and Language
Abstract: The rapid advancement of Large Language Models (LLMs) has demonstrated their vast potential across various domains, attributed to their extensive pretraining knowledge and exceptional generalizability. However, LLMs often encounter challenges in generating harmful content when faced with problematic prompts. To address this problem, existing work attempted to implement a gradient ascent based approach to prevent LLMs from producing harmful output. While these methods can be effective, they frequently impact the model utility in responding to normal prompts. To address this gap, we introduce Selective Knowledge negation Unlearning (SKU), a novel unlearning framework for LLMs, designed to eliminate harmful knowledge while preserving utility on normal prompts. Specifically, SKU is consisted of two stages: harmful knowledge acquisition stage and knowledge negation stage. The first stage aims to identify and acquire harmful knowledge within the model, whereas the second is dedicated to remove this knowledge. SKU selectively isolates and removes harmful knowledge in model parameters, ensuring the model's performance remains robust on normal prompts. Our experiments conducted across various LLM architectures demonstrate that SKU identifies a good balance point between removing harmful information and preserving utility., Comment: Accepted by ACL 2024 Findings
Published: 2024

16. UGMAE: A Unified Framework for Graph Masked Autoencoders

Author: Tian, Yijun, Zhang, Chuxu, Kou, Ziyi, Liu, Zheyuan, Zhang, Xiangliang, and Chawla, Nitesh V.
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Generative self-supervised learning on graphs, particularly graph masked autoencoders, has emerged as a popular learning paradigm and demonstrated its efficacy in handling non-Euclidean data. However, several remaining issues limit the capability of existing methods: 1) the disregard of uneven node significance in masking, 2) the underutilization of holistic graph information, 3) the ignorance of semantic knowledge in the representation space due to the exclusive use of reconstruction loss in the output space, and 4) the unstable reconstructions caused by the large volume of masked contents. In light of this, we propose UGMAE, a unified framework for graph masked autoencoders to address these issues from the perspectives of adaptivity, integrity, complementarity, and consistency. Specifically, we first develop an adaptive feature mask generator to account for the unique significance of nodes and sample informative masks (adaptivity). We then design a ranking-based structure reconstruction objective joint with feature reconstruction to capture holistic graph information and emphasize the topological proximity between neighbors (integrity). After that, we present a bootstrapping-based similarity module to encode the high-level semantic knowledge in the representation space, complementary to the low-level reconstruction in the output space (complementarity). Finally, we build a consistency assurance module to provide reconstruction objectives with extra stabilized consistency targets (consistency). Extensive experiments demonstrate that UGMAE outperforms both contrastive and generative state-of-the-art baselines on several tasks across multiple datasets.
Published: 2024

17. Democratizing Large Language Models via Personalized Parameter-Efficient Fine-tuning

Author: Tan, Zhaoxuan, Zeng, Qingkai, Tian, Yijun, Liu, Zheyuan, Yin, Bing, and Jiang, Meng
Subjects: Computer Science - Computation and Language
Abstract: Personalization in large language models (LLMs) is increasingly important, aiming to align the LLMs' interactions, content, and recommendations with individual user preferences. Recent advances have highlighted effective prompt design by enriching user queries with non-parametric knowledge through behavior history retrieval and textual profiles. However, these methods faced limitations due to a lack of model ownership, resulting in constrained customization and privacy issues, and often failed to capture complex, dynamic user behavior patterns. To address these shortcomings, we introduce One PEFT Per User (OPPU), employing personalized parameter-efficient fine-tuning (PEFT) modules to store user-specific behavior patterns and preferences. By plugging in personal PEFT parameters, users can own and use their LLMs individually. OPPU integrates parametric user knowledge in the personal PEFT parameters with non-parametric knowledge from retrieval and profiles, adapting LLMs to user behavior shifts. Experimental results demonstrate that OPPU significantly outperforms existing prompt-based methods across seven diverse tasks in the LaMP benchmark. Further studies reveal OPPU's enhanced capabilities in handling user behavior shifts, modeling users at different activity levels, maintaining robustness across various user history formats, and displaying versatility with different PEFT methods., Comment: EMNLP 2024 Main
Published: 2024

18. Breaking the Trilemma of Privacy, Utility, Efficiency via Controllable Machine Unlearning

Author: Liu, Zheyuan, Dou, Guangyao, Tian, Yijun, Zhang, Chunhui, Chien, Eli, and Zhu, Ziwei
Subjects: Computer Science - Cryptography and Security, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Machine Unlearning (MU) algorithms have become increasingly critical due to the imperative adherence to data privacy regulations. The primary objective of MU is to erase the influence of specific data samples on a given model without the need to retrain it from scratch. Accordingly, existing methods focus on maximizing user privacy protection. However, there are different degrees of privacy regulations for each real-world web-based application. Exploring the full spectrum of trade-offs between privacy, model utility, and runtime efficiency is critical for practical unlearning scenarios. Furthermore, designing the MU algorithm with simple control of the aforementioned trade-off is desirable but challenging due to the inherent complex interaction. To address the challenges, we present Controllable Machine Unlearning (ConMU), a novel framework designed to facilitate the calibration of MU. The ConMU framework contains three integral modules: an important data selection module that reconciles the runtime efficiency and model generalization, a progressive Gaussian mechanism module that balances privacy and model generalization, and an unlearning proxy that controls the trade-offs between privacy and runtime efficiency. Comprehensive experiments on various benchmark datasets have demonstrated the robust adaptability of our control mechanism and its superiority over established unlearning methods. ConMU explores the full spectrum of the Privacy-Utility-Efficiency trade-off and allows practitioners to account for different real-world regulations. Source code available at: https://github.com/guangyaodou/ConMU., Comment: Accepted by The Web Conference (WWW) 2024
Published: 2023
Full Text: View/download PDF

19. A Generalized Physical-knowledge-guided Dynamic Model for Underwater Image Enhancement

Author: Mu, Pan, Xu, Hanning, Liu, Zheyuan, Wang, Zheng, Chan, Sixian, and Bai, Cong
Subjects: Computer Science - Computer Vision and Pattern Recognition, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: Underwater images often suffer from color distortion and low contrast resulting in various image types, due to the scattering and absorption of light by water. While it is difficult to obtain high-quality paired training samples with a generalized model. To tackle these challenges, we design a Generalized Underwater image enhancement method via a Physical-knowledge-guided Dynamic Model (short for GUPDM), consisting of three parts: Atmosphere-based Dynamic Structure (ADS), Transmission-guided Dynamic Structure (TDS), and Prior-based Multi-scale Structure (PMS). In particular, to cover complex underwater scenes, this study changes the global atmosphere light and the transmission to simulate various underwater image types (e.g., the underwater image color ranging from yellow to blue) through the formation model. We then design ADS and TDS that use dynamic convolutions to adaptively extract prior information from underwater images and generate parameters for PMS. These two modules enable the network to select appropriate parameters for various water types adaptively. Besides, the multi-scale feature extraction module in PMS uses convolution blocks with different kernel sizes and obtains weights for each feature map via channel attention block and fuses them to boost the receptive field of the network. The source code will be available at \href{https://github.com/shiningZZ/GUPDM}{https://github.com/shiningZZ/GUPDM}., Comment: Accepted by ACMMM 2023
Published: 2023

20. Histogram-guided Video Colorization Structure with Spatial-Temporal Connection

Author: Liu, Zheyuan, Mu, Pan, Xu, Hanning, and Bai, Cong
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Video colorization, aiming at obtaining colorful and plausible results from grayish frames, has aroused a lot of interest recently. Nevertheless, how to maintain temporal consistency while keeping the quality of colorized results remains challenging. To tackle the above problems, we present a Histogram-guided Video Colorization with Spatial-Temporal connection structure (named ST-HVC). To fully exploit the chroma and motion information, the joint flow and histogram module is tailored to integrate the histogram and flow features. To manage the blurred and artifact, we design a combination scheme attending to temporal detail and flow feature combination. We further recombine the histogram, flow and sharpness features via a U-shape network. Extensive comparisons are conducted with several state-of-the-art image and video-based methods, demonstrating that the developed method achieves excellent performance both quantitatively and qualitatively in two video datasets., Comment: 6 pages; Accepted at IEEE ICME
Published: 2023

21. All-pairs Consistency Learning for Weakly Supervised Semantic Segmentation

Author: Sun, Weixuan, Zhang, Yanhao, Qin, Zhen, Liu, Zheyuan, Cheng, Lin, Wang, Fanyi, Zhong, Yiran, and Barnes, Nick
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In this work, we propose a new transformer-based regularization to better localize objects for Weakly supervised semantic segmentation (WSSS). In image-level WSSS, Class Activation Map (CAM) is adopted to generate object localization as pseudo segmentation labels. To address the partial activation issue of the CAMs, consistency regularization is employed to maintain activation intensity invariance across various image augmentations. However, such methods ignore pair-wise relations among regions within each CAM, which capture context and should also be invariant across image views. To this end, we propose a new all-pairs consistency regularization (ACR). Given a pair of augmented views, our approach regularizes the activation intensities between a pair of augmented views, while also ensuring that the affinity across regions within each view remains consistent. We adopt vision transformers as the self-attention mechanism naturally embeds pair-wise affinity. This enables us to simply regularize the distance between the attention matrices of augmented image pairs. Additionally, we introduce a novel class-wise localization method that leverages the gradients of the class token. Our method can be seamlessly integrated into existing WSSS methods using transformers without modifying the architectures. We evaluate our method on PASCAL VOC and MS COCO datasets. Our method produces noticeably better class localization maps (67.3% mIoU on PASCAL VOC train), resulting in superior WSSS performances., Comment: ICCV 2023 workshop, code released at: https://github.com/OpenNLPLab/ACR_WSSS
Published: 2023

22. Candidate Set Re-ranking for Composed Image Retrieval with Dual Multi-modal Encoder

Author: Liu, Zheyuan, Sun, Weixuan, Teney, Damien, and Gould, Stephen
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Information Retrieval, Computer Science - Machine Learning
Abstract: Composed image retrieval aims to find an image that best matches a given multi-modal user query consisting of a reference image and text pair. Existing methods commonly pre-compute image embeddings over the entire corpus and compare these to a reference image embedding modified by the query text at test time. Such a pipeline is very efficient at test time since fast vector distances can be used to evaluate candidates, but modifying the reference image embedding guided only by a short textual description can be difficult, especially independent of potential candidates. An alternative approach is to allow interactions between the query and every possible candidate, i.e., reference-text-candidate triplets, and pick the best from the entire set. Though this approach is more discriminative, for large-scale datasets the computational cost is prohibitive since pre-computation of candidate embeddings is no longer possible. We propose to combine the merits of both schemes using a two-stage model. Our first stage adopts the conventional vector distancing metric and performs a fast pruning among candidates. Meanwhile, our second stage employs a dual-encoder architecture, which effectively attends to the input triplet of reference-text-candidate and re-ranks the candidates. Both stages utilize a vision-and-language pre-trained network, which has proven beneficial for various downstream tasks. Our method consistently outperforms state-of-the-art approaches on standard benchmarks for the task. Our implementation is available at https://github.com/Cuberick-Orion/Candidate-Reranking-CIR., Comment: Accepted at TMLR, 19 pages, 8 figures
Published: 2023

23. An Alternative to WSSS? An Empirical Study of the Segment Anything Model (SAM) on Weakly-Supervised Semantic Segmentation Problems

Author: Sun, Weixuan, Liu, Zheyuan, Zhang, Yanhao, Zhong, Yiran, and Barnes, Nick
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: The Segment Anything Model (SAM) has demonstrated exceptional performance and versatility, making it a promising tool for various related tasks. In this report, we explore the application of SAM in Weakly-Supervised Semantic Segmentation (WSSS). Particularly, we adapt SAM as the pseudo-label generation pipeline given only the image-level class labels. While we observed impressive results in most cases, we also identify certain limitations. Our study includes performance evaluations on PASCAL VOC and MS-COCO, where we achieved remarkable improvements over the latest state-of-the-art methods on both datasets. We anticipate that this report encourages further explorations of adopting SAM in WSSS, as well as wider real-world applications., Comment: Technique report
Published: 2023

24. Bi-directional Training for Composed Image Retrieval via Text Prompt Learning

Author: Liu, Zheyuan, Sun, Weixuan, Hong, Yicong, Teney, Damien, and Gould, Stephen
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Information Retrieval, Computer Science - Machine Learning
Abstract: Composed image retrieval searches for a target image based on a multi-modal user query comprised of a reference image and modification text describing the desired changes. Existing approaches to solving this challenging task learn a mapping from the (reference image, modification text)-pair to an image embedding that is then matched against a large image corpus. One area that has not yet been explored is the reverse direction, which asks the question, what reference image when modified as described by the text would produce the given target image? In this work we propose a bi-directional training scheme that leverages such reversed queries and can be applied to existing composed image retrieval architectures with minimum changes, which improves the performance of the model. To encode the bi-directional query we prepend a learnable token to the modification text that designates the direction of the query and then finetune the parameters of the text embedding module. We make no other changes to the network architecture. Experiments on two standard datasets show that our novel approach achieves improved performance over a baseline BLIP-based model that itself already achieves competitive performance. Our code is released at https://github.com/Cuberick-Orion/Bi-Blip4CIR., Comment: WACV 2024 accepted. 12 pages, 7 figures
Published: 2023

25. Learning Audio-Visual Source Localization via False Negative Aware Contrastive Learning

Author: Sun, Weixuan, Zhang, Jiayi, Wang, Jianyuan, Liu, Zheyuan, Zhong, Yiran, Feng, Tianpeng, Guo, Yandong, Zhang, Yanhao, and Barnes, Nick
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Self-supervised audio-visual source localization aims to locate sound-source objects in video frames without extra annotations. Recent methods often approach this goal with the help of contrastive learning, which assumes only the audio and visual contents from the same video are positive samples for each other. However, this assumption would suffer from false negative samples in real-world training. For example, for an audio sample, treating the frames from the same audio class as negative samples may mislead the model and therefore harm the learned representations e.g., the audio of a siren wailing may reasonably correspond to the ambulances in multiple images). Based on this observation, we propose a new learning strategy named False Negative Aware Contrastive (FNAC) to mitigate the problem of misleading the training with such false negative samples. Specifically, we utilize the intra-modal similarities to identify potentially similar samples and construct corresponding adjacency matrices to guide contrastive learning. Further, we propose to strengthen the role of true negative samples by explicitly leveraging the visual features of sound sources to facilitate the differentiation of authentic sounding source regions. FNAC achieves state-of-the-art performances on Flickr-SoundNet, VGG-Sound, and AVSBench, which demonstrates the effectiveness of our method in mitigating the false negative issue. The code is available at \url{https://github.com/OpenNLPLab/FNAC_AVL}., Comment: CVPR2023
Published: 2023

26. Spin Excitation in Coupled Honeycomb Lattice Ni$_2$InSbO$_6$

Author: Liu, Zheyuan, Arima, Taka-hisa, Itoh, Shinichi, Asai, Shinichiro, and Masuda, Takatsugu
Subjects: Condensed Matter - Strongly Correlated Electrons
Abstract: We performed an inelastic neutron scattering experiment on a polycrystalline sample of a helimagnet Ni$_2$InSbO$_6$ to construct the spin Hamiltonian. Well-defined spin-wave excitation with a band energy of 20 meV was observed below $T_{N} = 76$ K. Using the linear spin-wave theory, the spectrum was reasonably reproduced with honeycomb spin layers coupled along the stacking axis (the $c$ axis). The proposed spin model reproduces the soliton lattice induced by a magnetic field applied perpendicular to the $c$ axis., Comment: 8 pages, 5 figures
Published: 2022
Full Text: View/download PDF

27. Plasma Surface Modification to Reduce Interfacial Defects in Aramid Fiber/Epoxy Composites

Author: Du, Yijun, Quan, Xiaoxi, Liu, Zheyuan, Deng, Yu, Chen, Shuo, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Tan, Kay Chen, Series Editor, Yang, Qingxin, editor, Li, Zewen, editor, and Luo, An, editor
Published: 2024
Full Text: View/download PDF

28. p-d orbital hybridization induced by CuGa2 promotes selective N2 electroreduction

Author: Chen, Bin, Zheng, Chaoyang, Shi, Dehuan, Huang, Yi, Deng, Renxia, Wei, Yang, Liu, Zheyuan, Yu, Yan, and Zhong, Shenghong
Published: 2025
Full Text: View/download PDF

29. Study on the simulation of electric power production in the integrated base of hydro-wind-photovoltaic-storage

Author: Wu, Di, Xiang, Huawei, Li, Dacheng, Yang, Jianzan, and Liu, Zheyuan
Published: 2024
Full Text: View/download PDF

30. Thermodynamic and kinetic properties of gas hydrate phase transition from formation to decomposition with applications: A review

Author: Liu, Zheyuan, Liu, Xiaoyang, Yang, Mingjun, Pang, Weixin, Dou, Binlin, and Song, Yongchen
Published: 2024
Full Text: View/download PDF

31. Alkyl-linked TiO2@COF heterostructure facilitating photocatalytic CO2 reduction by targeted electron transport

Author: Ning, Jiangqi, Huang, Junhan, Liu, Yuhang, Chen, Yanlei, Niu, Qing, Lin, Qingqing, He, Yajun, Liu, Zheyuan, Yu, Yan, and Li, Liuyi
Published: 2024
Full Text: View/download PDF

32. Visual study of hydrate particles agglomeration and rolling characteristic in gas-oil-water phase using a flow loop

Author: Liu, Zheyuan, Shen, Shichen, Dou, Binlin, Liu, Ni, Yang, Liang, Yang, Mingjun, and Song, Yongchen
Published: 2024
Full Text: View/download PDF

33. Analytical method for optimizing capacity expansion of existing hydropower plants in hydro-wind-photovoltaic hybrid system: A case study in the Yalong River basin

Author: Wu, Chen, Liu, Pan, Cheng, Qian, Yang, Zhikai, Huang, Kangdi, Liu, Zheyuan, Zheng, Yalian, Li, Xiao, Zhou, Yong, Jiang, Dingguo, and Yu, Yi
Published: 2025
Full Text: View/download PDF

34. Progress in diagnosis and treatment of strabismus based on artificial intelligence technology

Author: GUO Yonglin, CHEN Moxin, LIU Zheyuan, LI Yifei, WANG Ziqi, SHU Qin, and LI Lin
Subjects: strabismus, artificial intelligence, diagnosis, treatment, algorithm, Medicine
Abstract: Strabismus, misalignment of the eyes arising from central nervous system dysregulation and extraocular muscles imbalance, commonly manifests in childhood, leading to amblyopia, binocular vision dysfunction, torticollis and other developmental and psychological disorders. This exerts a negative impact on individuals, families and society. Timely diagnosis and intervention are crucial to prevent permanent damage to vision and stereopsis. Presently, strabismus diagnosis is reliant on the ophthalmologists′ evaluations which results in a lack of efficiency and coverage. However, routine school screening proves inadequate in assessing strabismus degree with low accuracy. Therefore, how to improve the efficiency of strabismus screening is an issue of great importance. This paper delves into the present landscape of strabismus diagnosis and treatment, considering both local and global research advancements. It focuses on the evolution of artificial intelligence technology, illuminating the utilization of artificial intelligence models and algorithms in strabismus. By pinpointing and exploring their strengths and limitations, it offers valuable insights, paving the way for future investigations into artificial intelligence-assisted strabismus diagnosis and treatment.
Published: 2024
Full Text: View/download PDF

35. GETAM: Gradient-weighted Element-wise Transformer Attention Map for Weakly-supervised Semantic segmentation

Author: Sun, Weixuan, Zhang, Jing, Liu, Zheyuan, Zhong, Yiran, and Barnes, Nick
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Weakly Supervised Semantic Segmentation (WSSS) is challenging, particularly when image-level labels are used to supervise pixel level prediction. To bridge their gap, a Class Activation Map (CAM) is usually generated to provide pixel level pseudo labels. CAMs in Convolutional Neural Networks suffer from partial activation ie, only the most discriminative regions are activated. Transformer based methods, on the other hand, are highly effective at exploring global context with long range dependency modeling, potentially alleviating the "partial activation" issue. In this paper, we propose the first transformer based WSSS approach, and introduce the Gradient weighted Element wise Transformer Attention Map (GETAM). GETAM shows fine scale activation for all feature map elements, revealing different parts of the object across transformer layers. Further, we propose an activation aware label completion module to generate high quality pseudo labels. Finally, we incorporate our methods into an end to end framework for WSSS using double backward propagation. Extensive experiments on PASCAL VOC and COCO demonstrate that our results beat the state-of-the-art end-to-end approaches by a significant margin, and outperform most multi-stage methods.m most multi-stage methods.
Published: 2021

36. Accelerated photodegradation of T-2 toxin over magnetic recyclable ZnO/CaFe2O4 nanocomposite with a p-n based Z-scheme heterojunction architecture

Author: Huang, Qingwen, Lou, Xiuping, Nie, Dongxia, Zhao, Zhihui, Fan, Kai, Guo, Wenbo, Meng, Jiajia, Liu, Zheyuan, and Han, Zheng
Published: 2024
Full Text: View/download PDF

37. Investigation of hydrate formation and slurry flow visualization in Oil-gas–water multiphase systems

Author: Liu, Zaixing, Ma, Shihui, Wu, Zhaoran, Wang, Lei, Liu, Zheyuan, Wang, Jiguang, Lang, Chen, Luo, Tingting, and Li, Yanghui
Published: 2024
Full Text: View/download PDF

38. Enhancing osseointegration and angiogenesis of Titanium implants through KMnO4-Modified Montmorillonite nano-clay coating

Author: Xiong, Lifeng, Dai, Binwei, Yin, Baodi, Hii Ru Yie, Kendrick, Sun, Haobo, Liu, Yang, Liu, Zheyuan, Mahany, Ahmed S., Cheng, Huan, Xu, Lihua, Gao, Peng, Lu, Lei, and Liu, Jinsong
Published: 2024
Full Text: View/download PDF

39. Dual modification of current collector for high-performance lithium metal batteries by laser etching

Author: Zhang, Xin, Huang, Lujun, Yang, Guobo, Song, Jinpeng, Cong, Guanghui, Liu, Shaoshuai, Huang, Yating, Liu, Zheyuan, and Geng, Lin
Published: 2024
Full Text: View/download PDF

40. Enhancing the Cathode/Electrolyte interface in Ni-Rich Lithium-Ion batteries through homogeneous oxynitridation enabled by NO3− dominated clusters

Author: Xiao, Yuanbin, Zhang, Weicheng, Dong, Weikang, Yang, Kang, Chao, Yu, Xi, Chenpeng, Li, Mengchao, Zhang, Qiaoli, Liu, Zheyuan, Du, Peng, Liu, Huan, Zhang, Weidong, Shao, Ruiwen, Wang, Qian, Yu, Yan, and Yang, Chengkai
Published: 2024
Full Text: View/download PDF

41. Three-dimensional transistor arrays for intra- and inter-cellular recording

Author: Gu, Yue, Wang, Chunfeng, Kim, Namheon, Zhang, Jingxin, Wang, Tsui Min, Stowe, Jennifer, Nasiri, Rohollah, Li, Jinfeng, Zhang, Daibo, Yang, Albert, Hsu, Leo Huan-Hsuan, Dai, Xiaochuan, Mu, Jing, Liu, Zheyuan, Lin, Muyang, Li, Weixin, Wang, Chonghe, Gong, Hua, Chen, Yimu, Lei, Yusheng, Hu, Hongjie, Li, Yang, Zhang, Lin, Huang, Zhenlong, Zhang, Xingcai, Ahadian, Samad, Banik, Pooja, Zhang, Liangfang, Jiang, Xiaocheng, Burke, Peter J, Khademhosseini, Ali, McCulloch, Andrew D, and Xu, Sheng
Subjects: Cardiovascular, Bioengineering, Underpinning research, 1.1 Normal biological development and functioning, Action Potentials, Cell Communication, Electrophysiological Phenomena, Myocytes, Cardiac, Nanoscience & Nanotechnology
Abstract: Electrical impulse generation and its conduction within cells or cellular networks are the cornerstone of electrophysiology. However, the advancement of the field is limited by sensing accuracy and the scalability of current recording technologies. Here we describe a scalable platform that enables accurate recording of transmembrane potentials in electrogenic cells. The platform employs a three-dimensional high-performance field-effect transistor array for minimally invasive cellular interfacing that produces faithful recordings, as validated by the gold standard patch clamp. Leveraging the high spatial and temporal resolutions of the field-effect transistors, we measured the intracellular signal conduction velocity of a cardiomyocyte to be 0.182 m s-1, which is about five times the intercellular velocity. We also demonstrate intracellular recordings in cardiac muscle tissue constructs and reveal the signal conduction paths. This platform could provide new capabilities in probing the electrical behaviours of single cells and cellular networks, which carries broad implications for understanding cellular physiology, pathology and cell-cell interactions.
Published: 2022

42. Image Retrieval on Real-life Images with Pre-trained Vision-and-Language Models

Author: Liu, Zheyuan, Rodriguez-Opazo, Cristian, Teney, Damien, and Gould, Stephen
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Computation and Language, Computer Science - Information Retrieval
Abstract: We extend the task of composed image retrieval, where an input query consists of an image and short textual description of how to modify the image. Existing methods have only been applied to non-complex images within narrow domains, such as fashion products, thereby limiting the scope of study on in-depth visual reasoning in rich image and language contexts. To address this issue, we collect the Compose Image Retrieval on Real-life images (CIRR) dataset, which consists of over 36,000 pairs of crowd-sourced, open-domain images with human-generated modifying text. To extend current methods to the open-domain, we propose CIRPLANT, a transformer based model that leverages rich pre-trained vision-and-language (V&L) knowledge for modifying visual features conditioned on natural language. Retrieval is then done by nearest neighbor lookup on the modified features. We demonstrate that with a relatively simple architecture, CIRPLANT outperforms existing methods on open-domain images, while matching state-of-the-art accuracy on the existing narrow datasets, such as fashion. Together with the release of CIRR, we believe this work will inspire further research on composed image retrieval., Comment: ICCV 2021. Dataset, code, and pre-trained models are released at https://cuberick-orion.github.io/CIRR/
Published: 2021

43. Esculin suppresses the PERK-eIF2α-CHOP pathway by enhancing SIRT1 expression in oxidative stress-induced rat chondrocytes, mitigating osteoarthritis progression in a rat model

Author: Cheng, Zhihua, Liu, Zheyuan, Liu, Chao, Yang, Aoxiang, Miao, Haichuan, and Bai, Xizhuang
Published: 2024
Full Text: View/download PDF

44. Atomic scale formation mechanism of the Amorphous-Nanocrystalline biphase structure in TiSiN Coating: Phase field crystal simulation and experimental characterization

Author: Liu, Zheyuan, Zapolsky, Helena, Tang, Sai, Patte, Renaud, Mao, Hong, Du, Yong, Qiu, Lianchang, and Zhang, Li
Published: 2024
Full Text: View/download PDF

45. Mussel byssus-inspired dual-functionalization of zirconia dental implants for improved bone integration

Author: Zhang, Qihong, Wu, Shuyi, Sun, Yingyue, Ru Yie, Kendrick Hii, Zhuang, Jiatong, Liu, Tingting, Si, Wen, Zhang, Yinyan, Liu, Zheyuan, Xiong, Lifeng, Lu, Lei, Gao, Peng, and Liu, Jinsong
Published: 2024
Full Text: View/download PDF

46. Mechanical properties and microstructure of cement-fly ash-dacite powder composite cementitious system

Author: Liu, Zheyuan, Han, Juhong, and Hu, Liangjian
Published: 2024
Full Text: View/download PDF

47. Establishing the carrier scattering phase diagram for ZrNiSn-based half-Heusler thermoelectric materials

Author: Ren, Qingyong, Fu, Chenguang, Qiu, Qinyi, Dai, Shengnan, Liu, Zheyuan, Masuda, Takatsugu, Asai, Shinichiro, Hagihala, Masato, Lee, Sanghyun, Torri, Shuki, Kamiyama, Takashi, He, Lunhua, Tong, Xin, Felser, Claudia, Singh, David J., Zhu, Tiejun, Yang, Jiong, and Ma, Jie
Subjects: Condensed Matter - Materials Science, Physics - Applied Physics
Abstract: Chemical doping is one of the most important strategies for tuning electrical properties of semiconductors, particularly thermoelectric materials. Generally, the main role of chemical doping lies in optimizing the carrier concentration, but there can potentially be other important effects. Here, we show that chemical doping plays multiple roles for both electron and phonon transport properties in half-Heusler thermoelectric materials. With ZrNiSn-based half-Heusler materials as an example, we use high-quality single and polycrystalline crystals, various probes, including electrical transport measurements, inelastic neutron scattering measurement, and first-principles calculations, to investigate the underlying electron-phonon interaction. We find that chemical doping brings strong screening effects to ionized impurities, grain boundary, and polar optical phonon scattering, but has negligible influence on lattice thermal conductivity. Furthermore, it is possible to establish a carrier scattering phase diagram, which can be used to select reasonable strategies for optimization of the thermoelectric performance., Comment: 21 pages, 5 figures
Published: 2020
Full Text: View/download PDF

48. Deriving long-term operating rules of the hydro-wind-PV hybrid energy system considering electricity price

Author: Xu, Shitian, Liu, Pan, Li, Xiao, Cheng, Qian, and Liu, Zheyuan
Published: 2023
Full Text: View/download PDF

49. Low-Carbon Economic Dispatch of an Integrated Energy System Based on Carbon Emission Flow Theory

Author: Liu, Zheyuan, Xing, Haijun, Luo, Yangfan, Ye, Yujing, and Shi, Yusong
Published: 2023
Full Text: View/download PDF

50. An investigation on the variation of induction process in natural gas hydrate formation influenced by multiphase flow in a visual flow loop

Author: Liu, Ni, Sun, Yu, Wang, Cheng, Yang, Liang, and Liu, Zheyuan
Published: 2023
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

414 results on '"Liu, Zheyuan"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources