Author: "Yuan, Jiayi" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Yuan, Jiayi"' showing total 312 results

Start Over Author "Yuan, Jiayi"

312 results on '"Yuan, Jiayi"'

1. Interpreting and Steering LLMs with Mutual Information-based Explanations on Sparse Autoencoders

Author: Wu, Xuansheng, Yuan, Jiayi, Yao, Wenlin, Zhai, Xiaoming, and Liu, Ninghao
Subjects: Computer Science - Computation and Language
Abstract: Large language models (LLMs) excel at handling human queries, but they can occasionally generate flawed or unexpected responses. Understanding their internal states is crucial for understanding their successes, diagnosing their failures, and refining their capabilities. Although sparse autoencoders (SAEs) have shown promise for interpreting LLM internal representations, limited research has explored how to better explain SAE features, i.e., understanding the semantic meaning of features learned by SAE. Our theoretical analysis reveals that existing explanation methods suffer from the frequency bias issue, where they emphasize linguistic patterns over semantic concepts, while the latter is more critical to steer LLM behaviors. To address this, we propose using a fixed vocabulary set for feature interpretations and designing a mutual information-based objective, aiming to better capture the semantic meaning behind these features. We further propose two runtime steering strategies that adjust the learned feature activations based on their corresponding explanations. Empirical results show that, compared to baselines, our method provides more discourse-level explanations and effectively steers LLM behaviors to defend against jailbreak attacks. These findings highlight the value of explanations for steering LLM behaviors in downstream applications. We will release our code and data once accepted., Comment: Pre-print. 20 pages, 5 figures
Published: 2025

2. The Science of Evaluating Foundation Models

Author: Yuan, Jiayi, Zhang, Jiamu, Wen, Andrew, and Hu, Xia
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: The emergent phenomena of large foundation models have revolutionized natural language processing. However, evaluating these models presents significant challenges due to their size, capabilities, and deployment across diverse applications. Existing literature often focuses on individual aspects, such as benchmark performance or specific tasks, but fails to provide a cohesive process that integrates the nuances of diverse use cases with broader ethical and operational considerations. This work focuses on three key aspects: (1) Formalizing the Evaluation Process by providing a structured framework tailored to specific use-case contexts, (2) Offering Actionable Tools and Frameworks such as checklists and templates to ensure thorough, reproducible, and practical evaluations, and (3) Surveying Recent Work with a targeted review of advancements in LLM evaluation, emphasizing real-world applications.
Published: 2025

3. Robot Learning with Super-Linear Scaling

Author: Torne, Marcel, Jain, Arhan, Yuan, Jiayi, Macha, Vidaaranya, Ankile, Lars, Simeonov, Anthony, Agrawal, Pulkit, and Gupta, Abhishek
Subjects: Computer Science - Robotics, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Scaling robot learning requires data collection pipelines that scale favorably with human effort. In this work, we propose Crowdsourcing and Amortizing Human Effort for Real-to-Sim-to-Real(CASHER), a pipeline for scaling up data collection and learning in simulation where the performance scales superlinearly with human effort. The key idea is to crowdsource digital twins of real-world scenes using 3D reconstruction and collect large-scale data in simulation, rather than the real-world. Data collection in simulation is initially driven by RL, bootstrapped with human demonstrations. As the training of a generalist policy progresses across environments, its generalization capabilities can be used to replace human effort with model generated demonstrations. This results in a pipeline where behavioral data is collected in simulation with continually reducing human effort. We show that CASHER demonstrates zero-shot and few-shot scaling laws on three real-world tasks across diverse scenarios. We show that CASHER enables fine-tuning of pre-trained policies to a target scenario using a video scan without any additional human effort. See our project website: https://casher-robot-learning.github.io/CASHER/
Published: 2024

4. InvestESG: A multi-agent reinforcement learning benchmark for studying climate investment as a social dilemma

Author: Hou, Xiaoxuan, Yuan, Jiayi, Leibo, Joel Z., and Jaques, Natasha
Subjects: Computer Science - Machine Learning, Computer Science - Computers and Society, Computer Science - Multiagent Systems, Economics - General Economics
Abstract: InvestESG is a novel multi-agent reinforcement learning (MARL) benchmark designed to study the impact of Environmental, Social, and Governance (ESG) disclosure mandates on corporate climate investments. The benchmark models an intertemporal social dilemma where companies balance short-term profit losses from climate mitigation efforts and long-term benefits from reducing climate risk, while ESG-conscious investors attempt to influence corporate behavior through their investment decisions. Companies allocate capital across mitigation, greenwashing, and resilience, with varying strategies influencing climate outcomes and investor preferences. We are releasing open-source versions of InvestESG in both PyTorch and JAX, which enable scalable and hardware-accelerated simulations for investigating competing incentives in mitigate climate change. Our experiments show that without ESG-conscious investors with sufficient capital, corporate mitigation efforts remain limited under the disclosure mandate. However, when a critical mass of investors prioritizes ESG, corporate cooperation increases, which in turn reduces climate risks and enhances long-term financial stability. Additionally, providing more information about global climate risks encourages companies to invest more in mitigation, even without investor involvement. Our findings align with empirical research using real-world data, highlighting MARL's potential to inform policy by providing insights into large-scale socio-economic challenges through efficient testing of alternative policy and market designs.
Published: 2024

5. Taylor Unswift: Secured Weight Release for Large Language Models via Taylor Expansion

Author: Wang, Guanchu, Chuang, Yu-Neng, Tang, Ruixiang, Zhong, Shaochen, Yuan, Jiayi, Jin, Hongye, Liu, Zirui, Chaudhary, Vipin, Xu, Shuai, Caverlee, James, and Hu, Xia
Subjects: Computer Science - Cryptography and Security, Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Ensuring the security of released large language models (LLMs) poses a significant dilemma, as existing mechanisms either compromise ownership rights or raise data privacy concerns. To address this dilemma, we introduce TaylorMLP to protect the ownership of released LLMs and prevent their abuse. Specifically, TaylorMLP preserves the ownership of LLMs by transforming the weights of LLMs into parameters of Taylor-series. Instead of releasing the original weights, developers can release the Taylor-series parameters with users, thereby ensuring the security of LLMs. Moreover, TaylorMLP can prevent abuse of LLMs by adjusting the generation speed. It can induce low-speed token generation for the protected LLMs by increasing the terms in the Taylor-series. This intentional delay helps LLM developers prevent potential large-scale unauthorized uses of their models. Empirical experiments across five datasets and three LLM architectures demonstrate that TaylorMLP induces over 4x increase in latency, producing the tokens precisely matched with original LLMs. Subsequent defensive experiments further confirm that TaylorMLP effectively prevents users from reconstructing the weight values based on downstream datasets.
Published: 2024

6. DHP Benchmark: Are LLMs Good NLG Evaluators?

Author: Wang, Yicheng, Yuan, Jiayi, Chuang, Yu-Neng, Wang, Zhuoer, Liu, Yingchi, Cusick, Mark, Kulkarni, Param, Ji, Zhengping, Ibrahim, Yasser, and Hu, Xia
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Large Language Models (LLMs) are increasingly serving as evaluators in Natural Language Generation (NLG) tasks; this is often referred to as ``LLM-as-a-judge'' paradigm. However, the capabilities of LLMs in evaluating NLG quality remain underexplored. Current studies depend on human assessments and simple metrics that fail to capture the discernment of LLMs across diverse NLG tasks. To address this gap, we propose the Discernment of Hierarchical Perturbation (DHP) benchmarking framework, which provides quantitative discernment scores for LLMs. This framework leverages hierarchically perturbed text data and statistical tests to systematically measure the NLG evaluation capabilities of LLMs. We re-established six evaluation datasets for this benchmark, covering four NLG tasks: Summarization, Story Completion, Question Answering, and Translation. Our comprehensive benchmarking of five major LLM families provides critical insight into their strengths and limitations as NLG evaluators. Our dataset is available at https://huggingface.co/datasets/YCWANGVINCE/DHP_Benchmark.
Published: 2024

7. KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches

Author: Yuan, Jiayi, Liu, Hongyi, Zhong, Shaochen, Chuang, Yu-Neng, Li, Songchen, Wang, Guanchu, Le, Duy, Jin, Hongye, Chaudhary, Vipin, Xu, Zhaozhuo, Liu, Zirui, and Hu, Xia
Subjects: Computer Science - Computation and Language
Abstract: Long context capability is a crucial competency for large language models (LLMs) as it mitigates the human struggle to digest long-form texts. This capability enables complex task-solving scenarios such as book summarization, code assistance, and many more tasks that are traditionally manpower-intensive. However, transformer-based LLMs face significant challenges with long context input due to the growing size of the KV cache and the intrinsic complexity of attending to extended inputs; where multiple schools of efficiency-driven approaches - such as KV cache quantization, token dropping, prompt compression, linear-time sequence models, and hybrid architectures - have been proposed to produce efficient yet long context-capable models. Despite these advancements, no existing work has comprehensively benchmarked these methods in a reasonably aligned environment. In this work, we fill this gap by providing a taxonomy of current methods and evaluating 10+ state-of-the-art approaches across seven categories of long context tasks. Our work reveals numerous previously unknown phenomena and offers insights - as well as a friendly workbench - for the future development of long context-capable LLMs. The source code is available at https://github.com/henryzhongsc/longctx_bench.
Published: 2024

8. LTSM-Bundle: A Toolbox and Benchmark on Large Language Models for Time Series Forecasting

Author: Chuang, Yu-Neng, Li, Songchen, Yuan, Jiayi, Wang, Guanchu, Lai, Kwei-Herng, Sui, Songyuan, Yu, Leisheng, Ding, Sirui, Chang, Chia-Yuan, Tan, Qiaoyu, Zha, Daochen, and Hu, Xia
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Time Series Forecasting (TSF) has long been a challenge in time series analysis. Inspired by the success of Large Language Models (LLMs), researchers are now developing Large Time Series Models (LTSMs)-universal transformer-based models that use autoregressive prediction-to improve TSF. However, training LTSMs on heterogeneous time series data poses unique challenges, including diverse frequencies, dimensions, and patterns across datasets. Recent endeavors have studied and evaluated various design choices aimed at enhancing LTSM training and generalization capabilities. However, these design choices are typically studied and evaluated in isolation and are not benchmarked collectively. In this work, we introduce LTSM-Bundle, a comprehensive toolbox, and benchmark for training LTSMs, spanning pre-processing techniques, model configurations, and dataset configuration. It modularized and benchmarked LTSMs from multiple dimensions, encompassing prompting strategies, tokenization approaches, training paradigms, base model selection, data quantity, and dataset diversity. Furthermore, we combine the most effective design choices identified in our study. Empirical results demonstrate that this combination achieves superior zero-shot and few-shot performances compared to state-of-the-art LTSMs and traditional TSF methods on benchmark datasets.
Published: 2024

9. SARS-CoV-2 brainstem encephalitis in human inherited DBR1 deficiency.

Author: Chan, Yi-Hao, Lundberg, Vanja, Le Pen, Jérémie, Yuan, Jiayi, Lee, Danyel, Pinci, Francesca, Volpi, Stefano, Nakajima, Koji, Bondet, Vincent, Åkesson, Sanna, Khobrekar, Noopur, Bodansky, Aaron, Du, Likun, Melander, Tina, Mariaggi, Alice-Andrée, Seeleuthner, Yoann, Saleh, Tariq, Chakravarty, Debanjana, Marits, Per, Dobbs, Kerry, Vonlanthen, Sofie, Hennings, Viktoria, Thörn, Karolina, Rinchai, Darawan, Bizien, Lucy, Chaldebas, Matthieu, Sobh, Ali, Özçelik, Tayfun, Keles, Sevgi, AlKhater, Suzan, Prando, Carolina, Meyts, Isabelle, Wilson, Michael, Rosain, Jérémie, Jouanguy, Emmanuelle, Aubart, Mélodie, Abel, Laurent, Mogensen, Trine, Pan-Hammarström, Qiang, Gao, Daxing, Duffy, Darragh, Cobat, Aurélie, Berg, Stefan, Notarangelo, Luigi, Harschnitz, Oliver, Rice, Charles, Studer, Lorenz, Casanova, Jean-Laurent, Ekwall, Olov, and Zhang, Shen-Ying
Subjects: Humans, Male, SARS-CoV-2, COVID-19, Brain Stem, Adolescent, Neurons, Encephalitis, Viral, Fibroblasts, Rhombencephalon
Abstract: Inherited deficiency of the RNA lariat-debranching enzyme 1 (DBR1) is a rare etiology of brainstem viral encephalitis. The cellular basis of disease and the range of viral predisposition are unclear. We report inherited DBR1 deficiency in a 14-year-old boy who suffered from isolated SARS-CoV-2 brainstem encephalitis. The patient is homozygous for a previously reported hypomorphic and pathogenic DBR1 variant (I120T). Consistently, DBR1 I120T/I120T fibroblasts from affected individuals from this and another unrelated kindred have similarly low levels of DBR1 protein and high levels of RNA lariats. DBR1 I120T/I120T human pluripotent stem cell (hPSC)-derived hindbrain neurons are highly susceptible to SARS-CoV-2 infection. Exogenous WT DBR1 expression in DBR1 I120T/I120T fibroblasts and hindbrain neurons rescued the RNA lariat accumulation phenotype. Moreover, expression of exogenous RNA lariats, mimicking DBR1 deficiency, increased the susceptibility of WT hindbrain neurons to SARS-CoV-2 infection. Inborn errors of DBR1 impair hindbrain neuron-intrinsic antiviral immunity, predisposing to viral infections of the brainstem, including that by SARS-CoV-2.
Published: 2024

10. LoRA-as-an-Attack! Piercing LLM Safety Under The Share-and-Play Scenario

Author: Liu, Hongyi, Liu, Zirui, Tang, Ruixiang, Yuan, Jiayi, Zhong, Shaochen, Chuang, Yu-Neng, Li, Li, Chen, Rui, and Hu, Xia
Subjects: Computer Science - Cryptography and Security, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: Fine-tuning LLMs is crucial to enhancing their task-specific performance and ensuring model behaviors are aligned with human preferences. Among various fine-tuning methods, LoRA is popular for its efficiency and ease to use, allowing end-users to easily post and adopt lightweight LoRA modules on open-source platforms to tailor their model for different customization. However, such a handy share-and-play setting opens up new attack surfaces, that the attacker can render LoRA as an attacker, such as backdoor injection, and widely distribute the adversarial LoRA to the community easily. This can result in detrimental outcomes. Despite the huge potential risks of sharing LoRA modules, this aspect however has not been fully explored. To fill the gap, in this study we thoroughly investigate the attack opportunities enabled in the growing share-and-play scenario. Specifically, we study how to inject backdoor into the LoRA module and dive deeper into LoRA's infection mechanisms. We found that training-free mechanism is possible in LoRA backdoor injection. We also discover the impact of backdoor attacks with the presence of multiple LoRA adaptions concurrently as well as LoRA based backdoor transferability. Our aim is to raise awareness of the potential risks under the emerging share-and-play scenario, so as to proactively prevent potential consequences caused by LoRA-as-an-Attack. Warning: the paper contains potential offensive content generated by models.
Published: 2024

11. KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache

Author: Liu, Zirui, Yuan, Jiayi, Jin, Hongye, Zhong, Shaochen, Xu, Zhaozhuo, Braverman, Vladimir, Chen, Beidi, and Hu, Xia
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning, Computer Science - Performance
Abstract: Efficiently serving large language models (LLMs) requires batching of many requests to reduce the cost per request. Yet, with larger batch sizes and longer context lengths, the key-value (KV) cache, which stores attention keys and values to avoid re-computations, significantly increases memory demands and becomes the new bottleneck in speed and memory usage. Additionally, the loading of the KV cache causes the computational core to be idle, which limits the inference speed. A straightforward and effective solution to reduce KV cache size is quantization, which decreases the total bytes taken by KV cache. However, there is a lack of in-depth studies that explore the element distribution of KV cache to understand the hardness and limitation of KV cache quantization. To fill the gap, we conducted a comprehensive study on the element distribution in KV cache of popular LLMs. Our findings indicate that the key cache should be quantized per-channel, i.e., group elements along the channel dimension and quantize them together. In contrast, the value cache should be quantized per-token. From this analysis, we developed a tuning-free 2bit KV cache quantization algorithm named KIVI. With hardware-friendly implementation, KIVI can enable Llama, Falcon, and Mistral models to maintain almost the same quality while using $\mathbf{2.6\times}$ less peak memory (including model weight). This reduction in memory usage enables up to $\mathbf{4\times}$ larger batch size, bringing $\mathbf{2.35\times \sim 3.47\times}$ throughput on real LLM inference workload. The source code is available at https://github.com/jy-yuan/KIVI., Comment: ICML2024
Published: 2024
Full Text: View/download PDF

12. Setting the Trap: Capturing and Defeating Backdoors in Pretrained Language Models through Honeypots

Author: Tang, Ruixiang, Yuan, Jiayi, Li, Yiming, Liu, Zirui, Chen, Rui, and Hu, Xia
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: In the field of natural language processing, the prevalent approach involves fine-tuning pretrained language models (PLMs) using local samples. Recent research has exposed the susceptibility of PLMs to backdoor attacks, wherein the adversaries can embed malicious prediction behaviors by manipulating a few training samples. In this study, our objective is to develop a backdoor-resistant tuning procedure that yields a backdoor-free model, no matter whether the fine-tuning dataset contains poisoned samples. To this end, we propose and integrate a honeypot module into the original PLM, specifically designed to absorb backdoor information exclusively. Our design is motivated by the observation that lower-layer representations in PLMs carry sufficient backdoor features while carrying minimal information about the original tasks. Consequently, we can impose penalties on the information acquired by the honeypot module to inhibit backdoor creation during the fine-tuning process of the stem network. Comprehensive experiments conducted on benchmark datasets substantiate the effectiveness and robustness of our defensive strategy. Notably, these results indicate a substantial reduction in the attack success rate ranging from 10\% to 40\% when compared to prior state-of-the-art methods.
Published: 2023

13. Two-dimensional ultrathin vanadium oxide nanosheets as catalytic bactericide

Author: Zhang, Zhimin, Guo, Zhao, Ruan, Zesong, Ge, Min, Cao, Shibo, Yuan, Jiayi, Xu, Zhen, Fan, Lieying, Zong, Ming, Lin, Han, and Shi, Jianlin
Published: 2024
Full Text: View/download PDF

14. Can Attention Be Used to Explain EHR-Based Mortality Prediction Tasks: A Case Study on Hemorrhagic Stroke

Author: Feng, Qizhang, Yuan, Jiayi, Emdad, Forhan Bin, Hanna, Karim, Hu, Xia, and He, Zhe
Subjects: Computer Science - Machine Learning
Abstract: Stroke is a significant cause of mortality and morbidity, necessitating early predictive strategies to minimize risks. Traditional methods for evaluating patients, such as Acute Physiology and Chronic Health Evaluation (APACHE II, IV) and Simplified Acute Physiology Score III (SAPS III), have limited accuracy and interpretability. This paper proposes a novel approach: an interpretable, attention-based transformer model for early stroke mortality prediction. This model seeks to address the limitations of previous predictive models, providing both interpretability (providing clear, understandable explanations of the model) and fidelity (giving a truthful explanation of the model's dynamics from input to output). Furthermore, the study explores and compares fidelity and interpretability scores using Shapley values and attention-based scores to improve model explainability. The research objectives include designing an interpretable attention-based transformer model, evaluating its performance compared to existing models, and providing feature importance derived from the model.
Published: 2023

15. TCAF1 promotes TRPV2-mediated Ca2+ release in response to cytosolic DNA to protect stressed replication forks

Author: Kong, Lingzhen, Cheng, Chen, Cheruiyot, Abigael, Yuan, Jiayi, Yang, Yichan, Hwang, Sydney, Foust, Daniel, Tsao, Ning, Wilkerson, Emily, Mosammaparast, Nima, Major, Michael B., Piston, David W., Li, Shan, and You, Zhongsheng
Published: 2024
Full Text: View/download PDF

16. NetBooster: Empowering Tiny Deep Learning By Standing on the Shoulders of Deep Giants

Author: Yu, Zhongzhi, Fu, Yonggan, Yuan, Jiayi, You, Haoran, and Lin, Yingyan
Subjects: Computer Science - Machine Learning, Computer Science - Distributed, Parallel, and Cluster Computing
Abstract: Tiny deep learning has attracted increasing attention driven by the substantial demand for deploying deep learning on numerous intelligent Internet-of-Things devices. However, it is still challenging to unleash tiny deep learning's full potential on both large-scale datasets and downstream tasks due to the under-fitting issues caused by the limited model capacity of tiny neural networks (TNNs). To this end, we propose a framework called NetBooster to empower tiny deep learning by augmenting the architectures of TNNs via an expansion-then-contraction strategy. Extensive experiments show that NetBooster consistently outperforms state-of-the-art tiny deep learning solutions.
Published: 2023

17. Gen-NeRF: Efficient and Generalizable Neural Radiance Fields via Algorithm-Hardware Co-Design

Author: Fu, Yonggan, Ye, Zhifan, Yuan, Jiayi, Zhang, Shunyao, Li, Sixu, You, Haoran, and Lin, Yingyan Celine
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Hardware Architecture
Abstract: Novel view synthesis is an essential functionality for enabling immersive experiences in various Augmented- and Virtual-Reality (AR/VR) applications, for which generalizable Neural Radiance Fields (NeRFs) have gained increasing popularity thanks to their cross-scene generalization capability. Despite their promise, the real-device deployment of generalizable NeRFs is bottlenecked by their prohibitive complexity due to the required massive memory accesses to acquire scene features, causing their ray marching process to be memory-bounded. To this end, we propose Gen-NeRF, an algorithm-hardware co-design framework dedicated to generalizable NeRF acceleration, which for the first time enables real-time generalizable NeRFs. On the algorithm side, Gen-NeRF integrates a coarse-then-focus sampling strategy, leveraging the fact that different regions of a 3D scene contribute differently to the rendered pixel, to enable sparse yet effective sampling. On the hardware side, Gen-NeRF highlights an accelerator micro-architecture to maximize the data reuse opportunities among different rays by making use of their epipolar geometric relationship. Furthermore, our Gen-NeRF accelerator features a customized dataflow to enhance data locality during point-to-hardware mapping and an optimized scene feature storage strategy to minimize memory bank conflicts. Extensive experiments validate the effectiveness of our proposed Gen-NeRF framework in enabling real-time and generalizable novel view synthesis., Comment: Accepted by ISCA 2023
Published: 2023

18. Robust Tickets Can Transfer Better: Drawing More Transferable Subnetworks in Transfer Learning

Author: Fu, Yonggan, Yuan, Ye, Wu, Shang, Yuan, Jiayi, and Lin, Yingyan Celine
Subjects: Computer Science - Machine Learning
Abstract: Transfer learning leverages feature representations of deep neural networks (DNNs) pretrained on source tasks with rich data to empower effective finetuning on downstream tasks. However, the pretrained models are often prohibitively large for delivering generalizable representations, which limits their deployment on edge devices with constrained resources. To close this gap, we propose a new transfer learning pipeline, which leverages our finding that robust tickets can transfer better, i.e., subnetworks drawn with properly induced adversarial robustness can win better transferability over vanilla lottery ticket subnetworks. Extensive experiments and ablation studies validate that our proposed transfer learning pipeline can achieve enhanced accuracy-sparsity trade-offs across both diverse downstream tasks and sparsity patterns, further enriching the lottery ticket hypothesis., Comment: Accepted by DAC 2023
Published: 2023

19. Large Language Models for Healthcare Data Augmentation: An Example on Patient-Trial Matching

Author: Yuan, Jiayi, Tang, Ruixiang, Jiang, Xiaoqian, and Hu, Xia
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: The process of matching patients with suitable clinical trials is essential for advancing medical research and providing optimal care. However, current approaches face challenges such as data standardization, ethical considerations, and a lack of interoperability between Electronic Health Records (EHRs) and clinical trial criteria. In this paper, we explore the potential of large language models (LLMs) to address these challenges by leveraging their advanced natural language generation capabilities to improve compatibility between EHRs and clinical trial descriptions. We propose an innovative privacy-aware data augmentation approach for LLM-based patient-trial matching (LLM-PTM), which balances the benefits of LLMs while ensuring the security and confidentiality of sensitive patient data. Our experiments demonstrate a 7.32% average improvement in performance using the proposed LLM-PTM method, and the generalizability to new data is improved by 12.12%. Additionally, we present case studies to further illustrate the effectiveness of our approach and provide a deeper understanding of its underlying principles.
Published: 2023

20. Towards Fair Patient-Trial Matching via Patient-Criterion Level Fairness Constraint

Author: Chang, Chia-Yuan, Yuan, Jiayi, Ding, Sirui, Tan, Qiaoyu, Zhang, Kai, Jiang, Xiaoqian, Hu, Xia, and Zou, Na
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computers and Society
Abstract: Clinical trials are indispensable in developing new treatments, but they face obstacles in patient recruitment and retention, hindering the enrollment of necessary participants. To tackle these challenges, deep learning frameworks have been created to match patients to trials. These frameworks calculate the similarity between patients and clinical trial eligibility criteria, considering the discrepancy between inclusion and exclusion criteria. Recent studies have shown that these frameworks outperform earlier approaches. However, deep learning models may raise fairness issues in patient-trial matching when certain sensitive groups of individuals are underrepresented in clinical trials, leading to incomplete or inaccurate data and potential harm. To tackle the issue of fairness, this work proposes a fair patient-trial matching framework by generating a patient-criterion level fairness constraint. The proposed framework considers the inconsistency between the embedding of inclusion and exclusion criteria among patients of different sensitive groups. The experimental results on real-world patient-trial and patient-criterion matching tasks demonstrate that the proposed framework can successfully alleviate the predictions that tend to be biased.
Published: 2023

21. ERSAM: Neural Architecture Search For Energy-Efficient and Real-Time Social Ambiance Measurement

Author: Li, Chaojian, Chen, Wenwan, Yuan, Jiayi, Lin, Yingyan, and Sabharwal, Ashutosh
Subjects: Computer Science - Machine Learning, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Social ambiance describes the context in which social interactions happen, and can be measured using speech audio by counting the number of concurrent speakers. This measurement has enabled various mental health tracking and human-centric IoT applications. While on-device Socal Ambiance Measure (SAM) is highly desirable to ensure user privacy and thus facilitate wide adoption of the aforementioned applications, the required computational complexity of state-of-the-art deep neural networks (DNNs) powered SAM solutions stands at odds with the often constrained resources on mobile devices. Furthermore, only limited labeled data is available or practical when it comes to SAM under clinical settings due to various privacy constraints and the required human effort, further challenging the achievable accuracy of on-device SAM solutions. To this end, we propose a dedicated neural architecture search framework for Energy-efficient and Real-time SAM (ERSAM). Specifically, our ERSAM framework can automatically search for DNNs that push forward the achievable accuracy vs. hardware efficiency frontier of mobile SAM solutions. For example, ERSAM-delivered DNNs only consume 40 mW x 12 h energy and 0.05 seconds processing latency for a 5 seconds audio segment on a Pixel 3 phone, while only achieving an error rate of 14.3% on a social ambiance dataset generated by LibriSpeech. We can expect that our ERSAM framework can pave the way for ubiquitous on-device SAM solutions which are in growing demand., Comment: Accepted by ICASSP'23
Published: 2023

22. Recurrent Structure Attention Guidance for Depth Super-Resolution

Author: Yuan, Jiayi, Jiang, Haobo, Li, Xiang, Qian, Jianjun, Li, Jun, and Yang, Jian
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Image guidance is an effective strategy for depth super-resolution. Generally, most existing methods employ hand-crafted operators to decompose the high-frequency (HF) and low-frequency (LF) ingredients from low-resolution depth maps and guide the HF ingredients by directly concatenating them with image features. However, the hand-designed operators usually cause inferior HF maps (e.g., distorted or structurally missing) due to the diverse appearance of complex depth maps. Moreover, the direct concatenation often results in weak guidance because not all image features have a positive effect on the HF maps. In this paper, we develop a recurrent structure attention guided (RSAG) framework, consisting of two important parts. First, we introduce a deep contrastive network with multi-scale filters for adaptive frequency-domain separation, which adopts contrastive networks from large filters to small ones to calculate the pixel contrasts for adaptive high-quality HF predictions. Second, instead of the coarse concatenation guidance, we propose a recurrent structure attention block, which iteratively utilizes the latest depth estimation and the image features to jointly select clear patterns and boundaries, aiming at providing refined guidance for accurate depth recovery. In addition, we fuse the features of HF maps to enhance the edge structures in the decomposed LF maps. Extensive experiments show that our approach obtains superior performance compared with state-of-the-art depth super-resolution methods., Comment: Accepted by AAAI-2023
Published: 2023

23. Structure Flow-Guided Network for Real Depth Super-Resolution

Author: Yuan, Jiayi, Jiang, Haobo, Li, Xiang, Qian, Jianjun, Li, Jun, and Yang, Jian
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Real depth super-resolution (DSR), unlike synthetic settings, is a challenging task due to the structural distortion and the edge noise caused by the natural degradation in real-world low-resolution (LR) depth maps. These defeats result in significant structure inconsistency between the depth map and the RGB guidance, which potentially confuses the RGB-structure guidance and thereby degrades the DSR quality. In this paper, we propose a novel structure flow-guided DSR framework, where a cross-modality flow map is learned to guide the RGB-structure information transferring for precise depth upsampling. Specifically, our framework consists of a cross-modality flow-guided upsampling network (CFUNet) and a flow-enhanced pyramid edge attention network (PEANet). CFUNet contains a trilateral self-attention module combining both the geometric and semantic correlations for reliable cross-modality flow learning. Then, the learned flow maps are combined with the grid-sampling mechanism for coarse high-resolution (HR) depth prediction. PEANet targets at integrating the learned flow map as the edge attention into a pyramid network to hierarchically learn the edge-focused guidance feature for depth edge refinement. Extensive experiments on real and synthetic DSR datasets verify that our approach achieves excellent performance compared to state-of-the-art methods., Comment: Accepted by AAAI-2023
Published: 2023

24. One-pot direct conversion of raw straw to furan chemicals simultaneously in a choline chloride-lactic acid/methyl isobutyl ketone biphasic system

Author: Yuan, Jiayi, Chen, Anwei, Chai, Youzheng, Bai, Ma, Zhu, Shiye, Peng, Liang, and Zhang, Jiachao
Published: 2024
Full Text: View/download PDF

25. Three essays on mergers and acquisitions

Author: Yuan, Jiayi
Subjects: HG Finance
Abstract: This thesis explores the motives and effects of mergers and acquisitions (M&As) in different institutional contexts. The first paper shows that M&As help firms navigate the challenges arising from green policies. It exploits the adoption of the NOx Budget Program (NBP) across U.S. states as an exogenous shock to investigate the impact of climate policy risk on M&As. The results suggest that manufacturers headquartered in NBP-participating states are more likely to initiate M&A deals to reduce future cash flow volatilities and financial distress risks. The M&A involvement is more pronounced for firms with a higher degree of financial distress, firms with greater product market competition, and firms facing more firm-level uncertainty shocks. Further, treated firms are less likely to finance M&A transactions with cash and tend to pay higher takeover premiums to induce target shareholders to accept stock payments. Collectively, this paper revisits the motives and effects of M&As in the context of green policies and supports the neoclassical view of M&As as being an effective tool for diversifying risks and regaining financial strength. The second paper provides evidence that the presence of top-tier advisors increases managers' propensity for withdrawing cross-border mergers and acquisitions (CBAs) with poor market returns around the announcement. This effect is stronger for private target acquisitions where information asymmetry is expected to be more pronounced, and for smaller bidders who are likely to lack the expertise to process information on themselves. This suggests that managers assisted by reputable investment banks consider the negative market feedback in informationally challenging deals. This paper provides novel inferences about the informative role of stock markets in shaping advisory roles for cross-border M&As. The third paper finds that managers of Chinese state-sponsored bidders broaden their acquisition targets from the strict patent-intensive sectors ("high-tech" industries only) to the loose patent-intensive sectors (a wider range of technology sectors). In doing so, they satisfy the Chinese government's call to acquire innovative targets, while at the same time they mitigate the resistance of host governments to the growing Chinese influence on the corporate scene. We show that Chinese state-sponsored bidders face fewer setbacks, improve the innovation growth rate, and protect shareholder wealth when they target firms from the loose innovation-intensive sectors.
Published: 2023

26. 3,5-Dicaffeoylquinic acid promotes intestinal urate excretion via the MAPK signaling pathway based on Caco-2 Cell model

Author: Liao, Jiajing, Qian, Jin, Rao, Lijuan, Lin, Suqin, Wang, Chen, Xu, Linqian, Yuan, Bing, Yuan, Jiayi, Wan, Yin, and Fu, Guiming
Published: 2025
Full Text: View/download PDF

27. Functional triboelectric nanogenerator based on shear stiffening gel for impact protection and self-powered motion monitoring

Author: Fang, Yingling, Shen, Zhilin, Gong, Jixin, Yuan, Jiayi, Zhu, Chengtao, and Yuan, Bihe
Published: 2025
Full Text: View/download PDF

28. Carbon footprints of incineration, pyrolysis, and gasification for sewage sludge treatment

Author: Chang, Huimin, Yuan, Jiayi, Zhao, Yan, Bisinella, Valentina, Damgaard, Anders, and Christensen, Thomas H.
Published: 2025
Full Text: View/download PDF

29. Multiple strategies enhance the efficacy of MSC-Exos transplantation for spinal cord injury

Author: Xu, Yan, Wang, Xuesong, Zhou, Xiaolei, Zeng, Wenhui, Yuan, Jiayi, and Ye, Junsong
Published: 2025
Full Text: View/download PDF

30. Impurity tolerance and effects in catalytic cracking of household mixed plastic waste

Author: Wang, Bowen, Yuan, Jiayi, Zhao, Silan, Wang, Shengwei, and Zhao, Yan
Published: 2025
Full Text: View/download PDF

31. An unsupervised water quality anomaly detection method based on a combination of time-frequency analysis and clustering

Author: Ni, Qingjian, Cao, Xuehan, Zhao, Ziqi, Yuan, Jiayi, and Tan, Chaoqun
Published: 2024
Full Text: View/download PDF

32. EyeCoD: Eye Tracking System Acceleration via FlatCam-based Algorithm & Accelerator Co-Design

Author: You, Haoran, Wan, Cheng, Zhao, Yang, Yu, Zhongzhi, Fu, Yonggan, Yuan, Jiayi, Wu, Shang, Zhang, Shunyao, Zhang, Yongan, Li, Chaojian, Boominathan, Vivek, Veeraraghavan, Ashok, Li, Ziyun, and Lin, Yingyan Celine
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Hardware Architecture
Abstract: Eye tracking has become an essential human-machine interaction modality for providing immersive experience in numerous virtual and augmented reality (VR/AR) applications desiring high throughput (e.g., 240 FPS), small-form, and enhanced visual privacy. However, existing eye tracking systems are still limited by their: (1) large form-factor largely due to the adopted bulky lens-based cameras; and (2) high communication cost required between the camera and backend processor, thus prohibiting their more extensive applications. To this end, we propose a lensless FlatCam-based eye tracking algorithm and accelerator co-design framework dubbed EyeCoD to enable eye tracking systems with a much reduced form-factor and boosted system efficiency without sacrificing the tracking accuracy, paving the way for next-generation eye tracking solutions. On the system level, we advocate the use of lensless FlatCams to facilitate the small form-factor need in mobile eye tracking systems. On the algorithm level, EyeCoD integrates a predict-then-focus pipeline that first predicts the region-of-interest (ROI) via segmentation and then only focuses on the ROI parts to estimate gaze directions, greatly reducing redundant computations and data movements. On the hardware level, we further develop a dedicated accelerator that (1) integrates a novel workload orchestration between the aforementioned segmentation and gaze estimation models, (2) leverages intra-channel reuse opportunities for depth-wise layers, and (3) utilizes input feature-wise partition to save activation memory size. On-silicon measurement validates that our EyeCoD consistently reduces both the communication and computation costs, leading to an overall system speedup of 10.95x, 3.21x, and 12.85x over CPUs, GPUs, and a prior-art eye tracking processor called CIS-GEP, respectively, while maintaining the tracking accuracy., Comment: Accepted by ISCA 2022; Also selected as an IEEE Micro's Top Pick of 2023
Published: 2022
Full Text: View/download PDF

33. DepthShrinker: A New Compression Paradigm Towards Boosting Real-Hardware Efficiency of Compact Neural Networks

Author: Fu, Yonggan, Yang, Haichuan, Yuan, Jiayi, Li, Meng, Wan, Cheng, Krishnamoorthi, Raghuraman, Chandra, Vikas, and Lin, Yingyan Celine
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition
Abstract: Efficient deep neural network (DNN) models equipped with compact operators (e.g., depthwise convolutions) have shown great potential in reducing DNNs' theoretical complexity (e.g., the total number of weights/operations) while maintaining a decent model accuracy. However, existing efficient DNNs are still limited in fulfilling their promise in boosting real-hardware efficiency, due to their commonly adopted compact operators' low hardware utilization. In this work, we open up a new compression paradigm for developing real-hardware efficient DNNs, leading to boosted hardware efficiency while maintaining model accuracy. Interestingly, we observe that while some DNN layers' activation functions help DNNs' training optimization and achievable accuracy, they can be properly removed after training without compromising the model accuracy. Inspired by this observation, we propose a framework dubbed DepthShrinker, which develops hardware-friendly compact networks via shrinking the basic building blocks of existing efficient DNNs that feature irregular computation patterns into dense ones with much improved hardware utilization and thus real-hardware efficiency. Excitingly, our DepthShrinker framework delivers hardware-friendly compact networks that outperform both state-of-the-art efficient DNNs and compression techniques, e.g., a 3.06% higher accuracy and 1.53$\times$ throughput on Tesla V100 over SOTA channel-wise pruning method MetaPruning. Our codes are available at: https://github.com/facebookresearch/DepthShrinker., Comment: Accepted at ICML 2022
Published: 2022

34. Multi-crosslinked gelatin-based composite hydrogel featuring high thermoelectric performance and excellent flame retardancy for intelligent fire-warning system

Author: Yang, Lujia, Zhou, Yichen, Xu, Jiaojiao, Ma, Xinyi, Yuan, Jiayi, and Yuan, Bihe
Published: 2024
Full Text: View/download PDF

35. Chief executive officer marital status and corporate credit ratings

Author: Cai, Xiangshang, Gao, Yang, Wu, Zhiting, and Yuan, Jiayi
Published: 2024
Full Text: View/download PDF

36. Knowledge Graph of Principles and Applications of Soil Control in the Loess Plateau

Author: CUI Caixian, SHEN Lin, YUAN Jiayi, YI Ya’nan, WANG Zhonggang, and LI Lingling
Subjects: the loess plateau, principles and applications of soil control, citespace, visualization, collaborative capacity improvement, Environmental sciences, GE1-350, Agriculture
Abstract: [Objective] The Loess Plateau is a great geographical unit of China, which has important strategic significance for national food security and ecological security. Through long-term slope management, comprehensive watershed management, afforestation and other measures, the ecological environment of the Loess Plateau has been significantly improved, but it still faces challenges in such aspects as insufficient rationality of vegetation restoration in local areas, uncoordinated overall planning of the whole basin, and balanced economic development and ecological protection. Under the guidance of the national dual-carbon goal, the future direction and goal of the Loess Plateau ecosystem and agricultural sustainable development should be to support the realization of carbon neutrality, establish an environmentally suitable vegetation community, promote the development of the ecosystem as a whole, and realize the green transformation of the development mode. [Methods] Based on the bibliometrics tool CiteSpace, this paper visually analyzes the relevant literatures on the principle and application of soil regulation in the Loess Plateau published from 1992 to 2023, focusing on the core research strength, research hotspot and research trend in this research field. [Results] (1) From the perspective of the evolution of the number of published papers, the overall trend is rising, and the growth rate is explosive after 2001; (2) From the perspective of publishing institutions, Northwest A&F University has the highest number of publishing institutions in this field, with 2 157; (3) From the perspective of the cooperative relationship between authors, Academician SHAO Mingan and Professor LIU Guobin are the core leaders in this field, and the number of published papers is 141 and 117 respectively. The cooperative network formed by them has been very mature; (4) From the perspective of research hotspots and development trends in this field, the main research hotspots focus on soil moisture, yield, dryland, soil erosion, soil nutrients and vegetation restoration, etc. The future micro theme trend will focus on yield, slope position and nitrogen application amount, while the future macro trend will explore a new model of the coupling and coordinated development of eco-economic-social system. [Conclusion] Based on the results of visual analysis, three perspectives on the principles and application of soil regulation in the Loess Plateau were proposed, including cooperative research, research hotspots and future key development directions.
Published: 2024
Full Text: View/download PDF

37. Life cycle carbon footprints of alternative sludge pretreatment technologies to recover carbon sources for denitrification

Author: Zhao, Silan, Xu, Yingjie, Yuan, Jiayi, Chang, Huimin, Wang, Shengwei, and Zhao, Yan
Published: 2024
Full Text: View/download PDF

38. Orthogonally woven 3D nanofiber scaffolds promote rapid soft tissue regeneration by enhancing bidirectional cell migration

Author: Yuan, Jiayi, Sun, Bingbing, Ma, Weixing, Cai, Chao, Huang, Zhenzhen, Zhou, Peiyi, Yi, Lei, Liu, Lubin, and Chen, Shixuan
Published: 2024
Full Text: View/download PDF

39. Deep integration of metabolome and transcriptome characterizes alkaloid metabolism in Houttuynia cordata

Author: Jiang, Xue, Wang, Qian, Yang, Jingtian, Du, Baoguo, Yuan, Zhaodi, Liu, Hongyi, Yuan, Jiayi, Zhang, Yang, Chen, Liao, and Liu, Lei
Published: 2024
Full Text: View/download PDF

40. Single-cell massively-parallel multiplexed microbial sequencing (M3-seq) identifies rare bacterial populations and profiles phage infection

Author: Wang, Bruce, Lin, Aaron E., Yuan, Jiayi, Novak, Katherine E., Koch, Matthias D., Wingreen, Ned S., Adamson, Britt, and Gitai, Zemer
Published: 2023
Full Text: View/download PDF

41. Strategic choice of the management of disposable meal boxes from the perspectives of life cycle impact assessment: Recycling fossil plastics or promoting biogenic plastics

Author: Lin, Guannv, Yuan, Jiayi, Li, Xiang, Zhao, Silan, Wang, Shengwei, Chang, Huimin, and Zhao, Yan
Published: 2024
Full Text: View/download PDF

42. Research on the Performance Evaluation of Digital Trade Based on the 'VHSD-EM' Model Under Dual Circulation

Author: Xiang, Yijun, Yuan, Jiayi, Howlett, Robert J., Series Editor, Jain, Lakhmi C., Series Editor, Hassanien, Aboul Ella, editor, Zheng, Dequan, editor, Zhao, Zhijie, editor, and Fan, Zhipeng, editor
Published: 2023
Full Text: View/download PDF

43. Soluble signal inhibitory receptor on leukocytes-1 reflects disease activity and assists diagnosis of patients with rheumatoid arthritis

Author: Xv, Zhen, Xv, Xuejing, Chen, Nianzhen, Yuan, Jiayi, Li, Jing, Wang, Lan, Yu, Shanshan, Li, Gen, Ding, Menglei, Zong, Ming, and Fan, Lieying
Published: 2024
Full Text: View/download PDF

44. The impact of sentiment consistency on bank risk: A dual perspective from bank managers and credit rating experts

Author: Yuan, Jiayi and Wei, Lu
Published: 2024
Full Text: View/download PDF

45. Adaptive scalable spatio-temporal graph convolutional network for PM2.5 prediction

Author: Ni, Qingjian, Wang, Yuhui, and Yuan, Jiayi
Published: 2023
Full Text: View/download PDF

46. Material and C/N flow analyses of alternative sludge pretreatment technologies for recovering carbon sources for denitrification

Author: Yuan, Jiayi, Xu, Yingjie, Zhao, Silan, Li, Xiang, Chang, Huimin, and Zhao, Yan
Published: 2023
Full Text: View/download PDF

47. Co-Mn-Fe spinel-carbon composite catalysts enhanced persulfate activation for degradation of neonicotinoid insecticides: (Non) radical path identification, degradation pathway and toxicity analysis

Author: Bai, Ma, Chai, Youzheng, Chen, Anwei, Shao, Jihai, Zhu, Shiye, Yuan, Jiayi, Yang, Zhenghang, Xiong, Jiahao, Jin, Doudou, Zhao, Keqi, and Chen, Yanziyun
Published: 2023
Full Text: View/download PDF

48. Bi-functional biochar-g-C3N4-MgO composites for simultaneously minimizing pollution：Photocatalytic degradation of pesticide and phosphorus recovery as slow-release fertilizer

Author: An, Xiongfang, Xu, Xiaolin, Guo, Weijie, Chen, Zepu, Miao, Zhiyin, Yuan, Jiayi, and Wu, Zhansheng
Published: 2023
Full Text: View/download PDF

49. An olefin-linked pyridinium covalent organic frameworks bearing donor–acceptor structure for highly efficient photocatalytic organic transformations

Author: Ma, Baiwei, Yang, Xin, Yuan, Jiayi, Yang, Xiubei, Han, Diandian, Zhao, Kun, Lin, Chunlei, Wang, Lei, Liu, Guoqun, and Mi, Liwei
Published: 2023
Full Text: View/download PDF

50. Does exposure to air pollution during different time windows affect pregnancy outcomes of in vitro fertilization treatment? A systematic review and meta-analysis

Author: Liu, Junjie, Dai, Yanpeng, Yuan, Jiayi, Li, Runqing, Hu, Yaolong, and Su, Yanhua
Published: 2023
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

312 results on '"Yuan, Jiayi"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources