Author: "Tian, Yuan" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Tian, Yuan"' showing total 15,235 results

Start Over Author "Tian, Yuan"

15,235 results on '"Tian, Yuan"'

1. InterTrans: Leveraging Transitive Intermediate Translations to Enhance LLM-based Code Translation

Author: Macedo, Marcos, Tian, Yuan, Nie, Pengyu, Cogo, Filipe R., and Adams, Bram
Subjects: Computer Science - Software Engineering, Computer Science - Artificial Intelligence
Abstract: Code translation aims to convert a program from one programming language (PL) to another. This long-standing software engineering task is crucial for modernizing legacy systems, ensuring cross-platform compatibility, enhancing performance, and more. However, automating this process remains challenging due to many syntactic and semantic differences between PLs. Recent studies show that even advanced techniques such as large language models (LLMs), especially open-source LLMs, still struggle with the task. Currently, code LLMs are trained with source code from multiple programming languages, thus presenting multilingual capabilities. In this paper, we investigate whether such multilingual capabilities can be harnessed to enhance code translation. To achieve this goal, we introduce InterTrans, an LLM-based automated code translation approach that, in contrast to existing approaches, leverages intermediate translations across PLs to bridge the syntactic and semantic gaps between source and target PLs. InterTrans contains two stages. It first utilizes a novel Tree of Code Translation (ToCT) algorithm to plan transitive intermediate translation sequences between a given source and target PL, then validates them in a specific order. We evaluate InterTrans with three open LLMs on three benchmarks (i.e., CodeNet, HumanEval-X, and TransCoder) involving six PLs. Results show an absolute improvement between 18.3% to 43.3% in Computation Accuracy (CA) for InterTrans over Direct Translation with 10 attempts. The best-performing variant of InterTrans (with Magicoder LLM) achieved an average CA of 87.3%-95.4% on three benchmarks.
Published: 2024

2. FirmRCA: Towards Post-Fuzzing Analysis on ARM Embedded Firmware with Efficient Event-based Fault Localization

Author: Chang, Boyu, Zhao, Binbin, Zhang, Qiao, Liu, Peiyu, Tian, Yuan, Beyah, Raheem, and Ji, Shouling
Subjects: Computer Science - Cryptography and Security
Abstract: While fuzzing has demonstrated its effectiveness in exposing vulnerabilities within embedded firmware, the discovery of crashing test cases is only the first step in improving the security of these critical systems. The subsequent fault localization process, which aims to precisely identify the root causes of observed crashes, is a crucial yet time-consuming post-fuzzing work. Unfortunately, the automated root cause analysis on embedded firmware crashes remains an underexplored area, which is challenging from several perspectives: (1) the fuzzing campaign towards the embedded firmware lacks adequate debugging mechanisms, making it hard to automatically extract essential runtime information for analysis; (2) the inherent raw binary nature of embedded firmware often leads to over-tainted and noisy suspicious instructions, which provides limited guidance for analysts in manually investigating the root cause and remediating the underlying vulnerability. To address these challenges, we design and implement FirmRCA, a practical fault localization framework tailored specifically for embedded firmware. FirmRCA introduces an event-based footprint collection approach to aid and significantly expedite reverse execution. Next, to solve the complicated memory alias problem, FirmRCA proposes a history-driven method by tracking data propagation through the execution trace, enabling precise identification of deep crash origins. Finally, FirmRCA proposes a novel strategy to highlight key instructions related to the root cause, providing practical guidance in the final investigation. We evaluate FirmRCA with both synthetic and real-world targets, including 41 crashing test cases across 17 firmware images. The results show that FirmRCA can effectively (92.7% success rate) identify the root cause of crashing test cases within the top 10 instructions., Comment: To appear in the IEEE Symposium on Security and Privacy (IEEE S&P) 2025, San Francisco, CA, USA
Published: 2024

3. Rethinking Data Selection at Scale: Random Selection is Almost All You Need

Author: Xia, Tingyu, Yu, Bowen, Dang, Kai, Yang, An, Wu, Yuan, Tian, Yuan, Chang, Yi, and Lin, Junyang
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Supervised fine-tuning (SFT) is crucial for aligning Large Language Models (LLMs) with human instructions. The primary goal during SFT is to select a small yet representative subset of training data from the larger pool, such that fine-tuning with this subset achieves results comparable to or even exceeding those obtained using the entire dataset. However, most existing data selection techniques are designed for small-scale data pools, which fail to meet the demands of real-world SFT scenarios. In this paper, we replicated several self-scoring methods those that do not rely on external model assistance on two million scale datasets, and found that nearly all methods struggled to significantly outperform random selection when dealing with such large-scale data pools. Moreover, our comparisons suggest that, during SFT, diversity in data selection is more critical than simply focusing on high quality data. We also analyzed the limitations of several current approaches, explaining why they perform poorly on large-scale datasets and why they are unsuitable for such contexts. Finally, we found that filtering data by token length offers a stable and efficient method for improving results. This approach, particularly when training on long text data, proves highly beneficial for relatively weaker base models, such as Llama3.
Published: 2024

4. R-Bench: Are your Large Multimodal Model Robust to Real-world Corruptions?

Author: Li, Chunyi, Zhang, Jianbo, Zhang, Zicheng, Wu, Haoning, Tian, Yuan, Sun, Wei, Lu, Guo, Liu, Xiaohong, Min, Xiongkuo, Lin, Weisi, and Zhai, Guangtao
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Multimedia, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: The outstanding performance of Large Multimodal Models (LMMs) has made them widely applied in vision-related tasks. However, various corruptions in the real world mean that images will not be as ideal as in simulations, presenting significant challenges for the practical application of LMMs. To address this issue, we introduce R-Bench, a benchmark focused on the **Real-world Robustness of LMMs**. Specifically, we: (a) model the complete link from user capture to LMMs reception, comprising 33 corruption dimensions, including 7 steps according to the corruption sequence, and 7 groups based on low-level attributes; (b) collect reference/distorted image dataset before/after corruption, including 2,970 question-answer pairs with human labeling; (c) propose comprehensive evaluation for absolute/relative robustness and benchmark 20 mainstream LMMs. Results show that while LMMs can correctly handle the original reference images, their performance is not stable when faced with distorted images, and there is a significant gap in robustness compared to the human visual system. We hope that R-Bench will inspire improving the robustness of LMMs, **extending them from experimental simulations to the real-world application**. Check https://q-future.github.io/R-Bench for details.
Published: 2024

5. A Comprehensive Framework for Evaluating API-oriented Code Generation in Large Language Models

Author: Wu, Yixi, He, Pengfei, Wang, Zehao, Wang, Shaowei, Tian, Yuan, and Chen, Tse-Hsun
Subjects: Computer Science - Software Engineering, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Large language models (LLMs) like GitHub Copilot and ChatGPT have emerged as powerful tools for code generation, significantly enhancing productivity and accelerating software development. However, existing benchmarks primarily focus on general code generation without considering API-oriented code generation, i.e., generating code that invokes APIs from specific libraries. Given the growing demand for API-oriented code generation, there is a pressing need for a systematic and automated approach to evaluate LLM on API-oriented code generation. To address this gap, we propose AutoAPIEval, a lightweight and automated framework designed to evaluate the capabilities of LLMs in API-oriented code generation. Our framework works with any library that provides API documentation and focuses on two unit tasks: API recommendation and code example generation, along with four metrics to evaluate the generated APIs and code examples, such as the proportion of incorrect API recommendations for Task 1, and the proportion of code examples where no specific API is invoked and uncompilable/unexecutable code examples for Task 2. In addition, we conducted a case study on three LLMs (ChatGPT, MagiCoder, and DeepSeek Coder) and Java Runtime Environment 8 to demonstrate the framework's effectiveness. Our findings reveal substantial variability in LLM performance across tasks, with ChatGPT adhering better to instructions, while sharing similar effectiveness in code example generation with its counterparts (i.e., MagiCoder and DeekSeek Coder). We also identify key factors associated with code quality, such as API popularity and model confidence, and build classifiers that achieve high accuracy in detecting incorrect API recommendations and erroneous code examples. Retrieval-augmented generation enhances the quality of code generated by LLMs, though its effectiveness varies across different LLMs.
Published: 2024

6. Free-VSC: Free Semantics from Visual Foundation Models for Unsupervised Video Semantic Compression

Author: Tian, Yuan, Lu, Guo, and Zhai, Guangtao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Unsupervised video semantic compression (UVSC), i.e., compressing videos to better support various analysis tasks, has recently garnered attention. However, the semantic richness of previous methods remains limited, due to the single semantic learning objective, limited training data, etc. To address this, we propose to boost the UVSC task by absorbing the off-the-shelf rich semantics from VFMs. Specifically, we introduce a VFMs-shared semantic alignment layer, complemented by VFM-specific prompts, to flexibly align semantics between the compressed video and various VFMs. This allows different VFMs to collaboratively build a mutually-enhanced semantic space, guiding the learning of the compression model. Moreover, we introduce a dynamic trajectory-based inter-frame compression scheme, which first estimates the semantic trajectory based on the historical content, and then traverses along the trajectory to predict the future semantics as the coding context. This reduces the overall bitcost of the system, further improving the compression efficiency. Our approach outperforms previous coding methods on three mainstream tasks and six datasets., Comment: ECCV2024
Published: 2024

7. EIA: Environmental Injection Attack on Generalist Web Agents for Privacy Leakage

Author: Liao, Zeyi, Mo, Lingbo, Xu, Chejian, Kang, Mintong, Zhang, Jiawei, Xiao, Chaowei, Tian, Yuan, Li, Bo, and Sun, Huan
Subjects: Computer Science - Cryptography and Security, Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Generalist web agents have demonstrated remarkable potential in autonomously completing a wide range of tasks on real websites, significantly boosting human productivity. However, web tasks, such as booking flights, usually involve users' PII, which may be exposed to potential privacy risks if web agents accidentally interact with compromised websites, a scenario that remains largely unexplored in the literature. In this work, we narrow this gap by conducting the first study on the privacy risks of generalist web agents in adversarial environments. First, we present a realistic threat model for attacks on the website, where we consider two adversarial targets: stealing users' specific PII or the entire user request. Then, we propose a novel attack method, termed Environmental Injection Attack (EIA). EIA injects malicious content designed to adapt well to environments where the agents operate and our work instantiates EIA specifically for privacy scenarios in web environments. We collect 177 action steps that involve diverse PII categories on realistic websites from the Mind2Web, and conduct experiments using one of the most capable generalist web agent frameworks to date. The results demonstrate that EIA achieves up to 70% ASR in stealing specific PII and 16% ASR for full user request. Additionally, by accessing the stealthiness and experimenting with a defensive system prompt, we indicate that EIA is hard to detect and mitigate. Notably, attacks that are not well adapted for a webpage can be detected via human inspection, leading to our discussion about the trade-off between security and autonomy. However, extra attackers' efforts can make EIA seamlessly adapted, rendering such supervision ineffective. Thus, we further discuss the defenses at the pre- and post-deployment stages of the websites without relying on human supervision and call for more advanced defense strategies., Comment: 29 pages
Published: 2024

8. SQLucid: Grounding Natural Language Database Queries with Interactive Explanations

Author: Tian, Yuan, Kummerfeld, Jonathan K., Li, Toby Jia-Jun, and Zhang, Tianyi
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Computation and Language
Abstract: Though recent advances in machine learning have led to significant improvements in natural language interfaces for databases, the accuracy and reliability of these systems remain limited, especially in high-stakes domains. This paper introduces SQLucid, a novel user interface that bridges the gap between non-expert users and complex database querying processes. SQLucid addresses existing limitations by integrating visual correspondence, intermediate query results, and editable step-by-step SQL explanations in natural language to facilitate user understanding and engagement. This unique blend of features empowers users to understand and refine SQL queries easily and precisely. Two user studies and one quantitative experiment were conducted to validate SQLucid's effectiveness, showing significant improvement in task completion accuracy and user confidence compared to existing interfaces. Our code is available at https://github.com/magic-YuanTian/SQLucid., Comment: Accepted to UIST'24
Published: 2024

9. Chain-of-Experts (CoE): Reverse Engineering Software Bills of Materials for JavaScript Application Bundles through Code Clone Search

Author: Song, Leo, Ding, Steven H. H., Tian, Yuan, Li, Li Tao, Charland, Philippe, and Walenstein, Andrew
Subjects: Computer Science - Software Engineering
Abstract: A Software Bill of Materials (SBoM) is a detailed inventory of all components, libraries, and modules in a software artifact, providing traceability throughout the software supply chain. With the increasing popularity of JavaScript in software engineering due to its dynamic syntax and seamless supply chain integration, the exposure to vulnerabilities and attacks has risen significantly. A JavaScript application bundle, which is a consolidated, symbol-stripped, and optimized assembly of code for deployment purpose. Generating a SBoM from a JavaScript application bundle through a reverse-engineering process ensures the integrity, security, and compliance of the supplier's software release, even without access to the original dependency graphs. This paper presents the first study on SBoM generation for JavaScript application bundles. We identify three key challenges for this task, i.e., nested code scopes, extremely long sequences, and large retrieval spaces. To address these challenges, we introduce Chain-of-Experts (CoE), a multi-task deep learning model designed to generate SBoMs through three tasks: code segmentation, code classification, and code clone retrieval. We evaluate CoE against individual task-specific solutions on 500 web application bundles with over 66,000 dependencies. Our experimental results demonstrate that CoE offers competitive outcomes with less training and inference time when compared with combined individual task-specific solutions. Consequently, CoE provides the first scalable, efficient, and end-to-end solution for the SBoM generation of real-world JavaScript application bundles.
Published: 2024

10. Optimal Position Detection of an Optically Levitated Mie Particle

Author: Wang, Long, Zhou, Lei-Ming, Tian, Yuan, Liu, Lyu-Hang, Guo, Guang-Can, Zheng, Yu, and Sun, Fang-Wen
Subjects: Physics - Optics
Abstract: We theoretically investigate the problem of position detection of an optically levitated Mie particle. The information radiation field (IRF) is proposed and defined to characterize the scattered light carrying complete information about the center-of-mass (c.m.) motion of the particle. Based on the IRF, we suggest an optimal detection scheme for the position of arbitrary particles. We calculate both the information losses of objective collection and mode-matching in levitated optomechanical experiments. Our results conclude that the backward detection scheme, using an incident Gaussian beam focused by a high numerical aperture lens, provides sufficient information to achieve the quantum ground state through cooling of the three-dimensional c.m. motion of the Mie particle.
Published: 2024

11. Selective Prompt Anchoring for Code Generation

Author: Tian, Yuan and Zhang, Tianyi
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Software Engineering
Abstract: Recent advances in large language models (LLMs) such as Copilot and ChatGPT have transformed software development by automating coding tasks. Despite these advancements, challenges remain in reducing error rates and fully meeting user expectations. Our empirical study reveals LLMs tend to dilute their self-attention on the initial prompt as more code tokens are generated. We hypothesize this self-attention dilution issue is one of the root causes of inaccuracies in LLM-generated code. To mitigate this issue, we propose Selective Prompt Anchoring (SPA). SPA amplifies the influence of the selected parts in the initial prompt, which we refer to as ``anchored text'', during code generation. Specifically, SPA calculates the logit distribution difference with and without the anchored text. We prove this difference approximates the anchored text's contextual contribution to the output logits. SPA creates an augmented logit distribution by linearly combining the original logit distribution and the logit difference. We evaluate SPA with five LLMs on four benchmarks. Our results demonstrate that using SPA can consistently improve Pass@1 rates by up to 9.7% in all settings. Notably, with selective text anchoring, a small version of DeepSeek-Coder (6.7B) can achieve better performance than an original much larger version (33B). Our code is available at https://github.com/magic-YuanTian/Selective-Prompt-Anchoring.
Published: 2024

12. An Exploratory Study on Fine-Tuning Large Language Models for Secure Code Generation

Author: Li, Junjie, Rabbi, Fazle, Cheng, Cheng, Sangalay, Aseem, Tian, Yuan, and Yang, Jinqiu
Subjects: Computer Science - Software Engineering, D.2.0
Abstract: AI-powered coding assistants such as GitHub Copilot and OpenAI ChatGPT have achieved notable success in automating code generation. However, these tools rely on pre-trained Large Language Models (LLMs) that are typically trained on human-written code sourced from open-source project hosting sites like GitHub, which often contains inherent security vulnerabilities. These vulnerabilities may then be mirrored in the code generated by these LLMs, a critical risk revealed and highlighted by recent empirical studies. In this work, we present an exploratory study on whether fine-tuning pre-trained LLMs on datasets of vulnerability-fixing commits can promote secure code generation. We explored two parameter-efficient fine-tuning techniques (LoRa and IA3) on two pre-trained LLMs for code generation. We crawled a fine-tuning dataset (14,622 C and C++ files) for secure code generation by collecting code fixes of confirmed vulnerabilities from open-source repositories. Our evaluation dataset comprises 52 vulnerability scenarios designed to cover the top most dangerous C and C++ Common Weakness Enumerations (CWEs). Each scenario is a prompt that may induce LLMs to generate vulnerable code. Our exploration reveals that fine-tuning LLMs can improve secure code generation by 6.4% in C language and 5.4% in C++ language. We further experimented with fine-tuning LLMs using different versions of the collected secure code dataset (block, function, and line). We found that fine-tuning with function-level and block-level datasets achieves the best secure code generation performance, compared to the alternatives (file-level and line-level)., Comment: 24 pages, 6 figures
Published: 2024

13. BadMerging: Backdoor Attacks Against Model Merging

Author: Zhang, Jinghuai, Chi, Jianfeng, Li, Zheng, Cai, Kunlin, Zhang, Yang, and Tian, Yuan
Subjects: Computer Science - Cryptography and Security, Computer Science - Machine Learning
Abstract: Fine-tuning pre-trained models for downstream tasks has led to a proliferation of open-sourced task-specific models. Recently, Model Merging (MM) has emerged as an effective approach to facilitate knowledge transfer among these independently fine-tuned models. MM directly combines multiple fine-tuned task-specific models into a merged model without additional training, and the resulting model shows enhanced capabilities in multiple tasks. Although MM provides great utility, it may come with security risks because an adversary can exploit MM to affect multiple downstream tasks. However, the security risks of MM have barely been studied. In this paper, we first find that MM, as a new learning paradigm, introduces unique challenges for existing backdoor attacks due to the merging process. To address these challenges, we introduce BadMerging, the first backdoor attack specifically designed for MM. Notably, BadMerging allows an adversary to compromise the entire merged model by contributing as few as one backdoored task-specific model. BadMerging comprises a two-stage attack mechanism and a novel feature-interpolation-based loss to enhance the robustness of embedded backdoors against the changes of different merging parameters. Considering that a merged model may incorporate tasks from different domains, BadMerging can jointly compromise the tasks provided by the adversary (on-task attack) and other contributors (off-task attack) and solve the corresponding unique challenges with novel attack designs. Extensive experiments show that BadMerging achieves remarkable attacks against various MM algorithms. Our ablation study demonstrates that the proposed attack designs can progressively contribute to the attack performance. Finally, we show that prior defense mechanisms fail to defend against our attacks, highlighting the need for more advanced defense., Comment: To appear in ACM Conference on Computer and Communications Security (CCS), 2024
Published: 2024

14. Lazarus: Resilient and Elastic Training of Mixture-of-Experts Models with Adaptive Expert Placement

Author: Wu, Yongji, Qu, Wenjie, Tao, Tianyang, Wang, Zhuang, Bai, Wei, Li, Zhuohao, Tian, Yuan, Zhang, Jiaheng, Lentz, Matthew, and Zhuo, Danyang
Subjects: Computer Science - Distributed, Parallel, and Cluster Computing, Computer Science - Machine Learning
Abstract: Sparsely-activated Mixture-of-Experts (MoE) architecture has increasingly been adopted to further scale large language models (LLMs) due to its sub-linear scaling for computation costs. However, frequent failures still pose significant challenges as training scales. The cost of even a single failure is significant, as all GPUs need to wait idle until the failure is resolved, potentially losing considerable training progress as training has to restart from checkpoints. Existing solutions for efficient fault-tolerant training either lack elasticity or rely on building resiliency into pipeline parallelism, which cannot be applied to MoE models due to the expert parallelism strategy adopted by the MoE architecture. We present Lazarus, a system for resilient and elastic training of MoE models. Lazarus adaptively allocates expert replicas to address the inherent imbalance in expert workload and speeds-up training, while a provably optimal expert placement algorithm is developed to maximize the probability of recovery upon failures. Through adaptive expert placement and a flexible token dispatcher, Lazarus can also fully utilize all available nodes after failures, leaving no GPU idle. Our evaluation shows that Lazarus outperforms existing MoE training systems by up to 5.7x under frequent node failures and 3.4x on a real spot instance trace.
Published: 2024

15. Solutions to Deepfakes: Can Camera Hardware, Cryptography, and Deep Learning Verify Real Images?

Author: Vilesov, Alexander, Tian, Yuan, Sehatbakhsh, Nader, and Kadambi, Achuta
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Cryptography and Security
Abstract: The exponential progress in generative AI poses serious implications for the credibility of all real images and videos. There will exist a point in the future where 1) digital content produced by generative AI will be indistinguishable from those created by cameras, 2) high-quality generative algorithms will be accessible to anyone, and 3) the ratio of all synthetic to real images will be large. It is imperative to establish methods that can separate real data from synthetic data with high confidence. We define real images as those that were produced by the camera hardware, capturing a real-world scene. Any synthetic generation of an image or alteration of a real image through generative AI or computer graphics techniques is labeled as a synthetic image. To this end, this document aims to: present known strategies in detection and cryptography that can be employed to verify which images are real, weight the strengths and weaknesses of these strategies, and suggest additional improvements to alleviate shortcomings.
Published: 2024

16. GAIA: Rethinking Action Quality Assessment for AI-Generated Videos

Author: Chen, Zijian, Sun, Wei, Tian, Yuan, Jia, Jun, Zhang, Zicheng, Wang, Jiarui, Huang, Ru, Min, Xiongkuo, Zhai, Guangtao, and Zhang, Wenjun
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Assessing action quality is both imperative and challenging due to its significant impact on the quality of AI-generated videos, further complicated by the inherently ambiguous nature of actions within AI-generated video (AIGV). Current action quality assessment (AQA) algorithms predominantly focus on actions from real specific scenarios and are pre-trained with normative action features, thus rendering them inapplicable in AIGVs. To address these problems, we construct GAIA, a Generic AI-generated Action dataset, by conducting a large-scale subjective evaluation from a novel causal reasoning-based perspective, resulting in 971,244 ratings among 9,180 video-action pairs. Based on GAIA, we evaluate a suite of popular text-to-video (T2V) models on their ability to generate visually rational actions, revealing their pros and cons on different categories of actions. We also extend GAIA as a testbed to benchmark the AQA capacity of existing automatic evaluation methods. Results show that traditional AQA methods, action-related metrics in recent T2V benchmarks, and mainstream video quality methods perform poorly with an average SRCC of 0.454, 0.191, and 0.519, respectively, indicating a sizable gap between current models and human action perception patterns in AIGVs. Our findings underscore the significance of action quality as a unique perspective for studying AIGVs and can catalyze progress towards methods with enhanced capacities for AQA in AIGVs., Comment: Accepted by NeurIPS2024 Dataset and Benchmark Track as Spotlight. 33 pages, 15 figures
Published: 2024

17. SMC++: Masked Learning of Unsupervised Video Semantic Compression

Author: Tian, Yuan, Lu, Guo, and Zhai, Guangtao
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Multimedia
Abstract: Most video compression methods focus on human visual perception, neglecting semantic preservation. This leads to severe semantic loss during the compression, hampering downstream video analysis tasks. In this paper, we propose a Masked Video Modeling (MVM)-powered compression framework that particularly preserves video semantics, by jointly mining and compressing the semantics in a self-supervised manner. While MVM is proficient at learning generalizable semantics through the masked patch prediction task, it may also encode non-semantic information like trivial textural details, wasting bitcost and bringing semantic noises. To suppress this, we explicitly regularize the non-semantic entropy of the compressed video in the MVM token space. The proposed framework is instantiated as a simple Semantic-Mining-then-Compression (SMC) model. Furthermore, we extend SMC as an advanced SMC++ model from several aspects. First, we equip it with a masked motion prediction objective, leading to better temporal semantic learning ability. Second, we introduce a Transformer-based compression module, to improve the semantic compression efficacy. Considering that directly mining the complex redundancy among heterogeneous features in different coding stages is non-trivial, we introduce a compact blueprint semantic representation to align these features into a similar form, fully unleashing the power of the Transformer-based compression module. Extensive results demonstrate the proposed SMC and SMC++ models show remarkable superiority over previous traditional, learnable, and perceptual quality-oriented video codecs, on three video analysis tasks and seven datasets. \textit{Codes and model are available at: \url{https://github.com/tianyuan168326/VideoSemanticCompression-Pytorch}.
Published: 2024

18. An Empirical Study of Developers' Challenges in Implementing Workflows as Code: A Case Study on Apache Airflow

Author: Yasmin, Jerin, Wang, Jiale, Tian, Yuan, and Adams, Bram
Subjects: Computer Science - Software Engineering
Abstract: The Workflows as Code paradigm is becoming increasingly essential to streamline the design and management of complex processes within data-intensive software systems. These systems require robust capabilities to process, analyze, and extract insights from large datasets. Workflow orchestration platforms such as Apache Airflow are pivotal in meeting these needs, as they effectively support the implementation of the Workflows as Code paradigm. Nevertheless, despite its considerable advantages, developers still face challenges due to the specialized demands of workflow orchestration and the complexities of distributed execution environments. In this paper, we manually study 1,000 sampled Stack Overflow posts derived from 9,591 Airflow-related questions to understand developers' challenges and root causes while implementing Workflows as Code. Our analysis results in a hierarchical taxonomy of Airflow-related challenges that contains 7 high-level categories and 14 sub-categories. We find that the most significant obstacles for developers arise when defining and executing their workflow. Our in-depth analysis identifies 10 root causes behind the challenges, including incorrect workflow configuration, complex environmental setup, and a lack of basic knowledge about Airflow and the external systems that it interacts with. Additionally, our analysis of references shared within the collected posts reveals that beyond the frequently cited Airflow documentation, documentation from external systems and third-party providers is also commonly referenced to address Airflow-related challenges., Comment: This is the preprint version of a paper that has been submitted to the Journal of Systems and Software
Published: 2024

19. RCInvestigator: Towards Better Investigation of Anomaly Root Causes in Cloud Computing Systems

Author: Liu, Shuhan, Zhou, Yunfan, Ying, Lu, Tian, Yuan, Zhang, Jue, Zhou, Shandan, Cui, Weiwei, Lin, Qingwei, Moscibroda, Thomas, Zhang, Haidong, Weng, Di, and Wu, Yingcai
Subjects: Computer Science - Human-Computer Interaction
Abstract: Finding the root causes of anomalies in cloud computing systems quickly is crucial to ensure availability and efficiency since accurate root causes can guide engineers to take appropriate actions to address the anomalies and maintain customer satisfaction. However, it is difficult to investigate and identify the root causes based on large-scale and high-dimension monitoring data collected from complex cloud computing environments. Due to the inherently dynamic characteristics of cloud computing systems, the existing approaches in practice largely rely on manual analyses for flexibility and reliability, but massive unpredictable factors and high data complexity make the process time-consuming. Despite recent advances in automated detection and investigation approaches, the speed and quality of root cause analyses remain limited by the lack of expert involvement in these approaches. The limitations found in the current solutions motivate us to propose a visual analytics approach that facilitates the interactive investigation of the anomaly root causes in cloud computing systems. We identified three challenges, namely, a) modeling databases for the root cause investigation, b) inferring root causes from large-scale time series, and c) building comprehensible investigation results. In collaboration with domain experts, we addressed these challenges with RCInvestigator, a novel visual analytics system that establishes a tight collaboration between human and machine and assists experts in investigating the root causes of cloud computing system anomalies. We evaluated the effectiveness of RCInvestigator through two use cases based on real-world data and received positive feedback from experts.
Published: 2024

20. Remote Keylogging Attacks in Multi-user VR Applications

Author: Su, Zihao, Cai, Kunlin, Beeler, Reuben, Dresel, Lukas, Garcia, Allan, Grishchenko, Ilya, Tian, Yuan, Kruegel, Christopher, and Vigna, Giovanni
Subjects: Computer Science - Cryptography and Security
Abstract: As Virtual Reality (VR) applications grow in popularity, they have bridged distances and brought users closer together. However, with this growth, there have been increasing concerns about security and privacy, especially related to the motion data used to create immersive experiences. In this study, we highlight a significant security threat in multi-user VR applications, which are applications that allow multiple users to interact with each other in the same virtual space. Specifically, we propose a remote attack that utilizes the avatar rendering information collected from an adversary's game clients to extract user-typed secrets like credit card information, passwords, or private conversations. We do this by (1) extracting motion data from network packets, and (2) mapping motion data to keystroke entries. We conducted a user study to verify the attack's effectiveness, in which our attack successfully inferred 97.62% of the keystrokes. Besides, we performed an additional experiment to underline that our attack is practical, confirming its effectiveness even when (1) there are multiple users in a room, and (2) the attacker cannot see the victims. Moreover, we replicated our proposed attack on four applications to demonstrate the generalizability of the attack. Lastly, we proposed a defense against the attack, which has been implemented by major players in the VR industry. These results underscore the severity of the vulnerability and its potential impact on millions of VR social platform users., Comment: Accepted for Usenix 2024
Published: 2024

21. An Empirical Study on the Effectiveness of Large Language Models for SATD Identification and Classification

Author: Sheikhaei, Mohammad Sadegh, Tian, Yuan, Wang, Shaowei, and Xu, Bowen
Subjects: Computer Science - Software Engineering, D.2, I.2
Abstract: Self-Admitted Technical Debt (SATD), a concept highlighting sub-optimal choices in software development documented in code comments or other project resources, poses challenges in the maintainability and evolution of software systems. Large language models (LLMs) have demonstrated significant effectiveness across a broad range of software tasks, especially in software text generation tasks. Nonetheless, their effectiveness in tasks related to SATD is still under-researched. In this paper, we investigate the efficacy of LLMs in both identification and classification of SATD. For both tasks, we investigate the performance gain from using more recent LLMs, specifically the Flan-T5 family, across different common usage settings. Our results demonstrate that for SATD identification, all fine-tuned LLMs outperform the best existing non-LLM baseline, i.e., the CNN model, with a 4.4% to 7.2% improvement in F1 score. In the SATD classification task, while our largest fine-tuned model, Flan-T5-XL, still led in performance, the CNN model exhibited competitive results, even surpassing four of six LLMs. We also found that the largest Flan-T5 model, i.e., Flan-T5-XXL, when used with a zero-shot in-context learning (ICL) approach for SATD identification, provides competitive results with traditional approaches but performs 6.4% to 9.2% worse than fine-tuned LLMs. For SATD classification, few-shot ICL approach, incorporating examples and category descriptions in prompts, outperforms the zero-shot approach and even surpasses the fine-tuned smaller Flan-T5 models. Moreover, our experiments demonstrate that incorporating contextual information, such as surrounding code, into the SATD classification task enables larger fine-tuned LLMs to improve their performance., Comment: This is the preprint version of a paper that has been submitted to Empirical Software Engineering
Published: 2024

22. Deep Space Separable Distillation for Lightweight Acoustic Scene Classification

Author: Ye, ShuQi and Tian, Yuan
Subjects: Computer Science - Sound, Computer Science - Artificial Intelligence, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Acoustic scene classification (ASC) is highly important in the real world. Recently, deep learning-based methods have been widely employed for acoustic scene classification. However, these methods are currently not lightweight enough as well as their performance is not satisfactory. To solve these problems, we propose a deep space separable distillation network. Firstly, the network performs high-low frequency decomposition on the log-mel spectrogram, significantly reducing computational complexity while maintaining model performance. Secondly, we specially design three lightweight operators for ASC, including Separable Convolution (SC), Orthonormal Separable Convolution (OSC), and Separable Partial Convolution (SPC). These operators exhibit highly efficient feature extraction capabilities in acoustic scene classification tasks. The experimental results demonstrate that the proposed method achieves a performance gain of 9.8% compared to the currently popular deep learning methods, while also having smaller parameter count and computational complexity.
Published: 2024

23. Explainable Fake News Detection With Large Language Model via Defense Among Competing Wisdom

Author: Wang, Bo, Ma, Jing, Lin, Hongzhan, Yang, Zhiwei, Yang, Ruichao, Tian, Yuan, and Chang, Yi
Subjects: Computer Science - Computation and Language
Abstract: Most fake news detection methods learn latent feature representations based on neural networks, which makes them black boxes to classify a piece of news without giving any justification. Existing explainable systems generate veracity justifications from investigative journalism, which suffer from debunking delayed and low efficiency. Recent studies simply assume that the justification is equivalent to the majority opinions expressed in the wisdom of crowds. However, the opinions typically contain some inaccurate or biased information since the wisdom of crowds is uncensored. To detect fake news from a sea of diverse, crowded and even competing narratives, in this paper, we propose a novel defense-based explainable fake news detection framework. Specifically, we first propose an evidence extraction module to split the wisdom of crowds into two competing parties and respectively detect salient evidences. To gain concise insights from evidences, we then design a prompt-based module that utilizes a large language model to generate justifications by inferring reasons towards two possible veracities. Finally, we propose a defense-based inference module to determine veracity via modeling the defense among these justifications. Extensive experiments conducted on two real-world benchmarks demonstrate that our proposed method outperforms state-of-the-art baselines in terms of fake news detection and provides high-quality justifications., Comment: 12 pages, WWW'2024
Published: 2024

24. Calculated Unconventional Superconductivity via Charge Fluctuations in Kagome Metal CsV3Sb5

Author: Tian, Yuan and Savrasov, Sergey Y.
Subjects: Condensed Matter - Superconductivity, Condensed Matter - Materials Science, Condensed Matter - Strongly Correlated Electrons
Abstract: Electrons on Kagome lattice exhibit a wealth of features including Dirac points, van Hove singularities and flatbands. When the Fermi level is placed at the van Hove saddle point, the Fermi surface is perfectly nested and a rich variety of electronic instabilities is known to occur. The material realization of such scenario is a recently discovered Kagome system CsV3Sb5 whose superconductivity near charge-density wave instability at low temperatures points to an unconventional, non-electron-phonon, pairing mechanism. Here we use a recently developed combination of density functional theory with momentum and frequency-resolved self-energies deduced from the so-called fluctuational-exchange-type random phase approximation to study charge fluctuation mediated pairing tendencies in CsV3Sb5. Based on our numerical diagonalization of the BCS gap equation, two competing solutions emerge from these calculations with A_{1g} (anisotropic s-wave-like) and B_{2g} (d_{x2-y2},d_{xy}-like) symmetries of the superconducting order parameter. Our evaluated Eliashberg spectral functions {\alpha}2F({\omega}) are purely due to electronic correlations; they were found to be strongly peaked in the vicinity of frequency 7 meV that sets the scale of charge fluctuations. The superconducting coupling constants for the leading pairing channels are estimated as a function of the nearest neighbor Coulomb interaction V, a well-known prime parameter of the extended Hubbard model. They were found in the range of 0.2-0.4 depending on V. We evaluate the superconducting T_{c} close to the values that are observed experimentally that point to the charge fluctuations to provide a substantial contribution to the pairing mechanism in CsV3Sb5., Comment: 6 pages, 4 figures
Published: 2024

25. A nano vacuum gauge based on second-order coherence in optical levitation

Author: Liu, Lyu-Hang, Zheng, Yu, Tian, Yuan, Wang, Long, Guo, Guang-Can, and Sun, Fang-Wen
Subjects: Physics - Optics
Abstract: Accurate measurement of pressure with a wide dynamic range holds significant importance for various applications. This issue can be realized with a mechanical nano-oscillator, where the pressure-related collisions with surrounding molecules induce its energy dissipation. However, this energy dissipation of the nano-oscillator may be overshadowed by other processes. Here, we apply the second-order coherence analysis to accurately characterize those distinct dissipation processes. Based on an optically levitated nano-oscillator, we successfully obtain precise measurements of the air pressure surrounding the particles from atmosphere to 7E-6 mbar, over 8 orders of magnitude. It proves that the mechanical nano-oscillator is an extremely promising candidate for precision pressure sensing applications. Moreover, the second-order coherence analysis method on a classical system can pave the way to characterize the dynamic properties of an oscillator, which will benefit microscopic thermodynamics, precision measurement, and macroscopic quantum research., Comment: 6 pages. 4 figures
Published: 2024

26. Exploring the Impact of the Output Format on the Evaluation of Large Language Models for Code Translation

Author: Macedo, Marcos, Tian, Yuan, Cogo, Filipe R., and Adams, Bram
Subjects: Computer Science - Software Engineering, Computer Science - Artificial Intelligence
Abstract: Code translation between programming languages is a long-existing and critical task in software engineering, facilitating the modernization of legacy systems, ensuring cross-platform compatibility, and enhancing software performance. With the recent advances in large language models (LLMs) and their applications to code translation, there is an increasing need for comprehensive evaluation of these models. In this study, we empirically analyze the generated outputs of eleven popular instruct-tuned LLMs with parameters ranging from 1B up to 46.7B on 3,820 translation pairs across five languages, including C, C++, Go, Java, and Python. Our analysis found that between 26.4% and 73.7% of code translations produced by our evaluated LLMs necessitate post-processing, as these translations often include a mix of code, quotes, and text rather than being purely source code. Overlooking the output format of these models can inadvertently lead to underestimation of their actual performance. This is particularly evident when evaluating them with execution-based metrics such as Computational Accuracy (CA). Our results demonstrate that a strategic combination of prompt engineering and regular expression can effectively extract the source code from the model generation output. In particular, our method can help eleven selected models achieve an average Code Extraction Success Rate (CSR) of 92.73%. Our findings shed light on and motivate future research to conduct more reliable benchmarks of LLMs for code translation., Comment: Accepted into 2024 IEEE/ACM First International Conference on AI Foundation Models and Software Engineering (Forge)
Published: 2024
Full Text: View/download PDF

27. An Empirical Study on Developers Shared Conversations with ChatGPT in GitHub Pull Requests and Issues

Author: Hao, Huizi, Hasan, Kazi Amit, Qin, Hong, Macedo, Marcos, Tian, Yuan, Ding, Steven H. H., and Hassan, Ahmed E.
Subjects: Computer Science - Software Engineering
Abstract: ChatGPT has significantly impacted software development practices, providing substantial assistance to developers in a variety of tasks, including coding, testing, and debugging. Despite its widespread adoption, the impact of ChatGPT as an assistant in collaborative coding remains largely unexplored. In this paper, we analyze a dataset of 210 and 370 developers shared conversations with ChatGPT in GitHub pull requests (PRs) and issues. We manually examined the content of the conversations and characterized the dynamics of the sharing behavior, i.e., understanding the rationale behind the sharing, identifying the locations where the conversations were shared, and determining the roles of the developers who shared them. Our main observations are: (1) Developers seek ChatGPT assistance across 16 types of software engineering inquiries. In both conversations shared in PRs and issues, the most frequently encountered inquiry categories include code generation, conceptual questions, how-to guides, issue resolution, and code review. (2) Developers frequently engage with ChatGPT via multi-turn conversations where each prompt can fulfill various roles, such as unveiling initial or new tasks, iterative follow-up, and prompt refinement. Multi-turn conversations account for 33.2% of the conversations shared in PRs and 36.9% in issues. (3) In collaborative coding, developers leverage shared conversations with ChatGPT to facilitate their role-specific contributions, whether as authors of PRs or issues, code reviewers, or collaborators on issues. Our work serves as the first step towards understanding the dynamics between developers and ChatGPT in collaborative software development and opens up new directions for future research on the topic.
Published: 2024

28. Synergy between Spin and Orbital Angular Momenta on a M\'obius Strip

Author: Liu, Lei, Sun, Xiao-Chen, Tian, Yuan, Zhang, Xiujuan, Lu, Ming-Hui, and Chen, Yan-Feng
Subjects: Physics - Applied Physics, Condensed Matter - Materials Science
Abstract: Spin and orbital angular momenta are fundamental physical characteristics described by polarization and spatial degrees of freedom, respectively. Polarization is a feature of vector fields while spatial phase gradient determines the orbital angular momentum ubiquitous to any scalar field. Common wisdom treats these two degrees of freedom as distinct and independent principles to manipulate wave propagations. Here, we demonstrate their synergy. This is achieved by introducing two orthogonal $p$-orbitals as eigenbases, whose spatial modal features are exploited to generate orbital angular momenta and the associated orbital orientations provide means to simultaneously manipulate polarizations. Through periodic modulation and directional coupling, we realize a full cyclic evolution of the synchronized and synergized spin-orbital angular momenta. Remarkably, this evolution acquires a nontrivial geometric phase, leading to its representation on a M\"obius strip. Experimentally, an acoustic cavity array is designed, whose dipole resonances precisely mimic the $p$-orbitals. The acoustic waves, uniquely, see the pressure (scalar) field as a spatial feature and carry an intrinsic polarization defined by the velocity (vector) field, serving as an ideal platform to observe the synergy of spin and orbital angular momenta. Based on such a property, we further showcase a spin-orbital-Hall effect, highlighting the intricate locking of handedness, directionality, spin density and spatial mode profile. Our study unveils a fundamental connection between spin and orbital angular momenta, promising avenues for novel applications in information coding and high-capacity communications.
Published: 2024

29. CarbonNet: How Computer Vision Plays a Role in Climate Change? Application: Learning Geomechanics from Subsurface Geometry of CCS to Mitigate Global Warming

Author: Chen, Wei, Li, Yunan, and Tian, Yuan
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: We introduce a new approach using computer vision to predict the land surface displacement from subsurface geometry images for Carbon Capture and Sequestration (CCS). CCS has been proved to be a key component for a carbon neutral society. However, scientists see there are challenges along the way including the high computational cost due to the large model scale and limitations to generalize a pre-trained model with complex physics. We tackle those challenges by training models directly from the subsurface geometry images. The goal is to understand the respons of land surface displacement due to carbon injection and utilize our trained models to inform decision making in CCS projects. We implement multiple models (CNN, ResNet, and ResNetUNet) for static mechanics problem, which is a image prediction problem. Next, we use the LSTM and transformer for transient mechanics scenario, which is a video prediction problem. It shows ResNetUNet outperforms the others thanks to its architecture in static mechanics problem, and LSTM shows comparable performance to transformer in transient problem. This report proceeds by outlining our dataset in detail followed by model descriptions in method section. Result and discussion state the key learning, observations, and conclusion with future work rounds out the paper.
Published: 2024

30. Layer-dependent evolution of electronic structures and correlations in rhombohedral multilayer graphene

Author: Zhang, Yang, Zhou, Yue-Ying, Zhang, Shihao, Cai, Hao, Tong, Ling-Hui, Liao, Wei-Yu, Zou, Ruo-Jue, Xue, Si-Min, Tian, Yuan, Chen, Tongtong, Tian, Qiwei, Zhang, Chen, Wang, Yiliu, Zou, Xuming, Liu, Xingqiang, Hu, Yuanyuan, Ren, Ya-Ning, Zhang, Li, Zhang, Lijie, Wang, Wen-Xiao, He, Lin, Liao, Lei, Qin, Zhihui, and Yin, Long-Jing
Published: 2024
Full Text: View/download PDF

31. Physiological changes in shrub species due to different sources of dust pollution in an urban environment

Author: Tian, Yuan, Li, Haimei, Li, Mingyan, Li, Shimei, and Guo, Xiao
Published: 2024
Full Text: View/download PDF

32. hnRNPA2B1 deacetylation by SIRT6 restrains local transcription and safeguards genome stability

Author: Chen, Feng, Xu, Wenchao, Tang, Ming, Tian, Yuan, Shu, Yuxin, He, Xingkai, Zhou, Linmin, Liu, Qi, Zhu, Qian, Lu, Xiaopeng, Zhang, Jun, and Zhu, Wei-Guo
Published: 2024
Full Text: View/download PDF

33. Quantitative analysis of pressure levels in manual lymphatic drainage across stages of breast cancer-related lymphedema: implications for optimized treatment protocols

Author: Xing, Naifang, Liu, Daiqing, Chen, Lufeng, Wang, Guorong, Tian, Yuan, Yang, Chen, Leng, Yingjie, Jiang, Xin, Li, Chengxiang, Xie, Ruonan, Nie, Zhuomiao, and Zhang, Tian
Published: 2024
Full Text: View/download PDF

34. Single cell, Label free Characterisation of Human Mesenchymal Stromal cell Stemness and Future Growth Potential by Autofluorescence Multispectral Imaging

Author: Campbell, Jared M., Habibalahi, Abbas, Agha, Adnan, Handley, Shannon, Knab, Aline, Xu, Xiaohu, Bhargava, Akanksha, Lei, Zhilin, Mackevicius, Max, Tian, Yuan, Mahbub, Saabah B., Anwer, Ayad G., Gronthos, Stan, Paton, Sharon, Grey, Shane T., Wu, Lindsay, Gilchrist, Robert B., and Goldys, Ewa M.
Published: 2024
Full Text: View/download PDF

35. USP33 facilitates the ovarian cancer progression via deubiquitinating and stabilizing CBX2

Author: Chen, Jiming, Shan, Wulin, Jia, Qiucheng, Chen, Yao, Jiang, Wenjing, Tian, Yuan, Huang, Xu, Li, Xiaoyu, Wang, Zengying, and Xia, Bairong
Published: 2024
Full Text: View/download PDF

36. Tailoring the π-conjugation in self-assembled hole-selective molecules for perovskite photovoltaics

Author: Zhao, Ke, Yao, Libing, Liu, Chen, Yavuz, Ilhan, Shen, Jiahui, Shi, Pengju, Zhang, Xu, Luo, Yixin, Jin, Donger, Tian, Yuan, Wang, Sisi, Fan, Wei, Xu, Jiazhe, Liu, Qingqing, Wang, Xiaonan, Tian, Liuwen, Liu, Ruzhang, Değer, Caner, Wang, Rui, and Xue, Jingjing
Published: 2024
Full Text: View/download PDF

37. E-Learning Course Design Based on Cloud-Based Fuzzy Logic Approach for Foreign Language Teaching and Learning

Author: Tian, Yuan and Gao, Bei
Published: 2024
Full Text: View/download PDF

38. Cryo-EM structures of Smc5/6 in multiple states reveal its assembly and functional mechanisms

Author: Li, Qian, Zhang, Jun, Haluska, Cory, Zhang, Xiang, Wang, Lei, Liu, Guangfeng, Wang, Zhaoning, Jin, Duo, Cheng, Tong, Wang, Hongxia, Tian, Yuan, Wang, Xiangxi, Sun, Lei, Zhao, Xiaolan, Chen, Zhenguo, and Wang, Lanfeng
Published: 2024
Full Text: View/download PDF

39. The emerging role of CARM1 in cancer

Author: Xie, Zizhuo, Tian, Yuan, Guo, Xiaohan, and Xie, Na
Published: 2024
Full Text: View/download PDF

40. Tailoring supramolecular antimicrobial peptides: from self-assembled nanoarchitectures to activities

Author: Wang, Saisai, Wu, Jian, Tian, Yuan, and Zhou, Shaobing
Published: 2024
Full Text: View/download PDF

41. High-entropy hybrid perovskites with disordered organic moieties for perovskite solar cells

Author: Tian, Yuan, Zhang, Xu, Zhao, Ke, Miao, Xiaohe, Deng, Tianqi, Fan, Wei, Jin, Donger, Jiang, Xuanyu, Zhong, Shulin, Wang, Xiaonan, Wang, Sisi, Shi, Pengju, Tian, Liuwen, Yao, Libing, Gong, Shaokuan, Yu, Xuemeng, Gao, Xingyu, Chen, Zhong, Chen, Xihan, Lu, Yunhao, Shrote, Vinayak, Yang, Yang, Yang, Deren, Wang, Rui, and Xue, Jingjing
Published: 2024
Full Text: View/download PDF

42. Staged thermal runaway behaviours of three typical lithium-ion batteries for hazard prevention

Author: Xiao, Yang, Zhao, Jia-Rong, Yin, Lan, Li, Bei, and Tian, Yuan
Published: 2024
Full Text: View/download PDF

43. PE-MVCNet: Multi-view and Cross-modal Fusion Network for Pulmonary Embolism Prediction

Author: Guo, Zhaoxin, Wang, Zhipeng, Ge, Ruiquan, Yu, Jianxun, Qin, Feiwei, Tian, Yuan, Peng, Yuqing, Li, Yonghong, and Wang, Changmiao
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: The early detection of a pulmonary embolism (PE) is critical for enhancing patient survival rates. Both image-based and non-image-based features are of utmost importance in medical classification tasks. In a clinical setting, physicians tend to rely on the contextual information provided by Electronic Medical Records (EMR) to interpret medical imaging. However, very few models effectively integrate clinical information with imaging data. To address this shortcoming, we suggest a multimodal fusion methodology, termed PE-MVCNet, which capitalizes on Computed Tomography Pulmonary Angiography imaging and EMR data. This method comprises the Image-only module with an integrated multi-view block, the EMR-only module, and the Cross-modal Attention Fusion (CMAF) module. These modules cooperate to extract comprehensive features that subsequently generate predictions for PE. We conducted experiments using the publicly accessible Stanford University Medical Center dataset, achieving an AUROC of 94.1%, an accuracy rate of 90.2%, and an F1 score of 90.6%. Our proposed model outperforms existing methodologies, corroborating that our multimodal fusion model excels compared to models that use a single data modality. Our source code is available at https://github.com/LeavingStarW/PE-MVCNET.
Published: 2024

44. Insights into Natural Language Database Query Errors: From Attention Misalignment to User Handling Strategies

Author: Ning, Zheng, Tian, Yuan, Zhang, Zheng, Zhang, Tianyi, and Li, Toby
Subjects: Computer Science - Human-Computer Interaction
Abstract: Querying structured databases with natural language (NL2SQL) has remained a difficult problem for years. Recently, the advancement of machine learning (ML), natural language processing (NLP), and large language models (LLM) have led to significant improvements in performance, with the best model achieving ~85% percent accuracy on the benchmark Spider dataset. However, there is a lack of a systematic understanding of the types, causes, and effectiveness of error-handling mechanisms of errors for erroneous queries nowadays. To bridge the gap, a taxonomy of errors made by four representative NL2SQL models was built in this work, along with an in-depth analysis of the errors. Second, the causes of model errors were explored by analyzing the model-human attention alignment to the natural language query. Last, a within-subjects user study with 26 participants was conducted to investigate the effectiveness of three interactive error-handling mechanisms in NL2SQL. Findings from this paper shed light on the design of model structure and error discovery and repair strategies for natural language data query interfaces in the future.
Published: 2024

45. Sym-Q: Adaptive Symbolic Regression via Sequential Decision-Making

Author: Tian, Yuan, Zhou, Wenqi, Dong, Hao, Kammer, David S., and Fink, Olga
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Symbolic regression holds great potential for uncovering underlying mathematical and physical relationships from empirical data. While existing transformer-based models have recently achieved significant success in this domain, they face challenges in terms of generalizability and adaptability. Typically, in cases where the output expressions do not adequately fit experimental data, the models lack efficient mechanisms to adapt or modify the expression. This inflexibility hinders their application in real-world scenarios, particularly in discovering unknown physical or biological relationships. Inspired by how human experts refine and adapt expressions, we introduce Symbolic Q-network (Sym-Q), a novel reinforcement learning-based model that redefines symbolic regression as a sequential decision-making task. Sym-Q leverages supervised demonstrations and refines expressions based on reward signals indicating the quality of fitting precision. Its distinctive ability to manage the complexity of expression trees and perform precise step-wise updates significantly enhances flexibility and efficiency. Our results demonstrate that Sym-Q excels not only in recovering underlying mathematical structures but also uniquely learns to efficiently refine the output expression based on reward signals, thereby discovering underlying expressions. Sym-Q paves the way for more intuitive and impactful discoveries in physical science, marking a substantial advancement in the field of symbolic regression.
Published: 2024

46. PeaTMOSS: A Dataset and Initial Analysis of Pre-Trained Models in Open-Source Software

Author: Jiang, Wenxin, Yasmin, Jerin, Jones, Jason, Synovic, Nicholas, Kuo, Jiashen, Bielanski, Nathaniel, Tian, Yuan, Thiruvathukal, George K., and Davis, James C.
Subjects: Computer Science - Software Engineering, Computer Science - Artificial Intelligence, Computer Science - Databases, Computer Science - Machine Learning
Abstract: The development and training of deep learning models have become increasingly costly and complex. Consequently, software engineers are adopting pre-trained models (PTMs) for their downstream applications. The dynamics of the PTM supply chain remain largely unexplored, signaling a clear need for structured datasets that document not only the metadata but also the subsequent applications of these models. Without such data, the MSR community cannot comprehensively understand the impact of PTM adoption and reuse. This paper presents the PeaTMOSS dataset, which comprises metadata for 281,638 PTMs and detailed snapshots for all PTMs with over 50 monthly downloads (14,296 PTMs), along with 28,575 open-source software repositories from GitHub that utilize these models. Additionally, the dataset includes 44,337 mappings from 15,129 downstream GitHub repositories to the 2,530 PTMs they use. To enhance the dataset's comprehensiveness, we developed prompts for a large language model to automatically extract model metadata, including the model's training datasets, parameters, and evaluation metrics. Our analysis of this dataset provides the first summary statistics for the PTM supply chain, showing the trend of PTM development and common shortcomings of PTM package documentation. Our example application reveals inconsistencies in software licenses across PTMs and their dependent projects. PeaTMOSS lays the foundation for future research, offering rich opportunities to investigate the PTM supply chain. We outline mining opportunities on PTMs, their downstream usage, and cross-cutting questions., Comment: Accepted at MSR'24
Published: 2024

47. ZS4C: Zero-Shot Synthesis of Compilable Code for Incomplete Code Snippets using LLMs

Author: Kabir, Azmain, Wang, Shaowei, Tian, Yuan, Chen, Tse-Hsun, Asaduzzaman, Muhammad, and Zhang, Wenbin
Subjects: Computer Science - Software Engineering, Computer Science - Artificial Intelligence
Abstract: Technical Q&A sites are valuable for software developers seeking knowledge, but the code snippets they provide are often uncompilable and incomplete due to unresolved types and missing libraries. This poses a challenge for users who wish to reuse or analyze these snippets. Existing methods either do not focus on creating compilable code or have low success rates. To address this, we propose ZS4C, a lightweight approach for zero-shot synthesis of compilable code from incomplete snippets using Large Language Models (LLMs). ZS4C operates in two stages: first, it uses an LLM, like GPT-3.5, to identify missing import statements in a snippet; second, it collaborates with a validator (e.g., compiler) to fix compilation errors caused by incorrect imports and syntax issues. We evaluated ZS4C on the StatType-SO benchmark and a new dataset, Python-SO, which includes 539 Python snippets from Stack Overflow across the 20 most popular Python libraries. ZS4C significantly outperforms existing methods, improving the compilation rate from 63% to 95.1% compared to the state-of-the-art SnR, marking a 50.1% improvement. On average, ZS4C can infer more accurate import statements (with an F1 score of 0.98) than SnR, with an improvement of 8.5% in the F1.
Published: 2024
Full Text: View/download PDF

48. Studying and Recommending Information Highlighting in Stack Overflow Answers

Author: Ahmed, Shahla Shaan, Wang, Shaowei, Tian, Yuan, Tse-Hsun, Chen, and Zhang, Haoxiang
Subjects: Computer Science - Computation and Language, Computer Science - Information Retrieval, Computer Science - Machine Learning, Computer Science - Software Engineering
Abstract: Context: Navigating the knowledge of Stack Overflow (SO) remains challenging. To make the posts vivid to users, SO allows users to write and edit posts with Markdown or HTML so that users can leverage various formatting styles (e.g., bold, italic, and code) to highlight the important information. Nonetheless, there have been limited studies on the highlighted information. Objective: We carried out the first large-scale exploratory study on the information highlighted in SO answers in our recent study. To extend our previous study, we develop approaches to automatically recommend highlighted content with formatting styles using neural network architectures initially designed for the Named Entity Recognition task. Method: In this paper, we studied 31,169,429 answers of Stack Overflow. For training recommendation models, we choose CNN-based and BERT-based models for each type of formatting (i.e., Bold, Italic, Code, and Heading) using the information highlighting dataset we collected from SO answers. Results: Our models achieve a precision ranging from 0.50 to 0.72 for different formatting types. It is easier to build a model to recommend Code than other types. Models for text formatting types (i.e., Heading, Bold, and Italic) suffer low recall. Our analysis of failure cases indicates that the majority of the failure cases are due to missing identification. One explanation is that the models are easy to learn the frequent highlighted words while struggling to learn less frequent words (i.g., long-tail knowledge). Conclusion: Our findings suggest that it is possible to develop recommendation models for highlighting information for answers with different formatting styles on Stack Overflow., Comment: This work is submitted to Information and Software Technology Journal
Published: 2024

49. Strain regulates the photovoltaic performance of thick-film perovskites.

Author: Shi, Pengju, Xu, Jiazhe, Yavuz, Ilhan, Huang, Tianyi, Tan, Shaun, Zhao, Ke, Zhang, Xu, Tian, Yuan, Wang, Sisi, Fan, Wei, Li, Yahui, Jin, Donger, Yu, Xuemeng, Wang, Chenyue, Gao, Xingyu, Chen, Zhong, Shi, Enzheng, Chen, Xihan, Yang, Deren, Xue, Jingjing, Wang, Rui, and Yang, Yang
Abstract: Perovskite photovoltaics, typically based on a solution-processed perovskite layer with a film thickness of a few hundred nanometres, have emerged as a leading thin-film photovoltaic technology. Nevertheless, many critical issues pose challenges to its commercialization progress, including industrial compatibility, stability, scalability and reliability. A thicker perovskite film on a scale of micrometres could mitigate these issues. However, the efficiencies of thick-film perovskite cells lag behind those with nanometre film thickness. With the mechanism remaining elusive, the community has long been under the impression that the limiting factor lies in the short carrier lifetime as a result of defects. Here, by constructing a perovskite system with extraordinarily long carrier lifetime, we rule out the restrictions of carrier lifetime on the device performance. Through this, we unveil the critical role of the ignored lattice strain in thick films. Our results provide insights into the factors limiting the performance of thick-film perovskite devices.
Published: 2024

50. Layer-dependent evolution of electronic structures and correlations in rhombohedral multilayer graphene

Author: Zhang, Yang, Zhou, Yue-Ying, Zhang, Shihao, Cai, Hao, Tong, Ling-Hui, Tian, Yuan, Chen, Tongtong, Tian, Qiwei, Zhang, Chen, Wang, Yiliu, Zou, Xuming, Liu, Xingqiang, Hu, Yuanyuan, Ren, Ya-Ning, Zhang, Li, Zhang, Lijie, Wang, Wen-Xiao, He, Lin, Liao, Lei, Qin, Zhihui, and Yin, Long-Jing
Subjects: Condensed Matter - Mesoscale and Nanoscale Physics, Condensed Matter - Strongly Correlated Electrons
Abstract: The recent discovery of superconductivity and magnetism in trilayer rhombohedral graphene (RG) establishes an ideal, untwisted platform to study strong correlation electronic phenomena. However, the correlated effects in multilayer RG have received limited attention, and, particularly, the evolution of the correlations with increasing layer number remains an unresolved question. Here, we show the observation of layer-dependent electronic structures and correlations, under surprising liquid nitrogen temperature, in RG multilayers from 3 to 9 layers by using scanning tunneling microscopy and spectroscopy. We explicitly determine layer-enhanced low-energy flat bands and interlayer coupling strengths. The former directly demonstrates the further flattening of low-energy bands in thicker RG, and the latter indicates the presence of varying interlayer interactions in RG multilayers. Moreover, we find significant splittings of the flat bands, ranging from ~50-80 meV, at 77 K when they are partially filled, indicating the emergence of interaction-induced strongly correlated states. Particularly, the strength of the correlated states is notably enhanced in thicker RG and reaches its maximum in the six-layer, validating directly theoretical predictions and establishing abundant new candidates for strongly correlated systems. Our results provide valuable insights into the layer dependence of the electronic properties in RG and demonstrate it as a suitable system for investigating robust and highly accessible correlated phases., Comment: 20 pages, 4 figures
Published: 2023
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Category

Publication Type

Journal

Region

Database

Publisher

15,235 results on '"Tian, Yuan"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources