Author: "Jiang, Lingxiao" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Jiang, Lingxiao"' showing total 351 results

Start Over Author "Jiang, Lingxiao"

351 results on '"Jiang, Lingxiao"'

1. Evaluating Software Development Agents: Patch Patterns, Code Quality, and Issue Complexity in Real-World GitHub Scenarios

Author: Chen, Zhi and Jiang, Lingxiao
Subjects: Computer Science - Software Engineering, Computer Science - Artificial Intelligence
Abstract: In recent years, AI-based software engineering has progressed from pre-trained models to advanced agentic workflows, with Software Development Agents representing the next major leap. These agents, capable of reasoning, planning, and interacting with external environments, offer promising solutions to complex software engineering tasks. However, while much research has evaluated code generated by large language models (LLMs), comprehensive studies on agent-generated patches, particularly in real-world settings, are lacking. This study addresses that gap by evaluating 4,892 patches from 10 top-ranked agents on 500 real-world GitHub issues from SWE-Bench Verified, focusing on their impact on code quality. Our analysis shows no single agent dominated, with 170 issues unresolved, indicating room for improvement. Even for patches that passed unit tests and resolved issues, agents made different file and function modifications compared to the gold patches from repository developers, revealing limitations in the benchmark's test case coverage. Most agents maintained code reliability and security, avoiding new bugs or vulnerabilities; while some agents increased code complexity, many reduced code duplication and minimized code smells. Finally, agents performed better on simpler codebases, suggesting that breaking complex tasks into smaller sub-tasks could improve effectiveness. This study provides the first comprehensive evaluation of agent-generated patches on real-world GitHub issues, offering insights to advance AI-driven software development., Comment: 10 pages of main content and 2 pages of references
Published: 2024

2. Promise and Peril of Collaborative Code Generation Models: Balancing Effectiveness and Memorization

Author: Chen, Zhi and Jiang, Lingxiao
Subjects: Computer Science - Software Engineering, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: In the rapidly evolving field of machine learning, training models with datasets from various locations and organizations presents significant challenges due to privacy and legal concerns. The exploration of effective collaborative training settings capable of leveraging valuable knowledge from distributed and isolated datasets is increasingly crucial. This study investigates key factors that impact the effectiveness of collaborative training methods in code next-token prediction, as well as the correctness and utility of the generated code, demonstrating the promise of such methods. Additionally, we evaluate the memorization of different participant training data across various collaborative training settings, including centralized, federated, and incremental training, highlighting their potential risks in leaking data. Our findings indicate that the size and diversity of code datasets are pivotal factors influencing the success of collaboratively trained code models. We show that federated learning achieves competitive performance compared to centralized training while offering better data protection, as evidenced by lower memorization ratios in the generated code. However, federated learning can still produce verbatim code snippets from hidden training data, potentially violating privacy or copyright. Our study further explores effectiveness and memorization patterns in incremental learning, emphasizing the sequence in which individual participant datasets are introduced. We also identify cross-organizational clones as a prevalent challenge in both centralized and federated learning scenarios. Our findings highlight the persistent risk of data leakage during inference, even when training data remains unseen. We conclude with recommendations for practitioners and researchers to optimize multisource datasets, propelling cross-organizational collaboration forward., Comment: Paper accepted to the ASE 2024 Conference Research Track
Published: 2024

3. Vital: Vulnerability-Oriented Symbolic Execution via Type-Unsafe Pointer-Guided Monte Carlo Tree Search

Author: Tu, Haoxin, Jiang, Lingxiao, and Böhme, Marcel
Subjects: Computer Science - Software Engineering, Computer Science - Cryptography and Security
Abstract: How to find memory safety bugs efficiently when navigating a symbolic execution tree that suffers from path explosion? Existing solutions either adopt path search heuristics to maximize coverage rate or chopped symbolic execution to skip uninteresting code (i.e., manually labeled as vulnerability-unrelated) during path exploration. However, most existing search heuristics are not vulnerability-oriented, and manual labeling of irrelevant code-to-be-skipped relies heavily on prior expert knowledge, making it hard to detect vulnerabilities effectively in practice. This paper proposes Vital, a new vulnerability-oriented symbolic execution via type-unsafe pointer-guided Monte Carlo Tree Search (MCTS). A pointer that is type unsafe cannot be statically proven to be safely dereferenced without memory corruption. Our key hypothesis is that a path with more type unsafe pointers is more likely to contain vulnerabilities. Vital drives a guided MCTS to prioritize paths in the symbolic execution tree that contain a larger number of unsafe pointers and to effectively navigate the exploration-exploitation trade-off. We built Vital on top of KLEE and compared it with existing search strategies and chopped symbolic execution. In the former, the results demonstrate that Vital could cover up to 90.03% more unsafe pointers and detect up to 37.50% more unique memory errors. In the latter, the results show that Vital could achieve a speedup of up to 30x execution time and a reduction of up to 20x memory consumption on automatically detecting known vulnerabilities without prior expert knowledge., Comment: 12 pages
Published: 2024

4. Navigating Governance Paradigms: A Cross-Regional Comparative Study of Generative AI Governance Processes & Principles

Author: Luna, Jose, Tan, Ivan, Xie, Xiaofei, and Jiang, Lingxiao
Subjects: Computer Science - Computers and Society, K.5.2, K.4.1, H.1.2
Abstract: As Generative Artificial Intelligence (GenAI) technologies evolve at an unprecedented rate, global governance approaches struggle to keep pace with the technology, highlighting a critical issue in the governance adaptation of significant challenges. Depicting the nuances of nascent and diverse governance approaches based on risks, rules, outcomes, principles, or a mix across different regions around the globe is fundamental to discern discrepancies and convergences and to shed light on specific limitations that need to be addressed, thereby facilitating the safe and trustworthy adoption of GenAI. In response to the need and the evolving nature of GenAI, this paper seeks to provide a collective view of different governance approaches around the world. Our research introduces a Harmonized GenAI Framework, "H-GenAIGF," based on the current governance approaches of six regions: European Union (EU), United States (US), China (CN), Canada (CA), United Kingdom (UK), and Singapore (SG). We have identified four constituents, fifteen processes, twenty-five sub-processes, and nine principles that aid the governance of GenAI, thus providing a comprehensive perspective on the current state of GenAI governance. In addition, we present a comparative analysis to facilitate the identification of common ground and distinctions based on the coverage of the processes by each region. The results show that risk-based approaches allow for better coverage of the processes, followed by mixed approaches. Other approaches lag behind, covering less than 50% of the processes. Most prominently, the analysis demonstrates that among the regions, only one process aligns across all approaches, highlighting the lack of consistent and executable provisions. Moreover, our case study on ChatGPT reveals process coverage deficiency, showing that harmonization of approaches is necessary to find alignment for GenAI governance., Comment: To appear at AIES 2024
Published: 2024

5. Study on the surface integrity of ultrasonic cavitation-assisted WEDM-LS under Rayleigh-Plesset model

Author: Wang, Yan, Jin, Ruiqi, Chen, Yizhang, Xiong, Wei, Li, Bingchu, and Jiang, Lingxiao
Published: 2024
Full Text: View/download PDF

6. Evaluating Pre-trained Language Models for Repairing API Misuses

Author: Zhang, Ting, Irsan, Ivana Clairine, Thung, Ferdian, Lo, David, Sharma, Asankhaya, and Jiang, Lingxiao
Subjects: Computer Science - Software Engineering
Abstract: API misuses often lead to software bugs, crashes, and vulnerabilities. While several API misuse detectors have been proposed, there are no automatic repair tools specifically designed for this purpose. In a recent study, test-suite-based automatic program repair (APR) tools were found to be ineffective in repairing API misuses. Still, since the study focused on non-learning-aided APR tools, it remains unknown whether learning-aided APR tools are capable of fixing API misuses. In recent years, pre-trained language models (PLMs) have succeeded greatly in many natural language processing tasks. There is a rising interest in applying PLMs to APR. However, there has not been any study that investigates the effectiveness of PLMs in repairing API misuse. To fill this gap, we conduct a comprehensive empirical study on 11 learning-aided APR tools, which include 9 of the state-of-the-art general-purpose PLMs and two APR tools. We evaluate these models with an API-misuse repair dataset, consisting of two variants. Our results show that PLMs perform better than the studied APR tools in repairing API misuses. Among the 9 pre-trained models tested, CodeT5 is the best performer in the exact match. We also offer insights and potential exploration directions for future research., Comment: Under review by TOSEM
Published: 2023

7. Unleashing the Power of Clippy in Real-World Rust Projects

Author: Li, Chunmiao, Yu, Yijun, Wu, Haitao, Carlig, Luca, Nie, Shijie, and Jiang, Lingxiao
Subjects: Computer Science - Software Engineering
Abstract: Clippy lints are considered as essential tools for Rust developers, as they can be configured as gate-keeping rules for a Rust project during continuous integration. Despite their availability, little was known about practical application and cost-effectiveness of the lints in reducing code quality issues. In this study, we embark on a comprehensive analysis to unveil the true impact of Clippy lints in the Rust development landscape. The study is structured around three interrelated components, each contributing to the overall effectiveness of Clippy. Firstly, we conduct a comprehensive analysis of Clippy lints in all idiomatic crates-io Rust projects with an average warning density of 21/KLOC. The analysis identifies the most cost-effective lint fixes, offering valuable opportunities for optimizing code quality. Secondly, we actively engage Rust developers through a user survey to garner invaluable feedback on their experiences with Clippy. User insights shed light on two crucial concerns: the prevalence of false positives in warnings and the need for auto-fix support for most warnings. Thirdly, building upon these findings, we engineer three innovative automated refactoring techniques to effectively fix the four most frequent Clippy lints. As a result, the warning density in Rosetta benchmarks has significantly decreased from 195/KLOC to an impressive 18/KLOC, already lower than the average density of the crates-io Rust projects. These results demonstrate tangible benefit and impact of our efforts in enhancing the overall code quality and maintainability for Rust developers.
Published: 2023

8. Isolating Compiler Bugs by Generating Effective Witness Programs with Large Language Models

Author: Tu, Haoxin, Zhou, Zhide, Jiang, He, Yusuf, Imam Nur Bani, Li, Yuxian, and Jiang, Lingxiao
Subjects: Computer Science - Software Engineering
Abstract: Compiler bugs pose a significant threat to safety-critical applications, and promptly as well as effectively isolating these bugs is crucial for assuring the quality of compilers. However, the limited availability of debugging information on reported bugs complicates the compiler bug isolation task. Existing compiler bug isolation approaches convert the problem into a test program mutation problem, but they are still limited by ineffective mutation strategies or high human effort requirements. Drawing inspiration from the recent progress of pre-trained Large Language Models (LLMs), such as ChatGPT, in code generation, we propose a new approach named LLM4CBI to utilize LLMs to generate effective test programs for compiler bug isolation. However, using LLMs directly for test program mutation may not yield the desired results due to the challenges associated with formulating precise prompts and selecting specialized prompts. To overcome the challenges, three new components are designed in LLM4CBI. First, LLM4CBI utilizes a program complexity-guided prompt production component, which leverages data and control flow analysis to identify the most valuable variables and locations in programs for mutation. Second, LLM4CBI employs a memorized prompt selection component, which adopts reinforcement learning to select specialized prompts for mutating test programs continuously. Third, a test program validation component is proposed to select specialized feedback prompts to avoid repeating the same mistakes during the mutation process. Compared with state-of-the-art approaches over 120 real bugs from GCC and LLVM, our evaluation demonstrates the advantages of LLM4CBI: It can isolate 69.70%/21.74% and 24.44%/8.92% more bugs than DiWi and RecBi within Top-1/Top-5 ranked results. We also demonstrate that the LLMs component used in LLM4CBI can be easily replaced while still achieving reasonable results., Comment: Accepted by IEEE Transactions on Software Engineering
Published: 2023
Full Text: View/download PDF

9. Duplicate Bug Report Detection: How Far Are We?

Author: Zhang, Ting, Han, DongGyun, Vinayakarao, Venkatesh, Irsan, Ivana Clairine, Xu, Bowen, Thung, Ferdian, Lo, David, and Jiang, Lingxiao
Subjects: Computer Science - Software Engineering
Abstract: Many Duplicate Bug Report Detection (DBRD) techniques have been proposed in the research literature. The industry uses some other techniques. Unfortunately, there is insufficient comparison among them, and it is unclear how far we have been. This work fills this gap by comparing the aforementioned techniques. To compare them, we first need a benchmark that can estimate how a tool would perform if applied in a realistic setting today. Thus, we first investigated potential biases that affect the fair comparison of the accuracy of DBRD techniques. Our experiments suggest that data age and issue tracking system choice cause a significant difference. Based on these findings, we prepared a new benchmark. We then used it to evaluate DBRD techniques to estimate better how far we have been. Surprisingly, a simpler technique outperforms recently proposed sophisticated techniques on most projects in our benchmark. In addition, we compared the DBRD techniques proposed in research with those used in Mozilla and VSCode. Surprisingly, we observe that a simple technique already adopted in practice can achieve comparable results as a recently proposed research tool. Our study gives reflections on the current state of DBRD, and we share our insights to benefit future DBRD research., Comment: Accepted by ACM Transactions on Software Engineering and Methodology
Published: 2022

10. Remote teaching system for robotic surgery and its validation: results of a randomized controlled study

Author: Jiang, Lingxiao, Chen, Gaojie, Li, Lu, Chen, Ziyan, Yang, Kun, and Wang, Xinghuan
Published: 2023
Full Text: View/download PDF

11. MANDO: Multi-Level Heterogeneous Graph Embeddings for Fine-Grained Detection of Smart Contract Vulnerabilities

Author: Nguyen, Hoang H., Nguyen, Nhat-Minh, Xie, Chunyao, Ahmadi, Zahra, Kudendo, Daniel, Doan, Thanh-Nam, and Jiang, Lingxiao
Subjects: Computer Science - Software Engineering, Computer Science - Machine Learning, I.2.5, D.2.4
Abstract: Learning heterogeneous graphs consisting of different types of nodes and edges enhances the results of homogeneous graph techniques. An interesting example of such graphs is control-flow graphs representing possible software code execution flows. As such graphs represent more semantic information of code, developing techniques and tools for such graphs can be highly beneficial for detecting vulnerabilities in software for its reliability. However, existing heterogeneous graph techniques are still insufficient in handling complex graphs where the number of different types of nodes and edges is large and variable. This paper concentrates on the Ethereum smart contracts as a sample of software codes represented by heterogeneous contract graphs built upon both control-flow graphs and call graphs containing different types of nodes and links. We propose MANDO, a new heterogeneous graph representation to learn such heterogeneous contract graphs' structures. MANDO extracts customized metapaths, which compose relational connections between different types of nodes and their neighbors. Moreover, it develops a multi-metapath heterogeneous graph attention network to learn multi-level embeddings of different types of nodes and their metapaths in the heterogeneous contract graphs, which can capture the code semantics of smart contracts more accurately and facilitate both fine-grained line-level and coarse-grained contract-level vulnerability detection. Our extensive evaluation of large smart contract datasets shows that MANDO improves the vulnerability detection results of other techniques at the coarse-grained contract level. More importantly, it is the first learning-based approach capable of identifying vulnerabilities at the fine-grained line-level, and significantly improves the traditional code analysis-based vulnerability detection approaches by 11.35% to 70.81% in terms of F1-score., Comment: Accepted at the 9th IEEE International Conference on Data Science and Advanced Analytics (DSAA 2022 - Research Track)
Published: 2022

12. BlockScope: Detecting and Investigating Propagated Vulnerabilities in Forked Blockchain Projects

Author: Yi, Xiao, Fang, Yuzhou, Wu, Daoyuan, and Jiang, Lingxiao
Subjects: Computer Science - Cryptography and Security
Abstract: Due to the open-source nature of the blockchain ecosystem, it is common for new blockchains to fork or partially reuse the code of classic blockchains. For example, the popular Dogecoin, Litecoin, Binance BSC, and Polygon are all variants of Bitcoin/Ethereum. These "forked" blockchains thus could encounter similar vulnerabilities that are propagated from Bitcoin/Ethereum during forking or subsequently commit fetching. In this paper, we conduct a systematic study of detecting and investigating the propagated vulnerabilities in forked blockchain projects. To facilitate this study, we propose BlockScope, a novel tool that can effectively and efficiently detect multiple types of cloned vulnerabilities given an input of existing Bitcoin/Ethereum security patches. Specifically, BlockScope adopts similarity-based code match and designs a new way of calculating code similarity to cover all the syntax-wide variant (i.e., Type-1, Type-2, and Type-3) clones. Moreover, BlockScope automatically extracts and leverages the contexts of patch code to narrow down the search scope and locate only potentially relevant code for comparison. Our evaluation shows that BlockScope achieves good precision and high recall both at 91.8% (1.8 times higher recall than that in ReDeBug). BlockScope allows us to discover 101 previously unknown vulnerabilities in 13 out of the 16 forked projects of Bitcoin and Ethereum, including 16 from Dogecoin, 6 from Litecoin, 1 from Binance, and 4 from Optimism. We have reported all the vulnerabilities to their developers; 40 of them have been patched or accepted, 66 were acknowledged or under pending, and only 4 were rejected. We further investigate the propagation and patching processes of discovered vulnerabilities, and reveal three types of vulnerability propagation from source to forked projects, as well as the long delay (over 200 days) for releasing patches in Bitcoin forks., Comment: The paper was accepted by ISOC NDSS 2023
Published: 2022
Full Text: View/download PDF

13. AutoPRTitle: A Tool for Automatic Pull Request Title Generation

Author: Irsan, Ivana Clairine, Zhang, Ting, Thung, Ferdian, Lo, David, and Jiang, Lingxiao
Subjects: Computer Science - Software Engineering
Abstract: With the rise of the pull request mechanism in software development, the quality of pull requests has gained more attention. Prior works focus on improving the quality of pull request descriptions and several approaches have been proposed to automatically generate pull request descriptions. As an essential component of a pull request, pull request titles have not received a similar level of attention. To further facilitate automation in software development and to help developers in drafting high-quality pull request titles, we introduce AutoPRTitle. AutoPRTitle is specifically designed to automatically generate pull request titles. AutoPRTitle can generate a precise and succinct pull request title based on the pull request description, commit messages, and the associated issue titles. AutoPRTitle is built upon a state-of-the-art text summarization model, BART, which has been pre-trained on large-scale English corpora. We further fine-tuned BART in a pull request dataset containing high-quality pull request titles. We implemented AutoPRTitle as a stand-alone web application. We conducted two sets of evaluations: one concerning the model accuracy and the other concerning the tool usability. For model accuracy, BART outperforms the best baseline by 24.6%, 40.5%, and 23.3%, respectively. For tool usability, the evaluators consider our tool as easy-to-use and useful when creating a pull request title of good quality. Source code: https://github.com/soarsmu/Auto-PR-Title Video demo: https://tinyurl.com/AutoPRTitle, Comment: Accepted by the ICSME'22 Tool Demonstration Track
Published: 2022

14. Automatic Pull Request Title Generation

Author: Zhang, Ting, Irsan, Ivana Clairine, Thung, Ferdian, Han, DongGyun, Lo, David, and Jiang, Lingxiao
Subjects: Computer Science - Software Engineering
Abstract: Pull Requests (PRs) are a mechanism on modern collaborative coding platforms, such as GitHub. PRs allow developers to tell others that their code changes are available for merging into another branch in a repository. A PR needs to be reviewed and approved by the core team of the repository before the changes are merged into the branch. Usually, reviewers need to identify a PR that is in line with their interests before providing a review. By default, PRs are arranged in a list view that shows the titles of PRs. Therefore, it is desirable to have a precise and concise title, which is beneficial for both reviewers and other developers. However, it is often the case that developers do not provide good titles; we find that many existing PR titles are either inappropriate in length (i.e., too short or too long) or fail to convey useful information, which may result in PR being ignored or rejected. Therefore, there is a need for automatic techniques to help developers draft high-quality titles. In this paper, we introduce the task of automatic generation of PR titles. We formulate the task as a one-sentence summarization task. To facilitate the research on this task, we construct a dataset that consists of 43,816 PRs from 495 GitHub repositories. We evaluated the state-of-the-art summarization approaches for the automatic PR title generation task. We leverage ROUGE metrics to automatically evaluate the summarization approaches and conduct a manual evaluation. The experimental results indicate that BART is the best technique for generating satisfactory PR titles with ROUGE-1, ROUGE-2, and ROUGE-L F1-scores of 47.22, 25.27, and 43.12, respectively. The manual evaluation also shows that the titles generated by BART are preferred., Comment: Accepted by the ICSME'22 research track
Published: 2022

15. iTiger: An Automatic Issue Title Generation Tool

Author: Zhang, Ting, Irsan, Ivana Clairine, Thung, Ferdian, Han, DongGyun, Lo, David, and Jiang, Lingxiao
Subjects: Computer Science - Software Engineering
Abstract: In both commercial and open-source software, bug reports or issues are used to track bugs or feature requests. However, the quality of issues can differ a lot. Prior research has found that bug reports with good quality tend to gain more attention than the ones with poor quality. As an essential component of an issue, title quality is an important aspect of issue quality. Moreover, issues are usually presented in a list view, where only the issue title and some metadata are present. In this case, a concise and accurate title is crucial for readers to grasp the general concept of the issue and facilitate the issue triaging. Previous work formulated the issue title generation task as a one-sentence summarization task. A sequence-to-sequence model was employed to solve this task. However, it requires a large amount of domain-specific training data to attain good performance in issue title generation. Recently, pre-trained models, which learned knowledge from large-scale general corpora, have shown much success in software engineering tasks. In this work, we make the first attempt to fine-tune BART, which has been pre-trained using English corpora, to generate issue titles. We implemented the fine-tuned BART as a web tool named iTiger, which can suggest an issue title based on the issue description. iTiger is fine-tuned on 267,094 GitHub issues. We compared iTiger with the state-of-the-art method, i.e., iTAPE, on 33,438 issues. The automatic evaluation shows that iTiger outperforms iTAPE by 29.7%, 50.8%, and 34.1%, in terms of ROUGE-1, ROUGE-2, ROUGE-L F1-scores. The manual evaluation also demonstrates the titles generated by BART are preferred by evaluators over the titles generated by iTAPE in 72.7% of cases. Besides, the evaluators deem our tool as useful and easy-to-use. They are also interested to use our tool in the future., Comment: Accepted by the ESEC/FSE 2022 Demonstrations Track
Published: 2022
Full Text: View/download PDF

16. Fuzzing drones for anomaly detection: A systematic literature review

Author: Malviya, Vikas K., Minn, Wei, Shar, Lwin Khin, and Jiang, Lingxiao
Published: 2025
Full Text: View/download PDF

17. An Empirical Study of Blockchain System Vulnerabilities: Modules, Types, and Patterns

Author: Yi, Xiao, Wu, Daoyuan, Jiang, Lingxiao, Fang, Yuzhou, Zhang, Kehuan, and Zhang, Wei
Subjects: Computer Science - Cryptography and Security, Computer Science - Software Engineering
Abstract: Blockchain, as a distributed ledger technology, becomes increasingly popular, especially for enabling valuable cryptocurrencies and smart contracts. However, the blockchain software systems inevitably have many bugs. Although bugs in smart contracts have been extensively investigated, security bugs of the underlying blockchain systems are much less explored. In this paper, we conduct an empirical study on blockchain's system vulnerabilities from four representative blockchains, Bitcoin, Ethereum, Monero, and Stellar. Specifically, we first design a systematic filtering process to effectively identify 1,037 vulnerabilities and their 2,317 patches from 34,245 issues/PRs (pull requests) and 85,164 commits on GitHub. We thus build the first blockchain vulnerability dataset. We then perform unique analyses of this dataset at three levels, including (i) file-level vulnerable module categorization by identifying and correlating module paths across projects, (ii) text-level vulnerability type clustering by natural language processing and similarity-based sentence clustering, and (iii) code-level vulnerability pattern analysis by generating and clustering code change signatures that capture both syntactic and semantic information of patch code fragments. Our analyses reveal three key findings: (i) some blockchain modules are more susceptible than the others; notably, each of the modules related to consensus, wallet, and networking has over 200 issues; (ii) about 70% of blockchain vulnerabilities are of traditional types, but we also identify four new types specific to blockchains; and (iii) we obtain 21 blockchain-specific vulnerability patterns that capture unique blockchain attributes and statuses, and demonstrate that they can be used to detect similar vulnerabilities in other popular blockchains, such as Dogecoin, Bitcoin SV, and Zcash., Comment: The paper was accepted by ACM FSE 2022
Published: 2021
Full Text: View/download PDF

18. Autophagy activated by GR/miR-421–3p/mTOR pathway as a compensatory mechanism participates in chondrodysplasia induced by prenatal caffeine exposure in male fetal rats

Author: Han, Hui, Shi, Huasong, Jiang, Lingxiao, Zhang, Dingmei, Wang, Hui, Li, Jing, and Chen, Liaobin
Published: 2024
Full Text: View/download PDF

19. Genotypic and phenotypic characterization of glucose-6-phosphate dehydrogenase (G6PD) deficiency in Guangzhou, China

Author: Li, Ziyan, Huang, Zhenyi, Liu, Yanxia, Cao, Yunshan, Li, Yating, Fang, Yanping, Huang, Meiying, Liu, Zixi, Lin, Lijuan, and Jiang, Lingxiao
Published: 2023
Full Text: View/download PDF

20. Exploring the use of driving simulation to improve robotic surgery simulator training: an observational case–control study

Author: Chen, Ziyan, Zheng, Yu Xuan, Hubert, Jacques, Jiang, Lingxiao, Yang, Kun, and Wang, XingHuan
Published: 2023
Full Text: View/download PDF

21. Investigating Math Word Problems using Pretrained Multilingual Language Models

Author: Tan, Minghuan, Wang, Lei, Jiang, Lingxiao, and Jiang, Jing
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: In this paper, we revisit math word problems~(MWPs) from the cross-lingual and multilingual perspective. We construct our MWP solvers over pretrained multilingual language models using sequence-to-sequence model with copy mechanism. We compare how the MWP solvers perform in cross-lingual and multilingual scenarios. To facilitate the comparison of cross-lingual performance, we first adapt the large-scale English dataset MathQA as a counterpart of the Chinese dataset Math23K. Then we extend several English datasets to bilingual datasets through machine translation plus human annotation. Our experiments show that the MWP solvers may not be transferred to a different language even if the target expressions have the same operator set and constants. But for both cross-lingual and multilingual cases, it can be better generalized if problem types exist on both source language and target language., Comment: To appear in MathNLP (The 1st Workshop on Mathematical Natural Language Processing)
Published: 2021

22. AndroEvolve: Automated Update for Android Deprecated-API Usages

Author: Haryono, Stefanus Agus, Thung, Ferdian, Lo, David, Jiang, Lingxiao, Lawall, Julia, Kang, Hong Jin, Serrano, Lucas, and Muller, Gilles
Subjects: Computer Science - Software Engineering
Abstract: Android operating system (OS) is often updated, where each new version may involve API deprecation. Usages of deprecated APIs in Android apps need to be updated to ensure the apps' compatibility with the old and new versions of Android OS. In this work, we propose AndroEvolve, an automated tool to update usages of deprecated Android APIs, that addresses the limitations of the state-of-the-art tool, CocciEvolve. AndroEvolve utilizes data flow analysis to solve the problem of out-of-method-boundary variables, and variable denormalization to remove the temporary variables introduced by CocciEvolve. We evaluated the accuracy of AndroEvolve using a dataset of 360 target files and 20 deprecated Android APIs, where AndroEvolve is able to produce 319 correct updates, compared to CocciEvolve which only produces 249 correct updates. We also evaluated the readability of AndroEvolve's update results using a manual and an automatic evaluation. Both evaluations demonstrated that the code produced by AndroEvolve has higher readability than CocciEvolve's. A video demonstration of AndroEvolve is available at https://youtu.be/siU0tuMITXI.
Published: 2020

23. InferCode: Self-Supervised Learning of Code Representations by Predicting Subtrees

Author: Bui, Nghi D. Q., Yu, Yijun, and Jiang, Lingxiao
Subjects: Computer Science - Software Engineering, Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Computer Science - Programming Languages
Abstract: Building deep learning models on source code has found many successful software engineering applications, such as code search, code comment generation, bug detection, code migration, and so on. Current learning techniques, however, have a major drawback that these models are mostly trained on datasets labeled for particular downstream tasks, and code representations may not be suitable for other tasks. While some techniques produce representations from unlabeled code, they are far from satisfactory when applied to downstream tasks. Although certain techniques generate representations from unlabeled code when applied to downstream tasks they are far from satisfactory. This paper proposes InferCode to overcome the limitation by adapting the self-supervised learning mechanism to build source code model. The key novelty lies in training code representations by predicting automatically identified subtrees from the context of the ASTs. Subtrees in ASTs are treated with InferCode as the labels for training code representations without any human labeling effort or the overhead of expensive graph construction, and the trained representations are no longer tied to any specific downstream tasks or code units. We trained an InferCode model instance using the Tree-based CNN as the encoder of a large set of Java code and applied it to downstream unsupervised tasks such as code clustering, code clone detection, cross-language code search or reused under a transfer learning scheme to continue training the model weights for supervised tasks such as code classification and method name prediction. Compared to previous code learning techniques applied to the same downstream tasks, such as Code2Vec, Code2Seq, ASTNN, higher performance results are achieved using our pre-trained InferCode model with a significant margin for most tasks including those involving different programming languages., Comment: Accepted at ICSE 2021
Published: 2020

24. AndroEvolve: Automated Android API Update with Data Flow Analysis and Variable Denormalization

Author: Haryono, Stefanus A., Thung, Ferdian, Lo, David, Jiang, Lingxiao, Lawall, Julia, Kang, Hong Jin, Serrano, Lucas, and Muller, Gilles
Subjects: Computer Science - Software Engineering
Abstract: The Android operating system is frequently updated, with each version bringing a new set of APIs. New versions may involve API deprecation; Android apps using deprecated APIs need to be updated to ensure the apps' compatibility withold and new versions of Android. Updating deprecated APIs is a time-consuming endeavor. Hence, automating the updates of Android APIs can be beneficial for developers. CocciEvolve is the state-of-the-art approach for this automation. However, it has several limitations, including its inability to resolve out-of-method-boundary variables and the low code readability of its update due to the addition of temporary variables. In an attempt to further improve the performance of automated Android API update, we propose an approach named AndroEvolve, which addresses the limitations of CocciEvolve through the addition of data flow analysis and variable name denormalization. Data flow analysis enables AndroEvolve to resolve the value of any variable within the file scope. Variable name denormalization replaces temporary variables that may present in the CocciEvolve update with appropriate values in the target file. We have evaluated the performance of AndroEvolve and the readability of its updates on 360 target files. AndroEvolve produces 26.90% more instances of correct updates compared to CocciEvolve. Moreover, our manual and automated evaluation shows that AndroEvolve updates are more readable than CocciEvolve updates.
Published: 2020

25. Characterization and Automatic Update of Deprecated Machine-Learning API Usages

Author: Haryono, Stefanus Agus, Thung, Ferdian, Lo, David, Lawall, Julia, and Jiang, Lingxiao
Subjects: Computer Science - Software Engineering
Abstract: Due to the rise of AI applications, machine learning libraries have become far more accessible, with Python being the most common programming language to write them. Machine learning libraries tend to be updated periodically, which may deprecate existing APIs, making it necessary for developers to update their usages. However, updating usages of deprecated APIs are typically not a priority for developers, leading to widespread usages of deprecated APIs which expose library users to vulnerability issues. In this paper, we built a tool to automate these updates. We first conducted an empirical study to seek a better understanding on how updates of deprecated machine-learning API usages in Python can be done. The study involved a dataset of 112 deprecated APIs from Scikit-Learn, TensorFlow, and PyTorch. We found dimensions of deprecated API migration related to its update operation (i.e., the required operation to perform the migration), API mapping (i.e., the number of deprecated and its corresponding updated APIs),and context dependency (i.e., whether we need to consider surrounding contexts when performing the migration). Guided by the findings on our empirical study, we created MLCatchUp, a tool to automate the update of Python deprecated API usage that automatically infers the API migration transformation through comparison of the deprecated and updated API signatures. These transformations are expressed in a Domain Specific Language (DSL). We evaluated MLCatchUp using test dataset containing 258 files with 514 API usages that we collected from public GitHub repositories. In this evaluation, MLCatchUp achieves a precision of 86.19%. We further improve the precision of MLCatchUp by adding a feature that allows it to accept additional user input to specify the transformation constraints in the DSL for context-dependent API migration, where MLCatchUp achieves a precision of 93.58%.
Published: 2020

26. Self-Supervised Contrastive Learning for Code Retrieval and Summarization via Semantic-Preserving Transformations

Author: Bui, Nghi D. Q., Yu, Yijun, and Jiang, Lingxiao
Subjects: Computer Science - Software Engineering, Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Computer Science - Programming Languages
Abstract: We propose Corder, a self-supervised contrastive learning framework for source code model. Corder is designed to alleviate the need of labeled data for code retrieval and code summarization tasks. The pre-trained model of Corder can be used in two ways: (1) it can produce vector representation of code which can be applied to code retrieval tasks that do not have labeled data; (2) it can be used in a fine-tuning process for tasks that might still require label data such as code summarization. The key innovation is that we train the source code model by asking it to recognize similar and dissimilar code snippets through a contrastive learning objective. To do so, we use a set of semantic-preserving transformation operators to generate code snippets that are syntactically diverse but semantically equivalent. Through extensive experiments, we have shown that the code models pretrained by Corder substantially outperform the other baselines for code-to-code retrieval, text-to-code retrieval, and code-to-text summarization tasks., Comment: Accepted at SIGIR 2021
Published: 2020
Full Text: View/download PDF

27. TreeCaps: Tree-Based Capsule Networks for Source Code Processing

Author: Bui, Nghi D. Q., Yu, Yijun, and Jiang, Lingxiao
Subjects: Computer Science - Software Engineering, Computer Science - Artificial Intelligence, Computer Science - Programming Languages
Abstract: Recently program learning techniques have been proposed to process source code based on syntactical structures (e.g., Abstract Syntax Trees) and/or semantic information (e.g., Dependency Graphs). Although graphs may be better at capturing various viewpoints of code semantics than trees, constructing graph inputs from code needs static code semantic analysis that may not be accurate and introduces noise during learning. Although syntax trees are precisely defined according to the language grammar and easier to construct and process than graphs, previous tree-based learning techniques have not been able to learn semantic information from trees to achieve better accuracy than graph-based techniques. We propose a new learning technique, named TreeCaps, by fusing together capsule networks with tree-based convolutional neural networks, to achieve learning accuracy higher than existing graph-based techniques while it is based only on trees. TreeCaps introduces novel variable-to-static routing algorithms into the capsule networks to compensate for the loss of previous routing algorithms. Aside from accuracy, we also find that TreeCaps is the most robust to withstand those semantic-preserving program transformations that change code syntax without modifying the semantics. Evaluated on a large number of Java and C/C++ programs, TreeCaps models outperform prior deep learning models of program source code, in terms of both accuracy and robustness for program comprehension tasks such as code functionality classification and function name prediction, Comment: Accepted at AAAI 2021
Published: 2020

28. On the Generalizability of Neural Program Models with respect to Semantic-Preserving Program Transformations

Author: Rabin, Md Rafiqul Islam, Bui, Nghi D. Q., Wang, Ke, Yu, Yijun, Jiang, Lingxiao, and Alipour, Mohammad Amin
Subjects: Computer Science - Software Engineering, Computer Science - Machine Learning, Computer Science - Programming Languages
Abstract: With the prevalence of publicly available source code repositories to train deep neural network models, neural program models can do well in source code analysis tasks such as predicting method names in given programs that cannot be easily done by traditional program analysis techniques. Although such neural program models have been tested on various existing datasets, the extent to which they generalize to unforeseen source code is largely unknown. Since it is very challenging to test neural program models on all unforeseen programs, in this paper, we propose to evaluate the generalizability of neural program models with respect to semantic-preserving transformations: a generalizable neural program model should perform equally well on programs that are of the same semantics but of different lexical appearances and syntactical structures. We compare the results of various neural program models for the method name prediction task on programs before and after automated semantic-preserving transformations. We use three Java datasets of different sizes and three state-of-the-art neural network models for code, namely code2vec, code2seq, and GGNN, to build nine such neural program models for evaluation. Our results show that even with small semantically preserving changes to the programs, these neural program models often fail to generalize their performance. Our results also suggest that neural program models based on data and control dependencies in programs generalize better than neural program models based only on abstract syntax trees. On the positive side, we observe that as the size of the training dataset grows and diversifies the generalizability of correct predictions produced by the neural program models can be improved too. Our results on the generalizability of neural program models provide insights to measure their limitations and provide a stepping stone for their improvement., Comment: Information and Software Technology, IST Journal 2021, Elsevier. Related to arXiv:2004.07313
Published: 2020
Full Text: View/download PDF

29. Automatic Android Deprecated-API Usage Update by Learning from Single Updated Example

Author: Haryono, Stefanus Agus, Thung, Ferdian, Kang, Hong Jin, Serrano, Lucas, Muller, Gilles, Lawall, Julia, Lo, David, and Jiang, Lingxiao
Subjects: Computer Science - Software Engineering, I.2.2
Abstract: Due to the deprecation of APIs in the Android operating system,developers have to update usages of the APIs to ensure that their applications work for both the past and current versions of Android.Such updates may be widespread, non-trivial, and time-consuming. Therefore, automation of such updates will be of great benefit to developers. AppEvolve, which is the state-of-the-art tool for automating such updates, relies on having before- and after-update examples to learn from. In this work, we propose an approach named CocciEvolve that performs such updates using only a single after-update example. CocciEvolve learns edits by extracting the relevant update to a block of code from an after-update example. From preliminary experiments, we find that CocciEvolve can successfully perform 96 out of 112 updates, with a success rate of 85%., Comment: 5 pages, 8 figures. Accepted in The International Conference on Program Comprehension (ICPC) 2020, ERA Track
Published: 2020

30. Checking Smart Contracts with Structural Code Embedding

Author: Gao, Zhipeng, Jiang, Lingxiao, Xia, Xin, Lo, David, and Grundy, John
Subjects: Computer Science - Software Engineering
Abstract: Smart contracts have been increasingly used together with blockchains to automate financial and business transactions. However, many bugs and vulnerabilities have been identified in many contracts which raises serious concerns about smart contract security, not to mention that the blockchain systems on which the smart contracts are built can be buggy. Thus, there is a significant need to better maintain smart contract code and ensure its high reliability. In this paper, we propose an automated approach to learn characteristics of smart contracts in Solidity, which is useful for clone detection, bug detection and contract validation on smart contracts. Our new approach is based on word embeddings and vector space comparison. We parse smart contract code into word streams with code structural information, convert code elements (e.g., statements, functions) into numerical vectors that are supposed to encode the code syntax and semantics, and compare the similarities among the vectors encoding code and known bugs, to identify potential issues. We have implemented the approach in a prototype, named SmartEmbed. Results show that our tool can effectively identify many repetitive instances of Solidity code, where the clone ratio is around 90\%. Code clones such as type-III or even type-IV semantic clones can also be detected accurately. Our tool can identify more than 1000 clone related bugs based on our bug databases efficiently and accurately. Our tool can also help to efficiently validate any given smart contract against a known set of bugs, which can help to improve the users' confidence in the reliability of the contract. The anonymous replication packages can be accessed at: https://drive.google.com/file/d/1kauLT3y2IiHPkUlVx4FSTda-dVAyL4za/view?usp=sharing, and evaluated it with more than 22,000 smart contracts collected from the Ethereum blockchain.
Published: 2020
Full Text: View/download PDF

31. Experimental comparison of features, analyses, and classifiers for Android malware detection

Author: Shar, Lwin Khin, Demissie, Biniam Fisseha, Ceccato, Mariano, Tun, Yan Naing, Lo, David, Jiang, Lingxiao, and Bienert, Christoph
Published: 2023
Full Text: View/download PDF

32. TreeCaps: Tree-Structured Capsule Networks for Program Source Code Processing

Author: Jayasundara, Vinoj, Bui, Nghi Duy Quoc, Jiang, Lingxiao, and Lo, David
Subjects: Computer Science - Machine Learning, Computer Science - Software Engineering, Statistics - Machine Learning
Abstract: Program comprehension is a fundamental task in software development and maintenance processes. Software developers often need to understand a large amount of existing code before they can develop new features or fix bugs in existing programs. Being able to process programming language code automatically and provide summaries of code functionality accurately can significantly help developers to reduce time spent in code navigation and understanding, and thus increase productivity. Different from natural language articles, source code in programming languages often follows rigid syntactical structures and there can exist dependencies among code elements that are located far away from each other through complex control flows and data flows. Existing studies on tree-based convolutional neural networks (TBCNN) and gated graph neural networks (GGNN) are not able to capture essential semantic dependencies among code elements accurately. In this paper, we propose novel tree-based capsule networks (TreeCaps) and relevant techniques for processing program code in an automated way that encodes code syntactical structures and captures code dependencies more accurately. Based on evaluation on programs written in different programming languages, we show that our TreeCaps-based approach can outperform other approaches in classifying the functionalities of many programs., Comment: in NeurIPS Workshop on ML for Systems, 2019
Published: 2019

33. SmartEmbed: A Tool for Clone and Bug Detection in Smart Contracts through Structural Code Embedding

Author: Gao, Zhipeng, Jayasundara, Vinoj, Jiang, Lingxiao, Xia, Xin, Lo, David, and Grundy, John
Subjects: Computer Science - Software Engineering
Abstract: Ethereum has become a widely used platform to enable secure, Blockchain-based financial and business transactions. However, a major concern in Ethereum is the security of its smart contracts. Many identified bugs and vulnerabilities in smart contracts not only present challenges to maintenance of blockchain, but also lead to serious financial loses. There is a significant need to better assist developers in checking smart contracts and ensuring their reliability.In this paper, we propose a web service tool, named SmartEmbed, which can help Solidity developers to find repetitive contract code and clone-related bugs in smart contracts. Our tool is based on code embeddings and similarity checking techniques. By comparing the similarities among the code embedding vectors for existing solidity code in the Ethereum blockchain and known bugs, we are able to efficiently identify code clones and clone-related bugs for any solidity code given by users, which can help to improve the users' confidence in the reliability of their code. In addition to the uses by individual developers, SmartEmbed can also be applied to studies of smart contracts in a large scale. When applied to more than 22K solidity contracts collected from the Ethereum blockchain, we found that the clone ratio of solidity code is close to 90\%, much higher than traditional software, and 194 clone-related bugs can be identified efficiently and accurately based on our small bug database with a precision of 96\%. SmartEmbed can be accessed at \url{http://www.smartembed.net}. A demo video of SmartEmbed is at \url{https://youtu.be/o9ylyOpYFq8}
Published: 2019
Full Text: View/download PDF

34. SAR: Learning Cross-Language API Mappings with Little Knowledge

Author: Bui, Nghi D. Q., Yu, Yijun, and Jiang, Lingxiao
Subjects: Computer Science - Machine Learning, Computer Science - Software Engineering, Statistics - Machine Learning
Abstract: To save manual effort, developers often translate programs from one programming language to another, instead of implementing it from scratch. Translating application program interfaces (APIs) used in one language to functionally equivalent ones available in another language is an important aspect of program translation. Existing approaches facilitate the translation by automatically identifying the API mappings across programming languages. However, all these approaches still require large amount of manual effort in preparing parallel program corpora, ranging from pairs of APIs, to manually identified code in different languages that are considered as functionally equivalent. To minimize the manual effort in identifying parallel program corpora and API mappings, this paper aims at an automated approach to map APIs across languages with much less knowledge a priori needed than other existing approaches. The approach is based on an realization of the notion of domain adaption combined with code embedding, which can better align two vector spaces: taking as input large sets of programs, our approach first generates numeric vector representations of the programs, especially the APIs used in each language, and it adapts generative adversarial networks (GAN) to align the vectors from the spaces of two languages. For a better alignment, we initialize the GAN with parameters derived from optional API mapping seeds that can be identified accurately with a simple automatic signature-based matching heuristic. Then the cross-language API mappings can be identified via nearest-neighbors queries in the aligned vector spaces., Comment: Accepted at the 27th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE)
Published: 2019

35. Hierarchical Learning of Cross-Language Mappings through Distributed Vector Representations for Code

Author: Bui, Nghi D. Q. and Jiang, Lingxiao
Subjects: Computer Science - Learning, Computer Science - Computation and Language, Computer Science - Software Engineering
Abstract: Translating a program written in one programming language to another can be useful for software development tasks that need functionality implementations in different languages. Although past studies have considered this problem, they may be either specific to the language grammars, or specific to certain kinds of code elements (e.g., tokens, phrases, API uses). This paper proposes a new approach to automatically learn cross-language representations for various kinds of structural code elements that may be used for program translation. Our key idea is two folded: First, we normalize and enrich code token streams with additional structural and semantic information, and train cross-language vector representations for the tokens (a.k.a. shared embeddings based on word2vec, a neural-network-based technique for producing word embeddings; Second, hierarchically from bottom up, we construct shared embeddings for code elements of higher levels of granularity (e.g., expressions, statements, methods) from the embeddings for their constituents, and then build mappings among code elements across languages based on similarities among embeddings. Our preliminary evaluations on about 40,000 Java and C# source files from 9 software projects show that our approach can automatically learn shared embeddings for various code elements in different languages and identify their cross-language mappings with reasonable Mean Average Precision scores. When compared with an existing tool for mapping library API methods, our approach identifies many more mappings accurately. The mapping results and code can be accessed at https://github.com/bdqnghi/hierarchical-programming-language-mapping. We believe that our idea for learning cross-language vector representations with code structural information can be a useful step towards automated program translation., Comment: Accepted at ICSE'18
Published: 2018
Full Text: View/download PDF

36. Cross-Language Learning for Program Classification using Bilateral Tree-Based Convolutional Neural Networks

Author: Bui, Nghi D. Q., Jiang, Lingxiao, and Yu, Yijun
Subjects: Computer Science - Learning
Abstract: Towards the vision of translating code that implements an algorithm from one programming language into another, this paper proposes an approach for automated program classification using bilateral tree-based convolutional neural networks (BiTBCNNs). It is layered on top of two tree-based convolutional neural networks (TBCNNs), each of which recognizes the algorithm of code written in an individual programming language. The combination layer of the networks recognizes the similarities and differences among code in different programming languages. The BiTBCNNs are trained using the source code in different languages but known to implement the same algorithms and/or functionalities. For a preliminary evaluation, we use 3591 Java and 3534 C++ code snippets from 6 algorithms we crawled systematically from GitHub. We obtained over 90% accuracy in the cross-language binary classification task to tell whether any given two code snippets implement the same algorithm. Also, for the algorithm classification task, i.e., to predict which one of the six algorithm labels is implemented by an arbitrary C++ code snippet, we achieved over 80% precision., Comment: Accepted at NL4SE Workshop, AAAI'18
Published: 2017

37. Epidemiological Characteristics of Upper Respiratory Tract Pathogens in Children in Guangdong, China.

Author: Zhao, Qianwen, Ke, Peifeng, Hu, Liangshan, Jiang, Changhong, Su, Rong, Lv, Weifeng, Li, Qixin, Jiang, Lingxiao, and Cao, Donglin
Subjects: RESPIRATORY infections in children, RESPIRATORY infections, CHILD patients, VIRAL variation, RESPIRATORY syncytial virus
Abstract: Objective: Researches on the epidemiology of various respiratory pathogens at multiple testing points in the pediatric population are limited, and these are crucial for the prevention of respiratory tract infections in children. Methods: We obtained 1788 upper respiratory tract swabs from children exhibiting symptoms of respiratory infection (notably fever with a body temperature exceeding 38.5°C) across five hospitals in Guangdong between November 2020 and June 2022. We used the multiplex probe amplification (MPA) PCR testing to identify 11 respiratory viruses and subsequently analyzed the prevalence characteristics of these pathogens among febrile children in hospitals. Results: The overall detection rate of the pathogens was 58.1% (1039/1788). Human rhinovirus (HRV) exhibited the highest detection rate at 19.0% (339/1788), succeeded by human parainfluenza virus (HPIV), human adenovirus (HAdV), and respiratory syncytial virus (RSV). The positivity and coinfection rates were higher in children aged 5 years and below compared to those above 5 years. Moreover, a distinct pathogen spectrum was observed across different age groups. Hospitalized patients demonstrated a significantly higher positivity and coinfection rate compared to outpatients. During COVID‐2019, RSV appeared a counter‐seasonal trend. Conclusion: Respiratory viral infections in children display distinct characteristics concerning age, hospitalization status, and seasonality. Children under the age of 5 and minor patients admitted to hospitals at least be tested for RSV, HRV, HPIV, and HAdV. The epidemiological patterns of RSV in the post‐epidemic period require ongoing surveillance. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

38. Spatial Differentiation and Influencing Factors of Poverty Alleviation Performance Under the Background of Sustainable Development: A Case Study of Contiguous Destitute Areas in Hunan Province, China

Author: Tan, Xuelan, Yu, Hangling, An, Yue, Wang, Zhenkai, Jiang, Lingxiao, and Ren, Hui
Published: 2021
Full Text: View/download PDF

39. Teaching Software Development for Real-World Problems using a Microservice-Based Collaborative Problem-Solving Approach

Author: Lau, Yi Meng, primary, Koh, Christian Michael, additional, and Jiang, Lingxiao, additional
Published: 2024
Full Text: View/download PDF

40. Unleashing the Power of Clippy in Real-World Rust Projects

Author: Li, Chunmiao, primary, Yu, Yijun, additional, Wu, Haitao, additional, Carlig, Luca, additional, Nie, Shijie, additional, and Jiang, Lingxiao, additional
Published: 2024
Full Text: View/download PDF

41. DronLomaly: Runtime Log-based Anomaly Detector for DJI Drones

Author: Minn, Wei, primary, Tun, Yan Naing, additional, Shar, Lwin Khin, additional, and Jiang, Lingxiao, additional
Published: 2024
Full Text: View/download PDF

42. Beyond a Joke: Dead Code Elimination Can Delete Live Code

Author: Tu, Haoxin, primary, Jiang, Lingxiao, additional, Gao, Debin, additional, and Jiang, He, additional
Published: 2024
Full Text: View/download PDF

43. Development and Clinical Evaluation of a CRISPR-Based Diagnostic for Rapid Group B Streptococcus Screening

Author: Jiang, Lingxiao, Zeng, Weiqi, Wu, Wanting, Deng, Yingying, He, Fusheng, Liang, Wenli, Huang, Mingyao, Huang, Hong, Li, Yongjun, Wang, Xiaorui, Su, Hang, Pan, Shilei, and Xu, Teng
Subjects: Perinatal infection -- Risk factors, Molecular diagnostic techniques -- Methods, Streptococcal infections -- Diagnosis -- Risk factors, Streptococcus agalactiae -- Identification and classification, Pregnant women -- Medical examination, Health
Abstract: Group B Streptococcus (GBS) is a common commensal bacteria of vaginal flora with reported carriage rates of 4%-40% (1-3). Vertical transmission of (GBS) through fetal aspiration of infected amniotic fluid [...]
Published: 2021
Full Text: View/download PDF

44. On the generalizability of Neural Program Models with respect to semantic-preserving program transformations

Author: Rabin, Md Rafiqul Islam, Bui, Nghi D.Q., Wang, Ke, Yu, Yijun, Jiang, Lingxiao, and Alipour, Mohammad Amin
Published: 2021
Full Text: View/download PDF

45. Impedance Control of an Anthropomorphic Hands Without Finger Force Sensors

Author: Jiang, Lingxiao, Tian, Xinyang, Zhan, Qiang, Xu, Qinhuan, and Zhang, Yin
Abstract: With multiple fingers and multi-DOFs, an anthropomorphic hand can compliantly grasp objects with complex shape and low stiffness. In order to realize compliant grasp, contact force control between the anthropomorphic hand and the object is necessary. However, due to limited finger space for sensors and wiring, many anthropomorphic hands do not possess finger force sensors, so how to realize their compliant grasp control becomes a key issue. To solve this problem, this paper presents an impedance control method based on contact force observers and joint friction compensation for anthropomorphic hands without finger force sensors. A generalized momentum observer is used to estimate contact force, and an improved “static friction + Coulomb + viscous” model is adopted to realize joint friction compensation. The proposed impedance control method is verified both by simulations in Simulink and grasp experiments of an anthropomorphic hand. All the results show the method can not only estimate contact forces accurately and compensate joint friction, but also compliantly grasp objects with low stiffness and complex shape. Note to Practitioners—This paper is motivated by the demand for grasp control in anthropomorphic hands without finger force sensors. Force feedback is important in grasp control of anthropomorphic hands, but installing force sensors is highly challenging due to limited finger space for sensors and wiring, and the attendant consequences of increasing weight, volume and expenses. In order to realize such sensor-less grasp control, we present an impedance control method by calculating contact forces with a generalized momentum observer and estimating finger joint friction by an improved speed-friction model. Simulation studies and prototype experiments show the proposed method can stably grasp objects with varying stiffness and complex shapes while realizing expected dynamic characteristics. The proposed method can be used for other anthropomorphic hands without finger force sensors to improve their grasping capability. In future research, we will test the proposed method on more anthropomorphic hands without finger force sensors.
Published: 2024
Full Text: View/download PDF

46. MtdScout: Complementing the Identification of Insecure Methods in Android Apps via Source-to-Bytecode Signature Generation and Tree-based Layered Search

Author: Zhang, Zicheng, Ma, Haoyu, Wu, Daoyuan, Gao, Debin, Yi, Xiao, Chen, Yufan, Wu, Yan, Jiang, Lingxiao, Zhang, Zicheng, Ma, Haoyu, Wu, Daoyuan, Gao, Debin, Yi, Xiao, Chen, Yufan, Wu, Yan, and Jiang, Lingxiao
Published: 2024

47. Your Instructions Are Not Always Helpful: Assessing the Efficacy of Instruction Fine-tuning for Software Vulnerability Detection

Author: Yusuf, Imam Nur Bani, Jiang, Lingxiao, Yusuf, Imam Nur Bani, and Jiang, Lingxiao
Abstract: Software, while beneficial, poses potential cybersecurity risks due to inherent vulnerabilities. Detecting these vulnerabilities is crucial, and deep learning has shown promise as an effective tool for this task due to its ability to perform well without extensive feature engineering. However, a challenge in deploying deep learning for vulnerability detection is the limited availability of training data. Recent research highlights the deep learning efficacy in diverse tasks. This success is attributed to instruction fine-tuning, a technique that remains under-explored in the context of vulnerability detection. This paper investigates the capability of models, specifically a recent language model, to generalize beyond the programming languages used in their training data. It also examines the role of natural language instructions in enhancing this generalization. Our study evaluates the model performance on a real-world dataset to predict vulnerable code. We present key insights and lessons learned, contributing to understanding the deep learning application in software vulnerability detection.
Published: 2024

48. AndroEvolve: automated Android API update with data flow analysis and variable denormalization

Author: Haryono, Stefanus A., Thung, Ferdian, Lo, David, Jiang, Lingxiao, Lawall, Julia, Kang, Hong Jin, Serrano, Lucas, and Muller, Gilles
Published: 2022
Full Text: View/download PDF

49. Fuzzing Drones for Anomaly Detection: A Systematic Literature Review

Author: Malviya, Vikas Kumar, primary, Minn, Wei, additional, Shar, Lwin Khin, additional, and Jiang, Lingxiao, additional
Published: 2024
Full Text: View/download PDF

50. Isolating Compiler Bugs by Generating Effective Witness Programs with Large Language Models

Author: Tu, Haoxin, primary, Zhou, Zhide, additional, Jiang, He, additional, Yusuf, Imam Nur Bani, additional, Li, Yuxian, additional, and Jiang, Lingxiao, additional
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

351 results on '"Jiang, Lingxiao"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources