1,101 results on '"Wang, Yidong"'
Search Results
2. Index
- Author
-
Gross, Larry, Beets, Becca, Newman, Todd P, Brown, Danielle K, McGregor, Shannon C, Kreiss, Daniel, Mancino, Susan, Connaughton, Stacey L, Christian, Aymar Jean, Almeida, Elaine, Wang, Yidong, Robinson, Sue, Ramasubramanian, Srividya, Wilkin, Holley, Gardner, Paula, Raphael, Chad, Napoli, Philip M, Kuo, Rachel, Jordan, Amy, Waisbord, Silvio, and Billard, Thomas J.
- Published
- 2024
3. Back cover
- Author
-
Gross, Larry, Beets, Becca, Newman, Todd P, Brown, Danielle K, McGregor, Shannon C, Kreiss, Daniel, Mancino, Susan, Connaughton, Stacey L, Christian, Aymar Jean, Almeida, Elaine, Wang, Yidong, Robinson, Sue, Ramasubramanian, Srividya, Wilkin, Holley, Gardner, Paula, Raphael, Chad, Napoli, Philip M, Kuo, Rachel, Jordan, Amy, Waisbord, Silvio, and Billard, Thomas J.
- Published
- 2024
4. Contributors
- Author
-
Gross, Larry, Beets, Becca, Newman, Todd P, Brown, Danielle K, McGregor, Shannon C, Kreiss, Daniel, Mancino, Susan, Connaughton, Stacey L, Christian, Aymar Jean, Almeida, Elaine, Wang, Yidong, Robinson, Sue, Ramasubramanian, Srividya, Wilkin, Holley, Gardner, Paula, Raphael, Chad, Napoli, Philip M, Kuo, Rachel, Jordan, Amy, Waisbord, Silvio, and Billard, Thomas J.
- Published
- 2024
5. Epilogue: Reflections on a Half Century of Public Scholarship
- Author
-
Gross, Larry, Beets, Becca, Newman, Todd P, Brown, Danielle K, McGregor, Shannon C, Kreiss, Daniel, Mancino, Susan, Connaughton, Stacey L, Christian, Aymar Jean, Almeida, Elaine, Wang, Yidong, Robinson, Sue, Ramasubramanian, Srividya, Wilkin, Holley, Gardner, Paula, Raphael, Chad, Napoli, Philip M, Kuo, Rachel, Jordan, Amy, Waisbord, Silvio, and Billard, Thomas J.
- Published
- 2024
6. 14. Science Communication
- Author
-
Gross, Larry, Beets, Becca, Newman, Todd P, Brown, Danielle K, McGregor, Shannon C, Kreiss, Daniel, Mancino, Susan, Connaughton, Stacey L, Christian, Aymar Jean, Almeida, Elaine, Wang, Yidong, Robinson, Sue, Ramasubramanian, Srividya, Wilkin, Holley, Gardner, Paula, Raphael, Chad, Napoli, Philip M, Kuo, Rachel, Jordan, Amy, Waisbord, Silvio, and Billard, Thomas J.
- Published
- 2024
7. 12. Political Communication
- Author
-
Gross, Larry, Beets, Becca, Newman, Todd P, Brown, Danielle K, McGregor, Shannon C, Kreiss, Daniel, Mancino, Susan, Connaughton, Stacey L, Christian, Aymar Jean, Almeida, Elaine, Wang, Yidong, Robinson, Sue, Ramasubramanian, Srividya, Wilkin, Holley, Gardner, Paula, Raphael, Chad, Napoli, Philip M, Kuo, Rachel, Jordan, Amy, Waisbord, Silvio, and Billard, Thomas J.
- Published
- 2024
8. 11. Philosophy and Critique
- Author
-
Gross, Larry, Beets, Becca, Newman, Todd P, Brown, Danielle K, McGregor, Shannon C, Kreiss, Daniel, Mancino, Susan, Connaughton, Stacey L, Christian, Aymar Jean, Almeida, Elaine, Wang, Yidong, Robinson, Sue, Ramasubramanian, Srividya, Wilkin, Holley, Gardner, Paula, Raphael, Chad, Napoli, Philip M, Kuo, Rachel, Jordan, Amy, Waisbord, Silvio, and Billard, Thomas J.
- Published
- 2024
9. 10. Organizational Communication
- Author
-
Gross, Larry, Beets, Becca, Newman, Todd P, Brown, Danielle K, McGregor, Shannon C, Kreiss, Daniel, Mancino, Susan, Connaughton, Stacey L, Christian, Aymar Jean, Almeida, Elaine, Wang, Yidong, Robinson, Sue, Ramasubramanian, Srividya, Wilkin, Holley, Gardner, Paula, Raphael, Chad, Napoli, Philip M, Kuo, Rachel, Jordan, Amy, Waisbord, Silvio, and Billard, Thomas J.
- Published
- 2024
10. 13. Race and Ethnic Studies
- Author
-
Gross, Larry, Beets, Becca, Newman, Todd P, Brown, Danielle K, McGregor, Shannon C, Kreiss, Daniel, Mancino, Susan, Connaughton, Stacey L, Christian, Aymar Jean, Almeida, Elaine, Wang, Yidong, Robinson, Sue, Ramasubramanian, Srividya, Wilkin, Holley, Gardner, Paula, Raphael, Chad, Napoli, Philip M, Kuo, Rachel, Jordan, Amy, Waisbord, Silvio, and Billard, Thomas J.
- Published
- 2024
11. 9. Queer Studies
- Author
-
Gross, Larry, Beets, Becca, Newman, Todd P, Brown, Danielle K, McGregor, Shannon C, Kreiss, Daniel, Mancino, Susan, Connaughton, Stacey L, Christian, Aymar Jean, Almeida, Elaine, Wang, Yidong, Robinson, Sue, Ramasubramanian, Srividya, Wilkin, Holley, Gardner, Paula, Raphael, Chad, Napoli, Philip M, Kuo, Rachel, Jordan, Amy, Waisbord, Silvio, and Billard, Thomas J.
- Published
- 2024
12. Chapter 7. Intercultural Communication
- Author
-
Gross, Larry, Beets, Becca, Newman, Todd P, Brown, Danielle K, McGregor, Shannon C, Kreiss, Daniel, Mancino, Susan, Connaughton, Stacey L, Christian, Aymar Jean, Almeida, Elaine, Wang, Yidong, Robinson, Sue, Ramasubramanian, Srividya, Wilkin, Holley, Gardner, Paula, Raphael, Chad, Napoli, Philip M, Kuo, Rachel, Jordan, Amy, Waisbord, Silvio, and Billard, Thomas J.
- Published
- 2024
13. 4. Environmental Communication
- Author
-
Gross, Larry, Beets, Becca, Newman, Todd P, Brown, Danielle K, McGregor, Shannon C, Kreiss, Daniel, Mancino, Susan, Connaughton, Stacey L, Christian, Aymar Jean, Almeida, Elaine, Wang, Yidong, Robinson, Sue, Ramasubramanian, Srividya, Wilkin, Holley, Gardner, Paula, Raphael, Chad, Napoli, Philip M, Kuo, Rachel, Jordan, Amy, Waisbord, Silvio, and Billard, Thomas J.
- Published
- 2024
14. 6. Health Communication
- Author
-
Gross, Larry, Beets, Becca, Newman, Todd P, Brown, Danielle K, McGregor, Shannon C, Kreiss, Daniel, Mancino, Susan, Connaughton, Stacey L, Christian, Aymar Jean, Almeida, Elaine, Wang, Yidong, Robinson, Sue, Ramasubramanian, Srividya, Wilkin, Holley, Gardner, Paula, Raphael, Chad, Napoli, Philip M, Kuo, Rachel, Jordan, Amy, Waisbord, Silvio, and Billard, Thomas J.
- Published
- 2024
15. 2. Communication and Digital Technology
- Author
-
Gross, Larry, Beets, Becca, Newman, Todd P, Brown, Danielle K, McGregor, Shannon C, Kreiss, Daniel, Mancino, Susan, Connaughton, Stacey L, Christian, Aymar Jean, Almeida, Elaine, Wang, Yidong, Robinson, Sue, Ramasubramanian, Srividya, Wilkin, Holley, Gardner, Paula, Raphael, Chad, Napoli, Philip M, Kuo, Rachel, Jordan, Amy, Waisbord, Silvio, and Billard, Thomas J.
- Published
- 2024
16. 8. Journalism Studies
- Author
-
Gross, Larry, Beets, Becca, Newman, Todd P, Brown, Danielle K, McGregor, Shannon C, Kreiss, Daniel, Mancino, Susan, Connaughton, Stacey L, Christian, Aymar Jean, Almeida, Elaine, Wang, Yidong, Robinson, Sue, Ramasubramanian, Srividya, Wilkin, Holley, Gardner, Paula, Raphael, Chad, Napoli, Philip M, Kuo, Rachel, Jordan, Amy, Waisbord, Silvio, and Billard, Thomas J.
- Published
- 2024
17. 5. Feminist Studies
- Author
-
Gross, Larry, Beets, Becca, Newman, Todd P, Brown, Danielle K, McGregor, Shannon C, Kreiss, Daniel, Mancino, Susan, Connaughton, Stacey L, Christian, Aymar Jean, Almeida, Elaine, Wang, Yidong, Robinson, Sue, Ramasubramanian, Srividya, Wilkin, Holley, Gardner, Paula, Raphael, Chad, Napoli, Philip M, Kuo, Rachel, Jordan, Amy, Waisbord, Silvio, and Billard, Thomas J.
- Published
- 2024
18. Introduction. The Promethean Imperative: An Introduction to Public Scholarship in Communication Studies
- Author
-
Gross, Larry, Beets, Becca, Newman, Todd P, Brown, Danielle K, McGregor, Shannon C, Kreiss, Daniel, Mancino, Susan, Connaughton, Stacey L, Christian, Aymar Jean, Almeida, Elaine, Wang, Yidong, Robinson, Sue, Ramasubramanian, Srividya, Wilkin, Holley, Gardner, Paula, Raphael, Chad, Napoli, Philip M, Kuo, Rachel, Jordan, Amy, Waisbord, Silvio, and Billard, Thomas J.
- Published
- 2024
19. 3. Media and Technology Policy
- Author
-
Gross, Larry, Beets, Becca, Newman, Todd P, Brown, Danielle K, McGregor, Shannon C, Kreiss, Daniel, Mancino, Susan, Connaughton, Stacey L, Christian, Aymar Jean, Almeida, Elaine, Wang, Yidong, Robinson, Sue, Ramasubramanian, Srividya, Wilkin, Holley, Gardner, Paula, Raphael, Chad, Napoli, Philip M, Kuo, Rachel, Jordan, Amy, Waisbord, Silvio, and Billard, Thomas J.
- Published
- 2024
20. Half-Title Page, Title Page, Copyright
- Author
-
Gross, Larry, Beets, Becca, Newman, Todd P, Brown, Danielle K, McGregor, Shannon C, Kreiss, Daniel, Mancino, Susan, Connaughton, Stacey L, Christian, Aymar Jean, Almeida, Elaine, Wang, Yidong, Robinson, Sue, Ramasubramanian, Srividya, Wilkin, Holley, Gardner, Paula, Raphael, Chad, Napoli, Philip M, Kuo, Rachel, Jordan, Amy, Waisbord, Silvio, and Billard, Thomas J.
- Published
- 2024
21. 1. Children and Media
- Author
-
Gross, Larry, Beets, Becca, Newman, Todd P, Brown, Danielle K, McGregor, Shannon C, Kreiss, Daniel, Mancino, Susan, Connaughton, Stacey L, Christian, Aymar Jean, Almeida, Elaine, Wang, Yidong, Robinson, Sue, Ramasubramanian, Srividya, Wilkin, Holley, Gardner, Paula, Raphael, Chad, Napoli, Philip M, Kuo, Rachel, Jordan, Amy, Waisbord, Silvio, and Billard, Thomas J.
- Published
- 2024
22. On the Diversity of Synthetic Data and its Impact on Training Large Language Models
- Author
-
Chen, Hao, Waheed, Abdul, Li, Xiang, Wang, Yidong, Wang, Jindong, Raj, Bhiksha, and Abdin, Marah I.
- Subjects
Computer Science - Computation and Language - Abstract
The rise of Large Language Models (LLMs) has accentuated the need for diverse, high-quality pre-training data. Synthetic data emerges as a viable solution to the challenges of data scarcity and inaccessibility. While previous literature has focused predominantly on the quality and quantity of real data, our work enables the measurement of diversity in synthetic data and explores its impact on LLM performance. We study the downstream effects of synthetic data diversity during both the pre-training and fine-tuning stages by introducing a new diversity metric, \textit{LLM cluster-agent}, designed to evaluate the diversity of synthetic datasets. Through a series of controlled experiments with models of 350M and 1.4B parameters, we demonstrate that the proposed cluster-based LLM scoring of diversity correlates positively with both pre-training and supervised fine-tuning performance. Our findings also reveal that synthetic data diversity in pre-training affects supervised fine-tuning more significantly than pre-training itself, even for smaller models. We hope this study advances our understanding of the optimal use of synthetic data in LLM training and opens new avenues for efficient data generation processes.
- Published
- 2024
23. ISC4DGF: Enhancing Directed Grey-box Fuzzing with LLM-Driven Initial Seed Corpus Generation
- Author
-
Xu, Yijiang, Jia, Hongrui, Chen, Liguo, Wang, Xin, Zeng, Zhengran, Wang, Yidong, Gao, Qing, Wang, Jindong, Ye, Wei, Zhang, Shikun, and Wu, Zhonghai
- Subjects
Computer Science - Software Engineering - Abstract
Fuzz testing is crucial for identifying software vulnerabilities, with coverage-guided grey-box fuzzers like AFL and Angora excelling in broad detection. However, as the need for targeted detection grows, directed grey-box fuzzing (DGF) has become essential, focusing on specific vulnerabilities. The initial seed corpus, which consists of carefully selected input samples that the fuzzer uses as a starting point, is fundamental in determining the paths that the fuzzer explores. A well-designed seed corpus can guide the fuzzer more effectively towards critical areas of the code, improving the efficiency and success of the fuzzing process. Even with its importance, many works concentrate on refining guidance mechanisms while paying less attention to optimizing the initial seed corpus. In this paper, we introduce ISC4DGF, a novel approach to generating optimized initial seed corpus for DGF using Large Language Models (LLMs). By leveraging LLMs' deep software understanding and refined user inputs, ISC4DGF creates precise seed corpus that efficiently trigger specific vulnerabilities. Implemented on AFL and tested against state-of-the-art fuzzers like AFLGo, FairFuzz, and Entropic using the Magma benchmark, ISC4DGF achieved a 35.63x speedup and 616.10x fewer target reaches. Moreover, ISC4DGF focused on more effectively detecting target vulnerabilities, enhancing efficiency while operating with reduced code coverage., Comment: 15 pages, 2 figures
- Published
- 2024
24. A Survey on Evaluating Large Language Models in Code Generation Tasks
- Author
-
Chen, Liguo, Guo, Qi, Jia, Hongrui, Zeng, Zhengran, Wang, Xin, Xu, Yijiang, Wu, Jian, Wang, Yidong, Gao, Qing, Wang, Jindong, Ye, Wei, and Zhang, Shikun
- Subjects
Computer Science - Software Engineering - Abstract
This paper provides a comprehensive review of the current methods and metrics used to evaluate the performance of Large Language Models (LLMs) in code generation tasks. With the rapid growth in demand for automated software development, LLMs have demonstrated significant potential in the field of code generation. The paper begins by reviewing the historical development of LLMs and their applications in code generation. Next, it details various methods and metrics for assessing the code generation capabilities of LLMs, including code correctness, efficiency, readability, and evaluation methods based on expert review and user experience. The paper also evaluates the widely used benchmark datasets, identifying their limitations and proposing directions for future improvements. Specifically, the paper analyzes the performance of code generation models across different tasks by combining multiple evaluation metrics, such as code compilation/interpretation success rates, unit test pass rates, and performance and efficiency metrics, to comprehensively assess the practical application of LLMs in code generation. Finally, the paper discusses the challenges faced in evaluating LLMs in code generation, particularly how to ensure the comprehensiveness and accuracy of evaluation methods and how to adapt to the evolving practices of software development. These analyses and discussions provide valuable insights for further optimizing and improving the application of LLMs in code generation tasks.
- Published
- 2024
25. RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generation
- Author
-
Zhang, Xuanwang, Song, Yunze, Wang, Yidong, Tang, Shuyun, Li, Xinfeng, Zeng, Zhengran, Wu, Zhen, Ye, Wei, Xu, Wenyuan, Zhang, Yue, Dai, Xinyu, Zhang, Shikun, and Wen, Qingsong
- Subjects
Computer Science - Computation and Language - Abstract
Large Language Models (LLMs) demonstrate human-level capabilities in dialogue, reasoning, and knowledge retention. However, even the most advanced LLMs face challenges such as hallucinations and real-time updating of their knowledge. Current research addresses this bottleneck by equipping LLMs with external knowledge, a technique known as Retrieval Augmented Generation (RAG). However, two key issues constrained the development of RAG. First, there is a growing lack of comprehensive and fair comparisons between novel RAG algorithms. Second, open-source tools such as LlamaIndex and LangChain employ high-level abstractions, which results in a lack of transparency and limits the ability to develop novel algorithms and evaluation metrics. To close this gap, we introduce RAGLAB, a modular and research-oriented open-source library. RAGLAB reproduces 6 existing algorithms and provides a comprehensive ecosystem for investigating RAG algorithms. Leveraging RAGLAB, we conduct a fair comparison of 6 RAG algorithms across 10 benchmarks. With RAGLAB, researchers can efficiently compare the performance of various algorithms and develop novel algorithms., Comment: 6 pages, 3 figures
- Published
- 2024
26. Survey on Knowledge Distillation for Large Language Models: Methods, Evaluation, and Application
- Author
-
Yang, Chuanpeng, Lu, Wang, Zhu, Yao, Wang, Yidong, Chen, Qian, Gao, Chenlong, Yan, Bingjie, and Chen, Yiqiang
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Large Language Models (LLMs) have showcased exceptional capabilities in various domains, attracting significant interest from both academia and industry. Despite their impressive performance, the substantial size and computational demands of LLMs pose considerable challenges for practical deployment, particularly in environments with limited resources. The endeavor to compress language models while maintaining their accuracy has become a focal point of research. Among the various methods, knowledge distillation has emerged as an effective technique to enhance inference speed without greatly compromising performance. This paper presents a thorough survey from three aspects: method, evaluation, and application, exploring knowledge distillation techniques tailored specifically for LLMs. Specifically, we divide the methods into white-box KD and black-box KD to better illustrate their differences. Furthermore, we also explored the evaluation tasks and distillation effects between different distillation methods, and proposed directions for future research. Through in-depth understanding of the latest advancements and practical applications, this survey provides valuable resources for researchers, paving the way for sustained progress in this field., Comment: 28 pages
- Published
- 2024
27. Enhancing In-Context Learning via Implicit Demonstration Augmentation
- Author
-
Zhou, Xiaoling, Ye, Wei, Wang, Yidong, Jiang, Chaoya, Lee, Zhemg, Xie, Rui, and Zhang, Shikun
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Computation and Language ,I.2.7 - Abstract
The emergence of in-context learning (ICL) enables large pre-trained language models (PLMs) to make predictions for unseen inputs without updating parameters. Despite its potential, ICL's effectiveness heavily relies on the quality, quantity, and permutation of demonstrations, commonly leading to suboptimal and unstable performance. In this paper, we tackle this challenge for the first time from the perspective of demonstration augmentation. Specifically, we start with enriching representations of demonstrations by leveraging their deep feature distribution. We then theoretically reveal that when the number of augmented copies approaches infinity, the augmentation is approximately equal to a novel logit calibration mechanism integrated with specific statistical properties. This insight results in a simple yet highly efficient method that significantly improves the average and worst-case accuracy across diverse PLMs and tasks. Moreover, our method effectively reduces performance variance among varying demonstrations, permutations, and templates, and displays the capability to address imbalanced class distributions., Comment: Accepted by ACL 2024 Main 19 pages,10 figures
- Published
- 2024
28. AutoSurvey: Large Language Models Can Automatically Write Surveys
- Author
-
Wang, Yidong, Guo, Qi, Yao, Wenjin, Zhang, Hongbo, Zhang, Xin, Wu, Zhen, Zhang, Meishan, Dai, Xinyu, Zhang, Min, Wen, Qingsong, Ye, Wei, Zhang, Shikun, and Zhang, Yue
- Subjects
Computer Science - Information Retrieval ,Computer Science - Artificial Intelligence ,Computer Science - Computation and Language - Abstract
This paper introduces AutoSurvey, a speedy and well-organized methodology for automating the creation of comprehensive literature surveys in rapidly evolving fields like artificial intelligence. Traditional survey paper creation faces challenges due to the vast volume and complexity of information, prompting the need for efficient survey methods. While large language models (LLMs) offer promise in automating this process, challenges such as context window limitations, parametric knowledge constraints, and the lack of evaluation benchmarks remain. AutoSurvey addresses these challenges through a systematic approach that involves initial retrieval and outline generation, subsection drafting by specialized LLMs, integration and refinement, and rigorous evaluation and iteration. Our contributions include a comprehensive solution to the survey problem, a reliable evaluation method, and experimental validation demonstrating AutoSurvey's effectiveness.We open our resources at \url{https://github.com/AutoSurveys/AutoSurvey}.
- Published
- 2024
29. FreeEval: A Modular Framework for Trustworthy and Efficient Evaluation of Large Language Models
- Author
-
Yu, Zhuohao, Gao, Chang, Yao, Wenjin, Wang, Yidong, Zeng, Zhengran, Ye, Wei, Wang, Jindong, Zhang, Yue, and Zhang, Shikun
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
The rapid development of large language model (LLM) evaluation methodologies and datasets has led to a profound challenge: integrating state-of-the-art evaluation techniques cost-effectively while ensuring reliability, reproducibility, and efficiency. Currently, there is a notable absence of a unified and adaptable framework that seamlessly integrates various evaluation approaches. Moreover, the reliability of evaluation findings is often questionable due to potential data contamination, with the evaluation efficiency commonly overlooked when facing the substantial costs associated with LLM inference. In response to these challenges, we introduce FreeEval, a modular and scalable framework crafted to enable trustworthy and efficient automatic evaluations of LLMs. Firstly, FreeEval's unified abstractions simplify the integration and improve the transparency of diverse evaluation methodologies, encompassing dynamic evaluation that demand sophisticated LLM interactions. Secondly, the framework integrates meta-evaluation techniques like human evaluation and data contamination detection, which, along with dynamic evaluation modules in the platform, enhance the fairness of the evaluation outcomes. Lastly, FreeEval is designed with a high-performance infrastructure, including distributed computation and caching strategies, enabling extensive evaluations across multi-node, multi-GPU clusters for open-source and proprietary LLMs., Comment: We open-source all our code at: https://github.com/WisdomShell/FreeEval
- Published
- 2024
30. CoderUJB: An Executable and Unified Java Benchmark for Practical Programming Scenarios
- Author
-
Zeng, Zhengran, Wang, Yidong, Xie, Rui, Ye, Wei, and Zhang, Shikun
- Subjects
Computer Science - Software Engineering ,68N30 (Primary) 68T20 (Secondary) ,D.2.0 - Abstract
In the evolving landscape of large language models (LLMs) tailored for software engineering, the need for benchmarks that accurately reflect real-world development scenarios is paramount. Current benchmarks are either too simplistic or fail to capture the multi-tasking nature of software development. To address this, we introduce CoderUJB, a new benchmark designed to evaluate LLMs across diverse Java programming tasks that are executable and reflective of actual development scenarios, acknowledging Java's prevalence in real-world software production. CoderUJB comprises 2,239 programming questions derived from 17 real open-source Java projects and spans five practical programming tasks. Our empirical study on this benchmark investigates the coding abilities of various open-source and closed-source LLMs, examining the effects of continued pre-training in specific programming languages code and instruction fine-tuning on their performance. The findings indicate that while LLMs exhibit strong potential, challenges remain, particularly in non-functional code generation (e.g., test generation and defect detection). Importantly, our results advise caution in the specific programming languages continued pre-training and instruction fine-tuning, as these techniques could hinder model performance on certain tasks, suggesting the need for more nuanced strategies. CoderUJB thus marks a significant step towards more realistic evaluations of programming capabilities in LLMs, and our study provides valuable insights for the future development of these models in software engineering., Comment: 11 pages, 4 figures, issta2024 accepted
- Published
- 2024
31. Apollo: A Lightweight Multilingual Medical LLM towards Democratizing Medical AI to 6B People
- Author
-
Wang, Xidong, Chen, Nuo, Chen, Junyin, Wang, Yidong, Zhen, Guorui, Zhang, Chunxian, Wu, Xiangbo, Hu, Yan, Gao, Anningzhe, Wan, Xiang, Li, Haizhou, and Wang, Benyou
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Despite the vast repository of global medical knowledge predominantly being in English, local languages are crucial for delivering tailored healthcare services, particularly in areas with limited medical resources. To extend the reach of medical AI advancements to a broader population, we aim to develop medical LLMs across the six most widely spoken languages, encompassing a global population of 6.1 billion. This effort culminates in the creation of the ApolloCorpora multilingual medical dataset and the XMedBench benchmark. In the multilingual medical benchmark, the released Apollo models, at various relatively-small sizes (i.e., 0.5B, 1.8B, 2B, 6B, and 7B), achieve the best performance among models of equivalent size. Especially, Apollo-7B is the state-of-the-art multilingual medical LLMs up to 70B. Additionally, these lite models could be used to improve the multi-lingual medical capabilities of larger models without fine-tuning in a proxy-tuning fashion. We will open-source training corpora, code, model weights and evaluation benchmark., Comment: Preprint
- Published
- 2024
32. KIEval: A Knowledge-grounded Interactive Evaluation Framework for Large Language Models
- Author
-
Yu, Zhuohao, Gao, Chang, Yao, Wenjin, Wang, Yidong, Ye, Wei, Wang, Jindong, Xie, Xing, Zhang, Yue, and Zhang, Shikun
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
Automatic evaluation methods for large language models (LLMs) are hindered by data contamination, leading to inflated assessments of their effectiveness. Existing strategies, which aim to detect contaminated texts, focus on quantifying contamination status instead of accurately gauging model performance. In this paper, we introduce KIEval, a Knowledge-grounded Interactive Evaluation framework, which incorporates an LLM-powered "interactor" role for the first time to accomplish a dynamic contamination-resilient evaluation. Starting with a question in a conventional LLM benchmark involving domain-specific knowledge, KIEval utilizes dynamically generated, multi-round, and knowledge-focused dialogues to determine whether a model's response is merely a recall of benchmark answers or demonstrates a deep comprehension to apply knowledge in more complex conversations. Extensive experiments on seven leading LLMs across five datasets validate KIEval's effectiveness and generalization. We also reveal that data contamination brings no contribution or even negative effect to models' real-world applicability and understanding, and existing contamination detection methods for LLMs can only identify contamination in pre-training but not during supervised fine-tuning., Comment: Accepted to ACL 2024 (main conference); 19 pages, 5 figures, 19 tables, code is available at: https://github.com/zhuohaoyu/KIEval
- Published
- 2024
33. A General Framework for Learning from Weak Supervision
- Author
-
Chen, Hao, Wang, Jindong, Feng, Lei, Li, Xiang, Wang, Yidong, Xie, Xing, Sugiyama, Masashi, Singh, Rita, and Raj, Bhiksha
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Weakly supervised learning generally faces challenges in applicability to various scenarios with diverse weak supervision and in scalability due to the complexity of existing algorithms, thereby hindering the practical deployment. This paper introduces a general framework for learning from weak supervision (GLWS) with a novel algorithm. Central to GLWS is an Expectation-Maximization (EM) formulation, adeptly accommodating various weak supervision sources, including instance partial labels, aggregate statistics, pairwise observations, and unlabeled data. We further present an advanced algorithm that significantly simplifies the EM computational demands using a Non-deterministic Finite Automaton (NFA) along with a forward-backward algorithm, which effectively reduces time complexity from quadratic or factorial often required in existing solutions to linear scale. The problem of learning from arbitrary weak supervision is therefore converted to the NFA modeling of them. GLWS not only enhances the scalability of machine learning models but also demonstrates superior performance and versatility across 11 weak supervision scenarios. We hope our work paves the way for further advancements and practical deployment in this field., Comment: 24 pages, 20 tables, 9 figures
- Published
- 2024
34. PD-L2 drives resistance to EGFR-TKIs: dynamic changes of the tumor immune environment and targeted therapy
- Author
-
Wang, Simeng, Su, Dongliang, Chen, Han, Lai, Jia-Cheng, Tang, Chengfang, Li, Yu, Wang, Yidong, Yang, Yuan, Qin, Mingze, Jia, Lina, Cui, Wei, Yang, Jingyu, Wang, Lihui, and Wu, Chunfu
- Published
- 2024
- Full Text
- View/download PDF
35. Supervised Knowledge Makes Large Language Models Better In-context Learners
- Author
-
Yang, Linyi, Zhang, Shuibai, Yu, Zhuohao, Bao, Guangsheng, Wang, Yidong, Wang, Jindong, Xu, Ruochen, Ye, Wei, Xie, Xing, Chen, Weizhu, and Zhang, Yue
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering. The recent progress in large-scale generative models has further expanded their use in real-world language applications. However, the critical challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored. While previous in-context learning research has focused on enhancing models to adhere to users' specific instructions and quality expectations, and to avoid undesired outputs, little to no work has explored the use of task-Specific fine-tuned Language Models (SLMs) to improve LLMs' in-context learning during the inference stage. Our primary contribution is the establishment of a simple yet effective framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks. Using our proposed plug-in method, enhanced versions of Llama 2 and ChatGPT surpass their original versions regarding generalizability and factuality. We offer a comprehensive suite of resources, including 16 curated datasets, prompts, model checkpoints, and LLM outputs across 9 distinct tasks. The code and data are released at: https://github.com/YangLinyi/Supervised-Knowledge-Makes-Large-Language-Models-Better-In-context-Learners. Our empirical analysis sheds light on the advantages of incorporating discriminative models into LLMs and highlights the potential of our methodology in fostering more reliable LLMs., Comment: Accepted to ICLR 2024
- Published
- 2023
36. Different biochemical composition and oxidation state of soil organic matter between upland and paddy fields
- Author
-
Feng, Miao, Liu, Kailou, Lou, Yilai, Shang, Yuntao, Guo, Changcheng, Song, Zhaoliang, Gunina, Anna, and Wang, Yidong
- Published
- 2024
- Full Text
- View/download PDF
37. Survey on Factuality in Large Language Models: Knowledge, Retrieval and Domain-Specificity
- Author
-
Wang, Cunxiang, Liu, Xiaoze, Yue, Yuanhao, Tang, Xiangru, Zhang, Tianhang, Jiayang, Cheng, Yao, Yunzhi, Gao, Wenyang, Hu, Xuming, Qi, Zehan, Wang, Yidong, Yang, Linyi, Wang, Jindong, Xie, Xing, Zhang, Zheng, and Zhang, Yue
- Subjects
Computer Science - Computation and Language - Abstract
This survey addresses the crucial issue of factuality in Large Language Models (LLMs). As LLMs find applications across diverse domains, the reliability and accuracy of their outputs become vital. We define the Factuality Issue as the probability of LLMs to produce content inconsistent with established facts. We first delve into the implications of these inaccuracies, highlighting the potential consequences and challenges posed by factual errors in LLM outputs. Subsequently, we analyze the mechanisms through which LLMs store and process facts, seeking the primary causes of factual errors. Our discussion then transitions to methodologies for evaluating LLM factuality, emphasizing key metrics, benchmarks, and studies. We further explore strategies for enhancing LLM factuality, including approaches tailored for specific domains. We focus two primary LLM configurations standalone LLMs and Retrieval-Augmented LLMs that utilizes external data, we detail their unique challenges and potential enhancements. Our survey offers a structured guide for researchers aiming to fortify the factual reliability of LLMs., Comment: 62 pages; 300+ references
- Published
- 2023
38. A Survey on Evaluation of Large Language Models
- Author
-
Chang, Yupeng, Wang, Xu, Wang, Jindong, Wu, Yuan, Yang, Linyi, Zhu, Kaijie, Chen, Hao, Yi, Xiaoyuan, Wang, Cunxiang, Wang, Yidong, Ye, Wei, Zhang, Yue, Chang, Yi, Yu, Philip S., Yang, Qiang, and Xie, Xing
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Large language models (LLMs) are gaining increasing popularity in both academia and industry, owing to their unprecedented performance in various applications. As LLMs continue to play a vital role in both research and daily use, their evaluation becomes increasingly critical, not only at the task level, but also at the society level for better understanding of their potential risks. Over the past years, significant efforts have been made to examine LLMs from various perspectives. This paper presents a comprehensive review of these evaluation methods for LLMs, focusing on three key dimensions: what to evaluate, where to evaluate, and how to evaluate. Firstly, we provide an overview from the perspective of evaluation tasks, encompassing general natural language processing tasks, reasoning, medical usage, ethics, educations, natural and social sciences, agent applications, and other areas. Secondly, we answer the `where' and `how' questions by diving into the evaluation methods and benchmarks, which serve as crucial components in assessing performance of LLMs. Then, we summarize the success and failure cases of LLMs in different tasks. Finally, we shed light on several future challenges that lie ahead in LLMs evaluation. Our aim is to offer invaluable insights to researchers in the realm of LLMs evaluation, thereby aiding the development of more proficient LLMs. Our key point is that evaluation should be treated as an essential discipline to better assist the development of LLMs. We consistently maintain the related open-source materials at: https://github.com/MLGroupJLU/LLM-eval-survey., Comment: Accepted by ACM Transactions on Intelligent Systems and Technology (TIST); 45 pages; More recent works; https://llm-eval.github.io/
- Published
- 2023
39. PandaLM: An Automatic Evaluation Benchmark for LLM Instruction Tuning Optimization
- Author
-
Wang, Yidong, Yu, Zhuohao, Zeng, Zhengran, Yang, Linyi, Wang, Cunxiang, Chen, Hao, Jiang, Chaoya, Xie, Rui, Wang, Jindong, Xie, Xing, Ye, Wei, Zhang, Shikun, and Zhang, Yue
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Instruction tuning large language models (LLMs) remains a challenging task, owing to the complexity of hyperparameter selection and the difficulty involved in evaluating the tuned models. To determine the optimal hyperparameters, an automatic, robust, and reliable evaluation benchmark is essential. However, establishing such a benchmark is not a trivial task due to the challenges associated with evaluation accuracy and privacy protection. In response to these challenges, we introduce a judge large language model, named PandaLM, which is trained to distinguish the superior model given several LLMs. PandaLM's focus extends beyond just the objective correctness of responses, which is the main focus of traditional evaluation datasets. It addresses vital subjective factors such as relative conciseness, clarity, adherence to instructions, comprehensiveness, and formality. To ensure the reliability of PandaLM, we collect a diverse human-annotated test dataset, where all contexts are generated by humans and labels are aligned with human preferences. Our results indicate that PandaLM-7B achieves 93.75% of GPT-3.5's evaluation ability and 88.28% of GPT-4's in terms of F1-score on our test dataset. PandaLM enables the evaluation of LLM to be fairer but with less cost, evidenced by significant improvements achieved by models tuned through PandaLM compared to their counterparts trained with default Alpaca's hyperparameters. In addition, PandaLM does not depend on API-based evaluations, thus avoiding potential data leakage. All resources of PandaLM are released at https://github.com/WeOpenML/PandaLM., Comment: Accepted by ICLR 2024
- Published
- 2023
40. PromptRobust: Towards Evaluating the Robustness of Large Language Models on Adversarial Prompts
- Author
-
Zhu, Kaijie, Wang, Jindong, Zhou, Jiaheng, Wang, Zichen, Chen, Hao, Wang, Yidong, Yang, Linyi, Ye, Wei, Zhang, Yue, Gong, Neil Zhenqiang, and Xie, Xing
- Subjects
Computer Science - Computation and Language ,Computer Science - Cryptography and Security ,Computer Science - Machine Learning - Abstract
The increasing reliance on Large Language Models (LLMs) across academia and industry necessitates a comprehensive understanding of their robustness to prompts. In response to this vital need, we introduce PromptRobust, a robustness benchmark designed to measure LLMs' resilience to adversarial prompts. This study uses a plethora of adversarial textual attacks targeting prompts across multiple levels: character, word, sentence, and semantic. The adversarial prompts, crafted to mimic plausible user errors like typos or synonyms, aim to evaluate how slight deviations can affect LLM outcomes while maintaining semantic integrity. These prompts are then employed in diverse tasks including sentiment analysis, natural language inference, reading comprehension, machine translation, and math problem-solving. Our study generates 4,788 adversarial prompts, meticulously evaluated over 8 tasks and 13 datasets. Our findings demonstrate that contemporary LLMs are not robust to adversarial prompts. Furthermore, we present a comprehensive analysis to understand the mystery behind prompt robustness and its transferability. We then offer insightful robustness analysis and pragmatic recommendations for prompt composition, beneficial to both researchers and everyday users., Comment: Technical report; code is at: https://github.com/microsoft/promptbench
- Published
- 2023
41. Out-of-Distribution Generalization in Text Classification: Past, Present, and Future
- Author
-
Yang, Linyi, Song, Yaoxiao, Ren, Xuan, Lyu, Chenyang, Wang, Yidong, Liu, Lingqiao, Wang, Jindong, Foster, Jennifer, and Zhang, Yue
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Machine learning (ML) systems in natural language processing (NLP) face significant challenges in generalizing to out-of-distribution (OOD) data, where the test distribution differs from the training data distribution. This poses important questions about the robustness of NLP models and their high accuracy, which may be artificially inflated due to their underlying sensitivity to systematic biases. Despite these challenges, there is a lack of comprehensive surveys on the generalization challenge from an OOD perspective in text classification. Therefore, this paper aims to fill this gap by presenting the first comprehensive review of recent progress, methods, and evaluations on this topic. We furth discuss the challenges involved and potential future research directions. By providing quick access to existing work, we hope this survey will encourage future research in this area., Comment: 25 pages, OOD Generalization, Survey
- Published
- 2023
42. Imprecise Label Learning: A Unified Framework for Learning with Various Imprecise Label Configurations
- Author
-
Chen, Hao, Shah, Ankit, Wang, Jindong, Tao, Ran, Wang, Yidong, Xie, Xing, Sugiyama, Masashi, Singh, Rita, and Raj, Bhiksha
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Learning with reduced labeling standards, such as noisy label, partial label, and multiple label candidates, which we generically refer to as \textit{imprecise} labels, is a commonplace challenge in machine learning tasks. Previous methods tend to propose specific designs for every emerging imprecise label configuration, which is usually unsustainable when multiple configurations of imprecision coexist. In this paper, we introduce imprecise label learning (ILL), a framework for the unification of learning with various imprecise label configurations. ILL leverages expectation-maximization (EM) for modeling the imprecise label information, treating the precise labels as latent variables.Instead of approximating the correct labels for training, it considers the entire distribution of all possible labeling entailed by the imprecise information. We demonstrate that ILL can seamlessly adapt to partial label learning, semi-supervised learning, noisy label learning, and, more importantly, a mixture of these settings. Notably, ILL surpasses the existing specified techniques for handling imprecise labels, marking the first unified framework with robust and effective performance across various challenging settings. We hope our work will inspire further research on this topic, unleashing the full potential of ILL in wider scenarios where precise labels are expensive and complicated to obtain., Comment: 29 pages, 3 figures, 16 tables, preprint
- Published
- 2023
43. Evaluating Open-QA Evaluation
- Author
-
Wang, Cunxiang, Cheng, Sirui, Guo, Qipeng, Yue, Yuanhao, Ding, Bowen, Xu, Zhikun, Wang, Yidong, Hu, Xiangkun, Zhang, Zheng, and Zhang, Yue
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
This study focuses on the evaluation of the Open Question Answering (Open-QA) task, which can directly estimate the factuality of large language models (LLMs). Current automatic evaluation methods have shown limitations, indicating that human evaluation still remains the most reliable approach. We introduce a new task, Evaluating QA Evaluation (QA-Eval) and the corresponding dataset EVOUNA, designed to assess the accuracy of AI-generated answers in relation to standard answers within Open-QA. Our evaluation of these methods utilizes human-annotated results to measure their performance. Specifically, the work investigates methods that show high correlation with human evaluations, deeming them more reliable. We also discuss the pitfalls of current methods and methods to improve LLM-based evaluators. We believe this new QA-Eval task and corresponding dataset EVOUNA will facilitate the development of more effective automatic evaluation tools and prove valuable for future research in this area. All resources are available at \url{https://github.com/wangcunxiang/QA-Eval} and it is under the Apache-2.0 License., Comment: Accepted by Neurips-2023 Datasets and Benchmarks track; 28 pages
- Published
- 2023
44. Updated mechanisms of MASLD pathogenesis
- Author
-
Li, Yuxuan, Yang, Peipei, Ye, Jialu, Xu, Qiyuan, Wu, Jiaqi, and Wang, Yidong
- Published
- 2024
- Full Text
- View/download PDF
45. Soil organic carbon pools under long-term mineral and organic amendments: a multisite study
- Author
-
Liu, Yiping, Zhang, Limin, Lou, Yilai, Hu, Ning, Li, Zhongfang, Zhang, Huimin, Zhu, Ping, Li, Dongchu, Gao, Hongjun, Zhang, Shuiqing, Lu, Shunbao, Bhattacharyya, Ranjan, Kuzyakov, Yakov, and Wang, Yidong
- Published
- 2024
- Full Text
- View/download PDF
46. Retraction Note: REST regulates the cell cycle for cardiac development and regeneration
- Author
-
Zhang, Donghong, Wang, Yidong, Lu, Pengfei, Wang, Ping, Yuan, Xinchun, Yan, Jianyun, Cai, Chenleng, Chang, Ching-Pin, Zheng, Deyou, Wu, Bingruo, and Zhou, Bin
- Published
- 2024
- Full Text
- View/download PDF
47. The regulatory relationship between NAMPT and PD-L1 in cancer and identification of a dual-targeting inhibitor
- Author
-
Yang, Yuan, Li, Zefei, Wang, Yidong, Gao, Jiwei, Meng, Yangyang, Wang, Simeng, Zhao, Xiaoyao, Tang, Chengfang, Yang, Weiming, Li, Yingjia, Bao, Jie, Fan, Xinyu, Tang, Jing, Yang, Jingyu, Wu, Chunfu, Qin, Mingze, and Wang, Lihui
- Published
- 2024
- Full Text
- View/download PDF
48. Data mining and analysis of adverse event signals associated with teprotumumab using the Food and Drug Administration adverse event reporting system database
- Author
-
Zhang, Sha, Wang, Yidong, Qi, Zhan, Tong, Shanshan, and Zhu, Deqiu
- Published
- 2024
- Full Text
- View/download PDF
49. Silicon promotes biomass accumulation in Phragmites australis under waterlogged conditions in coastal wetland
- Author
-
Wu, Yuntao, Zhang, Xiaodong, Lin, Jiayang, Wang, Xia, Sun, Shaobo, Hao, Qian, Wu, Lele, Zhou, Jingyun, Xia, Shaopan, Ran, Xiangbing, Wang, Yidong, Tang, Jiahuan, Yu, Changxun, Song, Zhaoliang, and Liu, Cong-Qiang
- Published
- 2024
- Full Text
- View/download PDF
50. Exploring Vision-Language Models for Imbalanced Learning
- Author
-
Wang, Yidong, Yu, Zhuohao, Wang, Jindong, Heng, Qiang, Chen, Hao, Ye, Wei, Xie, Rui, Xie, Xing, and Zhang, Shikun
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning - Abstract
Vision-Language models (VLMs) that use contrastive language-image pre-training have shown promising zero-shot classification performance. However, their performance on imbalanced dataset is relatively poor, where the distribution of classes in the training dataset is skewed, leading to poor performance in predicting minority classes. For instance, CLIP achieved only 5% accuracy on the iNaturalist18 dataset. We propose to add a lightweight decoder to VLMs to avoid OOM (out of memory) problem caused by large number of classes and capture nuanced features for tail classes. Then, we explore improvements of VLMs using prompt tuning, fine-tuning, and incorporating imbalanced algorithms such as Focal Loss, Balanced SoftMax and Distribution Alignment. Experiments demonstrate that the performance of VLMs can be further boosted when used with decoder and imbalanced methods. Specifically, our improved VLMs significantly outperforms zero-shot classification by an average accuracy of 6.58%, 69.82%, and 6.17%, on ImageNet-LT, iNaturalist18, and Places-LT, respectively. We further analyze the influence of pre-training data size, backbones, and training cost. Our study highlights the significance of imbalanced learning algorithms in face of VLMs pre-trained by huge data. We release our code at https://github.com/Imbalance-VLM/Imbalance-VLM., Comment: IJCV minor revision; 16 pages; code: https://github.com/Imbalance-VLM/Imbalance-VLM
- Published
- 2023
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.