Back to Search
Start Over
Enhancing Software Code Vulnerability Detection Using GPT-4o and Claude-3.5 Sonnet: A Study on Prompt Engineering Techniques.
- Source :
- Electronics (2079-9292); Jul2024, Vol. 13 Issue 13, p2657, 16p
- Publication Year :
- 2024
-
Abstract
- This study investigates the efficacy of advanced large language models, specifically GPT-4o, Claude-3.5 Sonnet, and GPT-3.5 Turbo, in detecting software vulnerabilities. Our experiment utilized vulnerable and secure code samples from the NIST Software Assurance Reference Dataset (SARD), focusing on C++, Java, and Python. We employed three distinct prompting techniques as follows: Concise, Tip Setting, and Step-by-Step. The results demonstrate that GPT-4o and Claude-3.5 Sonnet significantly outperform GPT-3.5 Turbo in vulnerability detection. GPT-4o showed the highest improvement with the Step-by-Step prompt, achieving an F1 score of 0.9072. Claude-3.5 Sonnet exhibited consistent high performance across all prompt types, with its Step-by-Step prompt yielding the best overall results (F1 score: 0.8933, AUC: 0.74). In contrast, GPT-3.5 Turbo showed minimal performance changes across prompts, with the Tip Setting prompt performing best (AUC: 0.65, F1 score: 0.6772), yet significantly lower than the other models. Our findings highlight the potential of advanced models in enhancing software security and underscore the importance of prompt engineering in optimizing their performance. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 20799292
- Volume :
- 13
- Issue :
- 13
- Database :
- Complementary Index
- Journal :
- Electronics (2079-9292)
- Publication Type :
- Academic Journal
- Accession number :
- 178412758
- Full Text :
- https://doi.org/10.3390/electronics13132657