Back to Search
Start Over
Collecting Vulnerable Source Code from Open-Source Repositories for Dataset Generation.
- Source :
- Applied Sciences (2076-3417); Feb2020, Vol. 10 Issue 4, p1270, 14p
- Publication Year :
- 2020
-
Abstract
- Different Machine Learning techniques to detect software vulnerabilities have emerged in scientific and industrial scenarios. Different actors in these scenarios aim to develop algorithms for predicting security threats without requiring human intervention. However, these algorithms require data-driven engines based on the processing of huge amounts of data, known as datasets. This paper introduces the SonarCloud Vulnerable Code Prospector for C (SVCP4C). This tool aims to collect vulnerable source code from open source repositories linked to SonarCloud, an online tool that performs static analysis and tags the potentially vulnerable code. The tool provides a set of tagged files suitable for extracting features and creating training datasets for Machine Learning algorithms. This study presents a descriptive analysis of these files and overviews current status of C vulnerabilities, specifically buffer overflow, in the reviewed public repositories. [ABSTRACT FROM AUTHOR]
- Subjects :
- INSTITUTIONAL repositories
MACHINE learning
SOURCE code
GENERATIONS
Subjects
Details
- Language :
- English
- ISSN :
- 20763417
- Volume :
- 10
- Issue :
- 4
- Database :
- Complementary Index
- Journal :
- Applied Sciences (2076-3417)
- Publication Type :
- Academic Journal
- Accession number :
- 142551685
- Full Text :
- https://doi.org/10.3390/app10041270