Back to Search Start Over

Collecting Vulnerable Source Code from Open-Source Repositories for Dataset Generation.

Authors :
Raducu, Razvan
Esteban, Gonzalo
Rodríguez Lera, Francisco J.
Fernández, Camino
Source :
Applied Sciences (2076-3417); Feb2020, Vol. 10 Issue 4, p1270, 14p
Publication Year :
2020

Abstract

Different Machine Learning techniques to detect software vulnerabilities have emerged in scientific and industrial scenarios. Different actors in these scenarios aim to develop algorithms for predicting security threats without requiring human intervention. However, these algorithms require data-driven engines based on the processing of huge amounts of data, known as datasets. This paper introduces the SonarCloud Vulnerable Code Prospector for C (SVCP4C). This tool aims to collect vulnerable source code from open source repositories linked to SonarCloud, an online tool that performs static analysis and tags the potentially vulnerable code. The tool provides a set of tagged files suitable for extracting features and creating training datasets for Machine Learning algorithms. This study presents a descriptive analysis of these files and overviews current status of C vulnerabilities, specifically buffer overflow, in the reviewed public repositories. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
20763417
Volume :
10
Issue :
4
Database :
Complementary Index
Journal :
Applied Sciences (2076-3417)
Publication Type :
Academic Journal
Accession number :
142551685
Full Text :
https://doi.org/10.3390/app10041270