Lack of well-labelled and coherent training data is the main reason why machine learning (ML) and data-driven interpretations are not established in the field of Ground-Penetrating Radar (GPR). Non-representative and limited datasets lead to non-reliable ML-schemes that overfit, and are unable to compete with traditional deterministic approaches. To that extent, numerical data can potentially complement insufficient measured datasets and overcome this lack of data, even in the presence of large feature spaces.Using synthetic data in ML is not new and it has been extensively applied to computer vision. Applying numerical data in ML requires a numerical framework capable of generating synthetic but nonetheless realistic datasets. Regarding GPR, such a framework is possible using gprMax, an open source electromagnetic solver, fine-tuned for GPR applications [1], [2], [3]. gprMax is fully parallelised and can be run using multiple CPU’s and GPU’s. In addition, it has a flexible scriptable format that makes it easy to generate big data in a trivial manner. Stochastic geometries, realistic soils, vegetation, targets [3] and models of commercial antennas [4], [5] are some of the features that can be easily incorporated in the training data.The capability of gprMax to generate realistic numerical datasets is demonstrated in [6], [7]. The investigated problem is assessing the depth and the diameter of rebars in reinforced concrete. Estimating the diameter of rebars using GPR is particularly challenging with no conclusive solution. Using a synthetic training set, generated using gprMax, we managed to effectively train ML-schemes capable of estimating the diameter of rebar in an accurate and efficient manner [6], [7]. The aforementioned case studies support the premise that gprMax has the potential to provide realistic training data to applications where well-labelled data are not available, such as landmine detection, non-destructive testing and planetary sciences.References[1] Warren, C., Giannopoulos, A. & Giannakis, I., (2016). gprMax: Open Source software to simulate electromagnetic wave propagation for Ground Penetrating Radar, Computer Physics Communications, 209, 163-170.[2] Warren, C., Giannopoulos, A., Gray, A., Giannakis, I., Patterson, A., Wetter, L. & Hamrah, A., (2018). A CUDA-based GPU engine for gprMax: Open source FDTD, electromagnetic simulation software. Computer Physics Communications, 237, 208-218.[3] Giannakis, I., Giannopoulos, A. & Warren, C. (2016). A realistic FDTD numerical modeling framework of Ground Penetrating Radar for landmine detection. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing. 9(1), 37-51.[4] Giannakis, I., Giannopoulos, A. & Warren, C., (2018). Realistic FDTD GPR antenna models optimized using a novel linear/non-linear full waveform inversion. IEEE Transactions on Geoscience and Remote Sensing, 207(3), 1768-1778.[5] Warren, C., Giannopoulos, A. (2011). Creating finite-difference time-domain models of commercial ground-penetrating radar antennas using Taguchi’s optimization method. Geophysics, 76(2), G37-G47[6] Giannakis, I., Giannopoulos, A. & Warren, C. (2021). A Machine Learning Scheme for Estimating the Diameter of Reinforcing Bars Using Ground Penetrating Radar. IEEE Geoscience and Remote Sensing Letters.[7] Giannakis, I., Giannopoulos, A., & Warren, C. (2019). A machine learning-based fast-forward solver for ground penetrating radar with application to full-waveform inversion. IEEE Transactions on Geoscience and Remote Sensing. 57(7), 4417-4426.