1. Machine learning data augmentation as a tool to enhance quantitative composition–activity relationships of complex mixtures. A new application to dissect the role of main chemical components in bioactive essential oils
- Author
-
Alessio Ragno, Raissa Buzzi, Lorenzo Antonini, Anna Baldisserotto, Erika Baldini, Filippo Sapienza, Manuela Sabatino, Silvia Vertuani, and Stefano Manfredini
- Subjects
Computer science ,Pharmaceutical Science ,Organic chemistry ,Complex Mixtures ,Machine learning ,computer.software_genre ,Field (computer science) ,Article ,Analytical Chemistry ,Structure-Activity Relationship ,QD241-441 ,Anti-Infective Agents ,Biological profile ,Drug Discovery ,Oils, Volatile ,LS7_3 ,Microsporum ,Cosmeceutics ,Physical and Theoretical Chemistry ,Phylogeny ,Biological data ,LS9_6 ,business.industry ,Pharmaceutics ,Deep learning ,Arthrodermataceae ,Data Collection ,Ambientale ,Matthews correlation coefficient ,Inhibitory potency ,Nutraceutics ,Chemistry (miscellaneous) ,Essential oils ,QCAR ,Molecular Medicine ,Artificial intelligence ,F1 score ,Raw data ,business ,computer - Abstract
Scientific investigation on essential oils composition and the related biological profile are continuously growing. Nevertheless, only a few studies have been performed on the relationships between chemical composition and biological data. Herein, the investigation of 61 assayed essential oils is reported focusing on their inhibition activity against Microsporum spp. including development of machine learning models with the aim of highlining the possible chemical components mainly related to the inhibitory potency. The application of machine learning and deep learning techniques for predictive and descriptive purposes have been applied successfully to many fields. Quantitative composition–activity relationships machine learning-based models were developed for the 61 essential oils tested as Microsporum spp. growth modulators. The models were built with in-house python scripts implementing data augmentation with the purpose of having a smoother flow between essential oils’ chemical compositions and biological data. High statistical coefficient values (Accuracy, Matthews correlation coefficient and F1 score) were obtained and model inspection permitted to detect possible specific roles related to some components of essential oils’ constituents. Robust machine learning models are far more useful tools to reveal data augmentation in comparison with raw data derived models. To the best of the authors knowledge this is the first report using data augmentation to highlight the role of complex mixture components, in particular a first application of these data will be for the development of ingredients in the dermo-cosmetic field investigating microbial species considering the urge for the use of natural preserving and acting antimicrobial agents.
- Published
- 2021