Back to Search
Start Over
Accelerating Formulation Design via Machine Learning: Generating a High-throughput Shampoo Formulations Dataset
- Source :
- Scientific Data, Vol 11, Iss 1, Pp 1-10 (2024)
- Publication Year :
- 2024
- Publisher :
- Nature Portfolio, 2024.
-
Abstract
- Abstract Liquid formulations are ubiquitous yet have lengthy product development cycles owing to the complex physical interactions between ingredients making it difficult to tune formulations to customer-defined property targets. Interpolative ML models can accelerate liquid formulations design but are typically trained on limited sets of ingredients and without any structural information, which limits their out-of-training predictive capacity. To address this challenge, we selected eighteen formulation ingredients covering a diverse chemical space to prepare an open experimental dataset for training ML models for rinse-off formulations development. The resulting design space has an over 50-fold increase in dimensionality compared to our previous work. Here, we present a dataset of 812 formulations, including 294 stable samples, which cover the entire design space, with phase stability, turbidity, and high-fidelity rheology measurements generated on our semi-automated, ML-driven liquid formulations workflow. Our dataset has the unique attribute of sample-specific uncertainty measurements to train predictive surrogate models.
- Subjects :
- Science
Subjects
Details
- Language :
- English
- ISSN :
- 20524463
- Volume :
- 11
- Issue :
- 1
- Database :
- Directory of Open Access Journals
- Journal :
- Scientific Data
- Publication Type :
- Academic Journal
- Accession number :
- edsdoj.8fa8db7a70894f60b16d6db492fee624
- Document Type :
- article
- Full Text :
- https://doi.org/10.1038/s41597-024-03573-w