1. Evaluating Molecular Complexity with Open-Source Machine Learning Approaches to Predict Process Mass Intensity.
- Author
-
Tin N, Chauhan M, Agwamba K, Sun Y, Parsons A, Payne P, and Osan R
- Abstract
The application of green chemistry is critical for cultivating environmental responsibility and sustainable practices in pharmaceutical manufacturing. Process mass intensity (PMI) is a key metric that quantifies the resource efficiency of a manufacturing process, but determining what constitutes a successful PMI of a specific molecule is challenging. A recent approach correlated molecular features to a crowdsourced definition of molecular complexity to determine PMI targets. While recent machine learning tools show promise in predicting molecular complexity, a more extensive application could significantly optimize manufacturing processes. To this end, we refine and expand upon the SMART-PMI tool by Sheridan et al. to create an open-source model and application. Our solution emphasizes explainability and parsimony to facilitate a nuanced understanding of prediction and ensure informed decision-making. The resulting model uses four descriptors-the heteroatom count, stereocenter count, unique topological torsion, and connectivity index chi4n-to compute molecular complexity with a comparable 82.6% predictive accuracy and 0.349 RMSE. We develop a corresponding app that takes in structured data files (SDF) to rapidly quantify molecular complexity and provide a PMI target that can be used to drive process development activities. By integrating machine learning explainability and open-source accessibility, we provide flexible tools to advance the field of green chemistry and sustainable pharmaceutical manufacturing., Competing Interests: The authors declare no competing financial interest., (© 2024 The Authors. Published by American Chemical Society.)
- Published
- 2024
- Full Text
- View/download PDF