Start Over

The SAMPL5 challenge for embedded-cluster integral equation theory: solvation free energies, aqueous pK a , and cyclohexane-water log D.

Authors :: Tielker N
Tomazic D
Heil J
Kloss T
Ehrhart S
Güssregen S
Schmidt KF
Kast SM
Source :: Journal of computer-aided molecular design [J Comput Aided Mol Des] 2016 Nov; Vol. 30 (11), pp. 1035-1044. Date of Electronic Publication: 2016 Aug 23.
Publication Year :: 2016
Abstract: We predict cyclohexane-water distribution coefficients (log D <subscript>7.4</subscript> ) for drug-like molecules taken from the SAMPL5 blind prediction challenge by the "embedded cluster reference interaction site model" (EC-RISM) integral equation theory. This task involves the coupled problem of predicting both partition coefficients (log P) of neutral species between the solvents and aqueous acidity constants (pK <subscript>a</subscript> ) in order to account for a change of protonation states. The first issue is addressed by calibrating an EC-RISM-based model for solvation free energies derived from the "Minnesota Solvation Database" (MNSOL) for both water and cyclohexane utilizing a correction based on the partial molar volume, yielding a root mean square error (RMSE) of 2.4 kcal mol <superscript>-1</superscript> for water and 0.8-0.9 kcal mol <superscript>-1</superscript> for cyclohexane depending on the parametrization. The second one is treated by employing on one hand an empirical pK <subscript>a</subscript> model (MoKa) and, on the other hand, an EC-RISM-derived regression of published acidity constants (RMSE of 1.5 for a single model covering acids and bases). In total, at most 8 adjustable parameters are necessary (2-3 for each solvent and two for the pK <subscript>a</subscript> ) for training solvation and acidity models. Applying the final models to the log D <subscript>7.4</subscript> dataset corresponds to evaluating an independent test set comprising other, composite observables, yielding, for different cyclohexane parametrizations, 2.0-2.1 for the RMSE with the first and 2.2-2.8 with the combined first and second SAMPL5 data set batches. Notably, a pure log P model (assuming neutral species only) performs statistically similarly for these particular compounds. The nature of the approximations and possible perspectives for future developments are discussed.

Subjects :: Models, Chemical
Molecular Structure
Quantum Theory
Solubility
Solvents chemistry
Thermodynamics
Computer Simulation
Cyclohexanes chemistry
Pharmaceutical Preparations chemistry
Water chemistry

Details

Language :: English
ISSN :: 1573-4951
Volume :: 30
Issue :: 11
Database :: MEDLINE
Journal :: Journal of computer-aided molecular design
Publication Type :: Academic Journal
Accession number :: 27554666
Full Text :: https://doi.org/10.1007/s10822-016-9939-7

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

The SAMPL5 challenge for embedded-cluster integral equation theory: solvation free energies, aqueous pK a , and cyclohexane-water log D.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

The SAMPL5 challenge for embedded-cluster integral equation theory: solvation free energies, aqueous pK a , and cyclohexane-water log D.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources