23 results on '"Novotarskyi, S"'
Search Results
2. Applicability domain for classification problems
- Author
-
Sushko Iurii, Novotarskyi S, Pandey AK, Körner R, and Tetko Igor
- Subjects
Information technology ,T58.5-58.64 ,Chemistry ,QD1-999 - Published
- 2010
- Full Text
- View/download PDF
3. Online chemical modeling environment
- Author
-
Tetko I, Sushko I, and Novotarskyi S
- Subjects
Chemistry ,QD1-999 - Published
- 2009
- Full Text
- View/download PDF
4. Online chemical modeling environment
- Author
-
Novotarskyi, S, Sushko, I, and Tetko, I
- Published
- 2009
- Full Text
- View/download PDF
5. How accurately can we predict the melting points of drug-like compounds?
- Author
-
Tetko I., Sushko Y., Novotarskyi S., Patiny L., Kondratov I., Petrenko A., Charochkina L., and Asiri A.
- Abstract
© 2014 American Chemical Society. This article contributes a highly accurate model for predicting the melting points (MPs) of medicinal chemistry compounds. The model was developed using the largest published data set, comprising more than 47k compounds. The distributions of MPs in drug-like and drug lead sets showed that >90% of molecules melt within [50,250]°C. The final model calculated an RMSE of less than 33 °C for molecules from this temperature interval, which is the most important for medicinal chemistry users. This performance was achieved using a consensus model that performed calculations to a significantly higher accuracy than the individual models. We found that compounds with reactive and unstable groups were overrepresented among outlying compounds. These compounds could decompose during storage or measurement, thus introducing experimental errors. While filtering the data by removing outliers generally increased the accuracy of individual models, it did not significantly affect the results of the consensus models. Three analyzed distance to models did not allow us to flag molecules, which had MP values fell outside the applicability domain of the model. We believe that this negative result and the public availability of data from this article will encourage future studies to develop better approaches to define the applicability domain of models. The final model, MP data, and identified reactive groups are available online at http://ochem.eu/article/55638.
- Published
- 2014
6. QSAR approaches to predict human cytochrome P450 inhibition
- Author
-
Novotarskyi, S.
- Abstract
This thesis focuses on several aspects of QSAR modeling of human cytochrome P450 inhibition and suggests the methodology to increase the quality of CYP inhibition models. It is shown that the addition of newly developed descriptors derived from docking simulations increases the predictive ability of the resulting models. The studies were performed on the OCHEM platform (http://ochem.eu) and all the descriptors, datasets and models are publicly available to the scientific community.
- Published
- 2013
7. Online chemical modeling environment (OCHEM): web platform for data storage, model development and publishing of chemical information
- Author
-
Sushko, I, Novotarskyi, S, Körner, R, Pandey, A, Rupp, M, Teetz, W, Brandmaier, S, Abdelaziz, A, Prokopenko, V, Tanchuk, V, Todeschini, R, Varnek, A, Marcou, G, Ertl, P, Potemkin, V, Grishina, M, Gasteiger, J, Schwab, C, Baskin, I, Palyulin, V, Radchenko, E, Welsh, W, Kholodovych, V, Chekmarev, D, Cherkasov, A, Aires de Sousa, J, Zhang, Q, Bender, A, Nigsch, F, Patiny, L, Williams, A, Pandey, AK, Prokopenko, VV, Tanchuk, VY, Palyulin, VA, Radchenko, EV, Welsh, WJ, Zhang, Q. Y, Williams, A., TODESCHINI, ROBERTO, Sushko, I, Novotarskyi, S, Körner, R, Pandey, A, Rupp, M, Teetz, W, Brandmaier, S, Abdelaziz, A, Prokopenko, V, Tanchuk, V, Todeschini, R, Varnek, A, Marcou, G, Ertl, P, Potemkin, V, Grishina, M, Gasteiger, J, Schwab, C, Baskin, I, Palyulin, V, Radchenko, E, Welsh, W, Kholodovych, V, Chekmarev, D, Cherkasov, A, Aires de Sousa, J, Zhang, Q, Bender, A, Nigsch, F, Patiny, L, Williams, A, Pandey, AK, Prokopenko, VV, Tanchuk, VY, Palyulin, VA, Radchenko, EV, Welsh, WJ, Zhang, Q. Y, Williams, A., and TODESCHINI, ROBERTO
- Abstract
The Online Chemical Modeling Environment is a web-based platform that aims to automate and simplify the typical steps required for QSAR modeling. The platform consists of two major subsystems: the database of experimental measurements and the modeling framework. A user-contributed database contains a set of tools for easy input, search and modification of thousands of records. The OCHEM database is based on the wiki principle and focuses primarily on the quality and verifiability of the data. The database is tightly integrated with the modeling framework, which supports all the steps required to create a predictive model: data search, calculation and selection of a vast variety of molecular descriptors, application of machine learning methods, validation, analysis of the model and assessment of the applicability domain. As compared to other similar systems, OCHEM is not intended to re-implement the existing tools or models but rather to invite the original authors to contribute their results, make them publicly available, share them with other users and to become members of the growing research community. Our intention is to make OCHEM a widely used platform to perform the QSPR/QSAR studies online and share it with other users on the Web. The ultimate goal of OCHEM is collecting all possible chemoinformatics tools within one simple, reliable and user-friendly resource. The OCHEM is free for web users and it is available online at http://www.ochem.eu . © 2011 The Author(s).
- Published
- 2011
8. Applicability Domains for Classification Problems: Benchmarking of Distance to Models for Ames Mutagenicity Set
- Author
-
Sushko, I, Novotarskyi, S, Körner, R, Pandey, A, Cherkasov, A, Li, J, Gramatica, P, Hansen, K, Schroeter, T, Müller, K, Xi, L, Liu, H, Yao, X, Öberg, T, Hormozdiari, F, Dao, P, Sahinalp, C, Todeschini, R, Polishchuk, P, Artemenko, A, Kuz'Min, V, Martin, T, Young, D, Fourches, D, Tropsha, A, Baskin, I, Horbath, D, Marcou, G, Varnek, A, Prokopenko, V, Tetko, I, Pandey, AK, Müller, KR, Kuz'min, V, Martin, TM, Young, DM, Prokopenko, VV, Tetko, IV, TODESCHINI, ROBERTO, Sushko, I, Novotarskyi, S, Körner, R, Pandey, A, Cherkasov, A, Li, J, Gramatica, P, Hansen, K, Schroeter, T, Müller, K, Xi, L, Liu, H, Yao, X, Öberg, T, Hormozdiari, F, Dao, P, Sahinalp, C, Todeschini, R, Polishchuk, P, Artemenko, A, Kuz'Min, V, Martin, T, Young, D, Fourches, D, Tropsha, A, Baskin, I, Horbath, D, Marcou, G, Varnek, A, Prokopenko, V, Tetko, I, Pandey, AK, Müller, KR, Kuz'min, V, Martin, TM, Young, DM, Prokopenko, VV, Tetko, IV, and TODESCHINI, ROBERTO
- Abstract
The estimation of accuracy and applicability of QSAR and QSPR models for biological and physicochemical properties represents a critical problem. The developed parameter of “distance to model” (DM) is defined as a metric of similarity between the training and test set compounds that have been subjected to QSAR/QSPR modeling. In our previous work, we demonstrated the utility and optimal performance of DM metrics that have been based on the standard deviation within an ensemble of QSAR models. The current study applies such analysis to 30 QSAR models for the Ames mutagenicity data set that were previously reported within the 2009 QSAR challenge. We demonstrate that the DMs based on an ensemble (consensus) model provide systematically better performance than other DMs. The presented approach identifies 30-60% of compounds having an accuracy of prediction similar to the interlaboratory accuracy of the Ames test, which is estimated to be 90%. Thus, the in silico predictions can be used to halve the cost of experimental measurements by providing a similar prediction accuracy. The developed model has been made publicly available at http://ochem.eu/models/1
- Published
- 2010
9. Prediction-driven matched molecular pairs to interpret QSARs and aid the molecular optimization process
- Author
-
Sushko Y., Novotarskyi S., Körner R., Vogt J., Abdelaziz A., Tetko I., Sushko Y., Novotarskyi S., Körner R., Vogt J., Abdelaziz A., and Tetko I.
- Abstract
© 2014 Sushko et al.; licensee Springer. Background: QSAR is an established and powerful method for cheap in silico assessment of physicochemical properties and biological activities of chemical compounds. However, QSAR models are rather complex mathematical constructs that cannot easily be interpreted. Medicinal chemists would benefit from practical guidance regarding which molecules to synthesize. Another possible approach is analysis of pairs of very similar molecules, so-called matched molecular pairs (MMPs). Such an approach allows identification of molecular transformations that affect particular activities (e.g. toxicity). In contrast to QSAR, chemical interpretation of these transformations is straightforward. Furthermore, such transformations can give medicinal chemists useful hints for the hit-to-lead optimization process. Results: The current study suggests a combination of QSAR and MMP approaches by finding MMP transformations based on QSAR predictions for large chemical datasets. The study shows that such an approach, referred to as prediction-driven MMP analysis, is a useful tool for medicinal chemists, allowing identification of large numbers of "interesting" transformations that can be used to drive the molecular optimization process. All the methodological developments have been implemented as software products available online as part of OCHEM (http://ochem.eu/). Conclusions: The prediction-driven MMPs methodology was exemplified by two use cases: modelling of aquatic toxicity and CYP3A4 inhibition. This approach helped us to interpret QSAR models and allowed identification of a number of "significant" molecular transformations that affect the desired properties. This can facilitate drug design as a part of molecular optimization process.
10. How accurately can we predict the melting points of drug-like compounds?
- Author
-
Tetko I., Sushko Y., Novotarskyi S., Patiny L., Kondratov I., Petrenko A., Charochkina L., Asiri A., Tetko I., Sushko Y., Novotarskyi S., Patiny L., Kondratov I., Petrenko A., Charochkina L., and Asiri A.
- Abstract
© 2014 American Chemical Society. This article contributes a highly accurate model for predicting the melting points (MPs) of medicinal chemistry compounds. The model was developed using the largest published data set, comprising more than 47k compounds. The distributions of MPs in drug-like and drug lead sets showed that >90% of molecules melt within [50,250]°C. The final model calculated an RMSE of less than 33 °C for molecules from this temperature interval, which is the most important for medicinal chemistry users. This performance was achieved using a consensus model that performed calculations to a significantly higher accuracy than the individual models. We found that compounds with reactive and unstable groups were overrepresented among outlying compounds. These compounds could decompose during storage or measurement, thus introducing experimental errors. While filtering the data by removing outliers generally increased the accuracy of individual models, it did not significantly affect the results of the consensus models. Three analyzed distance to models did not allow us to flag molecules, which had MP values fell outside the applicability domain of the model. We believe that this negative result and the public availability of data from this article will encourage future studies to develop better approaches to define the applicability domain of models. The final model, MP data, and identified reactive groups are available online at http://ochem.eu/article/55638.
11. Prediction-driven matched molecular pairs to interpret QSARs and aid the molecular optimization process
- Author
-
Sushko Y., Novotarskyi S., Körner R., Vogt J., Abdelaziz A., Tetko I., Sushko Y., Novotarskyi S., Körner R., Vogt J., Abdelaziz A., and Tetko I.
- Abstract
© 2014 Sushko et al.; licensee Springer. Background: QSAR is an established and powerful method for cheap in silico assessment of physicochemical properties and biological activities of chemical compounds. However, QSAR models are rather complex mathematical constructs that cannot easily be interpreted. Medicinal chemists would benefit from practical guidance regarding which molecules to synthesize. Another possible approach is analysis of pairs of very similar molecules, so-called matched molecular pairs (MMPs). Such an approach allows identification of molecular transformations that affect particular activities (e.g. toxicity). In contrast to QSAR, chemical interpretation of these transformations is straightforward. Furthermore, such transformations can give medicinal chemists useful hints for the hit-to-lead optimization process. Results: The current study suggests a combination of QSAR and MMP approaches by finding MMP transformations based on QSAR predictions for large chemical datasets. The study shows that such an approach, referred to as prediction-driven MMP analysis, is a useful tool for medicinal chemists, allowing identification of large numbers of "interesting" transformations that can be used to drive the molecular optimization process. All the methodological developments have been implemented as software products available online as part of OCHEM (http://ochem.eu/). Conclusions: The prediction-driven MMPs methodology was exemplified by two use cases: modelling of aquatic toxicity and CYP3A4 inhibition. This approach helped us to interpret QSAR models and allowed identification of a number of "significant" molecular transformations that affect the desired properties. This can facilitate drug design as a part of molecular optimization process.
12. How accurately can we predict the melting points of drug-like compounds?
- Author
-
Tetko I., Sushko Y., Novotarskyi S., Patiny L., Kondratov I., Petrenko A., Charochkina L., Asiri A., Tetko I., Sushko Y., Novotarskyi S., Patiny L., Kondratov I., Petrenko A., Charochkina L., and Asiri A.
- Abstract
© 2014 American Chemical Society. This article contributes a highly accurate model for predicting the melting points (MPs) of medicinal chemistry compounds. The model was developed using the largest published data set, comprising more than 47k compounds. The distributions of MPs in drug-like and drug lead sets showed that >90% of molecules melt within [50,250]°C. The final model calculated an RMSE of less than 33 °C for molecules from this temperature interval, which is the most important for medicinal chemistry users. This performance was achieved using a consensus model that performed calculations to a significantly higher accuracy than the individual models. We found that compounds with reactive and unstable groups were overrepresented among outlying compounds. These compounds could decompose during storage or measurement, thus introducing experimental errors. While filtering the data by removing outliers generally increased the accuracy of individual models, it did not significantly affect the results of the consensus models. Three analyzed distance to models did not allow us to flag molecules, which had MP values fell outside the applicability domain of the model. We believe that this negative result and the public availability of data from this article will encourage future studies to develop better approaches to define the applicability domain of models. The final model, MP data, and identified reactive groups are available online at http://ochem.eu/article/55638.
13. Applicability domains for classification problems: Benchmarking of distance to models for Ames mutagenicity set
- Author
-
Robert Körner, Gilles Marcou, Huanxiang Liu, Dragos Horvath, Roberto Todeschini, Phuong Dao, Xiaojun Yao, Douglas M. Young, Paola Gramatica, A. Varnek, A. Artemenko, Todd M. Martin, Anil Kumar Pandey, Farhad Hormozdiari, Eugene N. Muratov, Alexander Tropsha, Christophe Muller, Artem Cherkasov, Tomas Öberg, Katja Hansen, Lili Xi, Timon Schroeter, Pavel G. Polishchuk, Sergii Novotarskyi, Jiazhong Li, Volodymyr V. Prokopenko, Denis Fourches, Victor E. Kuz’min, Cenk Sahinalp, Igor I. Baskin, Klaus-Robert Müller, Igor V. Tetko, Iurii Sushko, Chimie de la matière complexe (CMC), Université de Strasbourg (UNISTRA)-Institut de Chimie du CNRS (INC)-Centre National de la Recherche Scientifique (CNRS), Sushko, I, Novotarskyi, S, Körner, R, Pandey, A, Cherkasov, A, Li, J, Gramatica, P, Hansen, K, Schroeter, T, Müller, K, Xi, L, Liu, H, Yao, X, Öberg, T, Hormozdiari, F, Dao, P, Sahinalp, C, Todeschini, R, Polishchuk, P, Artemenko, A, Kuz'Min, V, Martin, T, Young, D, Fourches, D, Tropsha, A, Baskin, I, Horbath, D, Marcou, G, Varnek, A, Prokopenko, V, and Tetko, I
- Subjects
Quantitative structure–activity relationship ,General Chemical Engineering ,Quantitative Structure-Activity Relationship ,Library and Information Sciences ,computer.software_genre ,01 natural sciences ,Standard deviation ,Set (abstract data type) ,03 medical and health sciences ,CHIM/01 - CHIMICA ANALITICA ,Similarity (network science) ,030304 developmental biology ,Mathematics ,0303 health sciences ,Principal Component Analysis ,QSAR ,Mutagenicity Tests ,mutagenicity ,General Chemistry ,Classification ,0104 chemical sciences ,Computer Science Applications ,Ames test ,Data set ,010404 medicinal & biomolecular chemistry ,Benchmarking ,Test set ,Metric (mathematics) ,Data mining ,computer ,Algorithm ,[CHIM.CHEM]Chemical Sciences/Cheminformatics ,Applicability domain - Abstract
The estimation of accuracy and applicability of QSAR and QSPR models for biological and physicochemical properties represents a critical problem. The developed parameter of "distance to model" (DM) is defined as a metric of similarity between the training and test set compounds that have been subjected to QSAR/QSPR modeling. In our previous work, we demonstrated the utility and optimal performance of DM metrics that have been based on the standard deviation within an ensemble of QSAR models. The current study applies such analysis to 30 QSAR models for the Ames mutagenicity data set that were previously reported within the 2009 QSAR challenge. We demonstrate that the DMs based on an ensemble (consensus) model provide systematically better performance than other DMs. The presented approach identifies 30-60% of compounds having an accuracy of prediction similar to the interlaboratory accuracy of the Ames test, which is estimated to be 90%. Thus, the in silico predictions can be used to halve the cost of experimental measurements by providing a similar prediction accuracy. The developed model has been made publicly available at http://ochem.eu/models/1 .
- Published
- 2010
- Full Text
- View/download PDF
14. Online chemical modeling environment (OCHEM): web platform for data storage, model development and publishing of chemical information
- Author
-
Gilles Marcou, Florian Nigsch, Ahmed Abdelaziz, Qingyou Zhang, Vladyslav Kholodovych, William J. Welsh, Matthias Rupp, Antony J. Williams, Vsevolod Yu. Tanchuk, Valery Tkachenko, Volodymyr V. Prokopenko, Sergii Novotarskyi, Alexandre Varnek, Igor I. Baskin, Christof H. Schwab, Peter Ertl, João Aires-de-Sousa, Eugene V. Radchenko, Johann Gasteiger, Robert Körner, Igor V. Tetko, Iurii Sushko, Andreas Bender, Maria Grishina, Vladimir A. Palyulin, Dmitriy Chekmarev, Luc Patiny, Wolfram Teetz, Artem Cherkasov, Stefan Brandmaier, Roberto Todeschini, Anil Kumar Pandey, Vladimir Potemkin, Sushko, I, Novotarskyi, S, Körner, R, Pandey, A, Rupp, M, Teetz, W, Brandmaier, S, Abdelaziz, A, Prokopenko, V, Tanchuk, V, Todeschini, R, Varnek, A, Marcou, G, Ertl, P, Potemkin, V, Grishina, M, Gasteiger, J, Schwab, C, Baskin, I, Palyulin, V, Radchenko, E, Welsh, W, Kholodovych, V, Chekmarev, D, Cherkasov, A, Aires de Sousa, J, Zhang, Q, Bender, A, Nigsch, F, Patiny, L, and Williams, A
- Subjects
Information management ,Databases, Factual ,Computer science ,Estimation of accuracy of predictions ,Information Management ,Molecular Similarity ,Modeling workflow ,Quantitative Structure-Activity Relationship ,01 natural sciences ,Partition-Coefficients ,Descriptors ,Article ,Set (abstract data type) ,World Wide Web ,03 medical and health sciences ,User-Computer Interface ,Resource (project management) ,CHIM/01 - CHIMICA ANALITICA ,Applicability domain ,Drug Discovery ,Physical and Theoretical Chemistry ,030304 developmental biology ,0303 health sciences ,Internet ,On-line web platform ,business.industry ,Information Dissemination ,Open access ,E-State Indexes ,0104 chemical sciences ,Variety (cybernetics) ,Computer Science Applications ,Data sharing ,010404 medicinal & biomolecular chemistry ,On-line web platform, Modeling workflow, Estimation of accuracy of predictions, Applicability domain, Data sharing, Open access ,In-Silico ,Models, Chemical ,Cheminformatics ,Shape Signatures ,Associative Neural Networks ,The Internet ,ddc:004 ,business ,Prediction - Abstract
The Online Chemical Modeling Environment is a web-based platform that aims to automate and simplify the typical steps required for QSAR modeling. The platform consists of two major subsystems: the database of experimental measurements and the modeling framework. A user-contributed database contains a set of tools for easy input, search and modification of thousands of records. The OCHEM database is based on the wiki principle and focuses primarily on the quality and verifiability of the data. The database is tightly integrated with the modeling framework, which supports all the steps required to create a predictive model: data search, calculation and selection of a vast variety of molecular descriptors, application of machine learning methods, validation, analysis of the model and assessment of the applicability domain. As compared to other similar systems, OCHEM is not intended to re-implement the existing tools or models but rather to invite the original authors to contribute their results, make them publicly available, share them with other users and to become members of the growing research community. Our intention is to make OCHEM a widely used platform to perform the QSPR/QSAR studies online and share it with other users on the Web. The ultimate goal of OCHEM is collecting all possible chemoinformatics tools within one simple, reliable and user-friendly resource. The OCHEM is free for web users and it is available online at http://www.ochem.eu . © 2011 The Author(s).
- Full Text
- View/download PDF
15. ToxCast EPA in Vitro to in Vivo Challenge: Insight into the Rank-I Model.
- Author
-
Novotarskyi S, Abdelaziz A, Sushko Y, Körner R, Vogt J, and Tetko IV
- Subjects
- Dose-Response Relationship, Drug, In Vitro Techniques, Machine Learning, Neural Networks, Computer, Models, Theoretical
- Abstract
The ToxCast EPA challenge was managed by TopCoder in Spring 2014. The goal of the challenge was to develop a model to predict the lowest effect level (LEL) concentration based on in vitro measurements and calculated in silico descriptors. This article summarizes the computational steps used to develop the Rank-I model, which calculated the lowest prediction error for the secret test data set of the challenge. The model was developed using the publicly available Online CHEmical database and Modeling environment (OCHEM), and it is freely available at http://ochem.eu/article/68104 . Surprisingly, this model does not use any in vitro measurements. The logic of the decision steps used to develop the model and the reason to skip inclusion of in vitro measurements is described. We also show that inclusion of in vitro assays would not improve the accuracy of the model.
- Published
- 2016
- Full Text
- View/download PDF
16. Using Online Tool (iPrior) for Modeling ToxCast™ Assays Towards Prioritization of Animal Toxicity Testing.
- Author
-
Abdelaziz A, Sushko Y, Novotarskyi S, Körner R, Brandmaier S, and Tetko IV
- Subjects
- Animals, Models, Molecular, Quantitative Structure-Activity Relationship, Internet, Toxicity Tests
- Abstract
The use of long-term animal studies for human and environmental toxicity estimation is more discouraged than ever before. Alternative models for toxicity prediction, including QSAR studies, are gaining more ground. A recent approach is to combine in vitro chemical profiling and in silico chemical descriptors with the knowledge about toxicity pathways to derive a unique signature for toxicity endpoints. In this study we investigate the ToxCast™ Phase I data regarding their ability to predict long-term animal toxicity. We investigated thousands of models constructed in an effort to predict 61 toxicity endpoints using multiple descriptor packages and hundreds of in vitro assays. We investigated the use of in vitro assays and biochemical pathways on model performance. We identified 10 toxicity endpoints where biologically derived descriptors from in vitro assays or pathway perturbations improved the model prediction ability. In vivo toxicity endpoints proved generally challenging to model. Few models were possible to readily model with a balanced accuracy (BA) above 0.7. We also constructed in silico models to predict the outcome of 144 in vitro assays. This showed better statistical metrics with 79 out of 144 assays having median balanced accuracy above 0.7. This suggests that the in vitro datasets have a better modelability than in vivo animal toxicities for the given datasets. Moreover, we published an online platform (http://iprior.ochem.eu) that automates large-scale model building and analysis.
- Published
- 2015
- Full Text
- View/download PDF
17. How accurately can we predict the melting points of drug-like compounds?
- Author
-
Tetko IV, Sushko Y, Novotarskyi S, Patiny L, Kondratov I, Petrenko AE, Charochkina L, and Asiri AM
- Subjects
- Artificial Intelligence, Models, Statistical, Statistics as Topic, Chemistry, Pharmaceutical, Informatics methods, Pharmaceutical Preparations chemistry, Transition Temperature
- Abstract
This article contributes a highly accurate model for predicting the melting points (MPs) of medicinal chemistry compounds. The model was developed using the largest published data set, comprising more than 47k compounds. The distributions of MPs in drug-like and drug lead sets showed that >90% of molecules melt within [50,250]°C. The final model calculated an RMSE of less than 33 °C for molecules from this temperature interval, which is the most important for medicinal chemistry users. This performance was achieved using a consensus model that performed calculations to a significantly higher accuracy than the individual models. We found that compounds with reactive and unstable groups were overrepresented among outlying compounds. These compounds could decompose during storage or measurement, thus introducing experimental errors. While filtering the data by removing outliers generally increased the accuracy of individual models, it did not significantly affect the results of the consensus models. Three analyzed distance to models did not allow us to flag molecules, which had MP values fell outside the applicability domain of the model. We believe that this negative result and the public availability of data from this article will encourage future studies to develop better approaches to define the applicability domain of models. The final model, MP data, and identified reactive groups are available online at http://ochem.eu/article/55638.
- Published
- 2014
- Full Text
- View/download PDF
18. Prediction-driven matched molecular pairs to interpret QSARs and aid the molecular optimization process.
- Author
-
Sushko Y, Novotarskyi S, Körner R, Vogt J, Abdelaziz A, and Tetko IV
- Abstract
Background: QSAR is an established and powerful method for cheap in silico assessment of physicochemical properties and biological activities of chemical compounds. However, QSAR models are rather complex mathematical constructs that cannot easily be interpreted. Medicinal chemists would benefit from practical guidance regarding which molecules to synthesize. Another possible approach is analysis of pairs of very similar molecules, so-called matched molecular pairs (MMPs). Such an approach allows identification of molecular transformations that affect particular activities (e.g. toxicity). In contrast to QSAR, chemical interpretation of these transformations is straightforward. Furthermore, such transformations can give medicinal chemists useful hints for the hit-to-lead optimization process., Results: The current study suggests a combination of QSAR and MMP approaches by finding MMP transformations based on QSAR predictions for large chemical datasets. The study shows that such an approach, referred to as prediction-driven MMP analysis, is a useful tool for medicinal chemists, allowing identification of large numbers of "interesting" transformations that can be used to drive the molecular optimization process. All the methodological developments have been implemented as software products available online as part of OCHEM (http://ochem.eu/)., Conclusions: The prediction-driven MMPs methodology was exemplified by two use cases: modelling of aquatic toxicity and CYP3A4 inhibition. This approach helped us to interpret QSAR models and allowed identification of a number of "significant" molecular transformations that affect the desired properties. This can facilitate drug design as a part of molecular optimization process. Graphical AbstractMolecular matched pairs and transformation graphs facilitate interpretable molecular optimisation process.
- Published
- 2014
- Full Text
- View/download PDF
19. The QSPR-THESAURUS: the online platform of the CADASTER project.
- Author
-
Brandmaier S, Peijnenburg W, Durjava MK, Kolar B, Gramatica P, Papa E, Bhhatarai B, Kovarich S, Cassani S, Roy PP, Rahmberg M, Öberg T, Jeliazkova N, Golsteijn L, Comber M, Charochkina L, Novotarskyi S, Sushko I, Abdelaziz A, D'Onofrio E, Kunwar P, Ruggiu F, and Tetko IV
- Subjects
- Linear Models, Research Design, Vocabulary, Controlled, Hazardous Substances toxicity, Internet, Quantitative Structure-Activity Relationship, Risk Assessment
- Abstract
The aim of the CADASTER project (CAse Studies on the Development and Application of in Silico Techniques for Environmental Hazard and Risk Assessment) was to exemplify REACH-related hazard assessments for four classes of chemical compound, namely, polybrominated diphenylethers, per and polyfluorinated compounds, (benzo)triazoles, and musks and fragrances. The QSPR-THESAURUS website (http: / /qspr-thesaurus.eu) was established as the project's online platform to upload, store, apply, and also create, models within the project. We overview the main features of the website, such as model upload, experimental design and hazard assessment to support risk assessment, and integration with other web tools, all of which are essential parts of the QSPR-THESAURUS., (2014 FRAME.)
- Published
- 2014
- Full Text
- View/download PDF
20. Development of dimethyl sulfoxide solubility models using 163,000 molecules: using a domain applicability metric to select more reliable predictions.
- Author
-
Tetko IV, Novotarskyi S, Sushko I, Ivanov V, Petrenko AE, Dieden R, Lebon F, and Mathieu B
- Subjects
- Linear Models, Neural Networks, Computer, Reproducibility of Results, Solubility, Support Vector Machine, Artificial Intelligence, Databases, Pharmaceutical, Dimethyl Sulfoxide chemistry, Informatics methods
- Abstract
The dimethyl sulfoxide (DMSO) solubility data from Enamine and two UCB pharma compound collections were analyzed using 8 different machine learning methods and 12 descriptor sets. The analyzed data sets were highly imbalanced with 1.7-5.8% nonsoluble compounds. The libraries' enrichment by soluble molecules from the set of 10% of the most reliable predictions was used to compare prediction performances of the methods. The highest accuracies were calculated using a C4.5 decision classification tree, random forest, and associative neural networks. The performances of the methods developed were estimated on individual data sets and their combinations. The developed models provided on average a 2-fold decrease of the number of nonsoluble compounds amid all compounds predicted as soluble in DMSO. However, a 4-9-fold enrichment was observed if only 10% of the most reliable predictions were considered. The structural features influencing compounds to be soluble or nonsoluble in DMSO were also determined. The best models developed with the publicly available Enamine data set are freely available online at http://ochem.eu/article/33409 .
- Published
- 2013
- Full Text
- View/download PDF
21. From descriptors to predicted properties: experimental design by using applicability domain estimation.
- Author
-
Brandmaier S, Novotarskyi S, Sushko I, and Tetko IV
- Subjects
- Regression Analysis, Research Design, Risk Assessment methods, Hazardous Substances toxicity
- Abstract
The importance of reliable methods for representative sub-sampling in terms of experimental design and risk assessment within the European Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH) system is crucial. We developed experimental design approaches, by utilising predicted properties and the 'distance to model' parameter, to estimate the benefits of certain compounds to the quality of a resulting model. A statistical evaluation of four regression data sets and one classification data set showed that the adaptive concept of iteratively refining the representation of the chemical space contributes to a more efficient and more reliable selection in comparison to traditional approaches. The evaluation of compounds with regard to the uncertainty and the correlation of prediction is beneficial, and in particular, for regression data sets of sufficient size, whereas the use of predicted properties to define the chemical space is beneficial for classification models., (2013 FRAME.)
- Published
- 2013
- Full Text
- View/download PDF
22. Modeling of non-additive mixture properties using the Online CHEmical database and Modeling environment (OCHEM).
- Author
-
Oprisiu I, Novotarskyi S, and Tetko IV
- Abstract
The Online Chemical Modeling Environment (OCHEM, http://ochem.eu) is a web-based platform that provides tools for automation of typical steps necessary to create a predictive QSAR/QSPR model. The platform consists of two major subsystems: a database of experimental measurements and a modeling framework. So far, OCHEM has been limited to the processing of individual compounds. In this work, we extended OCHEM with a new ability to store and model properties of binary non-additive mixtures. The developed system is publicly accessible, meaning that any user on the Web can store new data for binary mixtures and develop models to predict their non-additive properties.The database already contains almost 10,000 data points for the density, bubble point, and azeotropic behavior of binary mixtures. For these data, we developed models for both qualitative (azeotrope/zeotrope) and quantitative endpoints (density and bubble points) using different learning methods and specially developed descriptors for mixtures. The prediction performance of the models was similar to or more accurate than results reported in previous studies. Thus, we have developed and made publicly available a powerful system for modeling mixtures of chemical compounds on the Web.
- Published
- 2013
- Full Text
- View/download PDF
23. A comparison of different QSAR approaches to modeling CYP450 1A2 inhibition.
- Author
-
Novotarskyi S, Sushko I, Körner R, Pandey AK, and Tetko IV
- Subjects
- Enzyme Inhibitors chemistry, Humans, Molecular Conformation, Artificial Intelligence, Cytochrome P-450 CYP1A2 Inhibitors, Enzyme Inhibitors pharmacology, Quantitative Structure-Activity Relationship
- Abstract
Prediction of CYP450 inhibition activity of small molecules poses an important task due to high risk of drug-drug interactions. CYP1A2 is an important member of CYP450 superfamily and accounts for 15% of total CYP450 presence in human liver. This article compares 80 in-silico QSAR models that were created by following the same procedure with different combinations of descriptors and machine learning methods. The training and test sets consist of 3745 and 3741 inhibitors and noninhibitors from PubChem BioAssay database. A heterogeneous external test set of 160 inhibitors was collected from literature. The studied descriptor sets involve E-state, Dragon and ISIDA SMF descriptors. Machine learning methods involve Associative Neural Networks (ASNN), K Nearest Neighbors (kNN), Random Tree (RT), C4.5 Tree (J48), and Support Vector Machines (SVM). The influence of descriptor selection on model accuracy was studied. The benefits of "bagging" modeling approach were shown. Applicability domain approach was successfully applied in this study and ways of increasing model accuracy through use of applicability domain measures were demonstrated as well as fragment-based model interpretation was performed. The most accurate models in this study achieved values of 83% and 68% correctly classified instances on the internal and external test sets, respectively. The applicability domain approach allowed increasing the prediction accuracy to 90% for 78% of the internal and 17% of the external test sets, respectively. The most accurate models are available online at http://ochem.eu/models/Q5747 .
- Published
- 2011
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.