4 results on '"Yang, Zhi-Jiang"'
Search Results
2. Structural Analysis and Identification of False Positive Hits in Luciferase-Based Assays.
- Author
-
Yang ZY, Dong J, Yang ZJ, Lu AP, Hou TJ, and Cao DS
- Subjects
- Algorithms, Luciferases, Reproducibility of Results, Databases, Chemical, High-Throughput Screening Assays
- Abstract
Luciferase-based bioluminescence detection techniques are highly favored in high-throughput screening (HTS), in which the firefly luciferase (FLuc) is the most commonly used variant. However, FLuc inhibitors can interfere with the activity of luciferase, which may result in false positive signals in HTS assays. In order to reduce the unnecessary cost of time and money, an in silico prediction model for FLuc inhibitors is highly desirable. In this study, we built an extensive data set consisting of 20 888 FLuc inhibitors and 198 608 noninhibitors, and then developed a group of classification models based on the combination of three machine learning (ML) algorithms and four types of molecular representations. The best prediction model based on XGBoost and ECFP4 and MOE2d descriptors yielded a balanced accuracy (BA) of 0.878 and an area under the receiver operating characteristic curve (AUC) value of 0.958 for the validation set, and a BA of 0.886 and an AUC of 0.947 for the test set. Three external validation sets, including set 1 (3231 FLuc inhibitors and 69 783 noninhibitors), set 2 (695 FLuc inhibitors and 75 913 noninhibitors), and set 3 (1138 FLuc inhibitors and 8155 noninhibitors), were used to verify the predictive ability of our models. The BA values for the three external validation sets given by the best model are 0.864, 0.845, and 0.791, respectively. In addition, the important features or structural fragments related to FLuc inhibitors were recognized by the Shapley additive explanations (SHAP) method along with their influences on predictions, which may provide valuable clues to detecting undesirable luciferase inhibitors. Based on the important and explanatory features, 16 rules were proposed for detecting FLuc inhibitors, which can achieve a correction rate of 70% for FLuc inhibitors. Furthermore, a comparison with existing prediction rules and models for FLuc inhibitors used in virtual screening verified the high reliability of the models and rules proposed in this study. We also used the model to screen three curated chemical databases, and almost 10% of the molecules in the evaluated databases were predicted as inhibitors, highlighting the potential risk of false positives in luciferase-based assays. Finally, a public web server called ChemFLuc was developed (http://admet.scbdd.com/chemfluc/index/), and it offers a free available service to predict potential FLuc inhibitors.
- Published
- 2020
- Full Text
- View/download PDF
3. Systematic Modeling of log D 7.4 Based on Ensemble Machine Learning, Group Contribution, and Matched Molecular Pair Analysis.
- Author
-
Fu L, Liu L, Yang ZJ, Li P, Ding JJ, Yun YH, Lu AP, Hou TJ, and Cao DS
- Subjects
- Algorithms, Drug Discovery methods, Lipids chemistry, Quantitative Structure-Activity Relationship, Machine Learning, Models, Molecular
- Abstract
Lipophilicity, as evaluated by the n -octanol/buffer solution distribution coefficient at pH = 7.4 (log D
7.4 ), is a major determinant of various absorption, distribution, metabolism, elimination, and toxicology (ADMET) parameters of drug candidates. In this study, we developed several quantitative structure-property relationship (QSPR) models to predict log D7.4 based on a large and structurally diverse data set. Eight popular machine learning algorithms were employed to build the prediction models with 43 molecular descriptors selected by a wrapper feature selection method. The results demonstrated that XGBoost yielded better prediction performance than any other single model ( RT 2 = 0.906 and RMSET = 0.395). Moreover, the consensus model from the top three models could continue to improve the prediction performance ( RT 2 = 0.922 and RMSET = 0.359). The robustness, reliability, and generalization ability of the models were strictly evaluated by the Y-randomization test and applicability domain analysis. Moreover, the group contribution model based on 110 atom types and the local models for different ionization states were also established and compared to the global models. The results demonstrated that the descriptor-based consensus model is superior to the group contribution method, and the local models have no advantage over the global models. Finally, matched molecular pair (MMP) analysis and descriptor importance analysis were performed to extract transformation rules and give some explanations related to log D7.4 . In conclusion, we believe that the consensus model developed in this study can be used as a reliable and promising tool to evaluate log D7.4 in drug discovery.- Published
- 2020
- Full Text
- View/download PDF
4. Structural Analysis and Identification of Colloidal Aggregators in Drug Discovery.
- Author
-
Yang ZY, Yang ZJ, Dong J, Wang LL, Zhang LX, Ding JJ, Ding XQ, Lu AP, Hou TJ, and Cao DS
- Subjects
- Computer Simulation, Databases, Pharmaceutical, Drug Design, Humans, Molecular Structure, Software, Structure-Activity Relationship, Drug Discovery methods, Pharmaceutical Preparations chemistry
- Abstract
Aggregation has been posing a great challenge in drug discovery. Current computational approaches aiming to filter out aggregated molecules based on their similarity to known aggregators, such as Aggregator Advisor, have low prediction accuracy, and therefore development of reliable in silico models to detect aggregators is highly desirable. In this study, we built a data set consisting of 12 119 aggregators and 24 172 drugs or drug candidates and then developed a group of classification models based on the combination of two ensemble learning approaches and five types of molecular representations. The best model yielded an accuracy of 0.950 and an area under the curve (AUC) value of 0.987 for the training set, and an accuracy of 0.937 and an AUC of 0.976 for the test set. The best model also gave reliable predictions to the external validation set with 5681 aggregators since 80% of molecules were predicted to be aggregators with a prediction probability higher than 0.9. More importantly, we explored the relationship between colloidal aggregation and molecular features, and generalized a set of simple rules to detect aggregators. Molecular features, such as log D , the number of hydroxyl groups, the number of aromatic carbons attached to a hydrogen atom, and the number of sulfur atoms in aromatic heterocycles, would be helpful to distinguish aggregators from nonaggregators. A comparison with numerous existing druglikeness and aggregation filtering rules and models used in virtual screening verified the high reliability of the model and rules proposed in this study. We also used the model to screen several curated chemical databases, and almost 20% of molecules in the evaluated databases were predicted as aggregators, highlighting the potential high risk of aggregation in screening. Finally, we developed an online Web server of ChemAGG ( http://admet.scbdd.com/ChemAGG/index ), which offers a freely available tool to detect aggregators.
- Published
- 2019
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.