Back to Search Start Over

Identification of Diverse Database Subsets using Property-Based and Fragment-Based Molecular Descriptions

Authors :
Michael H. Charlton
Geoffrey M. Downs
Peter Willett
Roger Lahana
John M. Barnard
John D. Holliday
Florence Casset
Mark Ashton
Dominique Gorse
Source :
Quantitative Structure-Activity Relationships. 21:598-604
Publication Year :
2002
Publisher :
Wiley, 2002.

Abstract

This paper reports a comparison of calculated molecular properties and of 2D fragment bit-strings when used for the selection of structurally diverse subsets of a file of 44295 compounds. MaxMin dissimilarity-based selection and k-means cluster-based selection are used to select subsets containing between 1% and 20% of the file. Investigation of the numbers of bioactive molecules in the selected subsets suggest: that the MaxMin subsets are noticeably superior to the k-means subsets; that the property-based descriptors are marginally superior to the fragment-based descriptors; and that both approaches are noticeably superior to random selection.

Details

ISSN :
15213838 and 09318771
Volume :
21
Database :
OpenAIRE
Journal :
Quantitative Structure-Activity Relationships
Accession number :
edsair.doi...........fe08ed50a94993e2e680ff3d5574a4f4
Full Text :
https://doi.org/10.1002/qsar.200290002