Back to Search
Start Over
The Poisson Index: a new probabilistic model for protein–ligand binding site similarity
- Source :
- Bioinformatics. 23:3001-3008
- Publication Year :
- 2007
- Publisher :
- Oxford University Press (OUP), 2007.
-
Abstract
- Motivation: The large-scale comparison of protein–ligand binding sites is problematic, in that measures of structural similarity are difficult to quantify and are not easily understood in terms of statistical similarity that can ultimately be related to structure and function. We present a binding site matching score the Poisson Index (PI) based upon a well-defined statistical model. PI requires only the number of matching atoms between two sites and the size of the two sites—the same information used by the Tanimoto Index (TI), a comparable and widely used measure for molecular similarity. We apply PI and TI to a previously automatically extracted set of binding sites to determine the robustness and usefulness of both scores.Results: We found that PI outperforms TI; moreover, site similarity is poorly defined for TI at values around the 99.5% confidence level for which PI is well defined. A difference map at this confidence level shows that PI gives much more meaningful information than TI. We show individual examples where TI fails to distinguish either a false or a true site paring in contrast to PI, which performs much better. TI cannot handle large or small sites very well, or the comparison of large and small sites, in contrast to PI that is shown to be much more robust. Despite the difficulty of determining a biological ‘ground truth’ for binding site similarity we conclude that PI is a suitable measure of binding site similarity and could form the basis for a binding site classification scheme comparable to existing protein domain classification schema.Availability: PI is implemented in SitesBase www.modelling.leeds.ac.uk/sb/Contact: r.m.jackson@leeds.ac.uk
- Subjects :
- Statistics and Probability
Matching (graph theory)
Structural similarity
Molecular Sequence Data
Ligands
Poisson distribution
Biochemistry
Measure (mathematics)
symbols.namesake
Similarity (network science)
Sequence Analysis, Protein
Protein Interaction Mapping
Statistics
Computer Simulation
Amino Acid Sequence
Poisson Distribution
Molecular Biology
Mathematics
Binding Sites
Models, Statistical
Sequence Homology, Amino Acid
business.industry
Proteins
Contrast (statistics)
Pattern recognition
Statistical model
Similitude
Computer Science Applications
Computational Mathematics
Models, Chemical
Computational Theory and Mathematics
symbols
Artificial intelligence
business
Algorithms
Protein Binding
Subjects
Details
- ISSN :
- 13674811 and 13674803
- Volume :
- 23
- Database :
- OpenAIRE
- Journal :
- Bioinformatics
- Accession number :
- edsair.doi.dedup.....4f9dbdea4871f432ea7799aa90159e9d
- Full Text :
- https://doi.org/10.1093/bioinformatics/btm470