Back to Search
Start Over
Quantitative assessment of relationship between sequence similarity and function similarity
- Source :
- BMC Genomics, Vol 8, Iss 1, p 222 (2007), BMC Genomics
- Publication Year :
- 2007
- Publisher :
- BMC, 2007.
-
Abstract
- Background Comparative sequence analysis is considered as the first step towards annotating new proteins in genome annotation. However, sequence comparison may lead to creation and propagation of function assignment errors. Thus, it is important to perform a thorough analysis for the quality of sequence-based function assignment using large-scale data in a systematic way. Results We present an analysis of the relationship between sequence similarity and function similarity for the proteins in four model organisms, i.e., Arabidopsis thaliana, Saccharomyces cerevisiae, Caenorrhabditis elegans, and Drosophila melanogaster. Using a measure of functional similarity based on the three categories of Gene Ontology (GO) classifications (biological process, molecular function, and cellular component), we quantified the correlation between functional similarity and sequence similarity measured by sequence identity or statistical significance of the alignment and compared such a correlation against randomly chosen protein pairs. Conclusion Various sequence-function relationships were identified from BLAST versus PSI-BLAST, sequence identity versus Expectation Value, GO indices versus semantic similarity approaches, and within genome versus between genome comparisons, for the three GO categories. Our study provides a benchmark to estimate the confidence in assignment of functions purely based on sequence similarity.
- Subjects :
- lcsh:QH426-470
Sequence analysis
lcsh:Biotechnology
Molecular Sequence Data
Arabidopsis
Saccharomyces cerevisiae
Computational biology
Biology
Genome
Structure-Activity Relationship
Semantic similarity
Similarity (network science)
lcsh:TP248.13-248.65
Genetics
Animals
Caenorhabditis elegans
Databases, Protein
Alignment-free sequence analysis
Sequence (medicine)
Sequence Homology, Amino Acid
Computational Biology
Proteins
Sequence Analysis, DNA
Function (mathematics)
Genome project
lcsh:Genetics
Drosophila melanogaster
Structural Homology, Protein
Research Article
Biotechnology
Subjects
Details
- Language :
- English
- ISSN :
- 14712164
- Volume :
- 8
- Issue :
- 1
- Database :
- OpenAIRE
- Journal :
- BMC Genomics
- Accession number :
- edsair.doi.dedup.....e6e35391c069d561790ae187c74be833