Back to Search
Start Over
Evaluation of Clustering Algorithms for Protein Complex and Protein Interaction Network Assembly
- Source :
- Journal of Proteome Research. 8:2944-2952
- Publication Year :
- 2009
- Publisher :
- American Chemical Society (ACS), 2009.
-
Abstract
- Assembling protein complexes and protein interaction networks from affinity purification-based proteomics data sets remains a challenge. When little a priori knowledge of the complexes exists, it is difficult to place proteins in the proper locations and evaluate the results of clustering approaches. Here we have systematically compared multiple hierarchical and partitioning clustering approaches using a well-characterized but highly complex human protein interaction network data set centered around the conserved AAA+ ATPases Tip49a and Tip49b. This network provides a challenge to clustering algorithms because Tip49a and Tip49b are present in four distinct complexes, the network contains modules, and the network has multiple attachments. We compared the use of binary data, quantitative proteomics data in the form of normalized spectral abundance factors, and the Z-score normalization. In our analysis, a partitioning approach indicated the major modules in a network. Next, while Euclidian distance was sensitive to scaling, with data transformation, all the attachments in a data set were recovered in one branch of a dendrogram. Finally, when Pearson correlation and hierarchical clustering were used, complexes were well separated and their attachments were placed in the proper locations. Each of these three approaches provided distinct information useful for assembly of a network of multiple protein complexes.
- Subjects :
- Proteomics
Normalization (statistics)
Correlation clustering
Biology
computer.software_genre
Models, Biological
Biochemistry
Interaction network
Protein Interaction Mapping
Cluster Analysis
Humans
Databases, Protein
Cluster analysis
Dendrogram
DNA Helicases
Proteins
General Chemistry
Hierarchical clustering
Data set
ComputingMethodologies_PATTERNRECOGNITION
Multiprotein Complexes
Binary data
ATPases Associated with Diverse Cellular Activities
Data mining
Carrier Proteins
computer
Algorithms
Subjects
Details
- ISSN :
- 15353907 and 15353893
- Volume :
- 8
- Database :
- OpenAIRE
- Journal :
- Journal of Proteome Research
- Accession number :
- edsair.doi.dedup.....99c002429713f2f56b67540a3234605b