Back to Search Start Over

Cell and tumor classification using gene expression data: construction of forests.

Authors :
Zhang H
Yu CY
Singer B
Source :
Proceedings of the National Academy of Sciences of the United States of America [Proc Natl Acad Sci U S A] 2003 Apr 01; Vol. 100 (7), pp. 4168-72. Date of Electronic Publication: 2003 Mar 17.
Publication Year :
2003

Abstract

The advent of gene chips has led to a promising technology for cell, tumor, and cancer classification. We exploit and expand the methodology of recursive partitioning trees for tumor and cell classification from microarray gene expression data. To improve classification and prediction accuracy, we introduce a deterministic procedure to form forests of classification trees and compare their performance with extant alternatives. When two published and commonly used data sets are used, we find that the deterministic forests perform similarly to the random forests in terms of the error rate obtained from the leave-one-out procedure, and all of the forests are far better than the single trees. In addition, we provide graphical presentations to facilitate interpretation of complex forests and compare our findings with the current biological literature. In addition to numerical improvement, the main advantage of deterministic forests is reproducibility and scientific interpretability of all steps in tree construction.

Details

Language :
English
ISSN :
0027-8424
Volume :
100
Issue :
7
Database :
MEDLINE
Journal :
Proceedings of the National Academy of Sciences of the United States of America
Publication Type :
Academic Journal
Accession number :
12642676
Full Text :
https://doi.org/10.1073/pnas.0230559100