Introduction: Evaluations from multiple perspectives are needed to determine whether syndromic classification of chief complaints is useful for outbreak detection. Objective: This study quantified the performance of a naïve Bayesian classifier, Complaint Classifer (CoCo), at syndromic classification from chief complaints by using a three-stage evaluation process. Methods: First, CoCo was evaluated to determine its level of technical accuracy in answering the question, "Can we accurately classify a chief-complaint string into a syndromic category?" For example, the area under the ROC curve of CoCo classifications were calculated into eight syndromes for 28,990 chief complaints from 30 hospitals in Utah during a 1-month period (Olszewski RT. Bayesian classification of triage diagnoses for the early detection of epidemics. In: Proceeding of the Florida Artificial Intelligence Research Society Conference; May 12-14, 2003; St. Augustine, FL. Menlo Park, CA: AAAI Press; 2003:412-6). Standard classifications were made by a physician reading only the chief complaints. Second, CoCo was evaluated to determine its performance at case classification to answer the question, "Does the syndromic classification from the chief complaint accurately represent the patient's clinical state?" For example, the sensitivity and specificity of the CoCo classification of 527,228 patients over a 13-year period in a single hospital in Pittsburgh, Pennsylvania was measured (Chapman WW, Dowling JN, Wagner MM. Classification of emergency department chief complaints into seven syndromes: a retrospective analysis of 527,228 patients. Ann Emerg Med. In press 2005.). Reference standard classifications were assigned by syndromic groups of primary International Classification of Diseases, Ninth Revision (ICD-9) discharge diagnoses. Third, CoCo was evaluated to determine its performance at outbreak detection to answer the question, "How timely and accurately can we detect a public health outbreak by monitoring chief-complaint classifications?" For example, by using the Exponentially Weighted Moving Average (EWMA) detection algorithm, the factors measured were timeliness, sensitivity, and specificity of chief complaints classified by CoCo for predicting outbreaks of pediatric respiratory and gastrointestinal illness (Ivanov O, Gesteland P, Hogan W, Mundorff MB, Wagner MM. Detection of pediatric respiratory and gastrointestinal outbreaks from free-text chief complaints. In: Proceedings of the American Medical Informatics Association Annual Fall Symposium; November 8-12, Washington, DC. Bethesda, MD: American Medical Informatics Association; 2003: 318-22.). Reference standard classification comprised ICD-9 discharge diagnoses of pneumonia, influenza, and bronchiolitis for respiratory illness and rotavirus and pediatric gastroenteritis for gastrointestinal illness. Results: For technical accuracy, areas under the ROC curve ranged from 78% for botulinic syndrome to 96% for respiratory syndrome. For case classification, sensitivity and specificity, respectively, were as follows: respiratory: 63%, 94%; botulinic: 30%, 99%; gastrointestinal: 69%, 95%; neurologic: 67%, 93%; rash: 47%, 99%; constitutional: 46%, 97%; and hemorrhagic: 75%, 99%. For outbreak detection, three respiratory and three gastrointestinal outbreaks were detected by CoCo with 100% sensitivity and specificity. Time series of chief complaints correlated with hospital admissions and preceded them by an average of 10.3 days for respiratory outbreaks and 29 days for gastrointestinal outbreaks. Conclusion: Three stages of evaluation are useful in determining the performance of syndromic surveillance from chief complaints.… [ABSTRACT FROM AUTHOR]