Tayefi, Maryam, Saberi-Karimian, Maryam, Esmaeili, Habibollah, Zadeh, Alireza, Ebrahimi, Mahmoud, Mohebati, Mohsen, Heidari-Bakavoli, Alireza, Azarpajouh, Mahmoud, Heshmati, Masoud, Safarian, Mohammad, Nematy, Mohsen, Parizadeh, Seyed, Ferns, Gordon, and Ghayour-Mobarhan, Majid
Metabolic syndrome is a clustering of metabolic abnormalities that include central obesity, dyslipidemia, insulin resistance, and increased blood pressure. The aim of this study was to evaluate and identify the risk factors associated with metabolic syndrome by using a decision tree algorithm as a data mining tool. A total of 6578 individuals were included in the analysis using a body mass index (BMI) cutoff >= 25 kg/m for the definition of overweight, in accordance with International Diabetic Federation (IDF) criteria. Subjects with obesity plus two or more of the criteria for defining metabolic syndrome were included in the metabolic syndrome group. Of the 6578 subjects, 70% (4539 subjects) were selected as a 'training' dataset and 30% (2039 cases) were used as the testing dataset to evaluate the performance of decision tree. Two models were evaluated. In model I, age, sex, educational level, marriage and job status, fasted serum triglyceride (TG), total cholesterol (TC), high-density lipoprotein cholesterol (HDL-C), uric acid, fasting blood glucose (FBG), high sensitive C-reactive protein (Hs-CRP), systolic (SBP) and diastolic (DBP) blood pressure, and physical activity level were considered as input variables and in model II, age, gender, Hs-CRP, white blood cell (WBC), red blood cell (RBC), hemoglobin (HGB), hematocrit (HCT), mean corpuscular volume (MCV), mean corpuscular hemoglobin (MCH), Platelets (PLT), red cell distribution width (RDW), and platelet distribution width (PDW) were used as input variables. The validation of the model was assessed by constructing a receiver operating characteristic (ROC) curve. The results showed that in model I, serum fasted TG was the most important associated risk factor for metabolic syndrome. In model II, serum Hs-CRP was identified as a risk factor of metabolic syndrome. The sensitivity, specificity, accuracy, and the area under the ROC curve (AUC) values for model I were 99%, 94%, 97% and 0.972 and for model II were 74%, 77%, 76% and 0.812, respectively. Our findings in model I suggest that the IDF criteria are suitable for identifying individuals within the Iranian population into those with, or without MetS. Furthermore, model II showed that serum Hs-CRP concentrations were identified as a risk factor for metabolic syndrome within the Iranian population. [ABSTRACT FROM AUTHOR]