Analysis of the αβ T cell receptor (TCR) repertoire in patients with myelodysplastic syndrome (MDS) using the technique of TCR β chain spectratyping has provided valuable insights into the pathophysiology of cytopenias in a subset of patients. TCR β chain spectratypes represent complex datasets, however, and statistical tools for their comprehensive analysis are limited. To enable global comparison of spectratype data from different individuals, we developed a robust statistical method based on k-means clustering analysis, and applied this method to analysis of the αβ TCR repertoires in the peripheral blood of 50 MDS patients and 23 age-matched healthy controls. From each of the 23 CDR3 length distributions (one for each of 23 Vβ families) comprising each spectratype, 4 features were extracted: the number of peaks in the distribution, the relative height of the largest peak, the skewness of the distribution, and the kurtosis. K-means clustering was applied at the Vβ family level to the extracted feature data from the CDR3 length distributions across the 73 subjects. This analysis typically identified two distinct clusters: a “normal” cluster characterized by a higher number of peaks, lower maximum relative height, lower skewness, and lower kurtosis, and a second “abnormal” cluster with the opposite characteristics. Thus, each CDR3 length distribution was classified as normal or abnormal according to its assignment to one of these two clusters. The mean number of abnormal CDR3 length distributions per individual was 1.6 (range, 0 to 5) for the age-matched controls and 3.7 (range, 1 to 18) (p=0.03) for the MDS patients. K-means clustering was also applied at the individual level to composite datasets consisting of the four features extracted from each of the 23 CRD3 length distributions in each individual’s spectratype. This higher-level clustering again generated 2 clusters. The “normal” cluster contained all of the age-matched control subjects as well as 39 MDS patients, while the “abnormal” cluster contained the remaining 11 MDS patients, all of whom had profoundly abnormal TCR Vβ spectratypes. The median age in the abnormal group was higher than in the normal group, 67 versus 61 years, respectively (p=0.031). When the individuals in the two groups were analyzed according to the IPSS and WHO classification systems, 82% of patients in the abnormal group had high-risk disease (IPSS int-2 and high), compared with only 45% in the normal group (p=0.03), and 73% of patients in the abnormal group had advanced disease by WHO classification (>5% blasts), as opposed to 41% in the normal group (p=0.027). The 11 MDS patients in the abnormal cluster also had a higher median expression level of the Wilms’ tumor-1 (WT1) gene, as determined by quantitative RT-PCR, in the peripheral blood (0.034 versus 0.0062, p=0.047), and a higher median bone marrow blast count (10% versus 2%, p=0.056). The two groups of MDS patients were evaluated for potential differences in three variables that could potentially confound the analysis of TCR Vβ spectratyping: peripheral blood lymphopenia, active infection, and a history of transfusion. The median peripheral blood lymphocyte count (1310 × 103/ml versus 1410 × 103/ml, respectively; p=0.37), a history of transfusion (70% versus 70%), and the incidence of MDS-related infection (27% versus 21%, respectively, p=0.69), as defined by a viral, fungal or bacterial infection identified after the diagnosis of MDS but before sample acquisition, were also not significantly different between the normal and abnormal groups. In order to evaluate the stability of spectratypes over time and during therapy, serial Vβ spectratyping analysis of bone marrow mononuclear cells was performed in 4 patients before and after treatment with azacytidine and etanercept. In all 4 cases, the spectratypes remained stably abnormal over months of observation, during which time 2 patients achieved complete and 2 achieved partial remissions of their disease. In conclusion, we have developed a new statistical algorithm for spectratyping analysis that enabled the identification of a group of MDS patients with high-risk disease and highly abnormal αβ TCR repertoires. These findings further highlight the biological and clinical heterogeneity of MDS and provide the rationale for further studies of the αβ TCR repertoire in MDS.