重點研究將數據挖掘中的分類技術與數據庫技術緊密結合的高效的可擴展的分類算法.提出一種基于分組記數技術構造分類器的方法,利用數據庫系統的結構化查詢語言來實現主要計算任務.為了提高算法的執行效率,還提出了優化策略和冗余規則的剪裁策略,并將分類規則的發現過程與相關屬性的選擇方法有機地結合在一起.使用這些方法和策略,分類算法能夠從大規模數據集中快速地發現一組簡潔的規則.除了具有與現有分類算法相當的準確度和較高的執行效率以外,該分類算法還具有良好的基于訓練集元組個數和屬性個數兩方面的可擴展性和易于實現的特. This paper focuses on the study of efficient and scalable classification algorithm that tightly integrates classification technology with relational database system technology. In this paper, an approach based on grouping and counting is proposed to build classifier, which uses SQL (structured query language) provided by relational database to implement the major computation tasks. In order to improve the performance, several optimization strategies and a redundant rules?pruning strategy together with a feature selection method integrating with the process of finding classification rules are also proposed. With all these methods and strategies, the classification algorithm can find a compact set of classification rules quickly from a large volume of data. In addition to the same classification accuracy with current popular classification algorithms and high training speed, the unique features of the classification algorithm also include its linear scalability with respect to the number of training samples and the number of attributes, and the simplicity in implementation.