Back to Search Start Over

Large Margin Distribution Learning with Cost Interval and Unlabeled Data.

Authors :
Zhou, Yu-Hang
Zhou, Zhi-Hua
Source :
IEEE Transactions on Knowledge & Data Engineering; 7/1/2016, Vol. 28 Issue 7, p1749-1763, 15p
Publication Year :
2016

Abstract

In many real-world applications, different types of misclassification usually suffer from different costs, but the accurate cost is often hard to be determined and usually one can only get an interval-estimation like that one type of mistake is about 5 to 10 times more serious than the other type. On the other hand, there are usually abundant unlabeled data available, leading to great research effort about semi-supervised learning. It is noticeable that cost interval and unlabeled data usually appear simultaneously in practice tasks; however, there is rare study tackling them together. In this paper, we propose the cisLDM approach which is able to handle cost interval and exploit unlabeled data in a principled way. Rather than maximizing the minimum margin like traditional large margin classifiers, cisLDM tries to optimize the margin distribution on both labeled and unlabeled data when minimizing the worst-case total-cost and the mean total-cost simultaneously according to the cost interval. Experiments on a broad range of datasets and cost settings exhibit the impressive performance of cisLDM. In particular, cisLDM is able to reduce 47 percent more total-cost than standard SVM and 27 percent more total-cost than cost-sensitive semi-supervised SVM which assumes the true cost value is known in advance. [ABSTRACT FROM PUBLISHER]

Details

Language :
English
ISSN :
10414347
Volume :
28
Issue :
7
Database :
Complementary Index
Journal :
IEEE Transactions on Knowledge & Data Engineering
Publication Type :
Academic Journal
Accession number :
116115932
Full Text :
https://doi.org/10.1109/TKDE.2016.2535283