Back to Search
Start Over
Large Margin Distribution Learning with Cost Interval and Unlabeled Data.
- Source :
- IEEE Transactions on Knowledge & Data Engineering; 7/1/2016, Vol. 28 Issue 7, p1749-1763, 15p
- Publication Year :
- 2016
-
Abstract
- In many real-world applications, different types of misclassification usually suffer from different costs, but the accurate cost is often hard to be determined and usually one can only get an interval-estimation like that one type of mistake is about 5 to 10 times more serious than the other type. On the other hand, there are usually abundant unlabeled data available, leading to great research effort about semi-supervised learning. It is noticeable that cost interval and unlabeled data usually appear simultaneously in practice tasks; however, there is rare study tackling them together. In this paper, we propose the cisLDM approach which is able to handle cost interval and exploit unlabeled data in a principled way. Rather than maximizing the minimum margin like traditional large margin classifiers, cisLDM tries to optimize the margin distribution on both labeled and unlabeled data when minimizing the worst-case total-cost and the mean total-cost simultaneously according to the cost interval. Experiments on a broad range of datasets and cost settings exhibit the impressive performance of cisLDM. In particular, cisLDM is able to reduce 47 percent more total-cost than standard SVM and 27 percent more total-cost than cost-sensitive semi-supervised SVM which assumes the true cost value is known in advance. [ABSTRACT FROM PUBLISHER]
- Subjects :
- LEARNING
COST
CONFIDENCE intervals
ELECTRONIC data processing
BIG data
Subjects
Details
- Language :
- English
- ISSN :
- 10414347
- Volume :
- 28
- Issue :
- 7
- Database :
- Complementary Index
- Journal :
- IEEE Transactions on Knowledge & Data Engineering
- Publication Type :
- Academic Journal
- Accession number :
- 116115932
- Full Text :
- https://doi.org/10.1109/TKDE.2016.2535283