深圳大学新葡的京集团350vip8888
College of Computer Science and Software Engineering, SZU

Incorporating Diversity and Informativeness in 

Multiple-Instance Active Learning

IEEE Transactions on Fuzzy Systems

 

Ran Wang1    Xizhao Wang1    Sam Kwong2    Chen Xu1

1Shenzhen University    2City University of Hong Kong

 

Abstract

Multiple-instance active learning (MIAL) is a paradigm to collect sufficient training bags for a multiple-instance learning (MIL) problem, by selecting and querying the most valuable unlabeled bags iteratively. Existing works on MIAL evaluate an unlabeled bag by its informativeness with regard to the current classifier, but neglect the internal distribution of its instances, which can reflect the diversity of the bag. In this paper, two diversity criteria, i.e., clustering-based diversity and fuzzy rough set based diversity, are proposed for MIAL by utilizing a support vector machine (SVM) based MIL classifier. In the first criterion, a kernel k-means clustering algorithm is used to explore the hidden structure of the instances in the feature space of the SVM, and the diversity degree of an unlabeled bag is measured by the number of unique clusters covered by the bag. In the second criterion, the lower approximations in fuzzy rough sets are used to define a new concept named dissimilarity degree, which depicts the uniqueness of an instance so as to measure the diversity degree of a bag. By incorporating the proposed diversity criteria with existing informativeness measurements, new MIAL algorithms are developed, which can select bags with both high informativeness and diversity. Experimental comparisons demonstrate the feasibility and effectiveness of the proposed methods.

Fig. 1. Instances and bags in MIL.

 

Fig. 2. Investigation on unlabeled bags for an SVM classifier.

 

Fig. 3. Illustrative examples for (a) k-means clustering and (b) kernel k-means clustering.

 

Fig. 4. Computing the dissimilarity degree of instances in a bag. (a) Bag 1, (b) Bag 2, and (c) Bag 3.

Fig. 5. Training samples in MNIST dataset.

 

Fig. 6. Performance comparison of different learning strategies on MNIST MIL datasets. (Base-learner: mi-SVM). (a) Digit“0” (50 trials), (b) digit “1” (50 trials), (c) digit “2” (50 trials), (d) digit “3” (50 trials), (e) digit “4” (50 trials), (f) digit “5” (50 trials), (g) digit “6” (50 trials), (h) digit “7” (50 trials), (i) digit “8” (50 trials), (j) digit “9” (50 trials), (k) average result for ten digits, and (l) legend.

 

Fig. 7. Sensitivity analysis of parameter α for (a) and (b) SoftMax and (c) and (d) CombinU.

 

Fig. 9. Performance comparison of different learning strategies on CorelMIL datasets. (Base-learner: mi-SVM). (a) Elephant (50 trials). (b) Fox (50 trials). (c) Tiger (50 trials).

 

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grant 61672122, Grant 61602077, Grant 61772344 and Grant 61732011, in part by the Public Welfare Funds for Scientific Research of Liaoning Province of China under Grant 20170005, in part by the Natural Science Foundation of Liaoning Province of China under Grant 20170540097, and in part by the Fundamental Research Funds for the Central Universities under Grant 3132016348.

 

Bibtex

@ARTICLE{7953641,

author={Wang, Ran and Wang, Xi-Zhao and Kwong, Sam and Xu, Chen},

journal={IEEE Transactions on Fuzzy Systems},

title={Incorporating Diversity and Informativeness in Multiple-Instance Active Learning},

year={2017},

volume={25},

number={6},

pages={1460-1475},

doi={10.1109/TFUZZ.2017.2717803}

}

Downloads

XML 地图