Article Search
닫기

Original Article

Split Viewer

Int. J. Fuzzy Log. Intell. Syst. 2017; 17(1): 10-16

Published online March 31, 2017

https://doi.org/10.5391/IJFIS.2017.17.1.10

© The Korean Institute of Intelligent Systems

Simultaneous Kernel Learning and Label Imputation for Pattern Classification with Partially Labeled Data

Minyoung Kim

Department of Electronics & IT Media Engineering, Seoul National University of Science & Technology, Seoul, Korea

Correspondence to :
Minyoung Kim (mikim@seoultech.ac.kr)

Received: November 23, 2016; Revised: December 12, 2016; Accepted: December 14, 2016

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

The kernel function plays a central role in modern pattern classification for its ability to capture the inherent affinity structure of the underlying data manifold. While the kernel function can be chosen by human experts with domain knowledge, it is often more principled and promising to learn it directly from data. This idea of kernel learning has been studied considerably in machine learning and pattern recognition. However, most kernel learning algorithms assume fully supervised setups requiring expensive class label annotation for the training data. In this paper we consider kernel learning in the semi-supervised setup where only a fraction of data points need to be labeled. We propose two approaches: the first extends the idea of label propagation along the data similarity graph, in which we simultaneously learn the kernel and impute the labels of the unlabeled data. The second aims to minimize the dual loss in the support vector machines (SVM) classifier learning with respect to the kernel parameters and the missing labels. We provide reasonable and effective approximate solution methods for these optimization problems. These approaches exploit both labeled and unlabeled data in kernel leaning, where we empirically demonstrate the effectiveness on several benchmark datasets with partially labeled learning setups.

Keywords: Kernel learning, Semi-supervised learning, Pattern classification, Optimization

No potential conflict of interest relevant to this article was reported.

Minyoung Kim received his BS and MS degrees both in Computer Science and Engineering from Seoul National University, South Korea. He earned a PhD degree in Computer Science from Rutgers University in 2008. From 2009 to 2010 he was a postdoctoral researcher at the Robotics Institute of Carnegie Mellon University. He is currently an associate professor in Department of Electronics and IT Media Engineering at Seoul National University of Science and Technology in Korea. His primary research interest is machine learning and computer vision. His research focus includes graphical models, motion estimation/tracking, discriminative models/learning, kernel methods, and dimensionality reduction.

E-mail: mikim@seoultech.ac.kr

Article

Original Article

Int. J. Fuzzy Log. Intell. Syst. 2017; 17(1): 10-16

Published online March 31, 2017 https://doi.org/10.5391/IJFIS.2017.17.1.10

Copyright © The Korean Institute of Intelligent Systems.

Simultaneous Kernel Learning and Label Imputation for Pattern Classification with Partially Labeled Data

Minyoung Kim

Department of Electronics & IT Media Engineering, Seoul National University of Science & Technology, Seoul, Korea

Correspondence to:Minyoung Kim (mikim@seoultech.ac.kr)

Received: November 23, 2016; Revised: December 12, 2016; Accepted: December 14, 2016

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

The kernel function plays a central role in modern pattern classification for its ability to capture the inherent affinity structure of the underlying data manifold. While the kernel function can be chosen by human experts with domain knowledge, it is often more principled and promising to learn it directly from data. This idea of kernel learning has been studied considerably in machine learning and pattern recognition. However, most kernel learning algorithms assume fully supervised setups requiring expensive class label annotation for the training data. In this paper we consider kernel learning in the semi-supervised setup where only a fraction of data points need to be labeled. We propose two approaches: the first extends the idea of label propagation along the data similarity graph, in which we simultaneously learn the kernel and impute the labels of the unlabeled data. The second aims to minimize the dual loss in the support vector machines (SVM) classifier learning with respect to the kernel parameters and the missing labels. We provide reasonable and effective approximate solution methods for these optimization problems. These approaches exploit both labeled and unlabeled data in kernel leaning, where we empirically demonstrate the effectiveness on several benchmark datasets with partially labeled learning setups.

Keywords: Kernel learning, Semi-supervised learning, Pattern classification, Optimization

Table 1 . Statistics of the UCI datasets.

 Dataset  Number of data points  Input dimension 
Sonar20860
Vote32016
Wpbc18033
Liver3456

Table 2 . Test errors (%) on the UCI datasets with three different labeled training set proportions.

Dataset MethodLabeled set proportions
10%30%50%
SonarKTA48.43 ± 4.4243.95 ± 3.2336.67 ± 3.94
SLM47.14 ± 3.8343.10 ± 3.4538.81 ± 3.82
SSKL-LP43.05 ± 3.7940.24 ± 3.8033.10 ± 3.05
SSKL-SDM  42.14 ± 4.22  40.00 ± 3.81  30.48 ± 3.73 

VoteKTA25.86 ± 3.9922.66 ± 3.5013.57 ± 4.55
SLM23.66 ± 3.2120.76 ± 4.6213.74 ± 3.46
SSKL-LP19.86 ± 4.6216.57 ± 3.5511.89 ± 2.78
SSKL-SDM18.29 ± 3.9015.29 ± 3.6910.97 ± 3.53

WpbcKTA35.43 ± 4.6031.87 ± 3.9729.05 ± 3.13
SLM34.39 ± 3.9431.45 ± 4.6029.87 ± 4.79
SSKL-LP31.82 ± 3.4228.43 ± 3.5726.06 ± 3.65
SSKL-SDM31.39 ± 3.6527.56 ± 3.1325.82 ± 3.42

LiverKTA48.84 ± 4.3846.80 ± 3.3240.87 ± 3.59
SLM47.25 ± 3.6745.30 ± 3.9040.75 ± 3.73
SSKL-LP42.03 ± 3.7240.29 ± 4.7238.87 ± 3.92
SSKL-SDM42.88 ± 3.3241.30 ± 3.6738.72 ± 3.59

Share this article on :

Related articles in IJFIS