Article Search
닫기

Original Article

Split Viewer

International Journal of Fuzzy Logic and Intelligent Systems 2020; 20(4): 346-357

Published online December 25, 2020

https://doi.org/10.5391/IJFIS.2020.20.4.346

© The Korean Institute of Intelligent Systems

Cluster Size-Constrained Fuzzy C-Means with Density Center Searching

Jiarui Li1, Yukio Horiguchi2, and Tetsuo Sawaragi1

1Department of Mechanical Engineering and Science, Graduate School of Engineering, Kyoto University, Kyoto, Japan
2Faculty of Informatics, Kansai University, Osaka, Japan

Correspondence to :
Jiarui Li (ljr10225008@gmail.com)

Received: June 18, 2020; Revised: December 7, 2020; Accepted: December 15, 2020

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Fuzzy C-means (FCM) has a definite limitation when partitioning a dataset into clusters with varying sizes and densities because it ignores the scale difference in different dimensions of input data objects. To alleviate this cluster size insensitivity, we propose a wrapper algorithm for FCM by introducing cluster size as a priori information and limiting the search direction on the basis of density benchmarks (CSCD-FCM). This method is divided into two stages. The first stage adjusts the position of each cluster while maintaining its shape, and the second stage changes the shape of each cluster while maintaining its center. Both steps modify fuzzy partitions generated by FCM-like soft clustering methods by optimizing a “size-constrained” objective function. Numerical and practical experiments with unbalanced cluster size settings demonstrate the effectiveness of this method for extracting actual cluster structures, as well as achieving the desired cluster populations.

Keywords: Fuzzy C-means, Clustering, Cluster size insensitivity

No potential conflict of interest relevant to this article was reported.

Jiarui Li received her B.S., and M.S. degrees from Beijing Jiaotong University, China, in 2014 and 2017, respectively. She is currently a Ph.D. student at Kyoto University. Her research interests include fuzzy logic and human factors and human-machine interference.

E-mail: ljr10225008@gmail.com


Yukio Horiguchi received his B.S., M.S., and Ph.D. degrees from Kyoto University, Japan. He is currently a Professor at the Faculty of Informatics, Kansai University. His research interests include human factors, human-machine interference, and interface designs. His personal homepage is http://www.syn.me.kyoto-u.ac.jp/horiguchi/.


Tetsuo Sawaragi received his B.S., M.S., and Ph.D. degrees from Kyoto University, Japan. He is currently a Professor at the Department of Mechanical Engineering, Kyoto University. His research interests include system engineering, human-machine system, human-machine interference, and cognitive engineering. His personal homepage is http://www.design.kyoto-u.ac.jp/faculty/t-sawaragi.html.


Article

Original Article

International Journal of Fuzzy Logic and Intelligent Systems 2020; 20(4): 346-357

Published online December 25, 2020 https://doi.org/10.5391/IJFIS.2020.20.4.346

Copyright © The Korean Institute of Intelligent Systems.

Cluster Size-Constrained Fuzzy C-Means with Density Center Searching

Jiarui Li1, Yukio Horiguchi2, and Tetsuo Sawaragi1

1Department of Mechanical Engineering and Science, Graduate School of Engineering, Kyoto University, Kyoto, Japan
2Faculty of Informatics, Kansai University, Osaka, Japan

Correspondence to:Jiarui Li (ljr10225008@gmail.com)

Received: June 18, 2020; Revised: December 7, 2020; Accepted: December 15, 2020

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Fuzzy C-means (FCM) has a definite limitation when partitioning a dataset into clusters with varying sizes and densities because it ignores the scale difference in different dimensions of input data objects. To alleviate this cluster size insensitivity, we propose a wrapper algorithm for FCM by introducing cluster size as a priori information and limiting the search direction on the basis of density benchmarks (CSCD-FCM). This method is divided into two stages. The first stage adjusts the position of each cluster while maintaining its shape, and the second stage changes the shape of each cluster while maintaining its center. Both steps modify fuzzy partitions generated by FCM-like soft clustering methods by optimizing a “size-constrained” objective function. Numerical and practical experiments with unbalanced cluster size settings demonstrate the effectiveness of this method for extracting actual cluster structures, as well as achieving the desired cluster populations.

Keywords: Fuzzy C-means, Clustering, Cluster size insensitivity

Fig 1.

Figure 1.

Overall flow of the algorithm.

The International Journal of Fuzzy Logic and Intelligent Systems 2020; 20: 346-357https://doi.org/10.5391/IJFIS.2020.20.4.346

Fig 2.

Figure 2.

Data distribution of the input dataset.

The International Journal of Fuzzy Logic and Intelligent Systems 2020; 20: 346-357https://doi.org/10.5391/IJFIS.2020.20.4.346

Fig 3.

Figure 3.

FCM results and peak and floor of cluster 2.

The International Journal of Fuzzy Logic and Intelligent Systems 2020; 20: 346-357https://doi.org/10.5391/IJFIS.2020.20.4.346

Fig 4.

Figure 4.

Density center of cluster 2.

The International Journal of Fuzzy Logic and Intelligent Systems 2020; 20: 346-357https://doi.org/10.5391/IJFIS.2020.20.4.346

Fig 5.

Figure 5.

Cluster position adjustment of cluster 2.

The International Journal of Fuzzy Logic and Intelligent Systems 2020; 20: 346-357https://doi.org/10.5391/IJFIS.2020.20.4.346

Fig 6.

Figure 6.

Cluster shape adjustment of cluster 2.

The International Journal of Fuzzy Logic and Intelligent Systems 2020; 20: 346-357https://doi.org/10.5391/IJFIS.2020.20.4.346

Fig 7.

Figure 7.

Final results.

The International Journal of Fuzzy Logic and Intelligent Systems 2020; 20: 346-357https://doi.org/10.5391/IJFIS.2020.20.4.346

Fig 8.

Figure 8.

Numerical dataset with two clusters and different data distributions.

The International Journal of Fuzzy Logic and Intelligent Systems 2020; 20: 346-357https://doi.org/10.5391/IJFIS.2020.20.4.346

Fig 9.

Figure 9.

Clustering results using the four algorithms: (a) FCM, (b) SIIB-FCM, (c) KL-FCM, and (d) CSCD-FCM.

The International Journal of Fuzzy Logic and Intelligent Systems 2020; 20: 346-357https://doi.org/10.5391/IJFIS.2020.20.4.346

Fig 10.

Figure 10.

Comparison of the accuracy of four algorithms with various distances.

The International Journal of Fuzzy Logic and Intelligent Systems 2020; 20: 346-357https://doi.org/10.5391/IJFIS.2020.20.4.346

Fig 11.

Figure 11.

Comparison of the F1_score of four algorithms with various distances.

The International Journal of Fuzzy Logic and Intelligent Systems 2020; 20: 346-357https://doi.org/10.5391/IJFIS.2020.20.4.346

Fig 12.

Figure 12.

Changes in indices as cluster size decreases.

The International Journal of Fuzzy Logic and Intelligent Systems 2020; 20: 346-357https://doi.org/10.5391/IJFIS.2020.20.4.346

Fig 13.

Figure 13.

Changes in indices as cluster size increases.

The International Journal of Fuzzy Logic and Intelligent Systems 2020; 20: 346-357https://doi.org/10.5391/IJFIS.2020.20.4.346

Table 1 . Aims and contents of the four experiments.

Contents
Experiment 1Data structure extraction test
Experiment 2Distance tolerance test
Experiment 3Robustness test
Experiment 4Practical example of a healthcare problem

Table 2 . Cluster size results for two clusters with different distributions.

MethodCluster 1Cluster 2
SizeDifferenceSizeDifference
FCM1177±26823±26923±26−823±26
SIIB-FCM807±1051193±1051293±105−1193±105
KL-FCM1582±13418±13518±13−418±13
CSCD-FCM2000±10±1100±10±1

Table 3 . Evaluation indices for two clusters with different distributions.

MethodAccuracyF1_scoreDIXB
FCM0.6080±0.010.6528±0.00330.0043±0.00160.1532±0.0032
SIIB-FCM0.4321±0.050.6095±0.01190.0027±0.00130.0428±0.0031
KL-FCM0.7171±0.130.6922±0.0350.0016±0.000172.3254±58.074
CSCD-FCM0.9998±0.00030.9991±0.00150.303±0.25650.1868±0.0086

Table 4 . Evaluation indices for the practical example.

MethodAccuracyF1_scoreSensitivityCp size
FCM0.38850.35310.46082190(−540)
SIIB-FCM0.57850.55170.61502285(−445)
KL-FCM0.48610.55230.37951362(−1368)
CSCD-FCM0.72020.65520.83592934(+204)

Share this article on :

Related articles in IJFIS

Most KeyWord