Int. J. Fuzzy Log. Intell. Syst. 2018; 18(2): 154-160
Published online June 25, 2018
https://doi.org/10.5391/IJFIS.2018.18.2.154
© The Korean Institute of Intelligent Systems
Yagya Raj Pandeya, and Joonwhoan Lee
Department of Computer Science and Engineering, Chonbuk National University, Jeonju, Korea
Correspondence to :
Joonwhoan Lee (chlee@jbnu.ac.kr)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
The domestic cat or house cats (Felis catus) are an ancient human pet animal that can deliver various alert message to human on environmental changes by its mysterious kinds of sounds generation capability. Cat sound classification using deep neural network had scarcity of labeled data, that impelled us to make CatSound dataset across 10 categories of sound. The dataset was even not enough to select data driven approach for end to end learning, so we choose transfer learning for feature extraction. Extracted feature are input to six various classifiers and ensemble techniques applied with predicted probabilities of all classifier results. The ensemble and data augmentation perform better in this research. Finally, various results are evaluated using confusion matrix and receiver operating characteristic curve.
Keywords: Labeled dataset, Transfer learning, Ensemble method, Data augmentation
No potential conflict of interest relevant to this article was reported.
E-mail: yagyapandeya@gmail.com
E-mail: chlee@jbnu.ac.kr
Int. J. Fuzzy Log. Intell. Syst. 2018; 18(2): 154-160
Published online June 25, 2018 https://doi.org/10.5391/IJFIS.2018.18.2.154
Copyright © The Korean Institute of Intelligent Systems.
Yagya Raj Pandeya, and Joonwhoan Lee
Department of Computer Science and Engineering, Chonbuk National University, Jeonju, Korea
Correspondence to:Joonwhoan Lee (chlee@jbnu.ac.kr)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
The domestic cat or house cats (Felis catus) are an ancient human pet animal that can deliver various alert message to human on environmental changes by its mysterious kinds of sounds generation capability. Cat sound classification using deep neural network had scarcity of labeled data, that impelled us to make CatSound dataset across 10 categories of sound. The dataset was even not enough to select data driven approach for end to end learning, so we choose transfer learning for feature extraction. Extracted feature are input to six various classifiers and ensemble techniques applied with predicted probabilities of all classifier results. The ensemble and data augmentation perform better in this research. Finally, various results are evaluated using confusion matrix and receiver operating characteristic curve.
Keywords: Labeled dataset, Transfer learning, Ensemble method, Data augmentation
CatSound dataset class representation.
Overview of feature extraction from CNN network. From each layer of CNN, the globally averaged 32-dimensional features are concatenated into one feature vector and fed into various classifier. The predicted probability of each classifier is ensembled for final prediction result.
ROC curve of the best performing classifiers with 3
The average of accuracy, F1-score and area under curve comparison of our classifiers (six classifier and one ensemble) with original and augmented dataset.
Confusion matrix of the best performing ensemble classifier with 3
Table 1 . Best performance of classifier in 3
Classifiers | Accuracy (%) | F1-Score | AUC score |
---|---|---|---|
RF | 78.99 | 0.79 | 0.978 |
KNN | 79.07 | 0.79 | 0.884 |
Extra Trees | 77.30 | 0.77 | 0.977 |
LDA | 73.67 | 0.74 | 0.967 |
QDA | 80.76 | 0.81 | 0.974 |
SVM | 78.57 | 0.78 | 0.978 |
Ensemble | 87.76 | 0.88 | 0.990 |
RF | 78.99 | 0.79 | 0.978 |
Ho-Seung Kim and Jee-Hyong Lee
International Journal of Fuzzy Logic and Intelligent Systems 2024; 24(2): 83-92 https://doi.org/10.5391/IJFIS.2024.24.2.83Alif Tri Handoyo, Hidayaturrahman, Criscentia Jessica Setiadi, Derwin Suhartono
International Journal of Fuzzy Logic and Intelligent Systems 2022; 22(4): 401-413 https://doi.org/10.5391/IJFIS.2022.22.4.401Bhuwan Bhattarai, and Joonwhoan Lee
International Journal of Fuzzy Logic and Intelligent Systems 2019; 19(2): 88-96 https://doi.org/10.5391/IJFIS.2019.19.2.88CatSound dataset class representation.
|@|~(^,^)~|@|Overview of feature extraction from CNN network. From each layer of CNN, the globally averaged 32-dimensional features are concatenated into one feature vector and fed into various classifier. The predicted probability of each classifier is ensembled for final prediction result.
|@|~(^,^)~|@|ROC curve of the best performing classifiers with 3
The average of accuracy, F1-score and area under curve comparison of our classifiers (six classifier and one ensemble) with original and augmented dataset.
|@|~(^,^)~|@|Confusion matrix of the best performing ensemble classifier with 3