Article Search
닫기

Original Article

Split Viewer

Int. J. Fuzzy Log. Intell. Syst. 2018; 18(2): 154-160

Published online June 25, 2018

https://doi.org/10.5391/IJFIS.2018.18.2.154

© The Korean Institute of Intelligent Systems

Domestic Cat Sound Classification Using Transfer Learning

Yagya Raj Pandeya, and Joonwhoan Lee

Department of Computer Science and Engineering, Chonbuk National University, Jeonju, Korea

Correspondence to :
Joonwhoan Lee (chlee@jbnu.ac.kr)

Received: May 29, 2018; Revised: June 16, 2018; Accepted: June 21, 2018

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

The domestic cat or house cats (Felis catus) are an ancient human pet animal that can deliver various alert message to human on environmental changes by its mysterious kinds of sounds generation capability. Cat sound classification using deep neural network had scarcity of labeled data, that impelled us to make CatSound dataset across 10 categories of sound. The dataset was even not enough to select data driven approach for end to end learning, so we choose transfer learning for feature extraction. Extracted feature are input to six various classifiers and ensemble techniques applied with predicted probabilities of all classifier results. The ensemble and data augmentation perform better in this research. Finally, various results are evaluated using confusion matrix and receiver operating characteristic curve.

Keywords: Labeled dataset, Transfer learning, Ensemble method, Data augmentation

No potential conflict of interest relevant to this article was reported.

Yagya Raj Pandeya was born in Dadeldhura, Nepal in 1988. He receives the B.E. and M.E. degree in Computer Engineering from the Pokhara University of Nepal, in 2010 and 2013, respectively. He was Head of Department of Computer Engineering in NAST College in Dhangadhi, Nepal. He join Ministry of Home Affairs Nepal in 2015 to 2017. Mr. Yagya is currently a Ph.D. fellow at Fuzzy Logic and Artificial Intelligence Laboratory in Chonbuk National University, Korea.

E-mail: yagyapandeya@gmail.com

Joonwhoan Lee received his BS degree in Electronic Engineering from the University of Hanyang, Korea in 1980. He received his MS degree in Electrical and Electronics Engineering from KAIST, Korea in 1982, and the Ph.D. degree in Electrical and Computer Engineering from University of Missouri, USA in 1990. He is currently a Professor in Department of Computer Engineering, Chonbuk National University, Korea. His research interests include image and audio processing, computer vision, emotion engineering etc.

E-mail: chlee@jbnu.ac.kr

Article

Original Article

Int. J. Fuzzy Log. Intell. Syst. 2018; 18(2): 154-160

Published online June 25, 2018 https://doi.org/10.5391/IJFIS.2018.18.2.154

Copyright © The Korean Institute of Intelligent Systems.

Domestic Cat Sound Classification Using Transfer Learning

Yagya Raj Pandeya, and Joonwhoan Lee

Department of Computer Science and Engineering, Chonbuk National University, Jeonju, Korea

Correspondence to:Joonwhoan Lee (chlee@jbnu.ac.kr)

Received: May 29, 2018; Revised: June 16, 2018; Accepted: June 21, 2018

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

The domestic cat or house cats (Felis catus) are an ancient human pet animal that can deliver various alert message to human on environmental changes by its mysterious kinds of sounds generation capability. Cat sound classification using deep neural network had scarcity of labeled data, that impelled us to make CatSound dataset across 10 categories of sound. The dataset was even not enough to select data driven approach for end to end learning, so we choose transfer learning for feature extraction. Extracted feature are input to six various classifiers and ensemble techniques applied with predicted probabilities of all classifier results. The ensemble and data augmentation perform better in this research. Finally, various results are evaluated using confusion matrix and receiver operating characteristic curve.

Keywords: Labeled dataset, Transfer learning, Ensemble method, Data augmentation

Fig 1.

Figure 1.

CatSound dataset class representation.

The International Journal of Fuzzy Logic and Intelligent Systems 2018; 18: 154-160https://doi.org/10.5391/IJFIS.2018.18.2.154

Fig 2.

Figure 2.

Overview of feature extraction from CNN network. From each layer of CNN, the globally averaged 32-dimensional features are concatenated into one feature vector and fed into various classifier. The predicted probability of each classifier is ensembled for final prediction result.

The International Journal of Fuzzy Logic and Intelligent Systems 2018; 18: 154-160https://doi.org/10.5391/IJFIS.2018.18.2.154

Fig 3.

Figure 3.

ROC curve of the best performing classifiers with 3x Aug datasets.

The International Journal of Fuzzy Logic and Intelligent Systems 2018; 18: 154-160https://doi.org/10.5391/IJFIS.2018.18.2.154

Fig 4.

Figure 4.

The average of accuracy, F1-score and area under curve comparison of our classifiers (six classifier and one ensemble) with original and augmented dataset.

The International Journal of Fuzzy Logic and Intelligent Systems 2018; 18: 154-160https://doi.org/10.5391/IJFIS.2018.18.2.154

Fig 5.

Figure 5.

Confusion matrix of the best performing ensemble classifier with 3x Aug dataset.

The International Journal of Fuzzy Logic and Intelligent Systems 2018; 18: 154-160https://doi.org/10.5391/IJFIS.2018.18.2.154

Table 1 . Best performance of classifier in 3x Aug dataset using three metrics: accuracy, F1-score and AUC score.

ClassifiersAccuracy (%)F1-ScoreAUC score
RF78.990.790.978
KNN79.070.790.884
Extra Trees77.300.770.977
LDA73.670.740.967
QDA80.760.810.974
SVM78.570.780.978
Ensemble87.760.880.990
RF78.990.790.978

Share this article on :

Related articles in IJFIS