International Journal of Fuzzy Logic and Intelligent Systems 2022; 22(1): 1-10
Published online March 25, 2022
https://doi.org/10.5391/IJFIS.2022.22.1.1
© The Korean Institute of Intelligent Systems
Tosin Akinwale Adesuyi1, Byeong Man Kim1 , and Jongwan Kim2
1Department of Computer and Software Engineering, Kumoh National Institute of Technology, Gumi, Korea
2Division of Computer and Information Engineering, Daegu University, Gyeongsan, Korea
Correspondence to :
Byeong Man Kim (bmkim@kumoh.ac.kr)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Sound is an essential element of human relationships and communication. The sound recognition process involves three phases: signal preprocessing, feature extraction, and classification. This paper describes research on the classification of snoring data used to determine the importance of sleep health in humans. However, current sound classification methods using deep learning approaches do not yield desirable results for building good models. This is because some of the salient features required to sufficiently discriminate sounds and improve the accuracy of the classification are poorly captured during training. In this study, we propose a new convolutional neural network (CNN) model for sound classification using multi-feature extraction. The extracted features were used to form a new dataset that was used as the input to the CNN. Experiments were conducted on snoring and non-snoring datasets. The accuracy of the proposed model was 99.7% for snoring sounds, demonstrating an almost perfect classification and superior results compared to existing methods.
Keywords: Sound recognition, Snoring sound, CNN, Multi-feature extraction
No potential conflict of interest relevant to this article was reported.
E-mail: atadesuyi@kumoh.ac.kr
E-mail: bmkim@kumoh.ac.kr
E-mail: jwkim@daegu.ac.kr
International Journal of Fuzzy Logic and Intelligent Systems 2022; 22(1): 1-10
Published online March 25, 2022 https://doi.org/10.5391/IJFIS.2022.22.1.1
Copyright © The Korean Institute of Intelligent Systems.
Tosin Akinwale Adesuyi1, Byeong Man Kim1 , and Jongwan Kim2
1Department of Computer and Software Engineering, Kumoh National Institute of Technology, Gumi, Korea
2Division of Computer and Information Engineering, Daegu University, Gyeongsan, Korea
Correspondence to:Byeong Man Kim (bmkim@kumoh.ac.kr)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Sound is an essential element of human relationships and communication. The sound recognition process involves three phases: signal preprocessing, feature extraction, and classification. This paper describes research on the classification of snoring data used to determine the importance of sleep health in humans. However, current sound classification methods using deep learning approaches do not yield desirable results for building good models. This is because some of the salient features required to sufficiently discriminate sounds and improve the accuracy of the classification are poorly captured during training. In this study, we propose a new convolutional neural network (CNN) model for sound classification using multi-feature extraction. The extracted features were used to form a new dataset that was used as the input to the CNN. Experiments were conducted on snoring and non-snoring datasets. The accuracy of the proposed model was 99.7% for snoring sounds, demonstrating an almost perfect classification and superior results compared to existing methods.
Keywords: Sound recognition, Snoring sound, CNN, Multi-feature extraction
A generic architecture for sound classification.
Overall proposed system architecture for 1D-CNN.
Acquisition sequence of snoring data.
A graphical view of MFCC feature from a snore signal.
Conceptual diagrams of (a) 1D-CNN model and (b) 2D-CNN model.
A 2D-CNN architecture using spectrogram images.
Classification accuracy for snoring and non-snoring using multi-feature techniques and spectrogram image.
A 1D-CNN architecture for snoring sound classification.
Table 1 . Variation in accuracy of features using the proposed 1D-CNN on snoring dataset.
Feature | Accuracy (%) |
---|---|
Spectral (Centroid+Bandwidth+RollOff) | 50.18 |
STFT+RMS+Spectral+ZCR | 60.79 |
STFT+RMS+Spectral | 89.46 |
MFCC | 99.00 |
MFCC+ZCR | 99.00 |
Spectral+ZCR+MFCC | 99.20 |
MFCC+ZCR+STFT+Spectral+RMS | 99.70 |
Table 2 . Classification results in existing studies on snoring data.
Feature extraction technique | Classifier | Data size | Test accuracy (%) | ||
---|---|---|---|---|---|
Subject | Training | ||||
Demir et al. [20] | LBP+HOG | SVM | - | 828 | 72.00 |
Lim et al. [11] | ZCR+STFT+MFCC | RNN | 8 | 5600 | 98.80 |
Kang et al. [6] | MFCC | CNN+LSTM | 24 | 24 | 88.28 |
Arsenali et al. [9] | MFCC | RNN | 20 | 5670 | 95.00 |
Khan [21] | MFCC | CNN | - | 1000 | 96.00 |
Wang et al. [22] | - | Dual CNN+GRU | - | 828 | 63.80 |
Tuncer et al. [10] | PTT signal+AlexNet+VGG16 | SVM+KNN | 100 | 100 | 92.78 |
Dalal and Triggs [23] | SCAT+GMM+MAP | MLP | 224 | 282 | 67.71 |
GRU, gated recurrent unit; SCAT, deep scattering spectrum; GMM, Gaussian mixture model; MAP, maximum a posteriori; MLP, multilayer perceptron.
Jeongmin Kim and Hyukdoo Choi
International Journal of Fuzzy Logic and Intelligent Systems 2024; 24(2): 105-113 https://doi.org/10.5391/IJFIS.2024.24.2.105Igor V. Arinichev, Sergey V. Polyanskikh, Irina V. Arinicheva, Galina V. Volkova, and Irina P. Matveeva
International Journal of Fuzzy Logic and Intelligent Systems 2022; 22(1): 106-115 https://doi.org/10.5391/IJFIS.2022.22.1.106Wang-Su Jeon and Sang-Yong Rhee
International Journal of Fuzzy Logic and Intelligent Systems 2021; 21(4): 401-408 https://doi.org/10.5391/IJFIS.2021.21.4.401A generic architecture for sound classification.
|@|~(^,^)~|@|Overall proposed system architecture for 1D-CNN.
|@|~(^,^)~|@|Acquisition sequence of snoring data.
|@|~(^,^)~|@|A graphical view of MFCC feature from a snore signal.
|@|~(^,^)~|@|Conceptual diagrams of (a) 1D-CNN model and (b) 2D-CNN model.
|@|~(^,^)~|@|A 2D-CNN architecture using spectrogram images.
|@|~(^,^)~|@|Classification accuracy for snoring and non-snoring using multi-feature techniques and spectrogram image.
|@|~(^,^)~|@|A 1D-CNN architecture for snoring sound classification.