Article Search
닫기

Original Article

Split Viewer

International Journal of Fuzzy Logic and Intelligent Systems 2022; 22(1): 1-10

Published online March 25, 2022

https://doi.org/10.5391/IJFIS.2022.22.1.1

© The Korean Institute of Intelligent Systems

Snoring Sound Classification Using 1D-CNN Model Based on Multi-Feature Extraction

Tosin Akinwale Adesuyi1, Byeong Man Kim1 , and Jongwan Kim2

1Department of Computer and Software Engineering, Kumoh National Institute of Technology, Gumi, Korea
2Division of Computer and Information Engineering, Daegu University, Gyeongsan, Korea

Correspondence to :
Byeong Man Kim (bmkim@kumoh.ac.kr)

Received: June 30, 2021; Revised: September 23, 2021; Accepted: November 9, 2021

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Sound is an essential element of human relationships and communication. The sound recognition process involves three phases: signal preprocessing, feature extraction, and classification. This paper describes research on the classification of snoring data used to determine the importance of sleep health in humans. However, current sound classification methods using deep learning approaches do not yield desirable results for building good models. This is because some of the salient features required to sufficiently discriminate sounds and improve the accuracy of the classification are poorly captured during training. In this study, we propose a new convolutional neural network (CNN) model for sound classification using multi-feature extraction. The extracted features were used to form a new dataset that was used as the input to the CNN. Experiments were conducted on snoring and non-snoring datasets. The accuracy of the proposed model was 99.7% for snoring sounds, demonstrating an almost perfect classification and superior results compared to existing methods.

Keywords: Sound recognition, Snoring sound, CNN, Multi-feature extraction

This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF), funded by the Ministry of Education (No. NRF-2020R1F1A104833611).

No potential conflict of interest relevant to this article was reported.

Tosin A. Adesuyi received his B.Tech. and M.Tech. degrees in computer science from the Federal University of Technology, Akure, Nigeria, in 2010 and 2014, respectively. He has his Ph.D. in Artificial Intelligence from the Department of Software Engineering, Kumoh National Institute of Technology, Korea in 2020. His research work has been published in premier conferences and journals. His research areas include artificial intelligence, computer vision, privacy, e-learning, deep learning and accelerated optimized AI models. He currently works as a GPU advocate at NVIDIA Corp.

E-mail: atadesuyi@kumoh.ac.kr

Byeong Man Kim received the B.S. degree in the Department of computer Engineering from Seoul National University (SNU), Korea, in 1987, and the M.S. and the Ph.D. degrees in computer science from Korea Advanced Institute of Science and Technology (KAIST), Korea, in 1989 and 1992, respectively. He has been with Kumoh National Institute of Technology since 1992 as a faculty member of Computer Software Engineering Department. From 1998–1999, he was a post-doctoral fellow in UC, Irvine. From 2005–2006, he was a visiting scholar at Dept. of Computer Science of Colorado State University, working on design of a collaborative web agent based on friend network. His current research areas include artificial intelligence, information filtering, information security and brain computer interface.

E-mail: bmkim@kumoh.ac.kr

Jongwan Kim received B.S., M.S., and Ph.D. degrees in Department of Computer Engineering from Seoul National University, Korea, in 1987, 1989, and 1994, respectively. He has been with Daegu University since 1995 and is currently a professor. His research interests include artificial intelligence, internet dysfunction, human computer interaction, and IT convergence education.

E-mail: jwkim@daegu.ac.kr

Article

Original Article

International Journal of Fuzzy Logic and Intelligent Systems 2022; 22(1): 1-10

Published online March 25, 2022 https://doi.org/10.5391/IJFIS.2022.22.1.1

Copyright © The Korean Institute of Intelligent Systems.

Snoring Sound Classification Using 1D-CNN Model Based on Multi-Feature Extraction

Tosin Akinwale Adesuyi1, Byeong Man Kim1 , and Jongwan Kim2

1Department of Computer and Software Engineering, Kumoh National Institute of Technology, Gumi, Korea
2Division of Computer and Information Engineering, Daegu University, Gyeongsan, Korea

Correspondence to:Byeong Man Kim (bmkim@kumoh.ac.kr)

Received: June 30, 2021; Revised: September 23, 2021; Accepted: November 9, 2021

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Sound is an essential element of human relationships and communication. The sound recognition process involves three phases: signal preprocessing, feature extraction, and classification. This paper describes research on the classification of snoring data used to determine the importance of sleep health in humans. However, current sound classification methods using deep learning approaches do not yield desirable results for building good models. This is because some of the salient features required to sufficiently discriminate sounds and improve the accuracy of the classification are poorly captured during training. In this study, we propose a new convolutional neural network (CNN) model for sound classification using multi-feature extraction. The extracted features were used to form a new dataset that was used as the input to the CNN. Experiments were conducted on snoring and non-snoring datasets. The accuracy of the proposed model was 99.7% for snoring sounds, demonstrating an almost perfect classification and superior results compared to existing methods.

Keywords: Sound recognition, Snoring sound, CNN, Multi-feature extraction

Fig 1.

Figure 1.

A generic architecture for sound classification.

The International Journal of Fuzzy Logic and Intelligent Systems 2022; 22: 1-10https://doi.org/10.5391/IJFIS.2022.22.1.1

Fig 2.

Figure 2.

Overall proposed system architecture for 1D-CNN.

The International Journal of Fuzzy Logic and Intelligent Systems 2022; 22: 1-10https://doi.org/10.5391/IJFIS.2022.22.1.1

Fig 3.

Figure 3.

Acquisition sequence of snoring data.

The International Journal of Fuzzy Logic and Intelligent Systems 2022; 22: 1-10https://doi.org/10.5391/IJFIS.2022.22.1.1

Fig 4.

Figure 4.

A graphical view of MFCC feature from a snore signal.

The International Journal of Fuzzy Logic and Intelligent Systems 2022; 22: 1-10https://doi.org/10.5391/IJFIS.2022.22.1.1

Fig 5.

Figure 5.

Conceptual diagrams of (a) 1D-CNN model and (b) 2D-CNN model.

The International Journal of Fuzzy Logic and Intelligent Systems 2022; 22: 1-10https://doi.org/10.5391/IJFIS.2022.22.1.1

Fig 6.

Figure 6.

A 2D-CNN architecture using spectrogram images.

The International Journal of Fuzzy Logic and Intelligent Systems 2022; 22: 1-10https://doi.org/10.5391/IJFIS.2022.22.1.1

Fig 7.

Figure 7.

Classification accuracy for snoring and non-snoring using multi-feature techniques and spectrogram image.

The International Journal of Fuzzy Logic and Intelligent Systems 2022; 22: 1-10https://doi.org/10.5391/IJFIS.2022.22.1.1

Fig 8.

Figure 8.

A 1D-CNN architecture for snoring sound classification.

The International Journal of Fuzzy Logic and Intelligent Systems 2022; 22: 1-10https://doi.org/10.5391/IJFIS.2022.22.1.1

Table 1 . Variation in accuracy of features using the proposed 1D-CNN on snoring dataset.

FeatureAccuracy (%)
Spectral (Centroid+Bandwidth+RollOff)50.18
STFT+RMS+Spectral+ZCR60.79
STFT+RMS+Spectral89.46
MFCC99.00
MFCC+ZCR99.00
Spectral+ZCR+MFCC99.20
MFCC+ZCR+STFT+Spectral+RMS99.70

Table 2 . Classification results in existing studies on snoring data.

Feature extraction techniqueClassifierData sizeTest accuracy (%)
SubjectTraining
Demir et al. [20]LBP+HOGSVM-82872.00
Lim et al. [11]ZCR+STFT+MFCCRNN8560098.80
Kang et al. [6]MFCCCNN+LSTM242488.28
Arsenali et al. [9]MFCCRNN20567095.00
Khan [21]MFCCCNN-100096.00
Wang et al. [22]-Dual CNN+GRU-82863.80
Tuncer et al. [10]PTT signal+AlexNet+VGG16SVM+KNN10010092.78
Dalal and Triggs [23]SCAT+GMM+MAPMLP22428267.71

GRU, gated recurrent unit; SCAT, deep scattering spectrum; GMM, Gaussian mixture model; MAP, maximum a posteriori; MLP, multilayer perceptron.


Share this article on :

Related articles in IJFIS