Article Search
닫기

Original Article

Split Viewer

International Journal of Fuzzy Logic and Intelligent Systems 2023; 23(2): 117-129

Published online June 25, 2023

https://doi.org/10.5391/IJFIS.2023.23.2.117

© The Korean Institute of Intelligent Systems

Improved Swarm Intelligence Optimization Model with Mutual Information Estimation for Feature Selection in Microarray Bioinformatics Datasets for Diseases Diagnosis

Peddarapu Rama Krishna and Pothuraju Rajarajeswari

Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Andhra Pradesh, India

Correspondence to :
Peddarapu Rama Krishna (peddarapuramakrishna@gmail.com)

Received: January 28, 2023; Revised: May 22, 2023; Accepted: June 19, 2023

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Bioinformatics has emerged as a promising field with innovative applications in various biological domains. Microarray data analysis has become the preferred technology for estimating gene expression and diagnosing diseases. However, processing and computing the vast amount of information contained in microarray data pose significant challenges. This study focuses on developing an effective feature selection model for disease diagnosis using microarray datasets. The proposed approach, mutual fuzzy swarm optimization (MFSO), uses mutual information (MI) values to compute features in a microarray gene dataset. The computed MI values are then applied to a fuzzy expert system (FES) for sample classification. To improve classification accuracy, fuzzy logic is iteratively estimated for unclassified instances in the microarray sample. The feature selection model employs a particle swarm optimization model to extract microarray sample features based on the derived MI. The swarm optimization movement is guided by an objective function. The expert system integrates fuzzy logic with human expert knowledge to perform medical diagnoses, with a specific focus on the selected features. The proposed MFSO model utilizes if-then rules to estimate the membership values of the microarray dataset features. Through simulation analysis, the performance of the proposed MFSO model is evaluated using sensitivity, specificity, and ROC values for five different microarray datasets. The results demonstrate improved performance compared to existing methods. In conclusion, this study presents a novel approach for effective feature selection in a microarray dataset for disease diagnosis. The proposed MFSO model integrates MI computation, fuzzy logic, and expert knowledge to achieve improved classification accuracy. The simulation analysis validates the effectiveness of the proposed model, highlighting its superior performance compared to existing methods.

Keywords: Microarray, Fuzzy expert system (FES), Particle swarm optimization, Bioinformatics, Feature selection

We would like express our gratitude to all the individuals and organizations who contributed to the successful completion of this research. Their support and assistance were invaluable in conducting the study and achieving the results presented in this study.

No potential conflict of interest relevant to this article was reported.

Peddarapu Ramakrishna is a research scholar in Computer Science and Engineering department at the Koneru Lakshmaiah Education Foundation (KL Deemed to be University), Vaddeswaram, Andhra Pradesh, India since July 2019. He received his B.Tech. and M.Tech degrees in Computer Science and Engineering from JNTU University (JNTUH), Hyderabad, in 2004 and 2010, respectively. His research interests include machine learning, artificial intelligence, and deep learning. Currently, he is doing his research in biotechnology. He is a member of the CSI and ISTE. He is also working as an Asst. Professor in the Dept. of Computer Science and Engineering at VNR Vignana Jyothi Institute of Engineering and Technology, Bachupally, Hyderabad, Telangana. He has 18 years of teaching experience. He has published more than 15 research articles in ESCI and Scopus/international journals. E-mail: peddarapuramakrishna@gmail.com

Pothuraju Rajarajeswari completed her Ph.D. in Computer Science and Engineering and is currently working as PROFESS-OR in KL University, Guntur, Andhra pradesh. She has 22 years of academic teaching experience. She has guided many Mtech and PhD students. Her areas of interest include artificial intelligence, intelligent systems, data mining, machine learning, bioinformatics. She has published more than 65 research articles in SCI/ESCI and scopus /international journals. E-mail: rajilikhitha@gmail.com

Article

Original Article

International Journal of Fuzzy Logic and Intelligent Systems 2023; 23(2): 117-129

Published online June 25, 2023 https://doi.org/10.5391/IJFIS.2023.23.2.117

Copyright © The Korean Institute of Intelligent Systems.

Improved Swarm Intelligence Optimization Model with Mutual Information Estimation for Feature Selection in Microarray Bioinformatics Datasets for Diseases Diagnosis

Peddarapu Rama Krishna and Pothuraju Rajarajeswari

Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Andhra Pradesh, India

Correspondence to:Peddarapu Rama Krishna (peddarapuramakrishna@gmail.com)

Received: January 28, 2023; Revised: May 22, 2023; Accepted: June 19, 2023

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Bioinformatics has emerged as a promising field with innovative applications in various biological domains. Microarray data analysis has become the preferred technology for estimating gene expression and diagnosing diseases. However, processing and computing the vast amount of information contained in microarray data pose significant challenges. This study focuses on developing an effective feature selection model for disease diagnosis using microarray datasets. The proposed approach, mutual fuzzy swarm optimization (MFSO), uses mutual information (MI) values to compute features in a microarray gene dataset. The computed MI values are then applied to a fuzzy expert system (FES) for sample classification. To improve classification accuracy, fuzzy logic is iteratively estimated for unclassified instances in the microarray sample. The feature selection model employs a particle swarm optimization model to extract microarray sample features based on the derived MI. The swarm optimization movement is guided by an objective function. The expert system integrates fuzzy logic with human expert knowledge to perform medical diagnoses, with a specific focus on the selected features. The proposed MFSO model utilizes if-then rules to estimate the membership values of the microarray dataset features. Through simulation analysis, the performance of the proposed MFSO model is evaluated using sensitivity, specificity, and ROC values for five different microarray datasets. The results demonstrate improved performance compared to existing methods. In conclusion, this study presents a novel approach for effective feature selection in a microarray dataset for disease diagnosis. The proposed MFSO model integrates MI computation, fuzzy logic, and expert knowledge to achieve improved classification accuracy. The simulation analysis validates the effectiveness of the proposed model, highlighting its superior performance compared to existing methods.

Keywords: Microarray, Fuzzy expert system (FES), Particle swarm optimization, Bioinformatics, Feature selection

Fig 1.

Figure 1.

Flow diagram showing MI computation in the MFSO.

The International Journal of Fuzzy Logic and Intelligent Systems 2023; 23: 117-129https://doi.org/10.5391/IJFIS.2023.23.2.117

Fig 2.

Figure 2.

Design of the fuzzy system.

The International Journal of Fuzzy Logic and Intelligent Systems 2023; 23: 117-129https://doi.org/10.5391/IJFIS.2023.23.2.117

Fig 3.

Figure 3.

Flow chart of the PSO model.

The International Journal of Fuzzy Logic and Intelligent Systems 2023; 23: 117-129https://doi.org/10.5391/IJFIS.2023.23.2.117

Fig 4.

Figure 4.

Representation of solution variable in the hybrid ant stem algorithm.

The International Journal of Fuzzy Logic and Intelligent Systems 2023; 23: 117-129https://doi.org/10.5391/IJFIS.2023.23.2.117

Fig 5.

Figure 5.

Flow chart of the PSO.

The International Journal of Fuzzy Logic and Intelligent Systems 2023; 23: 117-129https://doi.org/10.5391/IJFIS.2023.23.2.117

Fig 6.

Figure 6.

Comparison of (a) sensitivity, (b) specificity, (c) false positive, and (d) false negative.

The International Journal of Fuzzy Logic and Intelligent Systems 2023; 23: 117-129https://doi.org/10.5391/IJFIS.2023.23.2.117

Algorithm 1. MFSO feature selection in microarray..

Begin
for i = 1 ... I do:
Compute the fitness value of swarm optimization P using an objective function
Estimate the fitness function based on the swarm objective function
Repeat the fuzzy set function Fi −1 == Fi
Calculate the swarm movement in the objects:
Pi=SCOptimumΣi=1NSCi
Estimate the fitness value of the particle swarm function
SCOptimum(t+1)=Pi*SCOptimum(t)
Compare the optimal value
Perform the differentiation process for the MF
xij(t+1)=μij+ϕ(μij(t)-μkj(t))+ω(μmi(t)-μij(t)),ω=μkj/μij
End for
return
End

Table 1. Dataset features.

DatasetSample countGeneLabelsClass samples
ISR116,783IS55
IR55

T2D3321,793DM216
NGT16

Colon cancer591,974Tumor38
Normal23

Leukemia696,943ALL43
AML29

Prostate10611,678Tumor49
Normal53

IS, insulin sensitivity; IR, insulin resistance; DM, diabetes mellitus; NGT, normoglycaemia; ALL, Acute lymphocytic leukemia; AML, acute myeloid leukemia..


Table 2. Estimation of features using particle swarm optimization.

ParametersValue
Delay factor (ρ)0.5
Number of swarm75
Minimal value (τ0)0.02
Error factor (σ)1
Csize27

Table 3. Interpretability of the MFSO based on rules.

DatasetRule setNcorrectNcoversCoverageAccuracy (%)
ISRRule 1567894
Rule 2689190

T2DRule 114188391
Rule 29118689
Rule 34178893
Rule 4597293

Table 4. Gene rule set with the MFSO.

DatasetsGene ID#MFLabels
T2DNM_021131.11medium
NM_0223491.12low, medium
AW2912181low
BC000229.11low
AL5235751high
NM_005260.21High

ISRD32129_f_at2low, medium
Z83805_at1medium
Y09615_at2low, high
X07730_at2medium, high

Table 5. Rule set with the MFSO.

DatasetRule setValue of NcoversValue of NcorrectValue of CoverageAccuracy of Rule (%)
ISR1867794.56
21089098.73

T2D1211774.6397,64
2171169.8498.74
3211347.3299.04
49938.9498.74

Table 6. Comparison of interpretability based on datasets.

DatasetMean coverageNumber of variablesMembership function
GAHCAMFSOGAHCAMFSOGAHCAMFSO
T2D21.325.8336.852.93.9816.840.453.7822.74
ISR25.8228.9339.433.24.9419.320.833.7429.83

Table 7. Generalization ability.

DatasetApproachLOOCV evaluation

CorrectIncorrectUnclassified
ISRGA84.35.410.3
PSO87.85.96.3
HCA92.44.33.3
MFSO98.20.80.8

T2DGA44.1129.4126.48
PSO85.38.825.88
HCA94.112.83.09
MFSO98.40.71.1

Table 8. Comparison of classification performance.

Data setApproaches# IGSensitivitySpecificityFalse positiveFalse negative
ISRMI-PSO90.9850.8350.0380.068
MI-HCA50.9930.7970.1840.296
MI- MFSO50.9860.8250.0590.083

T2DMI-PSO100.9830.8730.0480.63
MI-HCA60.9930.9260.0410.783
MI-MFSO60.9970.9480.0280.73

Table 9. Comparison of mean coverage.

DatasetMean coverage (μC)
BCGAPSOGSAHCAMFSO
T2D26.423.624.623.7336.82
ISR23.727.227.328.4241.75
Colon34.6737.9334.931.3540.94
Leukemia32.8239.137.133.9444.88
Prostate28.6326.428.436.7253.94

Table 10. Comparison of variables.

DatasetNumber of variables (#V)
BCGAPSOGSAHCAMFSO
T2D4.23.84.35.3814.63
ISR3.83.94.95.1613.94
Colon4.25.35.15.5310.94
Leukemia4.95.25.95.499.84
Prostate5.14.15.65.2810.35

Table 11. Comparison of membership function.

DatasetMean number of membership functions (μMF)
BCGAPSOGSAHCAMFSO
T2D0.110.92.32.134.08
ISR0.61.32.22.283.98
Colon0.11.12.12.233.45
Leukemia0.21.32.12.463.06
Prostate0.41.22.32.694.67

Share this article on :

Related articles in IJFIS