International Journal of Fuzzy Logic and Intelligent Systems 2024; 24(1): 10-18
Published online March 25, 2024
https://doi.org/10.5391/IJFIS.2024.24.1.10
© The Korean Institute of Intelligent Systems
Nishant Chauhan and Byung-Jae Choi
Department of Electronic Engineering, Daegu University, Gyeongsan, Korea
Correspondence to :
Byung-Jae Choi (bjchoi@daegu.ac.kr)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Attention-deficit hyperactivity disorder (ADHD) is a prevalent neurodevelopmental condition in children characterized by impairments in attention, hyperactivity, and impulse control. Despite extensive research, the underlying cause of ADHD remains unclear. Electroencephalography (EEG), a noninvasive method for recording brain activity, is valuable for studying ADHD-related neural patterns. This study explored the potential of EEG data to differentiate children with ADHD and healthy controls (HC) to enhance diagnostic accuracy. We analyzed EEG recordings from 61 children with ADHD and 60 healthy controls. The EEG data comprised signals from 19 scalp channels. Our primary objective was to develop a machine learning model capable of classifying ADHD subjects with ADHD from HC using EEG data as discriminatory features. To select the most relevant features, we utilized mutual information (MI), a measure of the statistical dependence between two variables. The top features were selected based on their minimum MI values, ensuring that they captured meaningful information from both ADHD and HC groups. Principal component analysis was employed to reduce dimensionality while preserving the essential features, aiming to mitigate computational complexity. The selected features were then used to train ten different classifiers: random forest, multilayer perceptron (MLP), k-nearest neighbors, extra tree classifier, XGBoost, support vector machines, logistic regression, AdaBoost, classification and regression trees, and gradient boosting machines. A stacked classifier was constructed by combining the outputs of all 10 individual classifiers, with the MLP acting as a meta-classifier. The stacked classifier outperformed individual models, achieving an impressive accuracy of 92%. Its precision (91%) and sensitivity (93%) were also higher than those of the individual models, indicating its ability to correctly identify ADHD-positive cases. Furthermore, the specificity of the stacked classifier (93%) was superior, highlighting its improved proficiency in correctly classifying HC. This comprehensive evaluation established the stacked classifier as an effective approach for ADHD classification, surpassing the performance of several standalone models. Our proposed method offers a noninvasive, objective, and cost-effective method for identifying children with ADHD, leading to earlier diagnosis, intervention, and improved treatment outcomes.
Keywords: EEG, ADHD, Machine learning, Mutual information, PCA, Stacked classifier
Attention-deficit hyperactivity disorder (ADHD) is a neurodevelopmental disorder marked by inattention, hyperactivity, and impulsivity, which are inconsistent with an individual’s age and developmental level [1]. The exact cause of ADHD remains unknown; however, several factors, including genetics, brain injury, toxin exposure, and substance abuse during pregnancy, may contribute to its development [2]. Currently, diagnosis of ADHD often relies on comprehensive evaluation by a pediatrician, psychiatrist, or psychologist. However, clinical manifestations of ADHD are subtle and difficult to detect [3]. Imaging techniques such as magnetic resonance imaging (MRI), positron emission tomography (PET), and computed tomography (CT) scans are not yet reliable for ADHD diagnosis, and even in developed countries, there is ongoing debate among specialists regarding diagnostic criteria. Moreover, the risk of false-positive diagnoses, especially in children, remains high. The diagnostic process for ADHD typically involves gathering a significant amount of data, making it complex, time-consuming, and costly. This extensive data collection is necessary to determine whether ADHD underlies the observed symptoms as opposed to normal behavioral variations, potential differential diagnoses, or co-occurring conditions [4]. Yet, it heavily relies on subjective assessments, increasing the likelihood of diagnostic biases and potentially delaying treatment initiation despite the availability of effective ADHD treatments. Therefore, it is imperative to streamlining, abbreviate and standardize the diagnostic process of ADHD. This can be achieved by identifying the most relevant data elements that can accurately predict the diagnostic outcomes. Machine learning (ML) techniques offer a promising approach to achieve this goal.
The growing availability of ML models has sparked a surge in interest in their application in the study of psychiatric disorders. ML models are mathematical algorithms capable of identifying intricate patterns in existing datasets. These learned patterns can then be utilized for predictive tasks in new datasets (e.g., patient vs. control participant classification and symptom score prediction), as well as to identify the most critical variables influencing these predictions. They have demonstrated remarkable effectiveness in capturing complex interactions underlying discrete alterations in schizophrenia, Alzheimer’s disease (AD), and autism spectrum disorder (ASD) [5].
One study showcased electroencephalography (EEG) analysis effectiveness in differentiating 253 children with ADHD from 67 age-matched controls. These used EEG signals recorded during eyes-closed resting-state conditions. Using logistic regression (LR) as a classifier, they achieved an impressive accuracy of 87%, sensitivity of 89%, and specificity of 79.6% [6]. Another study [7] explored the using nonlinear features extracted from EEG recordings during a cognitive attention task for classification. By combining minimum redundancy maximum relevance (mRMR) feature selection and multilayer perceptron (MLP) with one hidden layer containing five neurons, the researchers successfully distinguished 30 children with ADHD from 30 age-matched controls, achieving an accuracy of 92.28%. A recent nationwide Swedish study employed multiple ML models, including random forest (RF), elastic net, deep neural network, and gradient boosting machine (GBM) to identify significant predictors of ADHD based on the family and medical histories of a large cohort of 238,696 individuals [8]. The best-performing model attained a sensitivity of 71.7% and a specificity of 65.0%. Key risk factors for ADHD in children included having parents with criminal convictions, male sex, having a relative with ADHD, academic difficulties, and learning disabilities.
The potential of event-related potentials (ERPs) as diagnostic tools for ADHD was explored in a study that examined ERPs derived from the prefrontal and inferior parietal regions in 14 children with ADHD and 16 age-matched controls during a spatial Stroop task. Using k-nearest neighbor (KNN) and support vector machine (SVM) classifiers, researchers achieved an impressive 83.33% accuracy with KNN, outperforming the SVM’s accuracy of 56.42% [9]. Beyond enabling subject classification into traditional diagnostic groups, research on ML-aided ADHD diagnosis has also contributed to a deeper understanding of the clinical presentation and heterogeneity of ADHD by facilitating the identification of novel subgroups of participants, which can ultimately enhance diagnostic accuracy [10].
Feature selection is vital in ML, refining model performance by pinpointing the most relevant and informative features, thereby reducing dimensionality, improving interpretability, and mitigating the risk of overfitting. Moreover, ML techniques that can explore the heterogeneities in ADHD (e.g., feature selection and reduction methods) have the potential to not only improve diagnosis, but also contribute to advancements in future research investigating the underlying mechanisms of ADHD by providing more appropriately defined samples.
In our study, we investigated the potential of EEG data to differentiate between children with ADHD and HC to improve diagnostic accuracy. Using mutual information (MI) for feature selection, we identified key features with minimum MI values from both groups. To manage computational complexity, principal component analysis (PCA was applied for dimensionality reduction while preserving essential features. Ten classifiers (RF, MLP, KNN, extra tree classifier [ETC], XGBoost [XGB]), SVM, LR, AdaBoost, classification and regression tree [CART], and gradient boosting machine [GBM]) were trained using the selected features. A Stacked classifier, combining outputs from all 10 classifiers with MLP as the meta-classifier, demonstrated promising results.
The remainder of this paper is organized as follows. Section 2 outlines the materials and methodology employed. Section 3 discusses results comprehensively. Finally, Section 4 concludes the paper with a summary of key findings and future directions.
In this study, we proposed an ML framework that utilizes MI-based feature selection, dimensionality reduction, and stacked classifier ensemble learning to address the challenges of traditional ADHD diagnosis and enhance classification accuracy.
Figure 1 illustrates the architecture of the proposed method for ADHD and health controls (HC) classification using a stacked classifier. The framework begins by preprocessing collected EEG data to ensure quality and consistency. Subsequently, MI was employed to select the most relevant features from the preprocessed EEG signals, eliminating redundant and less informative features. Further refinement of the data representation is achieved through PCA, which reduced the dimensionality of the feature set while preserving essential information. The selected features are then used to train a stacked classifier, which is an ensemble-learning approach that combines the predictions of multiple individual classifiers. This stacked classifier leverages the strengths of different algorithms to achieve superior performance compared with standalone classifiers. Finally, the trained stacked classifier predicts ADHD and HC, with performance compared to other individual classifiers. This comprehensive approach aims to improve the accuracy and efficiency of ADHD diagnosis, potentially paving the way for improved clinical decision-making and patient outcomes.
This study utilized a validated online EEG dataset available at https://ieee-dataport.org/open-access/eegdata-adhd-control-children [11]. This dataset includes EEG recordings from two distinct groups: 61 children diagnosed with ADHD (48 boys and 13 girls, average age 9.61 ± 1.75 years) and 60 healthy children (50 boys and 10 girls, average age 9.85 ± 1.77 years). EEG data were collected using an SDC24 device equipped with 19 channels followed the 10–20 electrode placement system commonly used in EEG. The 10–20 plus standard system provides a consistent method for electrode positioning by dividing the scalp into defined regions with labeled points based on standardized distances relative to prominent anatomical landmarks. Electrodes were precisely placed at predetermined locations to ensure uniform and reproducible EEG signal acquisition across diverse brain regions. The “10–20” nomenclature signifies the standardized distances of 10% or 20% between electrode placements, creating a systematic grid for consistent electrode positioning across subjects and research settings. This method ensures methodological rigor in data collection, facilitates meaningful cross-subject and cross-study comparisons, and is crucial in EEG-based ADHD research. ADHD was diagnosed by an experienced child and adolescent psychiatrist using the Diagnostic and Statistical Manual of Mental Disorders, fourth edition (DSM-IV) criteria [12].
EEG signals are susceptible to various artifacts and noise sources, necessitating thorough preprocessing before reliable analysis. The datasets in this study underwent the following preprocessing steps.
The dataset owners employed a customized iteration of Makoto’s preprocessing pipeline adapted for use with EEGLab functions (version 14.1.1; Delorme & Makeig, 2004) in MATLAB 2018a. For artifact elimination, a band-pass finite impulse response filter covering 0.5 Hz to 48 Hz was applied to the continuous EEG data. The CleanLine plugin was used to suppress line noise. Independent component analysis (ICA) was then employed to decompose the EEG data, and components related to eye blinks and muscle artifacts were manually excluded based on spectral properties, scalp maps, and temporal characteristics. For each subject, the time-series was segmented into 1,024-sample (8-second) segments, with variable counts owing to task-specific timings. The minimum task duration was 50 seconds for the control group participants, while the maximum duration was 285 seconds for the ADHD participants. The mean segment count were 13.18 (standard deviation = 3.15) in the control group and 16.14 segments (standard deviation = 6.42) in the ADHD group.
MI quantifies the statistical dependence between two variables. In the context of feature selection for ADHD classification using EEG data, MI helps identify EEG features highly correlated with the target variable (i.e., ADHD diagnosis). This method is powerful for selecting the most relevant and informative features in a dataset. PCA is a dimensionality-reduction technique. It reduces the number of features in a dataset while preserving the most important information. The use of both methods in this study is described below.
MI gauges the dependence between two variables. In this study, MI was used to identify the EEG features that were most informative for distinguishing children with ADHD from healthy controls. Features with high MI values are crucial as they provide more information regarding the diagnosis of ADHD. To calculate the MI, the minimum MI values between each EEG feature and class label (ADHD vs. HC) were computed using the mutual_info_score function in Python’s scikit-learn library.
Finally, topographic maps were generated to visualize the spatial distribution of the most informative EEG features. These maps, created using the minimum norm estimate (MNE) tool-box, provide a visual perspective on how these significant features are distributed across the scalp.
Figure 2 displays a tomographic comparison of features between ADHD and HC based on minimum MI. The figure shows that several features are significantly different between individuals with ADHD and controls. These features include:
The figure also illustrates that the direction of the difference in MI between ADHD and control individuals was not always consistent. For instance, the MI between individuals with ADHD and control individuals was higher for the temporal region of the brain in some cases (e.g., Ept and TB) and lower in others (e.g., P3 and Pz). This suggests that the underlying neural mechanisms of ADHD are complex and that there is no single feature that can reliably distinguish between individuals with ADHD and controls.
This process entails reducing the dimensionality of the features chosen through MI. The cumulative variance plot in Figure 3 visually indicates the proportion of explained variance with an increasing number of components, which helps to identify an optimal cutoff. Here, the 95% cutoff threshold guided the selection of the number of components. Subsequently, PCA was applied to transform the original dataset into a new, more compact representation with a reduced number of dimensions (five in this instance). Five PCA components were selected as they accounted for over 95% of the variance in the brain network data.
In this study, we employed an ensemble of ten ML classifiers to detect patterns in EEG data and enhance ADHD classification in children. The classifiers utilized encompassed a range of algorithms to ensure a comprehensive exploration of the dataset. RF [13], MLP [14], KNN [15], ETC [16], and XGB [17] represent various decision-based models. Additionally, SVM [18], LR using stochastic gradient descent (SGD) [19], AdaBoost [20], CART [21], and GBM [22] have introduced diverse learning approaches. This broad range of classifiers ensures a thorough examination of EEG data, capturing the nuances and complexities essential for accurate classification of ADHD in children.
Stacked classifiers, also known as stacked ensembles, is an ML approach that combines the predictions of multiple base classifiers to improve the overall performance. In this study, ten diverse classifiers were employed, and their outputs were fed into a meta-classifier, specifically an MLP, to form a stacked classifier. This ensemble strategy aims to leverage the strengths of individual classifiers and enhance predictive accuracy and robustness.
It is important to use various metrics when evaluating the performance of a classifier. This is because no single metric can capture all aspects of the classifier performance. In this study, we used the following five performance metrics.
where true positive (TP), true negative (TN), false positive (FP), and false negative (FN).
The performance comparison results of the ten ML classifiers for the ADHD classification task are shown in Figure 4. The stacked classifier exhibited the best performance, with 92% accuracy, 91% precision, 93% sensitivity, 93%, and an F1-score of 92%.
Furthermore, based on the confusion matrix shown in Figure 5, the classifier detected 56 TP, 5 FP, 5 FN, and 55 TN. The 5-fold cross-validation was applied, and the average of each performance metric was reported. This indicates that the stacked classifier potentially learned more complex patterns in the data than the other classifiers.
The XGB and ETC also demonstrated strong performance, achieving accuracies of 88% and 79%, respectively. These classifiers are ensemble-learning methods that combine the predictions of multiple individual classifiers to produce a final prediction. This is one reason why they were able to achieve such good performance.
In contrast, the other classifiers exhibited lower performances, with accuracies ranging from 54% to 79%.
Overall, the results indicate that ensemble learning methods hold promise for ADHD classification. The stacked classifier achieves the best performance, whereas the XGB and ETC perform satisfactorily.
Here are some additional observations from the results:
• The stacked classifier had the highest precision and specificity, indicating that it was very good at correctly identifying both positive and negative ADHD cases.
• The XGB classifier exhibited the highest sensitivity, indicating its strong ability to detect all cases of ADHD, even if it resulted in false positives.
• The MLP and SVM classifiers had the lowest performance, suggesting that they were unable to learn complex patterns in the ADHD data as well as the other classifiers.
• Varied MI differences between ADHD and HC were observed in a variety of brain regions, suggesting that ADHD is a complex neurodevelopmental disorder that affects multiple brain regions.
• The direction of the MI differences was not always consistent, suggesting that there is no single neural signature for ADHD.
• The observed MI differences were relatively small, suggesting that ADHD is a heterogeneous disorder with a wide range of presentations.
In conclusion, our findings offer important insights into the neural mechanisms underlying ADHD. The results indicate that MI could be used to develop more accurate diagnostic tools for ADHD, and that a multimodal approach to ADHD diagnosis and treatment may be most effective.
This study extensively explored the use of ML techniques for the classification of children with ADHD and HC using EEG data. By employing MI-based feature selection, dimensionality reduction via PCA, and a stacked classifier ensemble, we significantly improved diagnostic accuracy. The robust performance of the individual classifiers, particularly the XGBoost model, highlights the potential of advanced ML algorithms to discern subtle patterns in EEG data associated with ADHD. The stacked classifier exhibits superior accuracy, emphasizing the benefits of combining diverse classifiers to enhance diagnostic outcomes. Our findings contribute to the growing body of literature on ADHD diagnosis, showcasing the effectiveness of ML methodologies and emphasizing the importance of feature selection and ensemble learning strategies. Further research on larger and more diverse datasets is needed to consolidate and extend our findings, with the ultimate goal of refining ADHD diagnosis and advancing our understanding of neurodevelopmental disorders.
This study showcased the potential of utilizing minimum MI-based features and a stacked classifier for ADHD diagnosis. However, the balanced and preprocessed nature of the dataset may limit the generalizability of the findings to real-world clinical settings. Future investigations should evaluate the performance of stacked classifiers on imbalanced datasets that reflect wider patient variations. Furthermore, exploring the integration of different neuroimaging modalities with EEG features could offer valuable avenues for enhancing diagnostic accuracy and deepening our comprehension of ADHD.
No potential conflict of interest relevant to this article was reported.
E-mail: nishantsep1090@daegu.ac.kr.
E-mail: bjchoi@daegu.ac.kr.
International Journal of Fuzzy Logic and Intelligent Systems 2024; 24(1): 10-18
Published online March 25, 2024 https://doi.org/10.5391/IJFIS.2024.24.1.10
Copyright © The Korean Institute of Intelligent Systems.
Nishant Chauhan and Byung-Jae Choi
Department of Electronic Engineering, Daegu University, Gyeongsan, Korea
Correspondence to:Byung-Jae Choi (bjchoi@daegu.ac.kr)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Attention-deficit hyperactivity disorder (ADHD) is a prevalent neurodevelopmental condition in children characterized by impairments in attention, hyperactivity, and impulse control. Despite extensive research, the underlying cause of ADHD remains unclear. Electroencephalography (EEG), a noninvasive method for recording brain activity, is valuable for studying ADHD-related neural patterns. This study explored the potential of EEG data to differentiate children with ADHD and healthy controls (HC) to enhance diagnostic accuracy. We analyzed EEG recordings from 61 children with ADHD and 60 healthy controls. The EEG data comprised signals from 19 scalp channels. Our primary objective was to develop a machine learning model capable of classifying ADHD subjects with ADHD from HC using EEG data as discriminatory features. To select the most relevant features, we utilized mutual information (MI), a measure of the statistical dependence between two variables. The top features were selected based on their minimum MI values, ensuring that they captured meaningful information from both ADHD and HC groups. Principal component analysis was employed to reduce dimensionality while preserving the essential features, aiming to mitigate computational complexity. The selected features were then used to train ten different classifiers: random forest, multilayer perceptron (MLP), k-nearest neighbors, extra tree classifier, XGBoost, support vector machines, logistic regression, AdaBoost, classification and regression trees, and gradient boosting machines. A stacked classifier was constructed by combining the outputs of all 10 individual classifiers, with the MLP acting as a meta-classifier. The stacked classifier outperformed individual models, achieving an impressive accuracy of 92%. Its precision (91%) and sensitivity (93%) were also higher than those of the individual models, indicating its ability to correctly identify ADHD-positive cases. Furthermore, the specificity of the stacked classifier (93%) was superior, highlighting its improved proficiency in correctly classifying HC. This comprehensive evaluation established the stacked classifier as an effective approach for ADHD classification, surpassing the performance of several standalone models. Our proposed method offers a noninvasive, objective, and cost-effective method for identifying children with ADHD, leading to earlier diagnosis, intervention, and improved treatment outcomes.
Keywords: EEG, ADHD, Machine learning, Mutual information, PCA, Stacked classifier
Attention-deficit hyperactivity disorder (ADHD) is a neurodevelopmental disorder marked by inattention, hyperactivity, and impulsivity, which are inconsistent with an individual’s age and developmental level [1]. The exact cause of ADHD remains unknown; however, several factors, including genetics, brain injury, toxin exposure, and substance abuse during pregnancy, may contribute to its development [2]. Currently, diagnosis of ADHD often relies on comprehensive evaluation by a pediatrician, psychiatrist, or psychologist. However, clinical manifestations of ADHD are subtle and difficult to detect [3]. Imaging techniques such as magnetic resonance imaging (MRI), positron emission tomography (PET), and computed tomography (CT) scans are not yet reliable for ADHD diagnosis, and even in developed countries, there is ongoing debate among specialists regarding diagnostic criteria. Moreover, the risk of false-positive diagnoses, especially in children, remains high. The diagnostic process for ADHD typically involves gathering a significant amount of data, making it complex, time-consuming, and costly. This extensive data collection is necessary to determine whether ADHD underlies the observed symptoms as opposed to normal behavioral variations, potential differential diagnoses, or co-occurring conditions [4]. Yet, it heavily relies on subjective assessments, increasing the likelihood of diagnostic biases and potentially delaying treatment initiation despite the availability of effective ADHD treatments. Therefore, it is imperative to streamlining, abbreviate and standardize the diagnostic process of ADHD. This can be achieved by identifying the most relevant data elements that can accurately predict the diagnostic outcomes. Machine learning (ML) techniques offer a promising approach to achieve this goal.
The growing availability of ML models has sparked a surge in interest in their application in the study of psychiatric disorders. ML models are mathematical algorithms capable of identifying intricate patterns in existing datasets. These learned patterns can then be utilized for predictive tasks in new datasets (e.g., patient vs. control participant classification and symptom score prediction), as well as to identify the most critical variables influencing these predictions. They have demonstrated remarkable effectiveness in capturing complex interactions underlying discrete alterations in schizophrenia, Alzheimer’s disease (AD), and autism spectrum disorder (ASD) [5].
One study showcased electroencephalography (EEG) analysis effectiveness in differentiating 253 children with ADHD from 67 age-matched controls. These used EEG signals recorded during eyes-closed resting-state conditions. Using logistic regression (LR) as a classifier, they achieved an impressive accuracy of 87%, sensitivity of 89%, and specificity of 79.6% [6]. Another study [7] explored the using nonlinear features extracted from EEG recordings during a cognitive attention task for classification. By combining minimum redundancy maximum relevance (mRMR) feature selection and multilayer perceptron (MLP) with one hidden layer containing five neurons, the researchers successfully distinguished 30 children with ADHD from 30 age-matched controls, achieving an accuracy of 92.28%. A recent nationwide Swedish study employed multiple ML models, including random forest (RF), elastic net, deep neural network, and gradient boosting machine (GBM) to identify significant predictors of ADHD based on the family and medical histories of a large cohort of 238,696 individuals [8]. The best-performing model attained a sensitivity of 71.7% and a specificity of 65.0%. Key risk factors for ADHD in children included having parents with criminal convictions, male sex, having a relative with ADHD, academic difficulties, and learning disabilities.
The potential of event-related potentials (ERPs) as diagnostic tools for ADHD was explored in a study that examined ERPs derived from the prefrontal and inferior parietal regions in 14 children with ADHD and 16 age-matched controls during a spatial Stroop task. Using k-nearest neighbor (KNN) and support vector machine (SVM) classifiers, researchers achieved an impressive 83.33% accuracy with KNN, outperforming the SVM’s accuracy of 56.42% [9]. Beyond enabling subject classification into traditional diagnostic groups, research on ML-aided ADHD diagnosis has also contributed to a deeper understanding of the clinical presentation and heterogeneity of ADHD by facilitating the identification of novel subgroups of participants, which can ultimately enhance diagnostic accuracy [10].
Feature selection is vital in ML, refining model performance by pinpointing the most relevant and informative features, thereby reducing dimensionality, improving interpretability, and mitigating the risk of overfitting. Moreover, ML techniques that can explore the heterogeneities in ADHD (e.g., feature selection and reduction methods) have the potential to not only improve diagnosis, but also contribute to advancements in future research investigating the underlying mechanisms of ADHD by providing more appropriately defined samples.
In our study, we investigated the potential of EEG data to differentiate between children with ADHD and HC to improve diagnostic accuracy. Using mutual information (MI) for feature selection, we identified key features with minimum MI values from both groups. To manage computational complexity, principal component analysis (PCA was applied for dimensionality reduction while preserving essential features. Ten classifiers (RF, MLP, KNN, extra tree classifier [ETC], XGBoost [XGB]), SVM, LR, AdaBoost, classification and regression tree [CART], and gradient boosting machine [GBM]) were trained using the selected features. A Stacked classifier, combining outputs from all 10 classifiers with MLP as the meta-classifier, demonstrated promising results.
The remainder of this paper is organized as follows. Section 2 outlines the materials and methodology employed. Section 3 discusses results comprehensively. Finally, Section 4 concludes the paper with a summary of key findings and future directions.
In this study, we proposed an ML framework that utilizes MI-based feature selection, dimensionality reduction, and stacked classifier ensemble learning to address the challenges of traditional ADHD diagnosis and enhance classification accuracy.
Figure 1 illustrates the architecture of the proposed method for ADHD and health controls (HC) classification using a stacked classifier. The framework begins by preprocessing collected EEG data to ensure quality and consistency. Subsequently, MI was employed to select the most relevant features from the preprocessed EEG signals, eliminating redundant and less informative features. Further refinement of the data representation is achieved through PCA, which reduced the dimensionality of the feature set while preserving essential information. The selected features are then used to train a stacked classifier, which is an ensemble-learning approach that combines the predictions of multiple individual classifiers. This stacked classifier leverages the strengths of different algorithms to achieve superior performance compared with standalone classifiers. Finally, the trained stacked classifier predicts ADHD and HC, with performance compared to other individual classifiers. This comprehensive approach aims to improve the accuracy and efficiency of ADHD diagnosis, potentially paving the way for improved clinical decision-making and patient outcomes.
This study utilized a validated online EEG dataset available at https://ieee-dataport.org/open-access/eegdata-adhd-control-children [11]. This dataset includes EEG recordings from two distinct groups: 61 children diagnosed with ADHD (48 boys and 13 girls, average age 9.61 ± 1.75 years) and 60 healthy children (50 boys and 10 girls, average age 9.85 ± 1.77 years). EEG data were collected using an SDC24 device equipped with 19 channels followed the 10–20 electrode placement system commonly used in EEG. The 10–20 plus standard system provides a consistent method for electrode positioning by dividing the scalp into defined regions with labeled points based on standardized distances relative to prominent anatomical landmarks. Electrodes were precisely placed at predetermined locations to ensure uniform and reproducible EEG signal acquisition across diverse brain regions. The “10–20” nomenclature signifies the standardized distances of 10% or 20% between electrode placements, creating a systematic grid for consistent electrode positioning across subjects and research settings. This method ensures methodological rigor in data collection, facilitates meaningful cross-subject and cross-study comparisons, and is crucial in EEG-based ADHD research. ADHD was diagnosed by an experienced child and adolescent psychiatrist using the Diagnostic and Statistical Manual of Mental Disorders, fourth edition (DSM-IV) criteria [12].
EEG signals are susceptible to various artifacts and noise sources, necessitating thorough preprocessing before reliable analysis. The datasets in this study underwent the following preprocessing steps.
The dataset owners employed a customized iteration of Makoto’s preprocessing pipeline adapted for use with EEGLab functions (version 14.1.1; Delorme & Makeig, 2004) in MATLAB 2018a. For artifact elimination, a band-pass finite impulse response filter covering 0.5 Hz to 48 Hz was applied to the continuous EEG data. The CleanLine plugin was used to suppress line noise. Independent component analysis (ICA) was then employed to decompose the EEG data, and components related to eye blinks and muscle artifacts were manually excluded based on spectral properties, scalp maps, and temporal characteristics. For each subject, the time-series was segmented into 1,024-sample (8-second) segments, with variable counts owing to task-specific timings. The minimum task duration was 50 seconds for the control group participants, while the maximum duration was 285 seconds for the ADHD participants. The mean segment count were 13.18 (standard deviation = 3.15) in the control group and 16.14 segments (standard deviation = 6.42) in the ADHD group.
MI quantifies the statistical dependence between two variables. In the context of feature selection for ADHD classification using EEG data, MI helps identify EEG features highly correlated with the target variable (i.e., ADHD diagnosis). This method is powerful for selecting the most relevant and informative features in a dataset. PCA is a dimensionality-reduction technique. It reduces the number of features in a dataset while preserving the most important information. The use of both methods in this study is described below.
MI gauges the dependence between two variables. In this study, MI was used to identify the EEG features that were most informative for distinguishing children with ADHD from healthy controls. Features with high MI values are crucial as they provide more information regarding the diagnosis of ADHD. To calculate the MI, the minimum MI values between each EEG feature and class label (ADHD vs. HC) were computed using the mutual_info_score function in Python’s scikit-learn library.
Finally, topographic maps were generated to visualize the spatial distribution of the most informative EEG features. These maps, created using the minimum norm estimate (MNE) tool-box, provide a visual perspective on how these significant features are distributed across the scalp.
Figure 2 displays a tomographic comparison of features between ADHD and HC based on minimum MI. The figure shows that several features are significantly different between individuals with ADHD and controls. These features include:
The figure also illustrates that the direction of the difference in MI between ADHD and control individuals was not always consistent. For instance, the MI between individuals with ADHD and control individuals was higher for the temporal region of the brain in some cases (e.g., Ept and TB) and lower in others (e.g., P3 and Pz). This suggests that the underlying neural mechanisms of ADHD are complex and that there is no single feature that can reliably distinguish between individuals with ADHD and controls.
This process entails reducing the dimensionality of the features chosen through MI. The cumulative variance plot in Figure 3 visually indicates the proportion of explained variance with an increasing number of components, which helps to identify an optimal cutoff. Here, the 95% cutoff threshold guided the selection of the number of components. Subsequently, PCA was applied to transform the original dataset into a new, more compact representation with a reduced number of dimensions (five in this instance). Five PCA components were selected as they accounted for over 95% of the variance in the brain network data.
In this study, we employed an ensemble of ten ML classifiers to detect patterns in EEG data and enhance ADHD classification in children. The classifiers utilized encompassed a range of algorithms to ensure a comprehensive exploration of the dataset. RF [13], MLP [14], KNN [15], ETC [16], and XGB [17] represent various decision-based models. Additionally, SVM [18], LR using stochastic gradient descent (SGD) [19], AdaBoost [20], CART [21], and GBM [22] have introduced diverse learning approaches. This broad range of classifiers ensures a thorough examination of EEG data, capturing the nuances and complexities essential for accurate classification of ADHD in children.
Stacked classifiers, also known as stacked ensembles, is an ML approach that combines the predictions of multiple base classifiers to improve the overall performance. In this study, ten diverse classifiers were employed, and their outputs were fed into a meta-classifier, specifically an MLP, to form a stacked classifier. This ensemble strategy aims to leverage the strengths of individual classifiers and enhance predictive accuracy and robustness.
It is important to use various metrics when evaluating the performance of a classifier. This is because no single metric can capture all aspects of the classifier performance. In this study, we used the following five performance metrics.
where true positive (TP), true negative (TN), false positive (FP), and false negative (FN).
The performance comparison results of the ten ML classifiers for the ADHD classification task are shown in Figure 4. The stacked classifier exhibited the best performance, with 92% accuracy, 91% precision, 93% sensitivity, 93%, and an F1-score of 92%.
Furthermore, based on the confusion matrix shown in Figure 5, the classifier detected 56 TP, 5 FP, 5 FN, and 55 TN. The 5-fold cross-validation was applied, and the average of each performance metric was reported. This indicates that the stacked classifier potentially learned more complex patterns in the data than the other classifiers.
The XGB and ETC also demonstrated strong performance, achieving accuracies of 88% and 79%, respectively. These classifiers are ensemble-learning methods that combine the predictions of multiple individual classifiers to produce a final prediction. This is one reason why they were able to achieve such good performance.
In contrast, the other classifiers exhibited lower performances, with accuracies ranging from 54% to 79%.
Overall, the results indicate that ensemble learning methods hold promise for ADHD classification. The stacked classifier achieves the best performance, whereas the XGB and ETC perform satisfactorily.
Here are some additional observations from the results:
• The stacked classifier had the highest precision and specificity, indicating that it was very good at correctly identifying both positive and negative ADHD cases.
• The XGB classifier exhibited the highest sensitivity, indicating its strong ability to detect all cases of ADHD, even if it resulted in false positives.
• The MLP and SVM classifiers had the lowest performance, suggesting that they were unable to learn complex patterns in the ADHD data as well as the other classifiers.
• Varied MI differences between ADHD and HC were observed in a variety of brain regions, suggesting that ADHD is a complex neurodevelopmental disorder that affects multiple brain regions.
• The direction of the MI differences was not always consistent, suggesting that there is no single neural signature for ADHD.
• The observed MI differences were relatively small, suggesting that ADHD is a heterogeneous disorder with a wide range of presentations.
In conclusion, our findings offer important insights into the neural mechanisms underlying ADHD. The results indicate that MI could be used to develop more accurate diagnostic tools for ADHD, and that a multimodal approach to ADHD diagnosis and treatment may be most effective.
This study extensively explored the use of ML techniques for the classification of children with ADHD and HC using EEG data. By employing MI-based feature selection, dimensionality reduction via PCA, and a stacked classifier ensemble, we significantly improved diagnostic accuracy. The robust performance of the individual classifiers, particularly the XGBoost model, highlights the potential of advanced ML algorithms to discern subtle patterns in EEG data associated with ADHD. The stacked classifier exhibits superior accuracy, emphasizing the benefits of combining diverse classifiers to enhance diagnostic outcomes. Our findings contribute to the growing body of literature on ADHD diagnosis, showcasing the effectiveness of ML methodologies and emphasizing the importance of feature selection and ensemble learning strategies. Further research on larger and more diverse datasets is needed to consolidate and extend our findings, with the ultimate goal of refining ADHD diagnosis and advancing our understanding of neurodevelopmental disorders.
This study showcased the potential of utilizing minimum MI-based features and a stacked classifier for ADHD diagnosis. However, the balanced and preprocessed nature of the dataset may limit the generalizability of the findings to real-world clinical settings. Future investigations should evaluate the performance of stacked classifiers on imbalanced datasets that reflect wider patient variations. Furthermore, exploring the integration of different neuroimaging modalities with EEG features could offer valuable avenues for enhancing diagnostic accuracy and deepening our comprehension of ADHD.
Proposed stacked classifier-based framework.
Topographic maps of minimum mutual information values between EEG features and ADHD diagnosis, comparing ADHD and HC.
Cumulative variance explained by principal components after feature selection usingMI.
Performance comparison of stacked classifier with other individualML models for ADHD classification.
Confusion matrix of stacked classifier for ADHD classification.
Nasim Alnuman, Samira Al-Nasser, and Omar Yasin
International Journal of Fuzzy Logic and Intelligent Systems 2024; 24(3): 258-270 https://doi.org/10.5391/IJFIS.2024.24.3.258Amirthalakshmi Thirumalai Maadapoosi, Velan Balamurugan, V. Vedanarayanan, Sahaya Anselin Nisha, and R. Narmadha
International Journal of Fuzzy Logic and Intelligent Systems 2024; 24(3): 231-241 https://doi.org/10.5391/IJFIS.2024.24.3.231Christine Musanase, Anthony Vodacek, Damien Hanyurwimfura, Alfred Uwitonze, Aloys Fashaho, and Adrien Turamyemyirijuru
International Journal of Fuzzy Logic and Intelligent Systems 2023; 23(2): 214-228 https://doi.org/10.5391/IJFIS.2023.23.2.214Proposed stacked classifier-based framework.
|@|~(^,^)~|@|Topographic maps of minimum mutual information values between EEG features and ADHD diagnosis, comparing ADHD and HC.
|@|~(^,^)~|@|Cumulative variance explained by principal components after feature selection usingMI.
|@|~(^,^)~|@|Performance comparison of stacked classifier with other individualML models for ADHD classification.
|@|~(^,^)~|@|Confusion matrix of stacked classifier for ADHD classification.