A Deep Dive into EEG Analysis for Predicting Neurological Outcomes
Today we are joined by Allan Moser, Jackie Le, and Lys Kang of Team Swarthbeat, who are going to talk about their approach to this year’s George B Moody PhysioNet Challenge. Their code and research paper can be found in this GitHub repository. Over to you guys!

Figure 1. From left, Team Swarthbeat: Allan Moser, Jackie Le, Lys Kang
Introduction to Swarthbeat and the PhysioNet Challenge 2023
Neurological recovery post-cardiac arrest is critical yet challenging to predict accurately. The goal of the 2023 George B. Moody PhysioNet Challenge was to develop open-source software to predict good and poor neurological outcomes for patients after cardiac arrest using longitudinal electroencephalograms (EEGs) and other patient information. This challenge was rooted in a critical medical need: to enhance the prognostication of neurological recovery or impairment following cardiac arrest. Such predictions are vital for informing treatment decisions and counseling patients’ families. EEG data, with its intricate patterns and subtle nuances, provides a rich yet complex source of information for these predictions. The task at hand was not only to analyze this data, but to translate it into meaningful, actionable insights for patient care.
We represent Team Swarthbeat, an undergraduate group affiliated with Swarthmore College. Seniors Lys Kang and Jackie Le, both majoring in engineering, collaborated closely with Visiting Professor of Engineering Allan Moser to participate in this challenge.
Our team approached this challenge by treating the selection of model, features, and parameters as an optimization problem with the objective function of maximizing the true positive rate (how often we correctly predicted a ‘poor’ outcome) given a false positive rate (how often we incorrectly predicted a ‘poor’ outcome) of less than 5%, leveraging various EEG feature extraction techniques, including electrode selection and standard frequency-domain quantities. The challenge score was based on a scale of 0.00 to 1.00, measuring the true positive rate, given a false positive rate of less than 5%, at 72 hours after return of spontaneous circulation (ROSC). The performance of the method at the 12, 24, and 48 hours from ROSC were also measured, however, these were not used for the final challenge score. A maximum false positive rate of 5% was chosen since the prediction of a poor outcome would be very serious if life-support for patients with the potential to recover was withdrawn based on the algorithm’s prognosis.
Our Technical Approach
Our journey began with MATLAB example code (provided by the PhysioNet Challenge organizers), diving straight into the realm of signal processing. We built on and experimented with this example code to create a new, more accurate algorithm. Figure 2 highlights how we tackled this multifaceted challenge:

Figure 2. PhysioNet Challenge submission workflow
Figure 3 highlights the timeline of our approaches during the challenge.

Figure 3. Sequence of approaches taken during the PhysioNet Challenge
Considering the massive size of the dataset (2.63 TB) provided by the challenge organizers, the largest in PhysioNet’s history, we decided to incorporate high-performance computing into our workflow by utilizing our campus supercomputer, Strelka. We would write MATLAB programs on our local machines to run with small subsets of the PhysioNet data before transferring these scripts to Strelka to execute larger batch jobs and evaluate our results.
Our approach to the challenge can be broken down into three primary categories:
1. Preprocessing the EEG Data
We started with signal preprocessing, which involved resampling over 2 terabytes of EEG data from 19-channels to a uniform rate and applying filters to remove noise and artifacts, which may have manifested as dead or corrupted EEG channels.
2. Feature Extraction
The next step was feature extraction, where we divided the EEG data into different attributes: patient information, time-domain features, and frequency-domain features. For each EEG record, we extracted features such as signal amplitude, power spectral density, and coherence.
When plotted in time, the raw EEG data was difficult to analyze, as can be seen in Figure 4. As a result, these signals were examined in the frequency domain to calculate more predictive quantities for accurately prognosticating patient outcomes, as shown in Figure 5.

3. Machine Learning and Optimization
Finally, the last step was to implement and work through the optimization problem. To do so we approached this problem in three ways.
Cross Validation Strategy:
We used balanced training data from 607 patient records, with outcomes equally split between good and poor. We used randomized poor-outcome samples to match the smaller number of available good-outcome records. Using five randomized subsets, we created a predictive model and determined final predictions by averaging scores from each balanced iteration.
Feature Significance Estimation for Selection to Maximize Score:
Using the OOBPredictorImportance option of MATLAB’s TreeBagger method, we predicted feature importance from 622 features using out-of-bag instances, with changes in error indicating feature significance. A final significance value was obtained by averaging the 5-fold cross-validation results. To optimize the PhysioNet Challenge score, a variable significance threshold was applied, which initially selected 24 features across different electrodes and attribute groups. However, since all of the data was exposed for the feature significance estimation, the resulting model was over trained which was revealed in the challenge score on a validation set.
Going back to the original 622 features, we eliminated the features with lowest significance, but retained all features within a given class, such as the bandpower for all electrodes. This reduced the number of features to 170 for our model which enhanced the challenge score on the validation set to 0.687. Further optimization was achieved by experimenting with pairs of EEG channels, with a 6-channel combination yielding the best score. To balance prediction models for poor and good outcome cases, random sampling and probability averaging were employed. In the absence of EEG data, the VFIB value (whether a shockable rhythm was induced) served as the prediction basis.
Selection of Classification Model:
We explored a variety of supervised machine learning classifiers using MATLAB’s Classification Learner app, which provides 32 machine learning models. Through its tools, we were able to rank the significance of features, and found redundancy amongst our large number of features. The models that produced the highest classification accuracies utilized ensemble tree methods, specifically AdaBoost, RUSBoost, and TreeBagger. AdaBoost gave the highest classification accuracy of 78.1%, though it had a rather high false positive rate of 37.1%. RUSBoost and TreeBagger offered slightly lower accuracy but had comparatively better false positive rates. Based on these results, our work considered only boosted and bagged tree ensemble methods for optimal machine learning outcomes. Notably, AdaBoost proved to be the best performer, offering high scores on the validation data.
The Results and Their Implications
Through our meticulous approach, we reduced the initial 622 features to a representative 59. Our AdaBoost classifier showed promising results, accurately differentiating between good and poor neurological outcomes. The analysis highlighted specific EEG channels and patient features as significant predictors, as shown in Figure 6.

The significant patient features shown in this figure are patient age and VFIB (shockable rhythm). The significant frequency domain features are the bandpower in all brainwave bands (d, q,a,and b); the slope and goodness-of-fit for a linear fit to the delta band; the ratios of bandpower for the delta-to-theta and delta-to-alpha bands, and coherence measurements for opposite-side-of-the-brain electrodes for the delta, theta, and alpha bands. Figure 7 shows the most significant electrodes used for these features. An interesting question that could be investigated is the physiological significance of these electrodes and features for the brain damage caused by the interruption of circulation resulting from a heart attack.

Figure 7. Positions of electrode used for significant features
Ultimately, our optimization approach resulted in a challenge score of 0.72 on the validation data, which placed Team Swarthbeat at 3rd during the official phase of the challenge. The final score, determined by the PhysioNet Challenge organizers, used a test set that had been hidden from all training. Our score on this test data was 0.52, which ranked Team Swarthbeat 17th out of the 36 teams who successfully completed the test among 110 teams that initially registered for the challenge.
We had the opportunity to present our work at the Computing in Cardiology Conference, an annual conference that brings together researchers from around the world who are doing innovative work in the field of computational cardiology. As one of the few undergraduate teams participating in the challenge, this was an exciting opportunity that allowed us to connect with experts in the biomedical field.
Reflecting on Our Findings
Our journey with EEG data revealed the critical role of frequency-domain features and the effectiveness of ensemble machine learning methods in handling complex biomedical data. We learned the importance of fine-tuning and optimizing classifiers to achieve high accuracy in medical predictions.
Future Directions
We are now exploring further enhancements to our methodology, such as incorporating more sophisticated signal processing techniques and expanding our feature set. Additionally, we are interested in exploring the potential integration of more sophisticated machine learning algorithms in our work to enhance the processing procedure of our extracted features. The potential applications of our research are vast, and we’re excited to contribute to the evolving field of EEG analysis within the area of cardiovascular health.
- Category:
- Data Science


 
                
               
               
               
               
               
              
Comments
To leave a comment, please click here to sign in to your MathWorks Account or create a new one.