The Neurophysiological Biomarker Toolbox (NBT)

# Automatic and Semi-Automatic methods for EEG pre-processing

One of the most fundamental and important preprocessing steps of EEG analysis is the removal of artifacts from the EEG signal. An artifact is electrical activity from a source other than the brain contributing to the signal measured by EEG electrodes. Examples include activity generated by eye blinks, eye movements, muscle contractions (e.g. neck or shoulders), heart beat, current drift, movement of electrodes on the scalp or line noise (due to electrical equipment).

Artifacts can be divided into two main categories:

1. Stereotypic artifacts, including eye movements (EOG = O(10² μV); EEG = O(10 μV)) and eye blinks
2. Non-stereotypic artifacts, which are much harder to identify visually, including movements of the electrodes on the scalp, current drift, spurious electrical activity picked by the EEG amplifier, muscle contractions (EMG = O(10² μV); EEG = O(10 μV)))

Pre-processing often consists of different filtering steps:

• Transient artifacts removal
• Independent component analysis (ICA), which separates the EEG signal into maximally independent components, and which is well-suited for the detection of eye blinks.

A major drawback of pre-processing EEG data by hand is that it is a time-consuming procedure, especially when you deal with long recordings and multiple timeseries of a study database. Moreover, it requires a certain degree of expertise for being able to detect artifacts from physiological signals and sometimes also among experts there can be slight variability. Over the past years, automatic cleaning approaches have been developed to aid in the cleaning of EEG data.

In this section we briefly introduce techniques that have been developed in the past years for (semi-)automated artifact detection and removal. The first method, called FASTER, is from the Neuronal Engineering Group of the Trinity College for Engineering, Dublin, Ireland. The second approach that we discuss is ADJUST, from INSERM-CEA Cognitive Neuroimaging Unit, Paris, France). The third method is the NBT cleaning protocol, an experimental algorithm based on a mixture of both FASTER and ADJUST. More detailed information about these techniques can be found in the corresponding reference articles.

Both FASTER and ADJUST come with fully developed open source Matlab functions, and they are both implemented in EEGlab and NBT.

We will also describe how to use these functions within the NBT framework.

## FASTER (Fully Automated Statistical Thresholding for EEG artifact Rejection)

FASTER is an automatic artifact rejection tool. After filtering the data in the selected frequency range (e.g. 1-45 Hz), FASTER can act at different levels of the pre-processing analysis:

1. it removes contaminated channels and uses spherical spline interpolation to reconstruct the removed channels
3. it removes artifactual Independent Components
4. it inspects and interpolates again bad channels, but this time, within single epochs
5. it allows for computing the grand average of subjects in the analysis, in order to detect outliers in the pool of experimental data.

FASTER uses thresholding methods over several parameters, in each of the above listed aspects. Statistical parameters of the data are calculated and a Z-score of ± 3 for each parameter is used as the metric for defining contaminated data.

See the flow chart below (Source) and the reference paper for more details on the statistical parameters used.

FASTER comes with a GUI, where you can work on files in .set or .bdf formats. You can access the FASTER GUI by using the EEGlab interface (go to Tools|Process with FASTER), and the NBT interface (go to Pre-processing| Auto Clean functions | Run Faster)

If you are using the FASTER GUI and you have your files in NBT format you can use the NBT files function nbt_fromNBTtoSET to convert NBT data to the .set data format, and use nbt_fromSETtoNBT after cleaning of data with FASTER to have your data in NBT format again.

You can also run FASTER directly using a separate function. In this case you can use the data in NBT format and you will not have to convert your data to the .set format. You will have to use the EEGlab environment, as the FASTER functions will require fields such as EEG (which is a struct created by EEGlab containing your data), eeg_chans (vector with the number of channels of your data, i.e 1:129), ref_chan (the channel you are using as reference), EOG_chans and blink_chans (eye channels).

We indicate here the main functions of FASTER that you can manipulate for specific uses (in this tutorial we refer to the first three items, channels, epochs, components). Remember that FASTER works best with epoched signal, so before using the following functions we suggest to epoch the signal using the EEGlab function EEG = eeg_regepochs(EEG, recurrence,limits,remove_baseline); (i.e. EEG = eeg_regepochs(EEG, 2,[0 2],NaN); this command produces 2 second epochs EEG).

• for channels
list_properties = channel_properties(EEG,eeg_chans,ref_chan);
rejection_options.measure=ones(1,size(list_properties,2)); % values of the statistical parameters (see flow chart)
rejection_options.z=3*ones(1,size(list_properties,2)); % Z-score threshold
[indelec] = min_z(list_properties,rejection_options); % rejected channels
• for epochs
list_properties = epoch_properties(EEG,eeg_chans);
rejection_options.measure=ones(1,size(list_properties,2)); % values of the statistical parameters (see flow chart)
rejection_options.z=3*ones(1,size(list_properties,2)); % Z-score threshold
[indelec] = min_z(list_properties,rejection_options); % rejected epochs
• for components
list_properties = component_properties(EEG,blink_chans,lpf_band);
rejection_options.measure=ones(1,size(list_properties,2)); % values of the statistical parameters (see flow chart)
rejection_options.z=3*ones(1,size(list_properties,2)); % Z-score threshold
[indelec] = min_z(list_properties,rejection_options); % rejected components

When using FASTER GUI, the results of the analysis are stored in .log files (you can inspect the FASTER job at each step of the analysis, if before running FASTER you select the Intermediate option, see figure below). This allow to assess the quality of FASTER pre-processing.

.\

Try to load the pre-interpolation dataset and inspect which of the FASTER statistical parameters contributed to the bad channels detection:

[EEG LASTCOM] = pop_loadset; % select pre_inter_... set file
eeg_chans = 1:129;
ref_chan = 129;
c_properties = channel_properties(EEG,eeg_chans,ref_chan);

figure
plot((c_properties(:,1)-mean(c_properties(:,1)))/std(c_properties(:,1)),'r','linewidth',2)
hold on
plot((c_properties(:,2)-mean(c_properties(:,2)))/std(c_properties(:,2)),'b','linewidth',2)
plot((c_properties(:,3)-mean(c_properties(:,1)))/std(c_properties(:,3)),'g','linewidth',2)
plot(1:129,3*ones(1,129),'k-.','linewidth',2)
plot(1:129,-3*ones(1,129),'k-.','linewidth',2)
legend('Mean correlation','Variance of the channels','Hurst exponent','Threshold')
xlabel('Channels')
ylabel('Z-score')
grid off
axis tight

Channels that have at least one parameter outside the Z-score interval are considered contaminated and will be then removed and interpolated. You can use similar code to inspect which parameters contribute to epochs removal and independent components rejection.

In NBT some FASTER functions are called by nbt_FindBadChannels, where you can select FASTER as an automatic method for bad channel detection, and by nbt_AutoRejectICA where FASTER is used together with ADJUST for detecting artifactual independent components (you need to run ICA and filter ICA before using this function).

## ADJUST (Automatic EEG artifact detection based on the joint use of spatial and temporal features)

ADJUST is an automatic tool for Independent Component Rejection. It also incorporates methods for gross artifact removal, although it is strongly suggested to visually inspect transient artifacts before running the independent component analysis.

ADJUST also relies on thresholding methods, but in this case the thresholds are computed automatically by an Expectation-Maximization algorithm (see References below for a detailed description of the algorithm).

Independent Components are examined on the basis of spatial and temporal features (see flowchart below, Source).

Four classes are then defined:

2. Vertical eye movements,
3. Horizontal eye movements,
4. Generic discontinuity (including muscles movements).

Each class is defined by specific statistical parameters exceeding certain threshold values (see the figure below).

Like FASTER, ADJUST is an EEGlab plugin, (from EEGlab interface go to Tools|ADJUST to access ADJUST processing).

In NBT some ADJUST functions are called by nbt_AutoRejectICA (in the NBT menu Pre-Processing|ICA|Auto reject ICA components). Prior to running nbt_AutoRejectICA, you need to run ICA and filter ICA components. The autorejection function will evaluate ICA components according to ADJUST and FASTER parameters. The two methods together allow more robust artifact rejection.

You can directly visualize and compare the outcomes provided by FASTER and the ones provided by ADJUST.

Although these methods work quite well, we suggest to check whether there are artifactual components which have not been detected. They can then be added manually for rejection.

Now we can visualize the results after the independent component analysis using different plotting tools: scrolling through components in time, spectral plots and plotting the statistical parameters of FASTER and ADJUST for each individual component. At this point you can accept or reject every automatic detected component and add new components.

By manipulating code directly you can disable the visualization of the rejection results. In this way you will accept the work of the automatic algorithms without need of supervision.

Note: fully automated artifact rejection has the risk of removing too much activity, e.g. activity generated by the brain, or may not detect some artifacts. It is recommended to check the signal of at least a few subjects to confirm whether the cleaning method is suited for your data.

## NBT autoclean protocol

In NBT Alpha RC3b we released a protocol for automatic cleaning of signals using a mixture of both FASTER and ADJUST. This function is highly experimental! However, we believe it works quite good, and can save a lot of time on manual cleaning of EEG signals. Please, carefully check the cleaning and give us feedback on your experiences (we are also happy to hear suggestions for improvements).

The NBT automatic cleaning can be found in the NBT menu Pre-processing| Auto Clean functions|NBT Auto clean signals. You need to first set the EyeCh (eye channels) and NonEEGch (non-EEG channels) fields using the functions in Pre-processing| Auto Clean functions|Setup. In most cases the EyeCh (eye channels) field should be set to the same as the NonEEGch (non-EEG channels) field. Check the nbt_AutoClean function for details for the protocol.

# Conclusion

Automatic methods for EEG pre-processing are becoming quiet popular in scientific research. The main reason for this is that they reduce man-time for the analysis of huge amount of data, thus leaving to humans more space for statistical analysis and data interpretation. These methods are becoming also more sophisticated and robust, looking both at spatial and temporal features.

Despite these powerful advantages, there are some limitations, for example better performance are achieved for high density EEG, and long recordings. Moreover, these techniques can require high processing power capabilities and last, but not least, they are not able yet to substitute the sight of a human expert's eye in the detection of artifacts. Indeed, although the number of false positives in the automatic detection of contaminated channels/epochs/components is very low, there can be still artifacts that the algorithm is not able to detect or classify.

For these reasons, we offer the possibility for the user to choose between 'manual' analysis, completely automated analysis (use of FASTER and ADJUST via EEGlab interface) and semi-automatic analysis, where the algorithms work together and are supervised by the user.

Do you know of other good ways of doing automatic or semi-automatic artifact removal in EEG or MEG signals? Please use our feedback form below.

# References

Anthony J Bell and Terrence J Sejnowski (1995). An informationmaximisation approach to blind separation and blind deconvolution. Neural Computation. November 1995, Vol. 7, No. 6, Pages 1129-1159.

Nolan H, Whelan R, Reilly RB.(2010), FASTER: Fully Automated Statistical Thresholding for EEG artifact Rejection. J Neurosci Methods. 2010 Sep 30;192(1):152-62. Epub 2010 Jul 21.