Why donate?

- Tutorials, follow the NBT course

**Follow us **

You are here: Frontpage : Introduction » Tutorials » Granger causality analysis of EEG data » Part III: Fitting a VAR model to real EEG data

Table of Contents

In the part II we learnt how to create a VAR model with a known topology but having otherwise random coefficients. In this part we are going to learn how to fit a VAR model to some real EEG data, i.e. how to create a VAR model whose coefficients are such that the dynamics of the model match as close as possible the dynamics of the EEG data.

The first thing that you need to do is to clear your workspace and close any figure that may still be open:

close all; clear classes;

Unless you have already done it, you should also run the command:

import external.eeglab.eegplot;

The EEG data is stored in a .mat file located under your `tutorial_sync`

directory. Let's load it into MATLAB's workspace:

load('eeg.mat');

where I have assumed that you are currently in the `tutorial_sync`

directory. If you are not then you can either change your current working
directory (using MATLAB's command `cd`

) or especify the full path of
file `eeg.mat`

when calling to load.

The EEG is sampled at 1000 Hz so let's inspect the data with the command:

eegplot(eeg, 'srate', 1000);

Fitting a model to this EEG data is the same as saying that we are going to learn what coefficient matrices would generate VAR observations that look as close as possible to the EEG dataset. We can perform this learning task using the command:

>> arfitObj = learn(var.arfit, eeg)

arfitObj =

var.arfit Package: var

Properties: MinOrder: 1 MaxOrder: 15 OrderCriterion: 'sbc' AsymptoticMinOrder: 1 DataMean: [7x1 double] Coeffs: [7x77 double] ResCov: [7x7 double] Residuals: [7x23258 double] Order: 11 NbDims: 7

Notice that variable `arfitObj`

is not of class `dynamics.var`

but of a
class called `var.arfit`

:

>> whos Name Size Bytes Class Attributes

arfitObj 1x1 1307335 var.arfit eeg 7x23258 1302448 double

The reason for this is that `var.arfit`

objects are used to _estimate_
VAR models from data while `dynamics.var`

objects are simply used to generate
artificial VAR models of known characteristics.

`MinOrder`

and `MaxOrder`

of object `arfitObj`

are the minimum
and maximum model orders considered by the VAR estimation algorithm. Property
`OrderCriterion`

indicates that
Swartz Bayesian Criterion (SBC)
was used to automatically select, between the minimum and maximum allowed
orders, the model order that matched best the EEG data. Property `Order`

shows
that the SBC criterion decided that the correct model order was `p=11`

.

Let's plot our EEG data together with the innovation process of its VAR model:

eegplot(eeg, 'srate', 1000,'data2', arfitObj.Residuals);

Notice how the innovation process of the model is now stored in a property
called `Residuals`

. The reason is that most VAR estimation algorithms estimate
the model coefficients by minimizing the variance of the innovation process.
Thus the name of `Residuals`

, because it is the residual that is left after
fitting the VAR coefficients to the data.

You may be tempted to think that the smaller the variance of the residuals (i.e. of the estimated innovation process) the better the model fit. However, you would be wrong. Remember that when we defined the VAR model we did not say anything about the variance of the innovation process. That is, in the model equation:

where we did not say anything about the variance of . The only thing that we did say was that the innovation process is an independent and identically distributed (i.i.d.) process. This means that is completely unrelated (independent from) for any . And the same should hold for the other components of the innovation process . An obvious consequence of this requirement is that the individual innovation processes should not be autocorrelated. We can test whether this model requirement is fulfilled by plotting their corresponding autocorrelation functions:

for i = 1:7

figure; misc.acf(arfitObj.Residuals(i,:),50); end

which should create these figures (click on them to zoom in):

The dashed-dotted lines mark the 95% confidence interval for the autocorrelation of an i.i.d. Gaussian process with the same number of data samples. Ideally, you would like all the autocorrelation values to be within those two lines (points outside of that margin are marked with asterisks). Indeed, it seems that the fit is not perfect but it will (almost) never be when you work with real data. You can get a qualitative idea of how good your model fit is by comparing the autocorrelation functions of the model residuals with the autocorrelation of the EEG data that you are attempting to model:

for i = 1:7

figure; misc.acf(eeg(i,:),50);print('-dpng', ['eeg-acf-' num2str(i)]); print('-dpng',['eeg-acf-' num2str(i)]); end

which will generate the figures:

As you can see, the autocorrelations of the original EEG data are much larger than the autocorrelations of the model residuals. Therefore you may say that the model fits your data reasonably well. But to really consider it a good fit we also have to take into consideration how large our model model order is relative to the amount of data that we used to _learn_ the model coefficients. The question below is related to this issue.

You can force the VAR estimator to use a especific model order (instead of the order chosen by the SBC criterion) by learning the model with the following command:

arfitObj = var.arfit('MinOrder', 30, 'MaxOrder', 30); % will use order 30 arfitObj = learn(arfitObj, eeg);