SOUND
Detect and remove recording noise with SOUND
Last updated
Detect and remove recording noise with SOUND
Last updated
TMS–EEG data often suffer from various noise components. TESA enables the use of the SOUND algorithm to automatically detect and remove recording noise. SOUND is an efficient way to clean noisy channels and to suppress noise components whose topography differs significantly from neuronal sources. SOUND uses spatial Wiener filtering to clean the data. It is based on cross-validating the channels with the help of the conductive profile of the head. In addition to TMS–EEG data, SOUND can be applied to any other multi-sensor EEG or MEG data.
Example of TMS–EEG data before and after SOUND correction. The underlying red curves correspond to the original data, whereas the black curves show the same data after the SOUND correction.
The basic idea of SOUND is simple. The noise level of each channel is evaluated iteratively by comparing the trace of the channel of interest against the recordings of all the other channels. SOUND uses minimum-norm estimation (MNE) to find the most likely cortical current distribution given the recordings of the other channels. From the obtained MNE, SOUND estimates the most likely signal in the channel of interest. Depending on the discrepancy between the estimated and the actual recording in the channel of interest, the noise level of the analysed channel is set. After the noise level in one channel has been evaluated, SOUND continues to estimate the noise level in the remaining channels in an identical way. As the iteration proceeds, the noise estimates, and thus, also the MNEs become increasingly precise. Once SOUND has converged to the final channel noise-level estimates, SOUND computes the final noise-suppressed MNE using all the channels simultaneously. From this noise-suppressed MNE, the final cleaned versions of the channel traces are calculated.
For using MNE in the cleaning process, SOUND requires a forward head model (lead-field matrix or gain matrix). If an appropriate head model is not provided as an input by the user, the TESA-SOUND implementation uses a three-layer spherical head model and the theoretical 10–20 electrode locations in the EEGLAB EEG structure to compute the lead field. In most cases, the spherical lead field provides satisfactory results but, when possible, the use of anatomically accurate, subject-specific lead-fields is recommended
When using your own custom lead fields, make sure that the channels are either in the same order as in the EEGLAB EEG data structure or that you have specified correctly the channel order in the lead field in a separate file. If you see unexpected or peculiar changes in your data due to SOUND, check first that the channels are correctly defined both in the EEGLAB EEG structure and the applied lead field matrix. See more details about uploading your lead-field matrices and about specifying the channel order below. Furthermore, if your EEG system records the TMS-induced artefact (most likely any TMS-EEG system apart from Nexstim) the artefact should be removed with pop_tesa_removedata() and pop_tesa_interpdata() functions prior to using SOUND. Otherwise the huge TMS-pulse artifacts might completely dominate the noise estimation process of SOUND, leading to suboptimal cleaning.
In addition to the choosing the lead-field, there are two input parameters that can affect the cleaning result of SOUND. As stated, SOUND is an iterative algorithm. As an input parameter, the user needs to specify the number of iterations used to evaluate the noise in each channel. With typical 30–60-channel EEG systems, 5–10 iterations is enough to ensure convergence [1]. However, if the data consists of hundred or more channels, 20–30 iterations is a more conservative choice. Since SOUND uses MNE in the cross-validation process, the regularization level (lambda value) also needs to be chosen. The lambda value controls the amount of cleaning. A larger lambda value removes more noise but simultaneously the risk for over-correction in the signals of interest increases.
If the user has already rejected some channels completely from the dataset prior to SOUND, the TESA-SOUND implementation allows to interpolate these channels with the best possible Wiener estimate rather than using any arbitrary spatial interpolation function (e.g., the spline function). See more details below.
Often the research question requires comparisons between datasets recorded in different conditions. The TESA-SOUND implementation allows the user to clean two or more datasets simultaneously with SOUND to ensure identical cleaning process and direct comparison between the conditions. See below for details.
For more information about the algorithm and for appropriate citing see [1].
[1] Mutanen, T. P., Metsomaa, J., Liljander, S., & Ilmoniemi, R. J. (2018). Automatic and robust noise suppression in EEG and MEG: The SOUND algorithm. NeuroImage, 166, 135-151.
1. Select an appropriate lambda value. This controls the amount of cleaning. A larger lambda value removes more noise but simultaneously the risk for over-correction increases.
2. Set the number of iterations. Typically, with EEG systems of 30-60 channels, 5-10 iterations are enough. However, if the data consists of hundreds of channel recordings, SOUND might require more iterations to converge. Please refer to the original work for more details.
3. Load a custom lead-field matrix (optional). SOUND requires a forward model of the head for solving the MNE. Users can either provide an individualised/custom lead field, or use the TESA default. For custom lead fields, save the lead field matrix in. mat format. The saved file should only contain an n x m lead-field matrix, n being the number of channels and m the sources in the lead field. If no lead-field is provided by the user, TESA automatically uses a spherical three-layer model and the theoretical electrode positions to compute a lead-field matrix.
4. When using a custom lead-field matrix, the user can specify the channel order in the lead field as an input parameter. NOTE: For SOUND to work correctly, it is essential that the channels in the EEG dataset as well as in the uploaded lead field are defined correctly!
5. In the case where some noisy channels have already been removed during previous signal processing steps, TESA’s sound implementation allows extrapolation of these channels with an optimal SOUND estimate. Give an EEGLAB .set file that contains all the original channel locations so that TESA can identify and extrapolate the missing channels.
6. Often, the research question requires to compare data collected in different conditions. The TESA SOUND implementation enables the user to clean different conditions simultaneously ensuring identical SOUND steps for all the datasets, and thus, non-biased comparison. To use this property, first pre-process each dataset of interest and safe them to separate EEGLAB variables. When applying SOUND, provide the list of the datasets to be cleaned simultaneously. The SOUND outputs only the cleaned version of the input EEG data. The other datasets are directly saved to the Matlab workspace with suffices “_after_SOUND”.
EEG = tesa_sound(EEG );
%Run SOUND using default values and a spherical leadfield
EEG = tesa_sound(EEG, 'key1',value1... );
%Run SOUND using customised inputs
EEG = pop_tesa_sound(EEG );
%Pop up window
EEG = pop_tesa_sound(EEG, 'key1',value1... );
%Run SOUND using customised inputs
Input
Description
Example
Default
EEG
EEGLAB EEG structure
EEG
-
Key
Input Value
Description
Example
Default
'lambdaValue'
double
Double (integer) providing the lambda value for SOUND
10.5
0.1
'iter'
int
integer providing the number of iterations for SOUND
10
5
'leadfieldInVar'
matrix
An n x m matrix with individualised leadfield where n = channels and m = dipoles.
IMPORTANT – The channel number and order must match the channels in the analyzed EEGLAB structure OR The channel order should be sorted to match EEGLAB using the 'leadfieldChansFile' input below.
leadfield
e.g. where leadfield = 62 x 15006 matrix (channel x dipole) stored in MATLAB workspace
If not specified, a spherical three layer model is used
'leadfieldInFile'
str
File path and name of individualised leadfield matrix where n = channels and m = dipoles.
This command can be used as an alternative to 'leadfieldInVar' which will load the required leadfield matrix in to the workspace.
Note that the file must only contain the leadfield matrix and nothing else.
IMPORTANT – The channel number and order must match channels in EEG structure OR the channel
order should be sorted to match EEGLAB using the 'leadfieldChansFile' input below.
'C:\data\myLeadfield.mat' which contains leadfield variable [e.g. where leadfield = 62 x 15006 matrix (channel x dipole)]
If not specified, a spherical three layer model is used
'leadfieldChansFile'
str
File path and name of cell array listing the channel order of the individual leadfield matrix.
If called, this command will load the indicated cell array and then sort the leadfield matrix channel order to match the EEGLAB data channel order.
This is important if the leadfield has been generated in another program, which stores the channels in the lead field in a different order than in the EEG variable.
NOTE: that the channel names must match those in the EEGLAB data.
C:\data\myLeadfieldChans
.mat' which contains cell array with channel order.
'multipleConds'
cell array
A cell string array indicating the variables of the additional datasets that need to be cleaned simultaneously with input variable EEG.
Use this input if you would like to apply SOUND identically to multiple conditions (e.g. pre and post an intervention).
Conditions must first be epoched. Epochs from different conditions identified in the 'multipleConds' input will be separated, averaged across epochs within a condition, and then concatenated in time before performing SOUND.
The names of the additional datasets will be also saved into an extra field in the output variable EEG, i.e., EEG.multipleCondsClean
{'EEG2', 'EEG3'};
For instance:
EEG1 = pop_tesa_sound( EEG1, 'multipleConds', {'EEG2', 'EEG3'} ); would clean simultaneously identically the datasets EEG1, EEG2, EEG3. Note, only the cleaned version of EEG1 is returned as an output and the datasets defined within the 'multipleConds' variable are directly saved to the workspace with new names EEG2_after_SOUND and EEG3_after_SOUND.
'replaceChans'
str
Path and file name for EEGLAB EEG structure containing all required output channels stored in the EEG.chanlocs field.
This command uses SOUND to replace missing channels (e.g. ones that have been removed earlier in the processing pipeline).
If called, this command will replace any channels missing from the current data set relative to the data set called by 'replaceChans'.
NOTE: This utility works also with 'multipleConds' option with the restriction that the same channels must have been removed in all
datasets prior to SOUND.
NOTE: When using your own leadfield, it should have channels matching the 'replaceChans'-input dataset.
'C:\data\fullChannelEegData.set';
Outputs
Output
Description
EEG
EEGLAB EEG structure
EEG = pop_tesa_sound(EEG);
%Default command
EEG = pop_tesa_sound(EEG, 'lambdaValue', 0.2, 'iter', 10 );
%Run SOUND using customised input values
EEG = pop_tesa_sound(EEG, 'leadfieldInVar', leadfield, 'leadfieldChansFile', 'C:\data\myLeadfieldChans.mat' );
%Run SOUND using default values and an individual leadfield matrix (stored in variable called leadfield). Sort the leadfield channel order (stored in 'C:\data\myLeadfieldChans.mat') to match the EEGLAB data channel order.
EEG = pop_tesa_sound(EEG, 'leadfieldInFile', 'C:\data\myLeadfield.mat' );
%Run SOUND using default values and an individual leadfield matrix (stored in variable called leadfield loaded from 'C:\data\myLeadfield.mat')
EEG = pop_tesa_sound(EEG, 'multipleConds', {'EEG2’,'EEG3'} );
%Run SOUND using default values simultaneously across several different TMS-EEG conditions saved to separate variables EEG, EEG2, and EEG3.
EEG = pop_tesa_sound(EEG, 'replaceChans', 'C:\data\fullChannelEegData.set' );
%Run SOUND using default values and replacing missing channels in the current data set (all channels defined in the loaded file).