Key words

1 Introduction

This chapter aims at providing an overview of electroencephalography (EEG) and magnetoencephalography (MEG) to help the reader with no previous experience with these modalities to understand the information that can be extracted and their neurophysiological meaning in the perspective to be used for brain disorders. These two modalities, which share common characteristics, are often designated together with the acronym M/EEG.

To this end, instead of providing an exhaustive presentation of the M/EEG clinical applications, we focused on the main aspects related to these modalities. As a result, this chapter is organized as follows: We first describe the basic principles in terms of origins of the signals and electrophysiological activity exploited in M/EEG (Subheading 2). We then present the principles of M/EEG experiments (Subheading 3), the data analysis techniques (Subheading 4), and in particular features that can be extracted from the data (Subheading 5). The last part of this chapter presents illustrations of M/EEG applications to brain disorders (Subheading 6). To go further, additional resources are provided to the reader in Boxes 1 and 2 and in a dedicated Github repository.

2 Basic Principles

Being able to extract the information of interest to perform a classification from M/EEG data requires to have some neurophysiological background knowledge to assess the relevance of the selected features. This paragraph aims at providing some general elements regarding the origin of the signals and the recorded activity.

2.1 Origin of the Signals

Neurons create electrical signals, transmitted to other cells via synapses. First, an action potential (AP) arrives at a synaptic cleft (step 1 in Fig. 1) where it will transmit chemical information via neurotransmitters (step 2 in Fig. 1) that generate postsynaptic potentials (PSPs) and local currents (step 3 in Fig. 1). A PSP will create a current sink and will propagate until the cell body to generate a current source (step 4 in Fig. 1). As a result, the PSP creates an electrical dipole consisting in a negative pole (i.e., the sink) and a positive pole (i.e., the source). This dipole will generate primary (intracellular) currents and secondary (extracellular) currents. M/EEG signals result from postsynaptic potentials. More specifically, M/EEG signals result from the spatial and temporal summation of the activity of a large population of synchronous neurons. But notable differences exist between MEG and EEG.

Fig. 1
A schematic diagram of the M or E E G signal exhibits 4 steps. Step 1, A P arrives at a synaptic cleft. Step 2, Chemical information is transmitted through neurotransmitters. Step 3, generates P S P and local currents. Step 4, P S P generates the current sink to generate the current source.

Origin of M/EEG signals

Firstly, regarding the signals themselves, MEG signals are mainly caused by intracellular currents generated by the PSP at the dendrite level and less by the extracellular currents; EEG signals correspond to a difference between electrical potentials, mainly due to extracellular currents. Secondly, regarding the sensitivity toward the dipole orientation, EEG is sensitive both to radial currents (activity located at the gyrus level) and to tangential currents (generated within sulci) even though it has stronger sensitivity to radial currents, whereas MEG is more sensitive to tangential currents. Finally, regarding the sensitivity toward the conductivity, EEG is strongly attenuated and deformed by crossing through the skull, whereas MEG is less sensitive to the different layers crossed (i.e., skull, brain, etc.). Such differences between MEG and EEG have an impact on the way data are preprocessed, analyzed, and, therefore, interpreted. The differences between MEG and EEG are summarized in Table 1.

Table 1 Main features to compare MEG and EEG

2.2 Evoked and Oscillatory Activity

There are two main types of electrophysiological activity of interest that are exploited in the M/EEG domain: the evoked and the oscillatory activity. Evoked responses are weak variations of electromagnetic activity resulting from a stimulation (for instance, in response to a task performance by the participant). Given their amplitude, it is often necessary to average signals over chunks of signals, referred as epochs, to reduce noise. To identify and describe these evoked responses, there is a specific way to name them according to their latency, their amplitude, their shape, and the polarity. Let’s take an example (see Fig. 2), which represent evoked responses from a study where we simulated a visual stimulation. We first see a positive deflection occurring 300 ms after the presentation of the stimulation, which is referred to as P300. These waves can reflect different mechanisms: the early components are mostly exogenous and are related to the stimulus characteristics; the late components are endogenous and are related to the performed task and to the subject’s state.

Fig. 2
A schematic representation of evoked activity. A sequential line graph of microvolts versus time in seconds. The region of N subscript a v e equals 1000. The indicated time regions are 0.165 seconds, 0.260 seconds, and 0.358 seconds.

Evoked activity. Results from a simulation where two sources, located in the visual area, generated an activity after a stimulus. On the right, we plotted the associated time course over the scalp (synthetic signals), resulting from the averaging of 1000 repetitions. One can observe notably a positive wave around t = 300 ms. The code to generate this figure is accessible via the dedicated Github repository

The oscillatory activity, or induced activity, results from the summation of the activity in a given brain region. These rhythms are mainly defined by their frequency, their amplitude, their shape, their location, and their duration. In Fig. 3, we provided examples of the main rhythms found in the literature. Each frequency band is referred to by a Greek letter. Delta ([0.5–3 Hz]) and Theta ([3–7 Hz]) rhythms are, respectively, detected in deep and slight sleeps. Alpha ([8–12 Hz] in posterior areas) and Mu ([7–13 Hz] in central areas) rhythms are both observed in quiet watch and resting state (with the eyes closed for Alpha). Beta ([13–30 Hz]) rhythm is detected during the active watch and during cognitive tasks such as motor imagery, for instance. Gamma rhythm (divided into two sub-rhythms: slow in 30–70 Hz and fast beyond 70 Hz) is observed during specific cognitive processing.

Fig. 3
A signal graph exhibits the time in seconds value for gamma with a low of 30 to 70 hertz and a high of greater than 70 hertz. Beta 18 to 30 Hertz. Alpha 8 to 13 Hert. Theta 4 to 8 Hertz. Delta 0.5 to 5 Hertz.

We plotted the time course associated with the main rhythms that one can observe from M/EEG recordings. These plots were obtained from synthetic signals. The code to generate this figure is accessible via the dedicated Github repository

3 M/EEG Experiments

This section provides an overview of the devices currently used and the main steps that constitute an M/EEG experiment. As a take-home message, in Table 1, we propose a comparison of the main features of MEG and EEG.

3.1 Instrumentation

3.1.1 EEG

EEG signals are recorded through the use of electrodes placed over the scalp. The EEG relies on the difference of potentials. The first EEG recordings have been performed by Hans Berger in 1924. He described the oscillatory activity at 8 Hz occurring in the posterior area of the scalp when the subject is awake with his eyes closed. There are different types of electrodes: wet/dry electrodes and active/passive electrodes. Wet electrodes are generally made of tin, silver, or silver chloride material (Ag/AgCl). They need an electrolytic gel to enable the conduction between the skin and the electrode. Dry electrodes are made of stainless steel that behaves as a conductor between the skin and the electrode. The active electrodes contain an electronic module that performs a pre-amplification of the signal to ensure the stability of the system toward changes in impedance and noise. The passive electrodes do not use a pre-amplification module.

Naming

Even though some differences may be found from one EEG device to another, there are some standardized ways to name and localize EEG sensors (also called channels). Each channel is often referred to by a letter and a number. Most of the time, odd channels are located on the left hemisphere and the even ones on the right hemisphere. The letters correspond to the area: frontal, temporal, parietal, central, and occipital. In addition to the sensors themselves, one can also find landmarks: nasion, inion, and preauricular points. An example of such naming is shown in Fig. 4b.

Fig. 4
2 photos and 1 illustration. A. 2 photos of a person with standardized E E G electrode arrays. B. An illustration of E E G montage, which corresponds to brain region. The sensors and landmarks are presented. C. 2 photos exhibit the sensors used to record M E G signals.

M/EEG instrumentation. (a) EEG experimental setup. (b) Example of EEG montage. For an illustrative purpose, each color corresponds to a brain area. Each circle represents either a sensor or a landmark. Sensors appear in color while landmarks appear in gray. Sensors are designated with a letter and a number. The letter is indicative of the brain region. Odd numbers correspond to the left hemisphere and even ones to the right hemisphere. (c) MEG experimental setup

List of Montages

Depending on the scientific question to be addressed and, therefore, the brain areas of interest, different montages can be found. One can build an EEG montage from less than 5 electrodes to up to 256 channels. EEG measurements rely on a difference of electrical potentials. For this purpose, two montages can be considered: the referential montage and the bipolar montage. In the referential montage, each difference of electrical potentials considers an electrode placed over the scalp and a reference. As a result, each electrode placed over the scalp is compared to the reference electrode. The choice of the reference is crucial. The most commonly chosen locations for the reference electrodes are the mastoids (i.e., temporal bone behind the ears), even though several studies prefer placing the reference at the vertex (Cz, i.e., midline central of the scalp). Again, the location depends on the scientific question to be addressed. The bipolar montage consists of performing the difference between two electrodes placed over the scalp, after the experiment. Another electrode, referred to as ground electrode, is used. Among the privileged locations is the scapula (i.e., shoulder blade). An example of an EEG setup and a standard montage are proposed in Fig. 4a, b. For a complete description of the standardized EEG electrode arrays, the reader can refer to [1].

Future of EEG Hardware

In the past years, there has been an increased interest in developing wearable EEG, to remove wires and to reduce its dimension but also to enable long-lasting recordings in a less constrained environment. Three bottlenecks need to be overcome: the EEG electrodes, hard to put on and to keep in place on the head; the EEG hardware, to make it less power-consuming and miniaturized; and the EEG software, to propose the most intelligible and reliable information regarding the captured brain activity [2]. In particular, EEG systems that rely on dry EEG electrodes get more and more attention. By not requiring conductive gel, it reduces the preparation time. Recent studies relying on commercialized dry electrodes systems show performances close to those obtained with wet electrodes [2].

3.1.2 MEG

Sensors and Main Devices

The difficulty here is to detect signals that are 109 weaker than the Earth magnetic field. The current devices rely on superconducting quantum interference devices (SQUIDs) that can detect small MEG signals [3]. One of the first proof of concept was made by D. Cohen in the 1970s [4]. The SQUIDs present a sensitivity, defined here as the smallest variation of magnetic field that can be detected by the sensor, of 1 \( fT\slash \sqrt{Hz} \). To obtain such performance, a magnetic shielding room is required to remove the environmental noise, and a part of the device needs to be cooled via a cryogenic system (see Fig. 4c). Two types of sensors are used to record MEG signals: magnetometers and gradiometers. Magnetometers measure the magnetic field, whereas the gradiometers measure the gradient of the magnetic field. They are used for noise elimination and consist in a combination of magnetometers. The main difference from one manufacturer to another lies in the type of gradiometers used:

  • CTF manufacturer: radial gradiometers consisting of two magnetometers placed one above the other

  • MEGIN manufacturer: planar gradiometers consisting of two magnetometers placed side by side

The type of gradiometer has an influence on the way brain activity is recorded and, therefore, on how to interpret the recorded signal [5]. Magnetometers and radial gradiometers are more sensitive to sources around the sensor, whereas planar gradiometers are more sensitive to sources located right below the sensor (Fig. 5).

Fig. 5
3 signal graphs. A. A sequential record of cardiac artifacts with the respective time in seconds by M E G is exhibited. B. A sequential record of ocular artifacts with the respective time in seconds. C. A line graph of f t per centimeter square per hertz versus frequency exhibits a decreasing trend.

Examples of artifacts in M/EEG. (a) Cardiac artifacts recorded with magnetometers. (b) Ocular artifacts recorded with EEG. (c) Power line noise recorded with gradiometers. Given its characteristics, plotting the power spectra enables to elicit it easily. The code to generate this figure is accessible via the dedicated Github repository

New Generation of Sensors

The current devices rely on a cryogenic cooling system that engenders technical and financial constraints. New cryogenic-free sensors have recently emerged: the optically pumped magnetometers (OPMs) [6, 7]. Developing cryogenic-free sensors presents two main advantages: an increase in the amplitude of the signal recorded by the sensor and a reduction of the dimension of the magnetic shielding room. Recent studies proved that OPMs present a better signal-to-noise ratio than EEG [8], can detect deep sources [9], and can be suited for pediatric or movement disorder studies [10]. Promising results could be obtained with triaxial measurements obtained from OPMs [11, 12].

3.2 Data Acquisition

Depending on the tasks and on the hardware used, the duration of an M/EEG experiment may vary. This section aims to present the main steps that constitute the data acquisition.

The first step consists in preparing all the materials to perform the experiment. For EEG, it will consist in cleaning the locations where electrodes will be in contact with the skin (e.g., forehead and mastoids). The electrodes and the EEG cap are then placed. Several key distances can be measured to verify that the cap is well-placed or to record fiducial points to be matched to other modalities afterward (e.g., MRI). Then, the experimenter needs to ensure that the communication between the electrodes and the scalp is established. For that purpose, an assessment of the impedance is made for each electrode. The lower it is, the better it is. In the case of wet electrodes, the experimenter has to inject gel at each sensor location. Once the impedances are lower than a certain threshold, typically a few kOhms, then the experiment can start. Regarding MEG, the experimenter places head-tracking coils to measure the head position before each recording. It helps preventing from large head movements that could lead to motion artefacts and error in the localization of source activity. The locations of fiducial points (nasion, left and right preauricular points) are registered. The information is stored in each data file. The subject is then placed in the magnetic shielding room after taking off all the elements that could generate magnetic interference with the device (e.g., jewels, belt). The experimenter helps the subject to place his/her head in the MEG helmet. Once the subject is in a comfortable position, the experimenter will save the head position that will be used as reference during the whole session.

Once the subject is correctly installed, the experimenter can start some pre-recordings to check the quality of the signal and give specific instructions to the subjects accordingly (e.g., loosening the jaw to avoid muscular artifacts). Finally, the experimenter can give further instructions regarding the task to perform before starting the recordings. After the end of session, the data are stored in specific servers to be processed.

4 Data Analysis

This sections aims at providing recommendations for analyzing M/EEG data. An overview of the main steps of the M/EEG data analysis is provided in Fig. 6.

Fig. 6
A schematic flowchart exhibits the works by M E G and E E G. M E G contains head movement compensation + Q C. E E G contains the graphical analysis of data inspection, artifact removal, + Q C, advanced analysis, which contains time-frequency analysis, connectivity, and networks.

Data analysis in M/EEG: general workflow. QC stands for quality check. Source reconstruction is not compulsory but advisable in specific cases. The code to generate this figure is accessible via the dedicated Github repository

4.1 Types of Artifacts/Noise

The notion of artifacts depends strongly on the signal of interest. Here, we consider as artifacts the signals that make the recording more difficult and may hamper the analysis of the brain activity recorded with EEG and/or MEG. Such artifacts can be divided into two categories: the neurophysiological artifacts and the environmental noise. This section aims at presenting their main features.

4.1.1 Neurophysiological Artifacts

This category of artifacts corresponds to noise generated by the subjects themselves, whether it is voluntarily or not. In a nutshell, it is important to bear in mind that the brain is far from being the only organ that generates electromagnetic activity. In particular, the eyes and the heart produce electromagnetic activity, which shows an amplitude higher than that of the brain. As a result, the main neurophysiological artifacts are related to cardiac activity and ocular activity (via blinks and saccades) and can be visually spotted out during an M/EEG recording (see Fig. 5a, b). A possible way to reduce the ocular artifacts is to instruct the subject to avoid moving their eyes and, for short recordings only, to avoid eyes blinking. Another neurophysiological artifact may be induced by the subjects’ voluntarily motion. Indeed, motion engenders muscular activity that can distort the recorded brain signals. Typical examples are jaw clenching and swallowing. They generate high-frequency activity that propagates to temporal electrodes. In the specific case of MEG, the device consisting in a helmet, it is strongly sensitive to head motion. A possible way to reduce the muscular artifacts is to instruct the subjects to remain as quiet as possible and to avoid moving their jaws.

4.1.2 Environmental Noise

This category refers to the artifacts generated by the environment that surrounds the experimental setup. They can be magnetic (e.g., magnetized devices that can interfere with the MEG sensors), linked with mechanical vibrations (e.g., presence of a tramway nearby), or simply associated with power line (occurring at 50 Hz or 60 Hz; see Fig. 5c). We do not aim at being exhaustive. We simply want the reader to be aware of the possible sources of environmental noise when analyzing M/EEG signals even though the Faraday cage and the shielded room, used respectively, in EEG and MEG, can partly prevent them.

4.1.3 System Noise

This category refers to artifacts generated by the sensors themselves. For example, in MEG, one can observe SQUID jumps or saturation. In both MEG and EEG, one can have broken sensors.

4.2 Preprocessing

This section aims at presenting the main steps that constitute the preprocessing pipeline, dedicated to artifact removal. This is probably the most crucial part when analyzing M/EEG data. Indeed, the point here is to remove noise without eliminating information of interest or distorting the signal. Attention must be paid to build the pipeline the most suited to the dataset and to the scientific question to be addressed. As such, the first thing to do when working with a new dataset is to extensively study it, in particular, inspecting the M/EEG signals but also the associated broadband power spectra. This preliminary step enables to identify most of the artifacts and, more importantly, if they have a specific temporal and/or frequency signature (e.g., presence of periodic artifacts).

From this point, it is possible to choose a specific strategy to remove the observed noise. In the case of cardiac and ocular artifacts, given their clear pattern, an efficient way to isolate and reduce them consists in applying independent component analysis (ICA) [13]. One can visually identify the components to be removed from both the temporal and the topographies (to avoid removing too many components) and manually select them. Another possibility, more reproducible, consists of using biosignals (e.g., electrocardiogram and electrooculogram) and to compute correlations between time series. This technique enables to ensure the robustness of the decision of removing a component.

In the case of artifacts at a specific frequency (e.g., power line noise at 50 Hz or 60 Hz), one can consider applying notch filters. With the same philosophy, in the case of muscular activity, applying a low-pass filter with a cutoff frequency at 40 Hz can be of interest. Nevertheless, one objection can be raised: the signal distortion induced by the filtering. As previously explained, here, we aim at finding a trade-off between removing artifacts and preserving the information of interest. That is why the pipeline strongly depends on the scientific question to be addressed. In the case of muscular activity, if someone is interested in the activity in the gamma band (>30 Hz), applying a low-pass filter will be a poor choice, and as such, removing noisy trials can be an option. Regarding head motion, as explained in Subheading 3.2, MEG systems enable to register the head position. Methods relying notably on signal space separation [14] can correct small movements (i.e., less that several centimeters).

Another type of artifacts consists of a broken channel. To avoid having a different number of sensors from one subject to another, the proposed solution depends on the sensor location. If the sensor has four neighbors, strategies relying on the interpolation can be considered. It consists in creating a virtual sensor that is the linear combination of the signals recorded by the broken sensor’s neighbors. If the sensor is located on the periphery, the interpolation is no longer reliable. The experimenter may consider removing the channel from the dataset. In the specific case of MEG, after an optional head movement correction step, if SQUID jump artifacts remain, one should consider reapplying the head movement correction on the raw data after having labeled as “bad” the sensors that show jumps. The bad MEG channels will be reconstructed.

Once the pipeline has been chosen and tested, it is important to check that the signals have been correctly preprocessed. This step corresponds to the quality check. There are different possibilities to perform it. The qualitative way would consist in superimposing preprocessed and postprocessed signals (which can be displayed as time series and/or power spectra) and to visualize potential differences. A more reliable way would consist in identifying a judgment criterion to assess to which extent the output signals are noisy. Possible metrics are the variance, the z-score, or the kurtosis. Using one of these metrics on the output may lead to both noisy channels and trials to be discarded. As a rule of thumb, the trial elimination must not exceed 10% of the total number of trials to ensure to have enough data to perform a relevant analysis [15].

4.3 Source Reconstruction

It is possible to directly analyze the signals recorded by the sensors. In such a case, one will say that the analysis is performed in the space of the sensors. However, it is also possible to go one step further and estimate the activity within the brain. This processing step is called source reconstruction and consists in estimating the neural correlates M/EEG signal location. It can be performed when one wants to have access to a higher spatial resolution to provide a more accurate description, and interpretation, of the neurophysiological phenomena occurring. For that purpose, both direct and inverse problems need to be solved [15, 16].

4.3.1 Direct Problem

Here, we aim at modeling the electromagnetic field produced by a cerebral source with known characteristics. For that purpose, it is necessary to consider both a physical model of the sources and a model that predicts the way that these sources will generate electromagnetic fields at the scalp level. The simplest model is the spherical model, which considers the head as an ensemble of spheres. Each sphere corresponds to a given tissue (brain, cerebrospinal fluid, skull, or skin) characterized by a given conductivity. Even though it is possible to adjust the spheres to the geometry of the head or restrict them to a limited number of regions of interest, this model is an oversimplification of the head geometry. More realistic models rely on geometrical reconstruction of the different layers that form the head tissues, directly extracted from the anatomical magnetic resonance imaging (MRI) data of, ideally, the participant (the MRI thus needs to be acquired separately) or a dedicated template (e.g., MNI Colin 27). They consist in building meshes of the interfaces between different tissues. We can cite three approaches: the boundary element method (BEM) [17] that is the most widely used, the finite difference method (FDM), and the finite element method (FEM). Another model, called overlapping spheres [18], consists of fitting a given sphere under each sensor.

Even though there are no guidelines regarding the choice of the method, we could provide some elements of recommendations: given the high sensitivity of the EEG toward variations in terms of conductivity, the BEM model can be a tool of choice. As for the MEG, being less sensitive to changes in conductivity, the overlapping spheres can be considered.

4.3.2 Inverse Problem

One of the main challenges of the inverse problem lies in the nonuniqueness of its solution. In other words, a large number of brain activity patterns could generate the same signature detected at the sensor level. Therefore, some constraints or assumptions are essential to lead to a unique solution that reflects the best the acquired data [15, 16]. In this section, we aim at providing a short overview of the methods that are the most used in routine.

The dipole modeling methods rely on a source modeling via a reduced number of equivalent dipoles where each of them represents a source activity. As a result, such methods are based on an a priori hypothesis on the required number of sources.

Scanning methods, such as the MUSIC approach [19], consist in estimating the probability of presence of a current dipole inside each voxel. Among them are the beamformer methods [20], which consist in applying a spatial filtering to estimate the source activity at each location. We can cite the linearly constrained minimum variance (LCMV) and the synthetic aperture magnetometry (SAM) [21] as examples of beamformer methods [22].

The approaches relying on distributed source models consist in estimating the amplitudes of dipoles located on the cortical surface. The characteristics of the groups of dipoles are fixed or are estimated via the individual MRI of the participant. The most famous methods relying on distributed sources models are the weighted minimum norm (wMNE) [23, 24] and LORETA [25].

Similar to the preprocessing step, there is no ideal choice of method for the inverse problem, as it depends on the question to be addressed. A general recommendation would be to consider the minimum norm method when expecting distributed sources and the dipole modeling for focal sources.

5 Feature Extraction and Selection

When considering M/EEG from the machine learning perspective, an important aspect is the extraction and the selection of the features. This section aims at presenting the main features that can be extracted from M/EEG. As previously mentioned, the selection of the features depends on the scientific question to be addressed but also on the neurophysiological phenomenon underlying the M/EEG experiment. In M/EEG, filtering both in the time domain and in the spatial domain to select the most relevant features is common.

The two main types of features used in the literature rely on the information in the frequency domain and in the time domain. In an effort of completeness, we will see alternative features that reflect the interconnected nature of the brain.

The event-related features consist of chunks of time series concatenated from all the channels, resulting from a low-pass or band-pass filtering and/or from a down-sampling step. This category of features is relevant when considering evoked activity after the presentation of a given stimulus (e.g., visual, auditory, or sensory). They are therefore of interest when one is expecting significant changes in signal amplitudes occurring at a given moment. In the example presented in Subheading 2.2, a positive wave occurred 300 ms after the visual stimulation. One could consider using chunks of time series centered at t = 300 ms to detect automatically the P300 wave.

The spectral features are used in the case of the detection of an oscillatory activity (see Subheading 2.2), when changes in M/EEG rhythms amplitudes are expected. The features are associated with the power spectra estimated in a given channel and in a given frequency band for a specific time window. Power spectra can be computed via a plethora of methods; we can notably cite the spectrogram, the Morlet wavelet scalogram, and the auto-regressive models. For a thorough comparison of spectral feature extraction techniques on EEG signals, please refer to [26].

Spatial filtering can be a valuable tool both for event-related and spectral features [27]. It relies on the combination of signals, recorded from different sensors, to obtain a new one, associated with an improved signal-to-noise ratio. We can divide the spatial-filtering methods into three categories. The first one, not data-driven, relies on physical considerations regarding the way the signals propagate through the different brain tissues. The most famous illustration of this category is the Laplacian filter. In its simplest version, the small Laplacian consists, for each electrode location, of a derivation of the EEG waveform via the average signal computed from the four nearest neighbors [28]. The second category of spatial filtering is data-driven and unsupervised. It can rely, for example, on a principal component analysis (PCA) approach (see Chap. 2, Sect. 13.1). The third category is data-driven and supervised. The most famous examples in M/EEG are the common spatial patterns (CSP) for spectral features [29] and xDAWN for event-related features [30]. The CSP consists of a linear combination of EEG signals to maximize the difference between two classes in terms of variance. The xDAWN approach aims at improving the signal-to-noise ratio obtained with evoked potentials via a projection of the raw EEG signals onto an estimated evoked subspace. Recent efforts have been put together to combine approaches to provide ways to optimize simultaneously spectral and spatial filters, with, for example, the filter bank CSP (FBCSP) [31].

Even though spectral and event-related features are the most used in the M/EEG literature, alternative features have been considered in the past years. Firstly, features relying on covariance matrices have recently been extensively used, in particular for Riemannian geometry-based classification [32]. Despite an unclear neurophysiological interpretation, they enabled to reach state-of-the-art performance and to win a large number of competitions. Secondly, new features, which take into account the interconnected nature of brain functioning, have recently emerged [33]. There is a plethora of estimators to assess the intensity of the interactions between brain areas [34]. The most frequent estimators used as features in M/EEG are derived from the coherency, i.e., the normalized cross-spectral density obtained from two signals (e.g., imaginary part of coherence), or rely on the assessment of the phase synchrony between two signals (e.g., phase-locking value (PLV), phase-lag index (PLI)). Here, two challenges need to be dealt with: the volume conduction that can lead to spurious connectivityFootnote 1 and the online implementation. In the first case, even though some estimators, such as the imaginary coherence, are less sensitive to the volume conduction, working in the source space is recommended. In the second case, a large majority of studies that consider estimators of functional interactions between two brain areas (i.e., functional connectivity estimators) as features are performed offline. Estimating brain interactions in real time is not trivial: it consists in finding the compromise between ensuring the quasi-stationarity of the signals and the statistical reliability of the functional connectivity estimation [33]. Recent studies considered the use of brain network metrics as potential features. Again, there is a plethora of metrics that characterize brain networks [33]. Here, we will cite the most used metrics. At the local scale, the node degree counts the number of connections linking one node to the other. In weighted networks (i.e., without having filtered the connectivity/adjacency matrix), it is referred to as node strength and consists in summing the weights of the connections of the considered node [35]. Another local-scale property of interest is the betweenness centrality defined as the extent to which a node lies “between” other pairs of nodes via the proportion of shortest paths in the network passing through it. This metric enables the identification of the nodes that are crucial for the information transfer between distant regions. At the global scale, we can cite two metrics: the characteristic path length and the clustering coefficient. The characteristic path length indicates the global tendency of the nodes in the network to integrate and exchange information. The clustering coefficient measures the tendency of having nodes’ neighbors mutually interconnected. Lastly, it is worthwhile noting the use of heterogeneous features (e.g., relying on both functional estimators and power spectra) that improves the classification accuracy [27]. Such an approach leads to an increase of the dimension, requiring cautions to select the most relevant features, via dimensional reduction methods.

The feature selection is a crucial step as it prevents redundancy, ensures the reliability of the features, reduces the dimensionality tuned, and helps in providing interpretable results. In this section, we aim at presenting the most popular feature selection methods in the M/EEG domain. For a complete description of the feature selection methods, the reader can refer to [27]. They can be divided into three categories: embedded, filter, and wrapper methods. In filter methods, the feature selection is performed independently and before the evaluation. Different criteria can be chosen to select features. The most popular criterion is the R2 score, which assesses to which extent a given feature is influenced by a task performed by the subject. In wrapper methods, the feature selection utilizes the classification. In other words, in an iterative process, the relevance of each subset of features is assessed via the classification performance until a given criterion is met. The embedded method consists in integrating both the feature selection and the classification in the same process, via a decision tree, for example, or an 1 penalty term.

Box 1: Tools for M/EEG analysis

All these tools provide a wide range of tutorials, publicly available datasets, and codes.

Python-based:

  • MNE-Python [36]

  • MOABB [37]

MATLAB-based:

  • EEGLAB [38]

  • Fieldtrip [39]

  • Brainstorm [40]

  • SPM [41]

6 M/EEG and Brain Disorders

6.1 Clinical Applications of M/EEG

The spatial and temporal resolutions of M/EEG enable the observation of a large number of processes. Notably, they can detect both evoked responses and oscillatory activity. As such, using these information could pave the way to biomarkers of brain disorders. To illustrate this point, we will focus our presentation on two specific clinical applications: epilepsy and Alzheimer disease. Nevertheless, M/EEG can be useful for a wider range of applications both in neurological and psychiatric disorders [42, 43].

6.1.1 Epilepsy

Epilepsy is a neurological disorder that presents a high prevalence of 1% [44]. It is established that between 20 and 30% of the patients present a pharmacoresistant form of epilepsy [45]. Among this proportion of patients, only 30% can undergo a surgery [46]. Epilepsy is a distributed disease that induces brain network reorganization and brain rhythm alterations both during ictal and interictal periods [47, 48]. Due to its time resolution compatible with the capture of dynamical changes as well as its wide availability, EEG is a key modality for the evaluation of epilepsy [44]. In addition to scalp EEG, stereotactic-EEG (SEEG) can be used to further localize epileptogenic foci and proven to provide valuable information on epileptogenic networks [48]. MEG can also be used for pre-surgical evaluation and for functional mapping [49], but it is much more costly and less widely available.

The use of network theory in epilepsy provides a useful framework to characterize the seizure (onset and propagation),and its clinical expression (e.g., comorbidities) [47, 48]. At the local scale, the node strength or degrees and the betweeness centrality have been used to characterize the epileptic network [48, 50]. At the global scale, two metrics have proven to be of interest in epilepsy [51]: the characteristic path length and the clustering coefficient.

6.1.2 Alzheimer Disease

Alzheimer disease is the most common dementia with 60–80% of the cases. The first symptoms are a deficit in short-term memory and concentration, followed later by a decline of linguistic skills, visuospatial orientation, and abstract reasoning judgment. As the pathophysiological process of the disease starts many years before the occurrence of symptoms [52], it is crucial to elicit biomarkers to provide a diagnosis as soon as possible. Efforts have been put together to describe mild cognitive impairment (MCI) and Alzheimer disease (AD) with M/EEG. These studies are essentially focused on oscillatory activity and on interactions between brain areas [53, 54]. In particular, patients present a reduced synchrony [55], and a decrease of the alpha power (i.e., between 8 and 12 Hz) correlates with lower cognitive status and hippocampal atrophy. Studies performed with MEG in preclinical and prodromal stages of AD showed that the effects of amyloid-beta deposition were associated with an increment of the prefrontal alpha power and that altered connectivity in the default mode network was present in normal individuals at risk for AD [56, 57].

A recent EEG work showed that effects of neurodegeneration were focused in frontocentral regions with an increase in high-frequency bands (beta and gamma) and a decrease in lower-frequency bands (delta) [58]. In particular, EEG patterns differ depending on the degree of amyloid burden, suggesting a compensatory mechanism: following a U-shape curve in delta power and an inverted U-shape curve for other tested metrics.

6.2 Advanced Uses: The Example of BCI as a Rehabilitation Tool

6.2.1 Presentation of the BCI

Brain-computer interfaces (BCIs) consist of acquiring, analyzing, and translating brain signals into commands in real time for control or communication. These systems present a large number of clinical applications and assistive technologies including control of wheelchairs and brain-based communication. BCI devices can be a valuable tool in the treatment of neurological disorders such as stroke [59] and to provide assistive solutions for patients with spinal cord injury [60] or the amyotrophic lateral sclerosis [59]. With regard to the communication, devices such as the P300 Speller, which rely on the evoked response occurring 300 ms after the visual stimulation, allow the users to communicate by selecting letters to form words and even sentences. For an overview of the main steps to be considered when performing a BCI experiment, please refer to Fig. 7.

Fig. 7
A schematic workflow diagram of the brain. It contains brain activity recording, feature extraction, classification, and feedback.

BCI experiment workflow

6.2.2 BCI as a Rehabilitation Tool

Stroke is one of the most common neurological conditions. In 2010, stroke was the second leading cause of death worldwide [61]. After a stroke, most patients require rehabilitation and assistance for daily tasks. Motor deficit of the upper limbs affects 70% of the survivors [62], and 85% of those presenting paralysis will have persistent damage [63]. Rapid recovery is observed during the first 3 months (acute phase) but can continue for several months after the accident (chronic phase) [64]. Motor imagery (MI)-based BCI can constitute a motor substitution in the case of stroke by building alternative pathways from the stimulation to the brain [65]. In this particular case, the system relies on the desynchronization effect associated with a decrease of the power spectra computed within the contralateral sensorimotor area [66]. In a recent meta-analysis [67], the authors observed that rehabilitation to restore upper limb motor function based on BCIs could improve the motricity, assessed via the Fugl-Meyer scale, more than other therapies. A part of the screened studies showed that BCI could induce neuroplasticity.

Brain network changes in stroke patients represent a very promising clinical application of closed-loop systems in rehabilitation strategies. Motor imagery has been proven to be a valuable tool in the study of upper limb recovery after stroke [68]. It enabled observations of changes in ipsilesional intrahemispheric connectivity [69] but also modifications in connectivity in prefrontal areas and correlations between node strengths and motor outcome [70]. Based on previous observations in resting state [71], a recent double-blind study involving ten stroke patients at the chronic stage revealed that node strength, computed from the ipsilesional primary motor cortex in the alpha band, could be a target for a motor imagery-based neurofeedback and lead to significant improvement on motor performance [72].

6.2.3 Current Challenges and Perspectives

Despite being beneficial for patients, controlling a BCI system is a learned skill that 15–30% of the users cannot develop even after several training sessions. This phenomenon, called “BCI inefficiency” [73], has been presented as one of the main limitations to a wider use of BCI. From the machine learning perspective, the main challenges to overcome in current BCI paradigms relying on EEG recordings are the low signal-to-noise ratio of signals, the non-stationarity over time mainly resulting from the difference between calibration and feedback sessions, the reduced amount of available data to train the classifier explained by the number of classes to be discriminated and/or the need to avoid the subject’s tiredness, and the lack of robustness and reliability of the BCI systems, in particular when decoding the users’ mental command.

To tackle these challenges, efforts have been put to improve the classification algorithms. They can be divided into three main groups: the adaptive classifiers, the transfer learning techniques, and the matrix- or tensor-based algorithms. The adaptive classifiers aim at dealing with EEG non-stationarity by taking into account changes in signal properties, and feature distribution, over time. Their parameters are updated when new EEG signals are available [74]. Even though most of the adaptive classifiers can rely on a supervised approach, the unsupervised one has proven to outperform the classifiers that cannot catch temporal dynamics [75]. Besides, it can be a valuable tool to reduce the training duration and potentially to remove the calibration part. Nevertheless, the adaptive classifiers present one main pitfall: their lack of online validation with a user in most of the current literature. This leads to two potential issues: the difficulty to find a trade-off between fully retraining the classifier and updating some key parameters and the adaptation that may not follow the actual user’s intent by being too fast or too slow [76].

Transfer learning consists here in exploiting changes in EEG signal properties over time and subjects to extract knowledge. More specifically, it relies on learned classifiers that are trained on one task (called domain here) and are adapted to another task with little or no new training data [77]. For example, it can be applied to a dataset formed by two motor imagery tasks performed by two different subjects. There is plethora of methods to solve the transfer learning problem [78]. The most common in the EEG-based BCI domain consists in learning the transformation to correct the mismatch between the domains, occurring when one domain corresponds to a hand motor imagery and the other to a foot motor imagery, for instance, finding a common feature representation for the domains, or learning a transformation of the data to make their distribution match [27]. Despite its robustness and recent advances in proposing guidelines [79], there is a lack of online experiments relying on transfer learning to fully validate this approach and assess to which extent it can be beneficial to patients.

Among the classification methods relying on matrices and tensors, the most well-known is the Riemannian geometry-based one. One of the main original characteristics of this approach is that it is able to manipulate and classify the data by representing them as symmetric positive definite matrices, such as covariance matrices, and by mapping them onto a dedicated geometrical space, involving less steps than the classic approaches. This approach relies on the assumption that the sources are specific of a given task encoded via the covariance matrix computed from EEG signals. Here, trials are classified via nearest neighbor methods relying on the Riemannian distance and the geometric mean. With the method relying on the minimum distance to mean (MDM), each class is associated with a geometric mean computed from the training data. Then, the MDM will attribute an unlabeled trial to the class showing the closest mean [80]. The Riemannian approaches present many advantages: they can be applied to all BCI paradigms, no parameter tuning is required, they are robust to noise, and, combined to transfer learning methods, they can lead to calibration-free BCI sessions [81]. In particular, Riemannian geometry-based methods [80, 82] are now the state of the art in terms of performance [27] and have won several data competitionsFootnote 2 [83].

Box: 2 To go further

Guidelines and books of reference

  • Hari, M., and Puce, A. (2017). MEG-EEG Primer. In MEG-EEG Primer. Oxford University Press.

  • M. Clerc, L. Bougrain, and F. Lotte. (2016) Brain-Computer Interfaces 1: Methods and Perspectives, Wiley.

  • M. Clerc, L. Bougrain, and F. Lotte. (2016) Brain-Computer Interfaces 2: Technology and Applications, Wiley.

  • Gross, J., Baillet, S., Barnes, G. R., Henson, R. N., Hillebrand, A., Jensen, O., Jerbi, K., Litvak, V., Maess, B., Oostenveld, R., Parkkonen, L., Taylor, J. R., van Wassenhove, V., Wibral, M., and Schoffelen, J.-M. (2013). Good practice for conducting and reporting MEG research. Neuroimage, 65, 349–363.

    Box 2 (continued)

  • Puce, A. and Hämäläinen, M. S. (2017). A Review of Issues Related to Data Acquisition and Analysis in EEG/MEG Studies. Brain Sci, 7(6).

7 Conclusion

EEG and MEG are key modalities for the study of brain disorders. In particular, EEG is relatively cheap and widely available and is thus a widely used tool in neurology. When dealing with EEG and MEG data, it is important to understand the origin of the signals as well as the different steps in their preprocessing and feature extraction. Machine learning is increasingly used on EEG and MEG data, in particular for BCI but also for computer-aided diagnosis and prognosis of brain disorders.