Automatic denoising of functional MRI data: Combining independent component analysis and hierarchical fusion of classifiers
Introduction
Functional magnetic resonance imaging (fMRI) has become a widely-used approach for mapping brain function. In most fMRI experiments, however, many sources of temporal fluctuation (e.g., head movement, respiratory motion, scanner artefacts, etc.) contribute to the recorded voxel-wise time series. Such artifacts reduce the signal-to-noise ratio, complicate the interpretation of the data, and can mislead statistical analyses (in both subject- and group-level inference) that attempt to investigate neuronally-related brain activation. Thus, separating “signal” from “noise”1 is a very important challenge in fMRI neuroscience. This is particularly important for resting-state fMRI, because functional networks are identified on the basis of spontaneous correlations between distinct regions, where spatially-extended artefacts can easily contribute problematically to estimated correlations.
There are two major types of noise removal techniques for fMRI datasets — approaches that employ additional physiological recordings (or, “model-based approaches”) and those that are data driven (for a detailed review, see Murphy et al., NeuroImage Special Issue on Mapping the Connectome, in press). One of the most well-known techniques of the former type, RETROspective Image CORrection (RETROICOR Golver et al., 2000), measures the phases of the cardiac and respiratory cycles, and attempts to remove low-order Fourier terms that are synchronised with these exogenous measurements. Similar approaches are taken in Shmueli et al. (2007) and Birn et al. (2006): these filter the aspects of the imaging data that demonstrate strong correspondence with the measurements (e.g., in terms of phase or correlation). While these approaches can perform quite well in cleaning respiratory and cardiac noises, their success depends heavily on the availability and quality of the physiological measurements. Moreover, physiological monitoring data, if available/collected, are not expected to relate to all common forms of artefact (e.g., scanner artefacts and head movements). This is the fundamental reason behind development and adoption of “data-driven” approaches.
Many data-driven approaches employ independent component analysis (ICA), which has been shown to be a powerful tool for separating various sources of fluctuations found in fMRI data. ICA was first used for fMRI by McKeown et al. (1998) for decomposing the data into distinct components (each consisting of a map and its representative time course) that are maximally spatially independent. Some components were considered artefactual, while others reflected the brain's activation in response to the task imposed on the subject. Later, (e.g., Kiviniemi et al., 2003) it was shown that amongst the structured processes identifiable through ICA, resting-state networks could be found as components distinct from each other and from artefactual effects in the data.
Since ICA requires a large number of samples to function well, its application to fMRI (where there are normally orders of magnitude more voxels than time points) is believed to be more robust in the spatial than in the temporal domain. Also, the underlying neural processes in the data may well be more non-Gaussian in space than in time (particularly for resting-state data), adding to the greater robustness of spatial ICA (Smith et al., 2012). With respect to the separation of activation from artefacts, and of spatially distinct activations from each other, spatial independence has been a successful and enduring model, and nearly all applications of ICA (to both task and resting fMRI) to date have used spatial ICA.
The success of ICA in separating BOLD signal from noise makes it an attractive preprocessing tool for denoising both task and resting fMRI. If ICA can decompose the data into a set of noisy components (i.e., artefactual fluctuations) and non-artefactual components (i.e., fluctuations of interest), one can “clean” the data by subtracting the artefactual components from the data (or regressing them out of the data). However, identifying the artefact components manually can be very labour-intensive, and requires in-depth knowledge of (ideally all possible) signal and noise fluctuations' spatiotemporal characteristics. Therefore, several previous approaches have attempted to offer fully-automatic solutions to ICA classification. As one of the first attempts, Kochiyama et al. (2005) proposed an automatic solution for removing the effects of task-related motion, which characterises the ICs by their task-related changes in signal intensity and variance; therefore this may be effective for task fMRI, but does not naturally extend to resting-state. Perlbarg et al. (2007) proposed an approach that characterises the activity of the voxels in certain regions of interest (ROIs) that are known a priori to correspond to noisy behaviour. Given the wide range of artefacts that can be present in fMRI data, Tohka et al. (2008) proposed a set of 6 spatial and temporal features that capture a wider range of ICs' characteristics, while (De Martino et al., 2007) defined 11 features. Such features might include the fraction of spatial map supra-threshold voxels lying on the brain edge, or the fraction of temporal spectral power lying above some frequency threshold. In both cases the features were then fed into a trained multivariate classifier, which attempted to automatically classify newly-seen components into signal vs. noise. Our approach is roughly similar, but we defined more than 180 features (including features similar to those defined in the previous papers), and utilise multiple different classifier approaches, combined via classifier stacking.
In this paper, we introduce FIX (FMRIB's ICA-based X-noiseifier), which is a fully automatic (once hand-trained) solution for cleaning (both task and resting) fMRI data of various types of structured noise. Using FIX consists of five steps: spatial ICA, estimation of a large number of spatial/temporal features for each component of each dataset, classifier training (using hand labeling of components), application of the classifier to new datasets, and denoising (removal of artefact components from the data). In the ICA step, we employ MELODIC (Multivariate Exploratory Linear Optimised Decomposition into Independent Components) (Beckmann and Smith, 2004) from the FMRIB Software Library (FSL2). We assessed the performance of FIX against manual component classifications across various fMRI datasets and found good to excellent performance across a wide range of resting fMRI datasets.
In an associated paper (Griffanti et al., submitted for publication), we have evaluated in detail the effect of ICA + FIX fMRI cleanup on both standard fMRI datasets and accelerated (Feinberg et al., 2010, Moeller et al., 2010) datasets. We also compared the various approaches that one might take to remove the artefactual components from the data once they have been classified as artefact by FIX. These investigations include evaluation (of the effect of the various cleanup options) on both the spatial and temporal (and hence network) characteristics of resting-state networks.
Section snippets
Methods
The general approach for applying FIX is:
- 1.
Apply standard preprocessing steps, typically: rigid-body head motion correction, optional spatial smoothing, and high-pass temporal filtering to remove slow drifts.
- 2.
Apply ICA to decompose the preprocessed data into a set of independent components.
- 3.
Use FIX to identify which of the ICA components correspond to artefactual processes in the data.
- 4.
Remove those components from the preprocessed fMRI data.
The spatial smoothing step in the pre-processing might
Results
Example results showing several different kinds of ICA components from a range of fMRI acquisition protocols have been presented above (Fig. 1, Fig. 2, Fig. 3, Fig. 4, Fig. 5, Fig. 6, Fig. 7). In this section we present quantitative results relating to the accuracy of FIX in correctly classifying ICA components as signal vs. noise. As discussed above, the evaluation of optimal methods for the removal of noise components (once identified by FIX), and investigation of the effects of this removal
Conclusions and discussion
We have described a new tool for the automated denoising of artefacts in fMRI data, achieved by running independent component analysis, identifying which components correspond to artefactual processes in the data, and removing those from the data. Our tool, FIX, can achieve over 99% classification accuracy on the best fMRI datasets, and around 95% accuracy on more “standard” acquisitions (particularly if study-specific training is carried out). FIX therefore can be a very valuable tool for the
Acknowledgments
We are very grateful to Erin Reid and Donna Dierker (WashU), for helping with the FIX training (hand-labeling of ICA components) from HCP data, to Eugene Duff and other members of the FMRIB Analysis Group for input on the FIX feature set and scripting, and to David Flitney (Oxford), for creating the Melview ICA component viewing and labeling tool. We are grateful for partial funding via the following NIH grants: 1U54MH091657-01, P30-NS057091, P41-RR08079/EB015894, and F30-MH097312. Gwenalle
References (37)
- et al.
Separating respiratory-variation-related fluctuations from neuronal-activity-related fluctuations in fMRI
NeuroImage
(2006) - et al.
Classification of fMRI independent components using IC-fingerprints and support vector machine classifiers
NeuroImage
(2007) - et al.
Independent component analysis of nondeterministic fMRI signal sources
NeuroImage
(2003) - et al.
Removing the effects of task-related motion using independent-component analysis
NeuroImage
(2005) - et al.
Spectral characteristics of resting state networks
- et al.
CORSICA: correction of structured noise in fMRI by automatic identification of ICA components
Magn. Reson. Imaging
(2007) - et al.
Adjusting the effect of nonstationarity in cluster-based and TFCE inference
NeuroImage
(2011 Feb 1) - et al.
Low-frequency fluctuations in the cardiac rate as a source of variance in the resting-state fMRI BOLD signal
NeuroImage
(Aug 2007) - et al.
Threshold-free cluster enhancement: addressing problems of smoothing, threshold dependence and localisation in cluster inference
NeuroImage
(2009) - et al.
Resting-state fMRI in the Human Connectome Project
NeuroImage
(2013 Oct 15)