Elsevier

NeuroImage

Volume 90, 15 April 2014, Pages 449-468
NeuroImage

Automatic denoising of functional MRI data: Combining independent component analysis and hierarchical fusion of classifiers

https://doi.org/10.1016/j.neuroimage.2013.11.046Get rights and content

Abstract

Many sources of fluctuation contribute to the fMRI signal, and this makes identifying the effects that are truly related to the underlying neuronal activity difficult. Independent component analysis (ICA) – one of the most widely used techniques for the exploratory analysis of fMRI data – has shown to be a powerful technique in identifying various sources of neuronally-related and artefactual fluctuation in fMRI data (both with the application of external stimuli and with the subject “at rest”). ICA decomposes fMRI data into patterns of activity (a set of spatial maps and their corresponding time series) that are statistically independent and add linearly to explain voxel-wise time series. Given the set of ICA components, if the components representing “signal” (brain activity) can be distinguished form the “noise” components (effects of motion, non-neuronal physiology, scanner artefacts and other nuisance sources), the latter can then be removed from the data, providing an effective cleanup of structured noise. Manual classification of components is labour intensive and requires expertise; hence, a fully automatic noise detection algorithm that can reliably detect various types of noise sources (in both task and resting fMRI) is desirable. In this paper, we introduce FIX (“FMRIB's ICA-based X-noiseifier”), which provides an automatic solution for denoising fMRI data via accurate classification of ICA components. For each ICA component FIX generates a large number of distinct spatial and temporal features, each describing a different aspect of the data (e.g., what proportion of temporal fluctuations are at high frequencies). The set of features is then fed into a multi-level classifier (built around several different classifiers). Once trained through the hand-classification of a sufficient number of training datasets, the classifier can then automatically classify new datasets. The noise components can then be subtracted from (or regressed out of) the original data, to provide automated cleanup. On conventional resting-state fMRI (rfMRI) single-run datasets, FIX achieved about 95% overall accuracy. On high-quality rfMRI data from the Human Connectome Project, FIX achieves over 99% classification accuracy, and as a result is being used in the default rfMRI processing pipeline for generating HCP connectomes. FIX is publicly available as a plugin for FSL.

Introduction

Functional magnetic resonance imaging (fMRI) has become a widely-used approach for mapping brain function. In most fMRI experiments, however, many sources of temporal fluctuation (e.g., head movement, respiratory motion, scanner artefacts, etc.) contribute to the recorded voxel-wise time series. Such artifacts reduce the signal-to-noise ratio, complicate the interpretation of the data, and can mislead statistical analyses (in both subject- and group-level inference) that attempt to investigate neuronally-related brain activation. Thus, separating “signal” from “noise”1 is a very important challenge in fMRI neuroscience. This is particularly important for resting-state fMRI, because functional networks are identified on the basis of spontaneous correlations between distinct regions, where spatially-extended artefacts can easily contribute problematically to estimated correlations.

There are two major types of noise removal techniques for fMRI datasets — approaches that employ additional physiological recordings (or, “model-based approaches”) and those that are data driven (for a detailed review, see Murphy et al., NeuroImage Special Issue on Mapping the Connectome, in press). One of the most well-known techniques of the former type, RETROspective Image CORrection (RETROICOR Golver et al., 2000), measures the phases of the cardiac and respiratory cycles, and attempts to remove low-order Fourier terms that are synchronised with these exogenous measurements. Similar approaches are taken in Shmueli et al. (2007) and Birn et al. (2006): these filter the aspects of the imaging data that demonstrate strong correspondence with the measurements (e.g., in terms of phase or correlation). While these approaches can perform quite well in cleaning respiratory and cardiac noises, their success depends heavily on the availability and quality of the physiological measurements. Moreover, physiological monitoring data, if available/collected, are not expected to relate to all common forms of artefact (e.g., scanner artefacts and head movements). This is the fundamental reason behind development and adoption of “data-driven” approaches.

Many data-driven approaches employ independent component analysis (ICA), which has been shown to be a powerful tool for separating various sources of fluctuations found in fMRI data. ICA was first used for fMRI by McKeown et al. (1998) for decomposing the data into distinct components (each consisting of a map and its representative time course) that are maximally spatially independent. Some components were considered artefactual, while others reflected the brain's activation in response to the task imposed on the subject. Later, (e.g., Kiviniemi et al., 2003) it was shown that amongst the structured processes identifiable through ICA, resting-state networks could be found as components distinct from each other and from artefactual effects in the data.

Since ICA requires a large number of samples to function well, its application to fMRI (where there are normally orders of magnitude more voxels than time points) is believed to be more robust in the spatial than in the temporal domain. Also, the underlying neural processes in the data may well be more non-Gaussian in space than in time (particularly for resting-state data), adding to the greater robustness of spatial ICA (Smith et al., 2012). With respect to the separation of activation from artefacts, and of spatially distinct activations from each other, spatial independence has been a successful and enduring model, and nearly all applications of ICA (to both task and resting fMRI) to date have used spatial ICA.

The success of ICA in separating BOLD signal from noise makes it an attractive preprocessing tool for denoising both task and resting fMRI. If ICA can decompose the data into a set of noisy components (i.e., artefactual fluctuations) and non-artefactual components (i.e., fluctuations of interest), one can “clean” the data by subtracting the artefactual components from the data (or regressing them out of the data). However, identifying the artefact components manually can be very labour-intensive, and requires in-depth knowledge of (ideally all possible) signal and noise fluctuations' spatiotemporal characteristics. Therefore, several previous approaches have attempted to offer fully-automatic solutions to ICA classification. As one of the first attempts, Kochiyama et al. (2005) proposed an automatic solution for removing the effects of task-related motion, which characterises the ICs by their task-related changes in signal intensity and variance; therefore this may be effective for task fMRI, but does not naturally extend to resting-state. Perlbarg et al. (2007) proposed an approach that characterises the activity of the voxels in certain regions of interest (ROIs) that are known a priori to correspond to noisy behaviour. Given the wide range of artefacts that can be present in fMRI data, Tohka et al. (2008) proposed a set of 6 spatial and temporal features that capture a wider range of ICs' characteristics, while (De Martino et al., 2007) defined 11 features. Such features might include the fraction of spatial map supra-threshold voxels lying on the brain edge, or the fraction of temporal spectral power lying above some frequency threshold. In both cases the features were then fed into a trained multivariate classifier, which attempted to automatically classify newly-seen components into signal vs. noise. Our approach is roughly similar, but we defined more than 180 features (including features similar to those defined in the previous papers), and utilise multiple different classifier approaches, combined via classifier stacking.

In this paper, we introduce FIX (FMRIB's ICA-based X-noiseifier), which is a fully automatic (once hand-trained) solution for cleaning (both task and resting) fMRI data of various types of structured noise. Using FIX consists of five steps: spatial ICA, estimation of a large number of spatial/temporal features for each component of each dataset, classifier training (using hand labeling of components), application of the classifier to new datasets, and denoising (removal of artefact components from the data). In the ICA step, we employ MELODIC (Multivariate Exploratory Linear Optimised Decomposition into Independent Components) (Beckmann and Smith, 2004) from the FMRIB Software Library (FSL2). We assessed the performance of FIX against manual component classifications across various fMRI datasets and found good to excellent performance across a wide range of resting fMRI datasets.

In an associated paper (Griffanti et al., submitted for publication), we have evaluated in detail the effect of ICA + FIX fMRI cleanup on both standard fMRI datasets and accelerated (Feinberg et al., 2010, Moeller et al., 2010) datasets. We also compared the various approaches that one might take to remove the artefactual components from the data once they have been classified as artefact by FIX. These investigations include evaluation (of the effect of the various cleanup options) on both the spatial and temporal (and hence network) characteristics of resting-state networks.

Section snippets

Methods

The general approach for applying FIX is:

  • 1.

    Apply standard preprocessing steps, typically: rigid-body head motion correction, optional spatial smoothing, and high-pass temporal filtering to remove slow drifts.

  • 2.

    Apply ICA to decompose the preprocessed data into a set of independent components.

  • 3.

    Use FIX to identify which of the ICA components correspond to artefactual processes in the data.

  • 4.

    Remove those components from the preprocessed fMRI data.

The spatial smoothing step in the pre-processing might

Results

Example results showing several different kinds of ICA components from a range of fMRI acquisition protocols have been presented above (Fig. 1, Fig. 2, Fig. 3, Fig. 4, Fig. 5, Fig. 6, Fig. 7). In this section we present quantitative results relating to the accuracy of FIX in correctly classifying ICA components as signal vs. noise. As discussed above, the evaluation of optimal methods for the removal of noise components (once identified by FIX), and investigation of the effects of this removal

Conclusions and discussion

We have described a new tool for the automated denoising of artefacts in fMRI data, achieved by running independent component analysis, identifying which components correspond to artefactual processes in the data, and removing those from the data. Our tool, FIX, can achieve over 99% classification accuracy on the best fMRI datasets, and around 95% accuracy on more “standard” acquisitions (particularly if study-specific training is carried out). FIX therefore can be a very valuable tool for the

Acknowledgments

We are very grateful to Erin Reid and Donna Dierker (WashU), for helping with the FIX training (hand-labeling of ICA components) from HCP data, to Eugene Duff and other members of the FMRIB Analysis Group for input on the FIX feature set and scripting, and to David Flitney (Oxford), for creating the Melview ICA component viewing and labeling tool. We are grateful for partial funding via the following NIH grants: 1U54MH091657-01, P30-NS057091, P41-RR08079/EB015894, and F30-MH097312. Gwenae¨lle

References (37)

  • J. Tohka et al.

    Automatic independent component labeling for artifact removal in fMRI

    NeuroImage

    (2008)
  • K. Ugurbil et al.

    Pushing spatial and temporal resolution for functional and diffusion MRI in the Human Connectome Project

    NeuroImage

    (2013 Oct 15)
  • D. Van Essen et al.

    The WU-Minn Human Connectome Project: an overview

    NeuroImage

    (2013)
  • D. Wolpert

    Stacked generalization

    Neural Netw.

    (1992)
  • Y. Amit et al.

    Shape quantization and recognition with randomized trees

    Neural Comput.

    (1997)
  • C.F. Beckmann et al.

    Probabilistic independent component analysis for functional magnetic resonance imaging

    IEEE Trans. Med. Imaging

    (2004)
  • L. Breiman

    Random forests

    Mach. Learn.

    (2001)
  • B. Caputo et al.

    Appearance-based object recognition using SVMs: which kernel should i use

  • Cited by (1287)

    View all citing articles on Scopus
    View full text