Elsevier

NeuroImage

Volume 19, Issue 2, June 2003, Pages 261-270
NeuroImage

Regular article
Functional magnetic resonance imaging (fMRI) “brain reading”: detecting and classifying distributed patterns of fMRI activity in human visual cortex

https://doi.org/10.1016/S1053-8119(03)00049-1Get rights and content

Abstract

Traditional (univariate) analysis of functional MRI (fMRI) data relies exclusively on the information contained in the time course of individual voxels. Multivariate analyses can take advantage of the information contained in activity patterns across space, from multiple voxels. Such analyses have the potential to greatly expand the amount of information extracted from fMRI data sets. In the present study, multivariate statistical pattern recognition methods, including linear discriminant analysis and support vector machines, were used to classify patterns of fMRI activation evoked by the visual presentation of various categories of objects. Classifiers were trained using data from voxels in predefined regions of interest during a subset of trials for each subject individually. Classification of subsequently collected fMRI data was attempted according to the similarity of activation patterns to prior training examples. Classification was done using only small amounts of data (20 s worth) at a time, so such a technique could, in principle, be used to extract information about a subject’s percept on a near real-time basis. Classifiers trained on data acquired during one session were equally accurate in classifying data collected within the same session and across sessions separated by more than a week, in the same subject. Although the highest classification accuracies were obtained using patterns of activity including lower visual areas as input, classification accuracies well above chance were achieved using regions of interest restricted to higher-order object-selective visual areas. In contrast to typical fMRI data analysis, in which hours of data across many subjects are averaged to reveal slight differences in activation, the use of pattern recognition methods allows a subtle 10-way discrimination to be performed on an essentially trial-by-trial basis within individuals, demonstrating that fMRI data contain far more information than is typically appreciated.

Introduction

The idea that the activity of a population of neurons in the brain represents some aspect of the external sensory world is as old as neuroscience itself. From the earliest single unit recording experiments through the relatively recent development of functional magnetic resonance imaging (fMRI), one of the predominant themes in neuroscience has been the development and understanding of the relationship between the activity of neurons and the sensory world.

The human brain is capable of representing an almost limitless collection of complex visual objects. This ability extends from the most common of everyday objects to objects that have never before been seen or imagined. However, while some of the details of the basic representational architecture of early visual cortex are known (e.g., retinotopy, hypercolumns Hubel and Wiesel 1968, Hubel and Wiesel 1969), relatively little is known about how the higher-order visual cortex represents complex real-world visual objects and the conjunctions of features that comprise them.

There exists a continuum of possible coding schemes that could be used to represent complex objects in the brain, ranging from highly localized architectures (“grandmother” coding), where individual functional units are used to represent individual classes of stimuli, to fully distributed schemes, where all functional units participate in representation, and it is the relative pattern of activity that counts. It is not clear where along this continuum human extrastriate cortex lies, and it is possible that the answer may be different at different spatial scales (e.g., a representation could be mostly localized when considered at the scale of large region of cortex, but distributed at finer scales, or vice versa). A debate surrounding this question has emerged recently based on fMRI evidence for the relative modularity Downing et al 2001, Kanwisher et al 1997 or distributedness Haxby et al 2001, Ishai et al 1999 of activity in human ventral extrastriate visual cortex.

Functional MRI is a technology well-suited to asking questions about representation in the human brain, as it offers a noninvasive window onto brain function with whole-brain coverage and reasonable spatial resolution. However, most commonly used analysis methods for fMRI data are ill-suited to dealing with distributed patterns of activity. fMRI data are fundamentally multivariate (that is, a single fMRI acquisition in time contains information about the local brain hemodynamics at thousands of locations), yet fMRI data are almost always analyzed in an essentially univariate way, treating each voxel as a separate entity as far as statistical analysis is concerned. While this is a natural way to seek functional localization, such an approach by definition ignores the interrelationships between the activity at different locations and the possibility that the variables and organization of interest may not have a one-to-one correspondence with the voxels in an fMRI data set. A variety of multivariate techniques have been applied to fMRI data Friston et al 1999, McIntosh et al 1996, McKeown et al 1998, but to date, relatively few of these efforts have been aimed at studying fine-grained questions about how the brain represents different classes of stimuli (with Haxby et al., 2001, and more recently, Spiridon and Kanwisher, 2002, being notable exceptions).

The purpose of the present investigation is to apply a new family of multivariate techniques directly to the problem of object representation in fMRI. Statistical pattern recognition algorithms are designed to learn and later classify multivariate data points based on statistical regularities in the data set. Fundamentally, pattern recognition algorithms operate by dividing a high-dimensional space into regions corresponding to different classes of data. This and other multivariate approaches are powerful because they can potentially discriminate between different classes of multivariate data even when the data, as projected along any individual dimension, are statistically indistinguishable (see Fig. 1 A subject is shown blocks of various categories of visually presented objects while in the scanner. fMRI volumes are acquired while the subject looks at the objects, and the pattern of activity over an independently selected set of voxels is extracted (see Feature Selection under Materials and Methods). This pattern is then given to a classifier, along with a label that identifies the category corresponding to the stimulus the subject was viewing, and the classifier learns a mapping between patterns of brain activity and stimulus categories. Then, in an independent imaging session (separated by as much as several weeks), the same subject views the same categories of objects with either the same (Experiment 1) or different (Experiment 2) exemplars. Functional MRI volumes are collected with nearly identical spatial sampling, and the same voxel subset is extracted. These patterns of activity are then given to the trained classifier that attempts to infer the category of objects the subject was viewing.

In the present investigation, we attempted to classify the category of object a subject was looking at (of 10 possible categories, including similar categories, such as horses and cows) using only very small amounts of data (20 s worth, roughly corresponding to the time scale of the hemodynamic response function).

Section snippets

fMRI data acquisition and subjects

Data were collected using a 3-T Siemens Allegra head-only MRI scanner (21–24 transaxial slices for whole-brain coverage, 3.125 × 3.125 mm in-plane, 5 mm thick, TR = 2 s, TE = 30 ms, GRE EPI) using a custom-designed thermoplastic head-restraint system that permitted repeatable subject placement to within a few millimeters over months of scanning. Four subjects participated in a varying numbers of sessions (S1, male aged 52, 8 sessions; S2, male aged 23, 8 sessions; S3, male aged 44, 2 sessions;

Results

In all cases, it was possible, with accuracies far above chance, to determine what object a subject was looking at, based purely on isolated collections of just 20 s worth of fMRI data at a time. Accuracy remained high even when training and test data sets were separated by days or weeks (as in Experiment 1).

A summary of the classifier performance for all subjects is shown in Fig. 3 (it should be noted that the accuracies presented here are computed in a more conservative manner than in Haxby

The information content of small quantities of fMRI data

The fact that it is possible to gain large amounts of information about which category of objects a subject is viewing from such small quantities of fMRI data is surprising for several reasons. First, while most fMRI experiments pool data collected over many minutes for each of many subjects to find subtle differences in activation across tasks, the present method can infer what category of object a subject is viewing (including categories as similar as horses and cows) using just 20 s worth of

Acknowledgements

We thank Michael Burns and Michael Beauchamp for helpful discussions and comments. This work was supported by the Rowland Institute for Science (now The Rowland Institute at Harvard University) and the Athinoula A. Martinos Center for Biomedical Imaging.

References (19)

There are more references available in the full text version of this article.

Cited by (874)

View all citing articles on Scopus
View full text