Abstract
In human adults, multiple cortical regions respond robustly to faces, including the occipital face area (OFA) and fusiform face area (FFA), implicated in face perception, and the superior temporal sulcus (STS) and medial prefrontal cortex (MPFC), implicated in higher-level social functions. When in development, does face selectivity arise in each of these regions? Here, we combined two awake infant functional magnetic resonance imaging (fMRI) datasets to create a sample size twice the size of previous reports (n = 65 infants; 2.6–9.6 months). Infants watched movies of faces, bodies, objects, and scenes, while fMRI data were collected. Despite variable amounts of data from each infant, individual subject whole-brain activation maps revealed responses to faces compared to nonface visual categories in the approximate location of OFA, FFA, STS, and MPFC. To determine the strength and nature of face selectivity in these regions, we used cross-validated functional region of interest analyses. Across this larger sample size, face responses in OFA, FFA, STS, and MPFC were significantly greater than responses to bodies, objects, and scenes. Even the youngest infants (2–5 months) showed significantly face-selective responses in FFA, STS, and MPFC, but not OFA. These results demonstrate that face selectivity is present in multiple cortical regions within months of birth, providing powerful constraints on theories of cortical development.
- cerebral cortex
- faces
- FFA
- fMRI
- infant brain
- MPFC
- OFA
- STS
Significance Statement
Social cognition often begins with face perception. In adults, several cortical regions respond robustly to faces, yet little is known about when and how these regions first arise in development. To test whether face selectivity changes in the first year of life, we combined two datasets, doubling the sample size relative to previous reports. In the approximate location of the fusiform face area, superior temporal sulcus, and medial prefrontal cortex but not occipital face area, face selectivity was present in the youngest group. These findings demonstrate that face-selective responses are present across multiple lobes of the brain very early in life.
Introduction
Faces are highly salient visual and social features of our environment. In human adults, many cortical regions show robust and selective responses to faces (Haxby et al., 2000; Dai and Scherf, 2023). We focus on four such regions: the occipital face area (OFA) in the inferior occipital gyrus (IOG; Gauthier et al., 2000), the fusiform face area (FFA) in the fusiform gyrus (Kanwisher et al., 1997), and regions in the superior temporal sulcus (STS) and the medial prefrontal cortex (MPFC). Here we ask when in development each of these regions first respond selectively to faces.
OFA and FFA are both visual regions, responding robustly to visually presented faces and much less to any other visual category. OFA is anatomically posterior to FFA and appears to encode face features or parts (Henriksson et al., 2015). FFA, on the other hand, appears to encode the presence and identity of a face more holistically (Grill-Spector et al., 2004; Yovel and Kanwisher, 2005). Individual neurons within FFA are highly face-selective (Axelrod et al., 2019; Khuvis et al., 2021), and electrically stimulating this area can distort or create face percepts (Parvizi et al., 2012; Rangarajan et al., 2014; Schalk et al., 2017; Jonas et al., 2018).
In contrast, STS and MPFC contain regions that respond robustly to faces, but these responses are modulated by social context, and the same regions also respond to socially relevant stimuli that are not faces. A region in STS is face-selective when compared to other visual non-face categories and preferentially responds to socially relevant movements of faces, like facial expressions and shifts of eye gaze (Pitcher et al., 2011), but also responds to human voices (Deen et al., 2020). Similarly, a region in MPFC is face-selective on visual tasks, but the response to faces is influenced by the faces’ social attributes such as moral goodness and attractiveness (LaBar, 2003; O’Doherty et al., 2003; Cheng et al., 2022). The same region in MPFC also responds to socially relevant diagrams and verbal narratives (Kosakowski et al., 2022b). In sum, in adults OFA, FFA, STS, and MPFC all have face-selective responses, though plausibly performing different visual and social functions.
Initial functional magnetic resonance imaging (fMRI) studies revealed substantial changes in the extent and magnitude of face-selective responses throughout childhood and early adolescence, suggesting that face selectivity is slow to develop (Golarai et al., 2007, 2010, 2015; Scherf et al., 2007; Peelen et al., 2009; Cohen Kadosh et al., 2011, 2013; Joseph et al., 2011; Haist et al., 2013; Natu et al., 2016; Nordt et al., 2021; Tian et al., 2021; Feng et al., 2022). On the other hand, human infants have at least modest preferential responses to faces in the approximate location of OFA, FFA, STS, and MPFC (Tzourio-Mazoyer et al., 2002; Lloyd-Fox et al., 2009, 2011, 2017; Ichikawa et al., 2010; Deen et al., 2017, Powell et al., 2018; Lisboa et al., 2020b). Most of these prior studies did not measure responses to other visual categories, to establish whether the responses were face-selective. One recent study reported face-selective responses in FFA in infants, but did not investigate STS or MPFC (Kosakowski et al., 2022a). Additionally, because of small sample sizes, prior studies could not resolve when face-selective responses in these regions first appear, within the first year of life. Thus, it remains an open question when each of these cortical regions first shows face-selective responses (Scott and Arcaro, 2023).
fMRI is the only neuroimaging method that has the coverage and spatial resolution to measure neural responses simultaneously in OFA, FFA, STS, and MPFC. A substantial challenge for fMRI with infant populations is that fMRI requires the participant to be still for long periods of time during data acquisition making awake infant fMRI studies rare (Ellis et al., 2020). For the current study, we combined two fMRI datasets that were collected (Fig. 1a), while infants watched dynamic videos of faces (Fig. 1b), bodies, objects, and scenes (Fig. 1c). In this combined dataset, infants ranged in age from 2 to 9 months (n = 65; Fig. 1d). As a result, the oldest infants had three times as much postbirth experience as the youngest ones. Thus, these data allow us to test whether, and when, cortical face selectivity emerges in OFA, FFA, STS, and MPFC in the first year of human infants’ lives.
Materials and Methods
Infant fMRI data
To investigate possible age-related changes in face selectivity in each region, we made two key changes to our analyses, compared with Kosakowski et al. (2022a). First, to get the largest possible sample of infants in each age group, we combined data that were collected using two different infant head coils and three different sequences, adjusting for differences in spatial resolution and distortions and temporal signal-to-noise ratio (tSNR) by using larger parcels created using the Glasser atlas (Glasser et al., 2016). Second, we used a contrast of faces greater than the response to all non-face conditions (Kosakowski et al., 2022a used face > objects). Other differences between the two analysis streams are noted below.
One group of infants (n = 31) was scanned from June 2016 to July 2019 with one coil (Coil 2011 described below; Keil et al., 2011) using a sinusoidal acquisition sequence (Zapp et al., 2012) on a 3 T Siemens Trio Scanner. A second group of infants (n = 56) was scanned from July 2019 to February 2020 with a different coil (Coil 2021 described below; Ghotra et al., 2021) using a higher-resolution acquisition sequence (see below, Data collection) on a 3 T Siemens Prisma Scanner. Overall, the tSNR is lower in Coil 2011 than Coil 2021 data. Data from the the two coils do not differ significantly in age (Coil 2011 mean, 5.04 months; Coil 2021 mean, 5.61 months; t(57.07) = −1.23; p = 0.22; ci = −1.50 to 0.36) or motion (Coil 2011 mean, 0.20; Coil 2021 mean, 0.19; t(62.41) = 0.41; p = 0.68; ci = 0.20–0.19).
Each infant was scheduled for up to eight visits. Any usable data that were collected within a 30 d window on the same coil with the same acquisition sequence were analyzed as a single session [see below, Data selection (subrun creation)]. In total, we had 74 sessions from 61 individuals (2.0–9.8 months; mean, 5.2 months). For each session, we collected 4.30–112.70 min of data (mean, 34.92 min; SD, 22.45 min). To be eligible for analysis, we identified segments of usable data with <2 mm/radians of frame-to-frame displacement, resulting in 65 usable sessions (2.1–9.8 months; mean, 5.3) from 53 unique individuals with 2.05–57.25 min of usable data per session (Fig. 1d; mean, 16.99; SD, 12.82). Of these, 49 sessions from 46 unique individuals had enough data to be included in whole-brain random effects analyses, and 37 sessions from 33 unique individuals met the inclusion criteria for functional region of interest (fROI) analyses (Fig. 1d).
Participants
Infants (n = 86; 2.0–11.9 months; mean, 5.4 months; 41 females) were recruited from the metro area in and around Boston, MA through word of mouth, fliers, and social media. These data have been previously reported in Kosakowski et al. (2022a), mainly reporting data from Coil 2021 and measuring responses only in visually responsive category-selective regions. Information about which infants were included in the present analyses compared with those in Kosakowski et al. (2022a) are included in a table on OSF (https://osf.io/h7rbv/). Here, the data were reanalyzed with a focus on face-responsive regions across the cerebral cortex and combining data from both Coil 2021 (Ghotra et al., 2021) and Coil 2011 (Keil et al., 2011) to create a larger sample. Usable data [see below, Data selection (subrun creation)] were collected from 65 infants (2.6–11.9 months; 24 females; 31 from Coil 2021 and 34 from Coil 2011). Parents of participants were provided parking or reimbursed travel expenses. Participants received a small compensation for each visit and, when possible, printed images of their brain. Parents of participants provided informed consent, and all protocols were approved by the Institutional Review Board at MIT.
Experimental paradigms
Paradigm 1
Infants watched videos of faces (Fig. 1b), bodies, objects, and scenes (Fig. 1c; Pitcher et al., 2011). A colorful, curvy, abstract baseline was used to maintain infants’ attention during baseline blocks. Videos were selected to be categorically homogeneous within blocks and heterogeneous between blocks. Each block was 18 s and was composed of six 3 s videos from the same category. Face videos showed one child's face on a black background. Object videos showed toys moving. Body videos showed children's hands or feet on a black background. Scene videos showed natural landscapes. Baseline blocks were also 18 s and consisted of six 3 s videos that featured abstract color scenes such as liquid bubbles or tie-dyed patterns. The block order was pseudorandom such that all blocks played once prior to playing again. Videos played continuously for as long as the infant was content, paying attention, and awake.
Paradigm 2
Infants watched videos from the same five conditions as in Paradigm 1. However, the videos were shortened to 2.7 s and interleaved with still images from the same category (but not drawn from the videos) presented for 300 ms. All blocks were 18 s and included six videos and six images. Video and image orders were randomized within blocks, and the block order was pseudorandom by category. Paradigm 2 contained one additional block depicting hand–object interactions which was not included in the present analysis.
Data collection
Infants were swaddled if possible (Fig. 1a). A parent or researcher went into the scanner with the infant, while a second adult stood outside the bore of the scanner. Infants heard lullabies (https://store.jammyjams.net/products/pop-goes-lullaby-10) for the duration of the scan. For data collected with Coil 2011, lullabies were played over a loudspeaker into the scanning room. For data collected with Coil 2021, lullabies were played through custom infant headphones (Fig. 1a).
Coil 2011
For data collected with Coil 2011, we used a custom 32-channel infant coil designed for 3 T Siemens Trio Scanner (Keil et al., 2011) and a quiet EPI with sinusoidal trajectory (Zapp et al., 2012) with 22 near-axial slices [repetition time (TR), 3 s, echo time (TE), 43 ms; flip angle, 90°; field of view (FOV), 192 mm; matrix, 64 × 64; slice thickness, 3 mm; slice gap, 0.6 mm]. The sinusoidal acquisition sequence caused substantial distortions in the functional images.
Coil 2021
Infants wore custom infant MR-safe headphones. Infant headphones attenuated scanner noises and allowed infants to hear the lullabies. An adjustable coil design (Ghotra et al., 2021) increased infant comfort and accommodated headphones as well as a variety of head sizes (Fig. 1a). The new infant coil and infant headphones designed for 3 T Siemens Prisma Scanner enabled the use of an EPI with standard trajectory with 44 near-axial slices (TR, 3 s; TE, 30 ms; flip angle, 90°; FOV, 160 mm; matrix, 80 × 80; slice thickness, 2 mm; slice gap, 0 mm). Six infants had data collected using a different EPI with standard trajectory with 52 near-axial slices (TR, 2 s; TE, 30 ms; flip angle, 90°; FOV, 208 mm; matrix, 104 × 104; slice thickness, 2 mm; slice gap, 0 mm). Functional data collected with Coil 2021 were less distorted than data collected with Coil 2011.
Data selection (subrun creation)
To be included in the analysis, data had to meet criteria for low head motion (Deen et al., 2017; Kosakowski et al., 2022a). Data were cleaved between consecutive timepoints with >2 radians or millimeter of frame-to-frame displacement, creating subruns, which had to contain at least 24 consecutive low-motion volumes to be included in further analysis. All volumes included in a subrun were extracted from the original run data and combined to create a new NIfTI file for each subrun. Paradigm files indicating which condition occurred at each time point were similarly updated for each subrun. Volumes with greater than 0.5 radians or millimeter of frame-to-frame displacement from the previous or following volume were scrubbed (i.e., removed) from all analyses. Data collected within a 30 d window from a single subject were analyzed as one session. Figure 2a shows the amount of data collected and the amount of data included in the subruns, prior to scrubbing, for each participant session. The total amount of data initially collected was negatively correlated with age (r = −0.26; p = 0.003), and the proportion of data we retained per session was positively correlated with age (Fig. 2b; r = 0.39; p < 0.001). That is, older infants had shorter sessions and tended to be still for greater proportions of those sessions. Younger infants were more likely to move and required longer sessions to produce the same eventual yield of usable data.
Figure 2-1
Analyses of temporal signal-to-noise ratio (tSNR) in awake infant fMRI data. Plots show estimated effects of run length, run type (i.e., subruns (red) vs concatenated (blue)) on tSNR in (a) Coil 2011 and (b) Coil 2021 data. In Coil 2011 data (a), the tSNR decreases as a function of run length but the decrease is greater when the subruns are concatenated. In Coil 2021 data (b) the tSNR decreases as function of run length but this effect is not modulated by concatenating the runs. All estimated effects are plotted using plot_model from sjPlot package in R. Download Figure 2-1, TIF file.
Participants had to have at least 5 min of low-motion data to be included in whole-brain analyses. Only one session from each participant was included in each RFX analysis (which were run separately for Coil 2011 and Coil 2021). For fROI analyses, subruns were combined or split, as necessary, to create subruns with at least 96 volumes each. Subruns were designed to have approximately the same number of volumes, within participant (see below for more information). To be included in the fROI analysis, participants had to have at least two subruns (one to choose voxels and the other to extract independent response magnitudes from the selected voxels).
fMRI data preprocessing
Each subrun was processed individually. First, an individual functional image was extracted from the middle of the subrun to be used for registering the subruns to one another for further analysis. Then, each subrun was motion corrected using FSL MCFLIRT. If >3 consecutive images had >0.5 mm or 0.5 radians of motion, there had to be at least seven consecutive low-motion volumes following the last high-motion volume for those volumes to be included in the analysis. Additionally, each subrun had to have at least 24 volumes after accounting for motion and timepoints when the infants appeared to be asleep (e.g., with eyes closed). Functional data were skull-stripped (FSL BET2), intensity normalized, and spatially smoothed with a 3 mm FWHM Gaussian kernel (FSL SUSAN).
Data registration
All subruns were aligned within subjects, and then each subject was registered to a standard template. First, the middle image of each subrun was extracted and used as an example image for registration. If the middle image was corrupted by motion or distortion, a better image was selected as the example image. The example image from the middle subrun of the first visit with usable data was used as the target image. All other subruns from each subject were registered to that subject's target image using FSL FLIRT. The target image for each subject was then registered to a template image using FSL FLIRT. For data collected with Coil 2011, the template image was taken from Deen et al. (2017). For data collected with Coil 2021, the template image was taken from Kosakowski et al. (2022a). Given the distortion of the images and the lack of an anatomical image for each subject, traditional registration tools do not effectively register infant data between subjects. As such, we attempted to register each image using a rigid, an affine, and a partial affine registration with FSL FLIRT. The best image registration was selected by eye from the three options and manually tuned using the FreeSurfer GUI for the best possible data alignment. Each image took between 2 and 8 h of human labor to register. Images collected with Coil 2021 were transformed into the anatomical space of the template image for visualization.
A potential concern is the impact of concatenating subruns on the measured timecourses. To address this, we directly tested the effect of analytic partition types (i.e., subruns vs concatenated subruns) on the tSNR across all voxels in all parcels (see below, fROI analysis, for information about parcels). The model also included predictors for run length, coil, and parcel and only included runs with at least 96 volumes. There was not a main effect of partition type [i.e., subruns with vs without concatenation (F(1/576.58) = 0.95; p = 0.33)], but tSNR was lower in longer runs (F(2/82.27) = 26.42; p < 0.00001) and higher in data collected using Coil 2021 (F(2/82.27) = 26.42; p < 0.00001), and there was a significant three-way interaction between run length, coil, and partition type (F(2/567.55) = 3.49; p = 0.03). In Coil 2011 data (Extended Data Fig. 2-1a), tSNR decreased as a function of run length and was more pronounced in concatenated subruns (F(1/332.42) = 10.44; p = 0.001), but there was no such effect in Coil 2021 data (Extended Data Fig. 2-1b; F(1/288.20) = 0.0009; p = 0.98). These results demonstrate that Coil 2011 data have lower tSNR, likely due to differential distortion patterns.
Subject-level beta and contrast maps
Functional data were analyzed with a whole-brain voxel–wise general linear model (GLM) using custom MATLAB scripts. The GLM included four condition regressors (faces, bodies, objects, and scenes), six motion regressors, a linear trend regressor, and five principal component analysis (PCA) noise regressors. PCA noise regressors are analogous to GLMdenoise (Kay et al., 2013). Condition regressors were defined as a boxcar function for the duration of each condition block (18 s). Infant inattention or sleep was accounted for using a single nuisance (“sleep”) regressor. The sleep regressor was defined as a boxcar function with a 1 for each TR the infant was not looking at the stimuli, and the corresponding TR was set to 0 for all condition regressors. Boxcar condition and sleep regressors were convolved with an infant hemodynamic response function (HRF) that is characterized by a longer time to peak and deeper undershoot compared with the standard adult HRF (Arichi et al., 2012). Next, data and all regressors except PCA noise regressors were concatenated across subruns. PCA noise regressors were computed across concatenated data, and beta values were computed for each condition in a whole-brain voxel–wise GLM. Subject-level contrast maps to test for face-selective responses were computed as the difference between the face beta and the average of all nonface (i.e., bodies, objects, and scenes) betas for each voxel using in-house MATLAB code.
Group random effect analysis
Due to variable distortions in the BOLD images across participants, and lack of a T1 and/or T2 image from most infants, data registration across participants is imperfect. Additionally, the sequences used with each coil created very different patterns of spatial distortion, so we conducted separate group random effect analyses for data collected from each coil. First, subject-level contrast difference (faces–nonface) maps were transformed into coil-specific template space. Group RFX analyses were performed using FreeSurfer mri_concat and FreeSurfer mri_glmfit. In the dataset used for whole-brain analyses, the amount of motion (the number of scrubbed volumes divided by the number of total volumes) was negatively correlated with age (Fig. 2c; r = −0.27; p = 0.03).
fROI analysis
Group RFX analyses are imperfect because they rely on high-quality registrations to a common template across subjects and they do not respect idiosyncratic anatomical and functional differences across individuals. Thus, to determine if cortical responses are face-selective, we utilized an fROI approach. Using fROI analyses enables us to (1) account for individual anatomical variability (Saxe et al., 2006), (2) more rigorously characterize responses using a cross-validation procedure (Nieto-Castañón and Fedorenko, 2012), and (3) require high-quality within–subject registrations while tolerating imperfect across-subject registrations.
We account for the variable amount of data in each subrun for each subject (n = 37 sessions; 33 unique individuals) and the impact this could have on reliable parameter estimates from the GLM by first combining or splitting subruns. This allowed us to approximately equate the amount of data across subruns within each subject. For example, if a subject had three subruns and the first had 30 volumes, the second had 75 volumes, and the third had 325 volumes, then we concatenated the first two subruns to create one subrun, and we split the third subrun into three resulting in a total of four subruns with approximately 100 volumes each. For data included in fROI analyses, motion and age were not correlated (Fig. 2d; r = −0.04; p = 0.8). Thus, the fROI analyses allow for a measurement of age effects that is not confounded with motion.
To constrain search areas for voxel selection, we used anatomically defined parcels transformed to subject-specific BOLD space. Due to the distortions in the Coil 2011 dataset, we opted to use larger parcels than the FFA and OFA parcels used in Kosakowski et al. (2022a). We created large parcels that extended well beyond the boundaries of face regions in the IOG (the approximate locations of OFA) and ventral temporal cortex (VTC, the approximate location of FFA), STS, and MPFC using the Glasser atlas (Glasser et al., 2016). The large OFA parcel included Glasser areas LO1, LO2, LO3, V4, V4t, and PIT. The large FFA parcel included Glasser areas VMV1, VMV2, VMV3, VVC, PHA1, PHA2, PHA3, and FFC. For the MPFC parcel, we used Glasser areas p24, d32, 9m, and p32. For the STS parcel, we used Glasser areas STSvp, STSva, STSdp, STSda, and STV. All parcels were transformed into infant-specific functional space by concatenating the subrun-to-infant template registration matrix with the infant template-to-MNI registration matrix and inverting those transformations.
We used an iterative leave-one-subrun-out procedure such that data were concatenated across all subruns except one. Then, whole-brain voxel–wise GLMs and contrast maps were computed. The top 5% of voxels that had a greater response to faces than the average response to nonface conditions within an anatomical constraint parcel were selected as the fROI for that subject. Then, the parameter estimates (i.e., beta values) for all four conditions were extracted from the left-out subrun. For all bar plots, beta values were averaged across participant sessions.
To determine whether a region's response was category-selective, we fit the beta values using a linear mixed effects model. In each model, we indicator-coded the three control conditions to test the hypothesis that the response to each control condition was significantly lower than the response to the face condition. Specifically, we fit a model in R using the lme4 software package (Bates et al., 2014) with the following expression:
To test for condition by age interactions, we used the following model in R:
To test for laterality effects between the left and right hemispheres, we fit the following model for each fROI in R:
To test if EVC has a different functional profile than face-selective regions, we fit the following model in R:
We also computed weighted LME models to account for the variable amount of data each subject contributed to each condition. The results were similar for these additional models and are reported in Extended Data Tables S1–S3.
Code and data availability
All code is provided in a repository on Open Science Framework (OSF; https://osf.io/h7rbv/). All data for the fROI analysis, figures with face activations for all participants, and infant ID comparison to Kosakowski et al. (2022a) is also provided on OSF.
Results
Whole-brain contrast maps
First, we asked if infants have face responses in the approximate locations of OFA, FFA, STS, and MPFC by visualizing whole-brain maps in individual infants. At a lenient statistical threshold (p < 0.01, uncorrected), visual inspection of contrast maps (faces > nonfaces) from all sessions with sufficient usable data (n = 65) revealed face activations in the approximate location of OFA, FFA, STS, and MPFC in many infants scanned on both Coil 2011 and Coil 2021. Importantly, infants with different amounts of data had face responses in the approximate location of each expected region (e.g., Fig. 3; all infants on OSF; https://osf.io/h7rbv/).
The Coil 2021 group RFX map (Fig. 4) showed face > nonface activations in the approximate locations of OFA, FFA, STS, and MPFC. For Coil 2011 group RFX, there were face > nonface activations in the approximate locations of STS and MPFC but not in the approximate locations of OFA or VTC (Extended Data Fig. 4-1). These activations did not survive correction for multiple comparisons, though note that registration across infants was only approximate, given the highly distorted functional images and the absence of a high-resolution individual anatomical image for most individuals.
Figure 4-1
Group face responses in infant cerebral cortex. Whole brain group random effects analysis of Coil 2011 at a lenient threshold (p < 0.05) revealed face activations (faces > non-faces) in (c) superior temporal sulcus (STS), and (d) medial prefrontal cortex (MPFC). Face activations were not observed in the group in (a) inferior occipital gyrus (IOG), the approximate location of OFA in adults or (b) ventral temporal cortex (VTC), the approximate location of FFA in adults. Hot colors indicate face activations, cool colors indicate average response to non-faces. Activation clusters did not survive correction for multiple comparisons. Activations for each region are shown on infant template BOLD image in coronal (top row), axial (middle row) and sagittal (bottom row) views and highlighted with a red circle. Results for Coil 2021 data are visualized in Figure 4. Download Figure 4-1, TIF file.
fROIs
All fROI analyses were conducted in n = 37 sessions (33 unique individuals) who had at least two subruns. IOG is the approximate location of OFA in adults. Across all infants in the IOG fROI analysis (Fig. 5a; Table 1), the response to faces was significantly greater than the response to bodies (p = 0.004), objects (p = 0.049), and scenes (p < 0.001). The overall magnitude of response across all conditions increased with age (p = 0.000006), and there was an age by condition interaction (p = 0.003; Table 2). Post hoc analyses revealed that the age by condition interaction was driven by a lower response to scenes in older infants (p = 0.003; Extended Data Fig. 5-1a; Table 3). Using a median split by age, we ran separate fROI analyses of younger and older infants (Fig. 5a; Table 1). In the younger infants alone, the response to faces was greater than the scene response (p = 0.03) but not significantly different from the object or body responses (ps > 0.1), but in the older infants alone, the face response was significantly greater than the response to each other condition (all ps < 0.03). To test for any laterality effects, we analyzed left and right hemispheres separately. Despite a hemisphere by age interaction (p = 0.04; Table 4) and a condition by age interaction (p = 0.03), the condition by hemisphere by age interaction was not significant (p > 0.3). Post hoc analyses indicated there was no difference in responses to faces in right versus left hemispheres (p > 0.4; Fig. 6a; Extended Data Table 4-4).
Figure 5-1
Effect of age on the response magnitude for each condition in each fROI. Scatter plots show magnitudes for each condition from fROI analyses collapsed across Coil 2011 and Coil 2021 datasets as a function of age. ROIs include (a) inferior occipital gyrus (IOG), the approximate location of OFA, (b) ventral temporal cortex (VTC), the approximate location of FFA, (c) superior temporal sulcus (STS), (d) medial prefrontal cortex (MPFC), and (e) early visual cortex (EVC). Face betas are plotted in purple, body betas are plotted in pink, object betas are plots in yellow, and scene betas are plotted in green. Age is z-scored. Symbols indicate statistics from linear mixed effects models: p < 0.05. Additional statistics reported in Table 3. Download Figure 5-1, TIF file.
Figure 6-1
Effect of age on the response magnitude for each condition in each fROI. Scatter plots show magnitudes for each condition from fROI analyses collapsed across Coil 2011 and Coil 2021 datasets as a function of age. ROIs include (a, b) left and right inferior occipital gyrus (IOG), the approximate location of OFA, (c, d) left and right ventral temporal cortex (VTC), the approximate location of FFA, and (e, f) superior temporal sulcus (STS). Age is z-scored. Face betas are plotted in purple, body betas are plotted in pink, object betas are plots in yellow, and scene betas are plotted in green. Symbols indicate statistics from linear mixed effects models: p < 0.1; p < 0.05; **p < 0.01. Additional statistics reported in Table 4-2. Download Figure 6-1, TIF file.
Table 1-1
Face selectivity in functional regions of interest with condition weights. Parameter estimates from linear mixed effects models with beta values for each condition as predictors. Indicator-coded vectors used to test if body, object, and scene responses are each significantly less than the response to faces. Sex and z-scored age were coded as fixed effects and subject was coded as a random effect. Standard error is indicated in paratheses; p < 0.05 is indicated in bold; p < 0.10 is indicated in italics. A negative number in bold indicates a significantly lower response to that condition to faces. The intercept indicates the magnitude of the face response relative to baseline. Statistical models without weights are reported in Table 1. Download Table 1-1, DOC file.
Table 2-1
Interaction effects of age and fROI with condition weights. All results from linear mixed effects models converted to ANOVA table with R function anova; p < 0.05 is indicated in bold, p < 0.10 is indicated in italics. The same models without weights reported in Table 2. Models testing for interaction between age and condition are in each hemisphere are in Table 2-2 (without weights) and Table 2-3 (with weights). Download Table 2-1, DOC file.
Table 2-2
Interaction effects of age in each fROI for each hemisphere. All results from linear mixed effects models converted to ANOVA table with R function anova; p < 0.05 is indicated in bold, p < 0.10 is indicated in italics. Models with weights are in Table 4-4. Download Table 2-2, DOC file.
Table 2-3
Interaction effects of age in each fROI for each hemisphere with condition weights. All results from linear mixed effects models converted to ANOVA with R function anova; p < 0.05 is indicated in bold, p < 0.10 is indicated in italics. Models without weights are in Table 4-3. Download Table 2-3, DOC file.
Table 3-1
Effect of age on each condition with condition weights. † Parameters estimated with a linear-mixed effects model in R. Condition response indicated in the left column are the predictors, z-scored age coded as a fixed effect, subject coded as a random effect. Standard error is indicated in paratheses. p < 0.05 is indicated in bold, p < 0.10 is indicated in italics. * Model was singular due to negligible contribution of participant in the random effects term. A linear model without subject as a random effect produces the same results without a singular fit. Statistical models without weights are reported in Table 3. Download Table 3-1, DOC file.
Table 4-1
Face selectivity in each fROI for each hemisphere. Parameter estimates from linear mixed effects models with beta values for each condition as predictors. Indicator-coded vectors used to test if body, object, and scene responses are each significantly less than the response to faces. Sex and z-scored age were coded as fixed effects and subject was coded as a random effect. Standard error is indicated in paratheses; p < 0.05 is indicated in bold; p < 0.10 is indicated in italics. A negative number in bold indicates a significantly lower response to that condition to faces. The intercept indicates the magnitude of the face response relative to baseline. Models with weights in Table 4-2. Download Table 4-1, DOC file.
Table 4-2
Face selectivity in fROI for each hemisphere with condition weights. Parameter estimates from linear mixed effects models with beta values for each condition as predictors. Indicator-coded vectors used to test if body, object, and scene responses are each significantly less than the response to faces. Sex and z-scored age were coded as fixed effects and subject was coded as a random effect. Standard error is indicated in paratheses; p < 0.05 is indicated in bold; p < 0.10 is indicated in italics. A negative number in bold indicates a significantly lower response to that condition to faces. The intercept indicates the magnitude of the face response relative to baseline. Models without weights in Table 4-1. Download Table 4-2, DOC file.
Table 4-3
Effect of age and hemisphere on face Selectivity in each fROI with condition weights. All results from linear mixed effects models converted to ANOVA with R function anova; p < 0.05 is indicated in bold, p < 0.10 is indicated in italics. Statistics for models without weights are reported in Table 4. Download Table 4-3, DOC file.
Table 4-4
Effect of Age on each condition in each hemisphere. † Parameters estimated with a linear-mixed effects model in R. Condition response indicated in the left column are the predictors, z-scored age coded as a fixed effect, subject coded as a random effect. Standard error is indicated in paratheses. p < 0.05 is indicated in bold, p < 0.10 is indicated in italics. * Model was singular due to negligible contribution of participant in the random effects term. A linear model without subject as a random effect produces the same results without a singular fit. No weights included in analyses. Download Table 4-4, DOC file.
Table 4-5
Effect of age and hemisphere on condition responses for each fROI. All results from linear mixed effects models converted to ANOVA with R function anova; p < 0.05 is indicated in bold, p < 0.10 is indicated in italics. No weights included in analyses. Download Table 4-5, DOC file.
VTC is the location of FFA in adults. Across all infants, in the fROI in VTC (Fig. 5b; Table 1), the response to faces was significantly greater than responses to objects, bodies, and scenes (all ps < 0.007). The overall magnitude of response across all conditions increased with age (p = 0.01), but the age by condition interaction did not reach significance (p = 0.09; Table 2). Post hoc analyses of each condition separately (Extended Data Fig. 5-1; Table 3) showed that face and object responses were significantly greater in older infants (faces p = 0.04; objects p = 0.03; Extended Data Fig. 5-1; Table 3). In both the younger and older infants separately (Fig. 5b; Table 1), the face response was significantly greater than the response to bodies (younger, p = 0.001; older, p = 0.0009), objects (younger, p = 0.02; older, p = 0.04), and scenes (younger, p = 0.03; older, p = 0.00002). The response to all stimuli was higher overall in the right hemisphere (p = 0.0007; Table 4) and in older infants (p = 0.005), but there was no interaction between hemisphere and condition (p = 0.76) or hemisphere and age (p = 0.18). Post hoc analyses indicated there was no difference in responses to faces in right versus left hemispheres (p > 0.4; Fig. 6a; Tables 4–8).
Are the age effects observed in IOG stronger than the lack of age effects in VTC? To address this question, we tested for an fROI (IOG vs VTC) by age by condition interaction using a linear mixed effect model (Table 2). Indeed, although we observed a significant fROI (IOG vs VTC) by age interaction (p = 0.03; Table 2) and an interaction between condition and age (p = 0.0004), the fROI by age by condition interaction was not significant (p = 0.46).
STS contains a region that responds to both faces and voices in adults. Across all infants, in the fROI analysis of STS, the response to faces was significantly greater than to objects, bodies, and scenes (all ps < 0.00002; Fig. 5c; Table 1). There was no main effect of age (p = 0.7) and no condition by age interaction (p = 0.15; Table 2). In fROI analyses of the younger and older infants separately, the face response was significantly greater than responses to any other condition (all ps < 0.004). Overall, response magnitudes were higher in older infants (age by condition interaction, p = 0.004), but interactions with the hemisphere were not significant (hemisphere by age, p = 0.08; hemisphere by age by condition, p > 0.8).
In adults, MPFC contains a region that responds to socially relevant stimuli, including faces. Across all infants, in the fROI analysis of MPFC, the response to faces was significantly greater than to objects, bodies, and scenes (all ps < 0.00005; Fig. 5d; Table 1). There was no main effect of age (p = 0.7) and no age by condition interaction (p = 0.3; Table 2). In the younger infants alone, the MPFC face response was greater than the response to each other condition (bodies, p = 0.02; objects, p = 0.08; scenes, p = 0.005; all ps < 0.0002 in weighted LME; Table 1 and Extended Data Table 1-1). Older infants also had face-selective responses in MPFC (all ps < 0.00003), and responses to each condition did not change with age (all ps > 0.14; Extended Data Fig. 5-1; Table 3).
To test whether the face-selective responses described above are spatially specific to cortical regions with face-selective responses in adults, we conducted an fROI analysis in occipital areas, the location of early visual cortex (EVC) in adults, where we do not expect to observe face-selective responses. In the fROI analysis of EVC (Fig. 5e), the face response was significantly greater than the response to objects (p = 0.006), but it was numerically lower than the baseline (p = 0.06) and was not statistically different from body and scene responses (all ps > 0.1; Table 1). Further the face selectivity in IOG, VTC, and MPFC fROIs was significantly different from the absence of face selectivity in EVC (all ps < 0.05; Table 5). The condition by fROI responses for STS versus EVC did not reach significance (p = 0.30; Table 5). Thus, even in voxels selected for maximally face-selective responses, we confirmed that voxels in infant EVC are not face-selective and are significantly different from responses in IOG, VTC, and MPFC.
Discussion
Here we test when face-selective cortical responses first arise in the approximate location of adult OFA, FFA, STS, and MPFC. We combined data collected with two different coils to double the sample size, compared with prior reports (Kosakowski et al., 2022a), and test whether face selectivity changes during the first year of life. To accommodate the different spatial distortions in the two coils, we used larger anatomical parcels, sacrificing spatial resolution in exchange for a larger sample of younger infants. Throughout the first year of life, including at the earliest ages we could measure, we observed face-selective responses in the approximate location of FFA, STS, and MPFC. In the approximate location of OFA, we did not observe face-selective responses in the youngest infants, but we did observe face-selective responses in older infants. Taken together, our results suggest that in humans, face-selective responses in multiple cortical regions emerge in infancy.
OFA
The region in IOG, near adult OFA, appeared to show the most developmental change in our dataset. The region was face-selective on average in infants, and there was a significant age by condition interaction. In the older half of infants, we found face-selective responses, but not in the younger half of infants. This difference was driven by a decreased response to scenes in older infants, not a change in the response to faces.
The possibility that face selectivity might arise in OFA relatively later than in FFA is intriguing, particularly because OFA was initially presumed to be the source of face-specific input for FFA (Haxby et al., 2000; Gobbini and Haxby, 2007). Yet face selectivity is still observed in FFA following focal damage to OFA (Weiner et al., 2016; Gao et al., 2019; Rossion, 2022). Thus, it is debated whether OFA is a necessary source of input to FFA in a single hierarchy or whether FFA receives sufficient input from other sources (Pitcher et al., 2014; Pitcher, 2022; Rossion, 2022). The current results, finding face-selective responses in the youngest infants in FFA but not yet in OFA, could be construed as evidence supporting this latter idea.
However, the weaker response in the location of OFA in younger infants should be interpreted with caution. Even in adults, OFA is small, variable, and difficult to detect (Rossion et al., 2012; Zhen et al., 2015; Schwarz et al., 2019; Dai and Scherf, 2023). Although we could not confirm a face-selective response near OFA in the youngest infants, we also did not find strong evidence for its absence. More data from young infants will likely be needed to establish more precisely when a face-selective response can first be detected in IOG.
FFA
Using the higher-resolution subset of these data collected with Coil 2021, a previous study reported that infants have face-selective responses in the approximate location of adult FFA (Kosakowski et al., 2022a). However, the sample was too small to test for developmental change within the first year. In the current analyses of the larger sample, we find no evidence that face selectivity is late to develop. Although older infants have a greater response to faces than younger infants, the youngest infants still have a detectable face-selective response in the approximate location of FFA, and there was no age by condition interaction.
The current results contribute to a long-standing debate about the origins of face selectivity in FFA. In children, FFA is face-selective, responding more to faces than to nonface visual categories (Passarotti et al., 2003; Aylward et al., 2005; Golarai et al., 2007, 2010, 2015; Scherf et al., 2007; Peelen et al., 2009; Pelphrey et al., 2009; Cantlon et al., 2011; Joseph et al., 2011; Natu et al., 2016; Dehaene-Lambertz et al., 2018; Nordt et al., 2021; Tian et al., 2021; Feng et al., 2022). However, compared with adults, children have (1) a smaller volume of the cortex with a significant face-selective response and (2) a smaller magnitude response difference between face and nonface categories (Golarai et al., 2007, 2010, 2015; Scherf et al., 2007; Peelen et al., 2009; Joseph et al., 2011; Haist et al., 2013; Natu et al., 2016; Nordt et al., 2021; Tian et al., 2021; Feng et al., 2022; although see Passarotti et al., 2003; Aylward et al., 2005; Dehaene-Lambertz et al., 2018). In children, the extent of face-selective cortex is correlated with face recognition and memory abilities (Golarai et al., 2007), which continue to develop into adulthood (Carey et al., 1992; Germine et al., 2011; Dundas et al., 2013).
Clearly, face-selective responses increase in extent and magnitude over childhood, but those observations do not establish when face-selective responses first emerge. The first neuroimaging studies with human infants suggested an early preferential response to faces in the approximate location of FFA, without full face selectivity. An early PET study demonstrated that young infants have responses to faces in the approximate location of FFA, STS, and MPFC (Tzourio-Mazoyer et al., 2002). But this study was limited because infants saw only two kinds of stimuli: static images of female faces and one control condition, colorful diodes. Similarly, an initial fMRI study found face preferences, but not selectivity, in a small sample of human infants (Deen et al., 2017). Consistent with evidence in humans, face-selective responses were observed in infant macaques only late in the first year of life (Livingstone et al., 2017), which is thought to correspond to approximately age 3 years in human development, and macaques that have never seen a face do not have face responses that are detectable with fMRI (Arcaro et al., 2017). Thus, initial PET and fMRI investigations in infants and children suggested that face selectivity in FFA might initially arise in toddlers or young children, only after substantial visual experience.
However, other fMRI and EEG data challenge that view. Two recent fMRI studies in awake infants observed face-selective responses in FFA, using different stimuli and task procedures (Kosakowski et al., 2022a; Yates et al., 2023). These results are consistent with EEG evidence of a distinctive response to faces, compared with that to other visual objects, in the brains of 4- to 6-month-old infants, with a source likely in VTC (De Heering and Rossion, 2015). Similarly, fMRI studies have found early origins for retinotopic organization of the visual cortex (Kourtzi et al., 2006; Ellis et al., 2021b) and for responses to motion in MT (Biagi et al., 2015, 2023). Compared with these other visual regions, FFA development might be less dependent on visual experience. Adults born with cataracts that were removed by age 2 have reduced motion-related responses in MT but have preserved responses in FFA (Guerreiro et al., 2022). Further, FFA responses are heritable (Polk et al., 2007; Chen et al., 2023) and are partially preserved in congenitally blind adults (Van Den Hurk et al., 2017; Ratan Murty et al., 2020). In sum, the current results fit with a growing body of evidence that face selectivity in FFA initially arises within a few months after birth and requires little visual experience.
Still, it is likely that many aspects of the FFA response change with both age and visual experience. The initial response to faces increases with age, even in infancy, and appears to expand to cover more of the fusiform gyrus during childhood (Golarai et al., 2007, 2010, 2015; Scherf et al., 2007; Peelen et al., 2009; Joseph et al., 2011; Natu et al., 2016; Nordt et al., 2021; Tian et al., 2021; Feng et al., 2022). Because of the spatial distortions, we cannot confidently estimate the size of the region we observed in infants. One important question for future research will be whether the size or selectivity of FFA in infants corresponds to their face recognition abilities.
STS
A region in STS showed a robust response to faces compared with all other visual categories, in both the younger and older infants. The face stimuli were videos of children's faces, including changing facial expressions and gaze. In adults, these videos are ideal to elicit responses in STS, which strongly prefers dynamic to static faces (Sato et al., 2004; Pitcher et al., 2011, 2014, 2019). These results converge with prior evidence of face-selective responses in the approximate location of STS in children (Pitcher et al., 2011; Walbrin et al., 2020) and, using functional near–infrared spectroscopy (fNIRS), in infants (Lloyd-Fox et al., 2009; Farroni et al., 2013; Powell et al., 2018).
Although the STS responds selectively to faces among visual categories, in adults and older children, the same region also responds to other kinds of social stimuli. For example, the face-selective region in STS responds more to point-light displays depicting two bodies interacting versus two bodies not interacting (Isik et al., 2017; Walbrin et al., 2020). The same region responds more to human speech than nonspeech sounds (Deen et al., 2015). In infants, parts of STS similarly responds more to speech compared with nonspeech sounds (Grossmann et al., 2010; Lloyd-Fox et al., 2014) and point-light displays depicting biological versus non-biological motion (Lloyd-Fox et al., 2011; Lisboa et al., 2020a,b). However, it is not known whether these responses are colocated in the same regions of STS as the face-selective response.
The infants studied here, as young as 2 months, are among the youngest in whom STS responses to dynamic faces have been reported. A limitation of the current design is that we cannot test whether this same region already also responds to human voices or other social stimuli. It is an interesting developmental question whether responses to faces and voices are processed separately in early development and then gradually associated or whether these responses are already integrated within the first few months of life. Behaviorally, infants do seem to integrate face and voice information remarkably early. Young infants prefer to look at one face more than another based on nonvisual properties (e.g., speaker language, prosody, social behavior; Kelly et al., 2005; Kinzler et al., 2007; Turati et al., 2011). Even within the first 12 h after birth, infants prefer to look at their own mother's face, compared with a female stranger (Pascalis et al., 1995; Bushnell, 2001; Sai, 2005), identified by association with the mother's voice (Sai, 2005). In one series of studies, 3-h-old infants looked more at their mother's face than a female stranger's face, when they had experienced their mother's voice and face together in those 3 h, but not if their mother had been instructed to remain silent during those hours (Bushnell et al., 1989; Sai, 2005). Future research could investigate whether early developing multimodal responses in STS are related to infants’ behavioral preferences for faces associated with specific voices.
Laterality of face responses
The robust lateralization of face responses to the right hemisphere in adults (Pitcher et al., 2007; Yovel et al., 2008; Rangarajan et al., 2014; Jonas et al., 2018) was not evident in our infant data, as none of the four face-selective regions showed a significant difference between hemispheres in the profile of response across conditions. Although this result could indicate that lateralization arises later in development (Behrmann and Plaut, 2020; Rossion and Lochy, 2022; Kubota et al., 2024), it is also possible that we simply lacked the power to detect effects of hemisphere (e.g., compare the nonsignificant trend in infant VTC in Fig. 6b). In the future, it will be important to use high-quality data from well-powered studies to determine when in development the lateralization of face selectivity in the right hemisphere emerges.
MPFC
A region in MPFC showed face-selective responses in both younger and older infants that did not change with age in this sample. Finding selective functional responses in MPFC in infants as young as 2- to 5-months-old is intriguing in light of the protracted anatomical development of this region (Brody et al., 1987; Kinney et al., 1988; Tau and Peterson, 2010; Dubois et al., 2016; Vasung et al., 2019; Bethlehem et al., 2022). Signatures of cortical maturation including expansion (Li et al., 2013), increased sulcal depth (Meng et al., 2014), and myelination (Hasegawa et al., 1992; Carmody et al., 2004; Miller et al., 2012) occur relatively later in MPFC than in other cortical regions. Indeed, new neurons are still migrating and being integrated into the prefrontal cortex well into the second year of life (Sanai et al., 2011)—while this process is completed in primary sensory areas around the time of birth (Kostović et al., 2019).
In adults, a region in MPFC has been reported that responds more to images of faces than to other categories (Schwarz et al., 2019; Gu et al., 2023) and dynamic faces compared with dynamic objects similar to the current videos (Julian et al., 2012; Kosakowski et al., 2022b). Yet, the MPFC is not classically considered a face perception region (Haxby et al., 2000; Gobbini and Haxby, 2007) or a visual region, and its response to faces is modulated by the social content (LaBar, 2003; O’Doherty et al., 2003; Cheng et al., 2022). In addition, while OFA and FFA respond only to visually presented faces, face-selective regions in MPFC also respond to a variety of other social stimuli, including animations of social interactions and stories about people presented visually and aurally (Kosakowski et al., 2022b).
Similar to adults, studies using fNIRS have reported responses to dynamic faces in infant MPFC (Tzourio-Mazoyer et al., 2002; Grossmann et al., 2008; Krol and Grossmann, 2020; Porto et al., 2020; Farris et al., 2022). These responses are greater for socially relevant faces (e.g., a parent, direct gaze, using infant-directed speech) than faces with less social relevance (e.g., a diagram image, an averted gaze; Grossmann et al., 2008; Naoi et al., 2012; Imafuku et al., 2014; Lloyd-Fox et al., 2015; Urakawa et al., 2015; Xu et al., 2017; Uchida-Ota et al., 2019; Krol and Grossmann, 2020). One intriguing possibility is that infants’ MPFC, like adult MPFC, is engaged in processing the social and emotional meaning associated with faces. However, the current evidence cannot exclude the possibility that initially infants’ MPFC contains purely visual representations of faces and only later in infancy is used to ascribe social and emotional meaning to those faces.
Our results are broadly consistent with other recent evidence of functional responses in infants’ prefrontal cortex. For example, a region in the lateral prefrontal cortex in infants responds more to sequences with statistical regularity compared with the unstructured input (Gervain et al., 2008; Werchan et al., 2016; Ellis et al., 2021a). Another area in the infant prefrontal cortex responds more to native language compared with foreign language or other nonspeech sounds (Dehaene-Lambertz et al., 2010; Minagawa-kawai et al., 2011; Vouloumanos et al., 2010; May et al., 2011; Altvater-Mackensen and Grossmann, 2018). Thus, despite structural immaturity, the prefrontal cortex appears to be functionally active in infancy and may play a key role in infant cognitive development.
Limitations and future directions
The current results provide an upper bound, but do not directly answer the question of when face selectivity first arises in each of the regions considered. Particularly in FFA, STS, and MPFC, face-selective responses are already present in the youngest group of infants, aged 2–5 months. We cannot resolve the time of first face-selective responses more specifically than this 3 month window, because we have limited data (average 16 min) from each infant. As a result, we cannot confidently estimate the selectivity of a region in a single infant. Moreover, we have no measurements in the first 2 months of infants’ lives, and so we cannot determine how much earlier face-selective responses first arise. These limitations leave open the intriguing possibility that face selectivity may not arise simultaneously across these cortical regions but, instead, arise in a sequence. A strong test of this hypothesis would ideally require substantially more data per infant, collected in a dense longitudinal sample, so that the age of first face-selective responses in each region could be confidently identified.
In order to increase the number of younger infants included in this sample, we combined data across two coils. The lower-resolution and more distorted images collected with Coil 2011 meant that we had to use very large parcels to identify voxels putatively near OFA and FFA in particular. Also, we did not have high-resolution anatomical images for most infants. As a result, the location of the fROIs reported here is approximate. Techniques for acquiring functional data from awake infants are improving rapidly (Cusack et al., 2018; Ellis et al., 2020; Yates et al., 2021), so the current results can be replicated in the future with greater confidence in the spatial origins of the measured signals.
Finally, although each of the regions tested showed a significantly greater response to faces than the other visual categories, the actual magnitude of the face responses was very small compared with those previously measured in children and adults. There are many possible explanations of the change in hemodynamic response magnitude over development, including both changes in neural firing rates and synchrony (Uhlhaas et al., 2010; Kiorpes, 2015) and changes in vasculature and neurovascular coupling (Colonnese et al., 2008). The current study cannot differentiate between these explanations. However, the change in magnitude we observed between age 2 and 9 months was gradual and moderate, consistent with other evidences that the magnitude of hemodynamic responses change slowly and gradually throughout infancy and childhood (Cohen Kadosh et al., 2011, 2013; Arichi et al., 2012; Cusack et al., 2015; De Oliveira et al., 2017).
Summary
In sum, using fMRI data from a large sample of awake infants, we measured face-selective responses in multiple regions of infants’ brains. Robust face selectivity was present in the approximate location of FFA, STS, and MPFC as early as we could measure. Putative OFA also had face-selective responses in older infants but not yet younger infants. Despite undergoing rapid anatomical change in the first postnatal year, the infant cortex already has structured responses to meaningful, self-relevant stimuli such as faces. These results importantly constrain theories of cortical development and the origins of face selectivity.
Footnotes
The authors declare no competing financial interests.
This research was carried out at the Athinoula A. Martinos Imaging Center at the McGovern Institute for Brain Research at Massachusetts Institute of Technology (MIT). The authors thank Nayanika Das and Somaia Saba for their help with registrations; Steven Shannon, Atsushi Takahashi, and Boris Keil for their technical support; members of Saxe Lab and members of the Kanwisher Lab for their help during recruitment and data collection; the Cambridge Writing Group and members of the Saxe Lab and Kanwisher Lab for their helpful comments on various versions of the manuscript; Michelle Hung and Kirsten Lydic for code review; Sofia Riskin for data reconciliation; Hannah LeBlanc for all the things; and all the infants and their families. We gratefully acknowledge support of this project by a National Science Foundation (graduate fellowship to H.L.K.; Collaborative Research Award No. 1829470 to M.A.C.), National Institutes of Health (No. 1F99NS124175 to H.L.K.; No. 8K00DA058542-02 to H.L.K.; No. R21-HD090346-02 to R.S.; No. DP1HD091947 to N.K.; shared instrumentation Grant S10OD021569 for the MRI scanner), Templeton World Charity Foundation (No. 2022-30268 and No. 2022-30269 to M.A.C.), Canadian Institute for Advanced Research Azrieli Global Scholars Fellowship (to M.A.C.), the McGovern Institute for Brain Research at MIT, and the Center for Brains, Minds and Machines (CBMM), funded by an National Science Foundation Science and Technology Center award (CCF-1231216).
- Received March 18, 2024.
- Revision received May 28, 2024.
- Accepted June 4, 2024.
- Copyright © 2024 Kosakowski et al.
This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license, which permits unrestricted use, distribution and reproduction in any medium provided that the original work is properly attributed.