Skip to main content

Main menu

  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Blog
    • Collections
    • Podcast
  • TOPICS
    • Cognition and Behavior
    • Development
    • Disorders of the Nervous System
    • History, Teaching and Public Awareness
    • Integrative Systems
    • Neuronal Excitability
    • Novel Tools and Methods
    • Sensory and Motor Systems
  • ALERTS
  • FOR AUTHORS
  • ABOUT
    • Overview
    • Editorial Board
    • For the Media
    • Privacy Policy
    • Contact Us
    • Feedback
  • SUBMIT

User menu

Search

  • Advanced search
eNeuro
eNeuro

Advanced Search

 

  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Blog
    • Collections
    • Podcast
  • TOPICS
    • Cognition and Behavior
    • Development
    • Disorders of the Nervous System
    • History, Teaching and Public Awareness
    • Integrative Systems
    • Neuronal Excitability
    • Novel Tools and Methods
    • Sensory and Motor Systems
  • ALERTS
  • FOR AUTHORS
  • ABOUT
    • Overview
    • Editorial Board
    • For the Media
    • Privacy Policy
    • Contact Us
    • Feedback
  • SUBMIT
PreviousNext
Research ArticleResearch Article: New Research, Cognition and Behavior

Gaze and Arrows: Does the Gaze-Following Patch in the Posterior Temporal Cortex Differentiate Social and Symbolic Spatial Cues?

Marius Görner, Hamidreza Ramezanpour, Peter Dicke and Peter Thier
eNeuro 3 July 2024, 11 (7) ENEURO.0065-24.2024; https://doi.org/10.1523/ENEURO.0065-24.2024
Marius Görner
1Cognitive Neurology Laboratory, Hertie Institute for Clinical Brain Research, 72076 Tübingen, Germany
2GTC of Neuroscience, 72076 Tübingen, Germany
3IMPRS for Cognitive and Systems Neuroscience, 72076 Tübingen, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Marius Görner
Hamidreza Ramezanpour
4Centre for Vision Research, York University, Toronto, Ontario M3J 1P3, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Hamidreza Ramezanpour
Peter Dicke
1Cognitive Neurology Laboratory, Hertie Institute for Clinical Brain Research, 72076 Tübingen, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Peter Thier
1Cognitive Neurology Laboratory, Hertie Institute for Clinical Brain Research, 72076 Tübingen, Germany
5Werner Reichardt Centre for Integrative Neuroscience, 72076 Tübingen, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Article
  • Figures & Data
  • Info & Metrics
  • eLetters
  • PDF
Loading

Abstract

The gaze-following patch (GFP) is located in the posterior temporal cortex and has been described as a cortical module dedicated to processing other people's gaze-direction in a domain-specific manner. Thus, it appears to be the neural correlate of Baron-Cohen's eye direction detector (EDD) which is one of the core modules in his mindreading system—a neurocognitive model for the theory of mind concept. Inspired by Jerry Fodor's ideas on the modularity of the mind, Baron-Cohen proposed that, among other things, the individual modules are domain specific. In the case of the EDD, this means that it exclusively processes eye-like stimuli to extract gaze-direction and that other stimuli, which may carry directional information as well, are processed elsewhere. If the GFP is indeed EDD's neural correlate, it must meet this expectation. To test this, we compared the GFP's BOLD activity during gaze-direction following with the activity during arrow-direction following in the present human fMRI study. Contrary to the expectation based on the assumption of domain specificity, we did not find a differentiation between gaze- and arrow-direction following. In fact, we were not able to reproduce the GFP as presented in the previous studies. A possible explanation is that in the present study—unlike the previous work—the gaze stimuli did not contain an obvious change of direction that represented a visual motion. Hence, the critical stimulus component responsible for the identification of the GFP in the previous experiments might have been visual motion.

  • fMRI
  • joint attention
  • social cognition
  • spatial cueing

Significance Statement

This study presents evidence against the notion of domain specificity of an area in the posterior temporal cortex [the gaze-following patch (GFP)] previously described to specifically serve eye gaze following. This conclusion is suggested by the finding that using arrows to identify a target object among distractors is accompanied by a comparable or even larger BOLD response than when the participants are asked to use the gaze-direction of a demonstrator’s face for target selection. The fact that even the best candidate to date, the posterior temporal GFP, does not stand up to critical scrutiny casts doubt on the assumption that the brain uses a specific module to enable gaze following, as proposed by Simon Baron-Cohen.

Introduction

The gaze-following patch (GFP) is a circumscribed region in the posterior part of the temporal cortex which was discovered in healthy human subjects that participated in fMRI experiments in which the task was to use the gaze-direction of a demonstrator to identify a target object among distractors (Materna et al., 2008a,b; Laube et al., 2011; Marquardt et al., 2017; Kraemer et al., 2020). In contrast with the respective control condition, iris-color mapping in which the observer had to shift gaze to an object whose color corresponded to the color of the demonstrator's iris, the gaze-following condition yielded a significantly larger BOLD response within the GFP. This preference for gaze-direction suggests that the GFP might be the neural realization of Baron-Cohen's eye direction detector (EDD; Baron-Cohen, 1994). Being an integral component of his mind-reading model, Baron-Cohen proposed the EDD to be domain specific (Fodor, 1983). This implies that it exclusively processes eye-like stimuli and forwards information on eye-direction to downstream modules to form a theory of mind (ToM), a concept that captures the assignment of desires, beliefs, and intentions to another person. Electrophysiological studies in nonhuman primates (NHPs) that investigated the response preferences of individual neurons in a presumably homologous brain area in the superior temporal sulcus (STS) seemed to be in line with the assumed domain specificity of the GFP, in accordance with the central tenet of the Baron-Cohen concept (Ramezanpour and Thier, 2020).

In his work, Baron-Cohen suggested that in conjunction with shape and contrast patterns, the visual motion signal inevitably yoked with the view of an eye movement plays a crucial role in the detection of eye-gaze stimuli. Hence, under the assumption that the GFP indeed corresponds to Baron-Cohen's EDD its location in a brain region known for its role in visual motion processing appears plausible. However, one may wonder to which extent the GFP is indeed selective for motion of the eyes, a selectivity to be met in order to satisfy the assumption of domain specificity. In fact a critical examination of this question is still lacking. This is why we embarked on the current fMRI study in which we compared the BOLD activity patterns resulting from contrasting gaze-direction following and iris-color mapping with an analogous contrast: arrow-direction following versus arrow-color mapping. We predicted that the GFP should remain silent in the arrow-contrast condition if it met the premise of domain specificity.

In this study, we tested 20 healthy human participants using the same stimuli as in Marquardt et al. (2017) with an important modification: while individual trials in the original version of the study always started with the demonstrator's overt gaze directed toward the participant followed by a second frame depicting the demonstrator looking toward the target object, in the current study, the initial part was replaced by a blank screen. This bears the consequence that no overt gaze shift is seen as the blank screen is directly followed by the demonstrator's gaze directed toward the target (Fig. 1). Hence, the spatiotemporal discontinuity (apparent motion) of the two views of the eyes in the original version which created the impression of a saccadic gaze shift was absent. This modification was necessary to allow fully analogous sequences in the gaze and arrow conditions, since in the two-dimensional views of arrows, there is no equivalent orientation to the gaze being directed toward the participant while still being recognizable as arrows. The presence of apparent motion (Wertheimer, 1912; Ramachandran and Anstis, 1986) in the original version of the paradigm had an important consequence; by contrasting the gaze-following condition with the respective control condition—e.g., iris-color mapping—we implicitly contrasted a stimulus component that comprised a motion event (the gaze shift) with a component that comprised a color change requiring to ignore the motion event. Or, to put it differently, in the gaze-following condition, the stimulus component that was behaviorally relevant, i.e., the gaze-direction, was intrinsically linked to visual motion, while in the iris-color condition, it was not. It has been shown that both electrophysiological and BOLD signatures of visual motion in parts of the posterior STS are boosted whenever motion patterns are behaviorally relevant (Stemmann and Freiwald, 2016, 2019). This raises the question if the human GFP as described by Marquardt et al. and other studies as well as the monkey's homolog may be the result of this implicit contrast between behaviorally relevant motion and a control condition, lacking behaviorally relevant motion cues. Therefore, in the current study, the first question was whether we are able to reproduce the GFP despite the absence of any motion information provided by the stimulus. Second, we compared the activity patterns emerging from the contrast gaze-direction > iris-color with those resulting from the contrast arrow-direction > arrow-color as well as with the location of the GFP as reported by Marquardt et al. (2017). We expected that if the GFP is domain specific and does not depend on visual motion, we would not find overlapping activation between the two contrasts at the expected location of the GFP at a given statistical threshold. Moreover, for each condition, we estimated the hemodynamic response functions (HRFs) based on the GFP region of interest (ROI) reported by Marquardt et al. and based on a ROI stemming from a search for visual motion in the Neurosynth (Yarkoni, n.d.) database.

Figure 1.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 1.

Illustration of the paradigm. The stimulus used in the experiment has been a photograph of a real person, here replaced by a drawing due to privacy reasons. The left and right columns display the sequence of events during a gaze-following/iris-color mapping and an arrow-direction/arrow-color mapping trial, respectively. Each trial started with a blank fixation period followed by the presentation of the cue. In each individual trial, the demonstrator’s face and the arrows look/point toward a randomly selected target, and the color of the iris and the arrows, chosen independently of each other, matched one of the targets. After cue presentation, the participants had to wait with their response until the central fixation dot disappeared.

Materials and Methods

Participants

Twenty-three healthy participants were recruited, three of which canceled the experiment prematurely. Of the remaining 20 participants, 12 were female and 8 male and had an average age of 31.5 years with a standard deviation of 3.6 years. Each participant participated in two sessions. The first was conducted outside the scanner to familiarize the participant with the task. Participants gave written consent to the procedures of the experiment. The study was approved by the local Ethics Review Board and was conducted in accordance with the principles of human research ethics of the Declaration of Helsinki.

Paradigm and setup

The paradigm consisted of two stimuli types with two conditions, each. One stimulus type consisted of a photo of a face and the other of two arrows that were displayed on a computer monitor. Below the stimuli, five differently colored squares were shown, which served as gaze targets. Each trial started with a black screen with a red fixation dot in the middle which was shown for 4 s. After this, either the face or the arrows were displayed, still with the red fixation dot in the center of the monitor. The face looked in the direction of one of the squares, the iris-color matched one of the colors of the squares, the arrows pointed toward one of the squares, and their color matched one of the squares as well. After 2 more seconds, the fixation dot disappeared which was the go-cue telling the participants that they now have to shift their own visual focus onto the target square defined by the current experimental condition. Conditions were organized in blocks such that the participants always had to identify the target square by the same stimulus features for 20 consecutive trials. Which stimulus feature mattered was announced by a written instruction before each block. Four blocks made up a run within which each condition occurred in randomized order. In total, four runs had to be completed, such that each condition was represented by 80 trials.

During the experiment, participants lay in the scanner and viewed the stimulus monitor via a mirror system. The distance between the participants’ eyes and the monitor was ∼190 cm, and it covered ∼20° of the field of view in the horizontal and ∼12° in the vertical. Additionally, eye-tracking data were recorded during the fMRI experiment. However as the data suffered from bad quality, they could not be analyzed consistently. In a separate session, participants were familiarized with the task outside of the scanner.

Data collection

MR images were acquired in a 3 T scanner (Siemens Magnetom Prisma) with a 20-channel phased-array head coil. The head of the subjects was fixed inside the head coil by using plastic foam cushions to avoid head movements. An AutoAlign sequence was used to standardize the alignment of images across sessions and subjects. A high-resolution T1-weighted (T1w) anatomic scan (MP-RAGE, 176 × 256 × 256 voxel, voxel size 1 × 1 × 1 mm) and local field maps were acquired. Functional scans were conducted using a T2*-weighted echo-planar multibanded 2D sequence (multiband factor = 2; TE = 35 ms; TR = 1,500 ms; flip angle = 70°) which covers the whole brain (44 × 64 × 64 voxel, voxel size 3 × 3 × 3 mm, interleaved slice acquisition, no gap).

Preprocessing

Results included in this manuscript were preprocessed using fMRIPrep 1.5.2 (Esteban et al., 2019, 2022).

Copyright waiver

The below boilerplate text was automatically generated by fMRIPrep with the express intention that users should copy and paste this text into their manuscripts unchanged. It is released under the CC0 license.

Anatomical data preprocessing

The T1w image was corrected for intensity nonuniformity with N4BiasFieldCorrection (Tustison et al., 2010), distributed with ANTs 2.2.0 (Avants et al., 2008), and used as T1w reference throughout the workflow. The T1w reference was then skull stripped with a Nipype implementation of the antsBrainExtraction.sh workflow (from ANTs), using OASIS30ANTs as the target template. Brain tissue segmentation of the cerebrospinal fluid (CSF), white matter (WM), and gray matter (GM) was performed on the brain-extracted T1w using fast (Zhang et al., 2001; FSL 5.0.9). Brain surfaces were reconstructed using recon-all (Dale et al., 1999; FreeSurfer 6.0.1), and the brain mask estimated previously was refined with a custom variation of the method to reconcile ANTs-derived and FreeSurfer-derived segmentations of the cortical gray matter of Mindboggle (Klein et al., 2017). Volume-based spatial normalization to one standard space (MNI152NLin2009cAsym) was performed through nonlinear registration with antsRegistration (ANTs 2.2.0), using brain-extracted versions of both the T1w reference and the T1w template. The following template was selected for spatial normalization: ICBM 152 Nonlinear Asymmetrical template version 2009c (Fonov et al., 2009; TemplateFlow ID: MNI152NLin2009cAsym).

Functional data preprocessing

For each of the four BOLD runs found per subject (across all tasks and sessions), the following preprocessing was performed. First, a reference volume and its skull-stripped version were generated using a custom methodology of fMRIPrep. A deformation field to correct for susceptibility distortions was estimated based on a field map that was coregistered to the BOLD reference, using a custom workflow of fMRIPrep derived from D. Greve's epidewarp.fsl script and further improvements of HCP Pipelines (Glasser et al., 2013). Based on the estimated susceptibility distortion, an unwarped BOLD reference was calculated for a more accurate coregistration with the anatomical reference. The BOLD reference was then coregistered to the T1w reference using bbregister (FreeSurfer) which implements boundary-based registration (Greve and Fischl, 2009). Coregistration was configured with six degrees of freedom. Head motion parameters with respect to the BOLD reference (transformation matrices and six corresponding rotation and translation parameters) are estimated before any spatiotemporal filtering using mcflirt (Jenkinson et al., 2002; FSL 5.0.9). BOLD runs were slice-time corrected using 3dTshift from AFNI 20160207 (Cox and Hyde, 1997). The BOLD time series were resampled to surfaces on the following spaces: fsaverage5. The BOLD time series (including slice-timing correction when applied) were resampled onto their original, native space by applying a single, composite transform to correct for head motion and susceptibility distortions. These resampled BOLD time series will be referred to as preprocessed BOLD in original space, or just preprocessed BOLD. The BOLD time series were resampled into standard space, generating a preprocessed BOLD run in [“MNI152NLin2009cAsym”] space. First, a reference volume and its skull-stripped version were generated using a custom methodology of fMRIPrep. Several confounding time series were calculated based on the preprocessed BOLD: framewise displacement (FD), derivative of root mean square variance over voxel (DVARS), and three region-wise global signals. FD and DVARS are calculated for each functional run, both using their implementations in Nipype [following the definitions by Power et al. (2014)]. The three global signals are extracted within the CSF, the WM, and the whole-brain masks. Additionally, a set of physiological regressors were extracted to allow for component-based noise correction (CompCor; Behzadi et al., 2007). Principal components are estimated after high-pass filtering the preprocessed BOLD time series (using a discrete cosine filter with 128 s cutoff) for the two CompCor variants: temporal (tCompCor) and anatomical (aCompCor). tCompCor components are then calculated from the top 5% variable voxels within a mask covering the subcortical regions. This subcortical mask is obtained by heavily eroding the brain mask, which ensures it does not include cortical GM regions. For aCompCor, components are calculated within the intersection of the aforementioned mask and the union of CSF and WM masks calculated in the T1w space, after their projection to the native space of each functional run (using the inverse BOLD-to-T1w transformation). Components are also calculated separately within the WM and CSF masks. For each CompCor decomposition, the k components with the largest singular values are retained, such that the retained components’ time series are sufficient to explain 50% of variance across the nuisance mask (CSF, WM, combined, or temporal). The remaining components are dropped from consideration. The head motion estimates calculated in the correction step were also placed within the corresponding confound file. The confound time series derived from head motion estimates and global signals were expanded with the inclusion of temporal derivatives and quadratic terms for each (Satterthwaite et al., 2013). Frames that exceeded a threshold of 0.5 mm FD or 1.5 standardized DVARS were annotated as motion outliers. All resamplings can be performed with a single interpolation step by composing all the pertinent transformations (i.e., head motion transform matrices, susceptibility distortion correction when available, and coregistrations to anatomical and output spaces). Gridded (volumetric) resamplings were performed using antsApplyTransforms (ANTs), configured with Lanczos interpolation to minimize the smoothing effects of other kernels (Lanczos, 1964). Nongridded (surface) resamplings were performed using mri_vol2surf (FreeSurfer).

Many internal operations of fMRIPrep use Nilearn 0.5.2 (Abraham et al., 2014), mostly within the functional processing workflow. For more details of the pipeline, see the section corresponding to workflows in fMRIPrep’s documentation.

Analysis

Contrasts

For each participant, we computed a general linear model (GLM) for the combination of all runs (first-level) to obtain the respective β-images for each condition using the Python package Nilearn (Nilearn, n.d.). For modeling, we aligned the onsets of each trial to the onsets of the spatial/color cue (4 s after the actual trial onset, starting with the blank screen) and set the stimulus duration to 0, i.e., we modeled the relevant stimulus component as a single event since it is not changing over time. To mitigate the effects of motion artifacts and other noise sources, the nuisance regressors global_signal, csf, white_matter, trans_x, trans_y, trans_z, rot_x, rot_y, and rot_z and their respective first derivatives estimated by fMRIPrep were included in the design matrices. As the model for the HRF, we used the glover + derivative + dispersion provided by Nilearn and included a polynomial drift model of order 3 to remove slow drifts in the data. Further, we masked the data with the average across run's mask images provided by fMRIPrep and applied smoothing with fwhm = 8 mm.

For each participant, the resulting first-level β-images will be used to compute the contrasts gaze-direction–iris-color and arrow-direction–arrow-color. The resulting effect-size images (Nilearn terminology) of each contrast were fed into second-level analyses that were fitted as one-sample t tests. The resulting second-level contrasts were thresholded with p < 0.001 [false positive rate (fpr)] and p < 0.01 (fpr). All coordinates reported in this manuscript refer to the MNI space. For each of the reported coordinates, a Neurosynth search was conducted, and we reported the first five associations.

ROI definition and HRF estimation

To define the GFP ROI, we extracted the activity patch of both hemispheres as reported by Marquardt et al. (2017) and converted them into mask images. To define a ROI corresponding to brain areas that process visual motion, we searched the Neurosynth database (term-based meta-analysis) using the term visual motion. After downloading the result (association test), we extracted the largest components in both hemispheres spanning the posterior temporal cortex and converted them into mask images. The left GFP consists of 76 voxels and the right of 61 voxels, and the visual motion ROIs consist of 377 (left) and 306 (voxels). All ROIs are plotted along with the contrasts described above in Figure 2.

Figure 2.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 2.

Both panels a and b show the same contrasts/ROIs at two threshold levels. White encircled is the visual motion ROI (Yarkoni, n.d.) and pink encircled is the mGFP (Marquardt et al., 2017). Blue-encircled areas belong to the arrow contrast and red-encircled areas to the gaze contrast at the respective statistical thresholds.

For each participant and ROI, we estimated the hemodynamic response of each condition. To do so, the BOLD signals of each ROI (averaged across voxels) were extracted. Prior to signal extraction, the BOLD images were denoised using the same nuisance regressors as used to fit the GLMs described above. To recover the condition-specific hemodynamic responses, we deconvolved the signals using the nideconv package (Hollander and Knapen, 2017). We applied the Fourier basis set with nine regressors over a period of 19.5 s starting from spatial/color cue onset, again omitting the blank screen period in the beginning of each trial. Individual models were fitted using the standard settings of nideconv's fitting method. The first-level response estimates were then fed into the group-level model using nideconv's GroupResponseFitter functionality using the same settings as for the first-level. As a result, we obtain the estimated HRFs together with the 95% credible intervals (CIs) of the estimates. Periods in which the CIs do not include an activation level of 0 (au) are considered to be statistically different from 0 at the 5% level.

Resources

Analysis code is available under https://github.com/maalaria/fMRIus/tree/gaze-arrows. The dataset is available under https://doi.org/10.18112/openneuro.ds005203.v1.0.0.

Results

Behavior

Because of insufficient quality of the eye tracking, we were only able to analyze the oculomotor behavior in ∼38% of all trials. The performance (∼80–90% correct responses) in this fraction of the data matched the expected performance observed in previous experiments (∼80%) using the same stimuli in all experimental conditions (Marquardt et al., 2017). Further, the eye-tracking camera enabled us to monitor participants’ behavior during the experiment, giving us confidence that participants generally complied with the task requirements. Before the fMRI session, participants underwent a practice session in which their behavior was closely monitored, and corrected, as well. This, together with the fact that the task was easy and intuitive, makes us confident that the number of error trials was indeed small and, thus, can be ignored as in previous experiments (Marquardt et al., 2017).

Contrasts

At a threshold of p < 0.001 (uncorrected; Fig. 2, top) contrasting gaze-direction following with iris-color mapping (regions encircled in red in Fig. 2) did not yield activity overlapping with the GFP [as described by Marquardt et al. (2017); from now on, we refer to the GFP ROI as defined by Marquardt et al. (2017) as mGFP—the region encircled in pink in Fig. 2]. Contrasting arrow-direction following with arrow-color mapping (regions encircled in blue in Fig. 2) yielded activity in the left hemisphere that overlaps with the mGFP therein. The coordinates of the local maximum of this patch are (x, y, z) = (−54, −72, 6). The first five Neurosynth (Yarkoni, n.d.) associations for this location are motion, videos, v5, mt, and visual.

At this threshold, the gaze-iris contrast yielded an activity dorsal to the mGFP beginning to emerge around z = 8 and extending toward z = 18 (encircled red in Fig. 2). The local maximum of this patch is located at (x, y, z) = (−51, −63, 15). The first five associations for this location are action observation, intentions, mentalizing, social, and temporoparietal.

The mGFP as well as the activity related to the arrows contrast are falling within the region of the visual motion ROI (regions encircled in white in Fig. 2.) given by the Neurosynth database. Activity related to the gaze-iris contrast is slightly more frontal to this ROI.

Liberalizing the threshold to p < 0.01 (uncorrected; Fig. 2, bottom) additionally gave rise to patches close to or overlapping with the mGFP for the gaze-iris contrast in both hemispheres [peak coordinates: (x, y, z) = (−60, −60, −3) and (x, y, z) = (57, −60, −6)]. Neurosynth associations with these two locations are word form, judgment task, interactive, semantics, and timing (left) and visual, unfamiliar, objects, multisensory, and visually (right). Further, a patch dorsal to the mGFP emerged in the right hemisphere at this threshold for this contrast. This patch had one ventral peak at (48, −54, 0) and a dorsal peak at (48, −58, 12) which closely matched the dorsal activity in the left hemisphere at (−51, −63, 15) already visible at the more conservative threshold of p < 0.001. The Neurosynth associations for (48, −54, 0) are unfamiliar, motion, social interactions, gestures, and preparatory and for (48, −58, 12) motion, video clips, gaze, and action observation.

At the more liberal threshold, the activity patch at (−54, −72, 6) associated with arrow-direction following was more extended and matched the mGFP, especially in the left hemisphere. In the right hemisphere, a small activity patch emerged at this threshold at (51, −63, −3) falling into the area of the mGFP as well. The Neurosynth associations for (51, −63, −3) are visual, motion, objects, movements, and action observation.

Table 1 lists the peak coordinates as well as the Neurosynth associations of the activations found in this study as well as the GFP locations reported in two preceding studies on the GFP.

View this table:
  • View inline
  • View popup
Table 1.

List of locations of the GFP reported in earlier studies as well as locations in its vicinity found in this study

HRF estimates

Figure 3 shows the estimated HRFs and the 95% CIs for each condition for the mGFP ROI and the ROI based on a search for visual motion in the Neurosynth database. The HRFs for the mGFP show a statistically significant (95% CIs do not include the activation level of 0) deflection ∼5fs after cue onset in both hemispheres in all conditions but the iris-color condition. This matches the temporal progression of the canonical HRF which describes the expected BOLD signal in response to a given impulse stimulus (Lindquist et al., 2009). Though the difference is not statistically significant at the 5% level, the estimated activation in the arrow-direction condition is nearly twice as large as in the gaze-direction following the condition in the left hemisphere. This difference is not present in the right hemisphere.

Figure 3.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 3.

In a and b, the HRF estimates for the four different conditions in the left- and right-hemispheric mGFP and visual motion ROI, respectively, are shown. Shaded areas represent the 95% CIs. Especially in the left mGFP, the dominance of activity related to the arrow-direction following condition is obvious.

In the visual motion ROI, the activation level is overall smaller than in the mGFP but still significant at the 5% level (Fig. 2, inlets, bottom) for all conditions but the iris-color condition in both hemispheres at ∼5 s.

Discussion

In this study, we asked if the GFP—a brain area described in several studies as a domain-specific module dedicated to the processing of others’ gaze-direction (Materna et al., 2008a,b; Laube et al., 2011; Marquardt et al., 2017; Kraemer et al., 2020; Ramezanpour and Thier, 2020)—is indeed domain specific or if it is also active when the participants use arrows instead of the gaze of a demonstrator to identify a target object. We used the same portraits as stimuli that were used in previous experiments based upon which the GFP was originally defined but had to introduce a modification to the temporal structure of the task to allow a direct comparison between the gaze and the arrow condition. In the original version of the task that was used in human studies (Materna et al., 2008a; Marquardt et al., 2017; Kreamer et al., 2020) as well as in studies with NHP (Kamphuis et al., 2009; Ramezanpour and Thier, 2020), each trial started with a frame displaying the demonstrator looking straight ahead toward the participant, followed by a second frame displaying the demonstrator looking toward the target object. Whenever two consecutive frames feature a coherent shift of pixels that does not shatter the correspondence of the depicted scene in the two frames, the visual system interprets the sequence as motion (Wertheimer, 1912; Ramachandran and Anstis, 1986). In the current study, the initial frame was replaced by a black screen which was directly followed by the frame displaying the demonstrator looking toward the target object. In light of this modification, the first question here was whether we can replicate the GFP despite the absence of apparent motion in the form of the gaze shift. We have to answer that in the negative. Even though the sample size and the MRI scanner used in this study were the same as in previous studies, at the same statistical threshold used by Marquardt et al. (2017) and even at more liberal thresholds, we could not detect any activation for the gaze-direction minus iris-color contrast that coincided with the GFP. However, due to a lack of a positive control for the role of apparent motion, we cannot confirm that it is indeed its lack causing the failure to reproduce the GFP. We propose, however, that this lack is compensated by the second analysis, involving the estimation of HRFs, which demonstrated a significant activation in the mGFP ROI in all conditions but the iris-color mapping condition [Fig. 3; with “mGFP” we denote the GFP ROI as delineated based on the data of Marquardt et al. (2017); a more detailed discussion follows further below].

Even at the statistically most liberal threshold that we applied (p < 0.01, uncorrected), there was only a small patch related to the contrast gaze-direction following minus iris-color mapping partially overlapping with the mGFP (Fig. 2b, Table 1). Surprisingly, however, the contrast arrow-direction following minus arrow-color mapping yielded activity patches overlapping with the mGFP for both statistical thresholds (Fig. 2, Table 1). This area of relatively stronger activity during arrow-direction following compared with arrow-color mapping as well as the mGFP falls within parts of a ROI that are associated with the search term visual motion in the neurosynth.org (Yarkoni, n.d.) database (Fig. 2, white encircled areas). When searching for the functional associations with the locations activated in the arrow contrast, Neurosynth then also provides visual, motion, video, eye movements, etc., as expected (Table 1).

Activity only detectable for the gaze but not the arrow contrast at p < 0.001 (uncorrected) can be found dorsal to the mGFP in the left hemisphere with its center at x, y, z = −51, −63, 15 (MNI). Anatomically, this location corresponds to BA39 or the temporoparietal junction. Searching for functional associations of this location at neurosynth.org (Yarkoni, n.d.) yields action observation, intention, mentalizing, social, and ToM as the first five results (Table 1). This location was not reported by Marquardt et al. (2017) or Materna et al. (2008a) to be activated during gaze following but fits to the idea that perceiving the other's gaze evokes the assignment of intentions (Baron-Cohen, 1994, n.d., Perrett and Emery, 1994; Nummenmaa and Calder, 2009).

These results provide a surprising picture of the GFP. First, a gaze stimulus lacking apparent motion does not yield a relatively stronger activation when compared with the color mapping condition at the mGFP location. Second, however, arrows pointing toward the target, despite lacking apparent motion as well, yield detectable activity confined to a region very well overlapping with the mGFP when contrasted with the color mapping condition. Note, however, that the more conservative threshold of p < 0.001 is not corrected for multiple comparisons as well.

To capture the activity within the mGFP ROI beyond the contrasts and compare it to the activity within the visual motion ROI, we modeled the HRFs for each of the conditions and hemispheres individually (Fig. 3). We found that all conditions but the iris-color mapping condition yielded a significant activation in all ROIs. Moreover, a comparable pattern is apparent in all ROIs, with the arrow-direction condition featuring the strongest activation (Fig. 3, blue curves). However, there is no statistically significant difference between any condition as the 95% CIs overlap at all time points. Comparison of the general activity patterns obtained for both ROIs shows their similarity, which is not surprising since the mGFP overlaps nearly completely with the more lateral part of the visual motion ROI. The overall smaller amplitude of the HRF for the latter ROI is likely due to the fact that the visual motion ROI was about five times larger than the mGFP and therefore most probably comprised voxels which added task-independent signals.

This sheds new light on the negative result of the contrast analysis and raises the question of what triggers the activation of mGFP, given the absence of apparent motion. Taking the results of the contrast analysis and HRF estimates together, we can think of three possible explanations. First, it might be a nonspecific response within the visual system to the switch from a blank screen to the stimulus image. Indeed, it was shown that the BOLD activity in the early visual cortex is positively correlated with scene complexity (Groen et al., 2018). Unfortunately, it appears unclear whether the apparently higher visual complexity/size of the face stimulus compared with the arrows should lead to a stronger activation in the visual system, or whether this is to be expected from the less conspicuous arrows, which are for this very reason—corresponding to a de facto higher complexity—more difficult to recognize, thus resulting in a higher workload. On that ground, we cannot exclude this possibility. The second possibility is that the location corresponding to the mGFP ROI is relevant for the processing of the behaviorally relevant orientation of objects—independent of the type of object. However, we are not aware of any studies that would directly support this hypothesis. Some studies suggest neural tuning to object orientations in area V4 (Moore, 1999) as well as in the parietal and frontal areas (Henderson and Serences, 2019). But because their paradigms and ours are very different, it is not clear how the results can be related to each other. As a third possibility, we propose that the observed activity patterns can be attributed to an effect described as implied motion. Both psychophysical (Faivre and Koch, 2014) and fMRI studies (Kourtzi and Kanwisher, 2000; Senior et al., 2000) have demonstrated that static images of moving objects elicit experimental effects that are known from the perception of visual motion (motion adaptation). How could this effect explain our results, given the fact that the stimulus images used here do not depict moving objects? Guterstam et al. published a series of studies [Guterstam et al., 2019; Guterstam and Graziano, 2020a,b; compare a Comment Letter (Görner et al., 2020)] in which they demonstrated that viewing a static image of a diagram face looking toward an object is indeed able to evoke motion adaptation in the viewer. In an fMRI study, Guterstam et al., (2020) furthermore showed that these behavioral effects are accompanied by activity within the human MT+ complex. However, in their data, they find that these effects are unique to gaze and are not present if the stimulus is an arrow, contradicting our results that demonstrate a significant BOLD response to a face gazing at an object as well as to arrows pointing at an object. A study by Lorteije et al. (2011), however, presented evidence that the effects typically attributed to implied motion can be attributed to low-level features of the stimuli such as orientation and size. Thus, the effect we find here for arrows seems plausible, though the interpretation would change accordingly. In fact, the results from Lorteije et al. are related to the second explanation (orientation tuning) that we describe above.

Other authors have used arrows as attentional cues and compared them to gaze in spatial cueing tasks inspired by Posner (1980) before us. However, due to the differences in the tasks, no conclusive comparisons can be made with our results, yet they help to contextualize our findings. For example Hietanen et al. (2006) found that gaze and arrows, when used as central endogenous cues, elicit the same behavioral effects, but the BOLD activations differ. Activity related to arrows was much more widespread involving all cortical lobes, while activity related to gaze was confined to the inferior and middle occipital gyri, however, in some distance to the mGFP. A study by Callejas et al. (2014), on the other hand, reported that largely the same neural networks encode gaze and arrow cues but that some differential modulations occurred in parts of the intraparietal sulcus and in the MT+ region. Interestingly, they found that the MT+ region exhibited effects only in relation to arrow cues. In that both studies report a stronger BOLD activation associated with arrow cues compared with gaze cues, our results are consistent. However, other than in Callejas’ study, we also find a significant, albeit smaller, activation in the MT+ region in response to static gaze cues. As an explanation for the greater activity associated with the arrow stimuli, the authors suggest the greater automaticity of responses to gaze stimuli, as well as that the processing of arrow stimuli could be more demanding. Given that the gaze of others is ubiquitous from birth and plays a crucial role in ontogeny, the processing of faces and gaze is likely to be highly optimized resulting in smaller signal changes when investigated using fMRI. There are two neuropsychological cases that are of great interest as well. One is reported by Akiyama et al. (2006) and consists of an impairment of using gaze but not arrows as a spatial cue after a lesion to the (entire) right superior temporal gyrus. This case confirms that the superior temporal region, which was reported to be involved in face perception (Kanwisher et al., 1997; Halgren et al., 1998; Haxby et al., 1999), especially in the perception of changeable aspects of faces such as eye and mouth movements (Puce et al., 1998; Hoffman and Haxby, 2000), is an integral component and that its lack leads to impairments in tasks in which eye recognition plays a crucial role. The other case is reported in two studies by Vecera and Rizzo (2004, 2006) and points toward another brain region that may be crucial for the perception and interpretation of gaze cues as well—the frontal lobe. They report the case of a patient with frontal lobe damage who had lost the ability to direct attention to peripheral locations volitionally through endogenous cues such as words and gaze, but could still attend automatically to exogenous cues. Based on this, they suggest an association hypothesis that the gaze of another person is understood, analogous to words, through associations of gaze-directions with locations in space.

All this leaves a situation involving a number of perspectives impossible to reconcile at this point. Given the lack of a positive control for the role of visual motion, we cannot be sure that the reason for the negative finding in the contrast analysis is indeed the absence of visual motion in the stimuli. However, the HRF estimates do provide strong evidence that the GFP does not differentiate between arrow- and gaze-stimuli and if it should differentiate after all that arrow-stimuli elicit at least stronger responses than gaze-stimuli, contradicting the previous assumption of a specific functional role of the GFP in gaze following.

Footnotes

  • The authors declare no competing financial interests.

  • We thank Dr. Friedemann Bunjes for his technical assistance. We acknowledge support from the German Research Foundation Project TH425/17-1 and from the Open Access Publication Fund of the University of Tübingen.

  • M.G.'s present address: Max Planck Institute of Psychiatry, Department Emotion Research, 80804 Munich, Germany.

This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license, which permits unrestricted use, distribution and reproduction in any medium provided that the original work is properly attributed.

References

  1. ↵
    1. Abraham A,
    2. Pedregosa F,
    3. Eickenberg M,
    4. Gervais P,
    5. Mueller A,
    6. Kossaifi J,
    7. Gramfort A,
    8. Thirion B,
    9. Varoquaux G
    (2014) Machine learning for neuroimaging with scikit-learn. Front Neuroinform 8. https://doi.org/10.3389/fninf.2014.00014 pmid:24600388
    OpenUrlCrossRefPubMed
  2. ↵
    1. Akiyama T,
    2. Kato M,
    3. Muramatsu T,
    4. Saito F,
    5. Umeda S,
    6. Kashima H
    (2006) Gaze but not arrows: a dissociative impairment after right superior temporal gyrus damage. Neuropsychologia 44:1804–1810. https://doi.org/10.1016/j.neuropsychologia.2006.03.007
    OpenUrlCrossRefPubMed
  3. ↵
    1. Avants BB,
    2. Epstein CL,
    3. Grossman M,
    4. Gee JC
    (2008) Symmetric diffeomorphic image registration with cross-correlation: evaluating automated labeling of elderly and neurodegenerative brain. Med Image Anal 12:26–41. https://doi.org/10.1016/j.media.2007.06.004 pmid:17659998
    OpenUrlCrossRefPubMed
  4. ↵
    1. Baron-Cohen S
    (1994) How to build a baby that can read minds: cognitive mechanisms in mind reading. Curr Psychol Cogn 13:513–552.
    OpenUrl
  5. ↵
    1. Baron-Cohen S
    (n.d.) The empathizing system: a revision of the 1994 model of the mindreading system.
  6. ↵
    1. Behzadi Y,
    2. Restom K,
    3. Liau J,
    4. Liu TT
    (2007) A component based noise correction method (CompCor) for BOLD and perfusion based fMRI. NeuroImage 37:90–101. https://doi.org/10.1016/j.neuroimage.2007.04.042 pmid:17560126
    OpenUrlCrossRefPubMed
  7. ↵
    1. Callejas A,
    2. Shulman GL,
    3. Corbetta M
    (2014) Dorsal and ventral attention systems underlie social and symbolic cueing. J Cogn Neurosci 26:63–80. https://doi.org/10.1162/jocn_a_00461 pmid:23937692
    OpenUrlCrossRefPubMed
  8. ↵
    1. Cox RW,
    2. Hyde JS
    (1997) Software tools for analysis and visualization of fMRI data. NMR Biomed 10:171–178. https://doi.org/10.1002/(SICI)1099-1492(199706/08)10:4/5<171::AID-NBM453>3.0.CO;2-L
    OpenUrlCrossRefPubMed
  9. ↵
    1. Dale AM,
    2. Fischl B,
    3. Sereno MI
    (1999) Cortical surface-based analysis: I. segmentation and surface reconstruction. NeuroImage 9:179–194. https://doi.org/10.1006/nimg.1998.0395
    OpenUrlCrossRefPubMed
  10. ↵
    1. De Hollander G,
    2. Knapen T
    (2017) nideconv.
  11. ↵
    1. Esteban O, et al.
    (2019) fMRIPrep: a robust preprocessing pipeline for functional MRI. Nat Methods 16:111–116. https://doi.org/10.1038/s41592-018-0235-4 pmid:30532080
    OpenUrlCrossRefPubMed
  12. ↵
    1. Esteban O, et al.
    (2022) fMRIPrep: a robust preprocessing pipeline for functional MRI.
  13. ↵
    1. Faivre N,
    2. Koch C
    (2014) Inferring the direction of implied motion depends on visual awareness. J Vis 14:4. https://doi.org/10.1167/14.4.4 pmid:24706951
    OpenUrlAbstract/FREE Full Text
  14. ↵
    1. Fodor JA
    (1983) The modularity of mind: an essay on faculty psychology. MIT Press.
  15. ↵
    1. Fonov V,
    2. Evans A,
    3. McKinstry R,
    4. Almli C,
    5. Collins D
    (2009) Unbiased nonlinear average age-appropriate brain templates from birth to adulthood. NeuroImage 47:S102. https://doi.org/10.1016/S1053-8119(09)70884-5
    OpenUrlCrossRef
  16. ↵
    1. Glasser MF, et al.
    (2013) The minimal preprocessing pipelines for the human connectome project. NeuroImage 80:105–124. https://doi.org/10.1016/j.neuroimage.2013.04.127 pmid:23668970
    OpenUrlCrossRefPubMed
  17. ↵
    1. Görner M,
    2. Ramezanpour H,
    3. Chong I,
    4. Thier P
    (2020) Does the brain encode the gaze of others as beams emitted by their eyes? Proc Natl Acad Sci U S A 117:20375–20376. https://doi.org/10.1073/pnas.2012462117 pmid:32843563
    OpenUrlFREE Full Text
  18. ↵
    1. Greve DN,
    2. Fischl B
    (2009) Accurate and robust brain image alignment using boundary-based registration. NeuroImage 48:63–72. https://doi.org/10.1016/j.neuroimage.2009.06.060 pmid:19573611
    OpenUrlCrossRefPubMed
  19. ↵
    1. Groen IIA,
    2. Jahfari S,
    3. Seijdel N,
    4. Ghebreab S,
    5. Lamme VAF,
    6. Scholte HS
    (2018) Scene complexity modulates degree of feedback activity during object detection in natural scenes. PLoS Comput Biol 14:e1006690. https://doi.org/10.1371/journal.pcbi.1006690 pmid:30596644
    OpenUrlCrossRefPubMed
  20. ↵
    1. Guterstam A,
    2. Graziano MSA
    (2020a) Implied motion as a possible mechanism for encoding other people’s attention. Prog Neurobiol 190:101797. https://doi.org/10.1016/j.pneurobio.2020.101797
    OpenUrl
  21. ↵
    1. Guterstam A,
    2. Graziano MSA
    (2020b) Visual motion assists in social cognition. Proc Natl Acad Sci U S A 117:32165–32168. https://doi.org/10.1073/pnas.2021325117 pmid:33257566
    OpenUrlAbstract/FREE Full Text
  22. ↵
    1. Guterstam A,
    2. Kean HH,
    3. Webb TW,
    4. Kean FS,
    5. Graziano MSA
    (2019) Implicit model of other people’s visual attention as an invisible, force-carrying beam projecting from the eyes. Proc Natl Acad Sci U S A 116:328–333. https://doi.org/10.1073/pnas.1816581115 pmid:30559179
    OpenUrlAbstract/FREE Full Text
  23. ↵
    1. Guterstam A,
    2. Wilterson AI,
    3. Wachtell D,
    4. Graziano MSA
    (2020) Other people’s gaze encoded as implied motion in the human brain. Proc Natl Acad Sci U S A. 117:13162–13167. https://doi.org/10.1073/pnas.2003110117 pmid:32457153
    OpenUrlAbstract/FREE Full Text
  24. ↵
    1. Halgren E,
    2. Dale AM,
    3. Sereno MI,
    4. Tootell RBH,
    5. Marinkovic K,
    6. Rosen BR
    (1998) Location of human face-selective cortex with respect to retinotopic areas. Hum Brain Mapp 7:29–37. https://doi.org/10.1002/(SICI)1097-0193(1999)7:1<29::AID-HBM3>3.0.CO;2-R
    OpenUrlCrossRef
  25. ↵
    1. Haxby JV,
    2. Ungerleider LG,
    3. Clark VP,
    4. Schouten JL,
    5. Hoffman EA,
    6. Martin A
    (1999) The effect of face inversion on activity in human neural systems for face and object perception. Neuron 22:189–199. https://doi.org/10.1016/S0896-6273(00)80690-X
    OpenUrlCrossRefPubMed
  26. ↵
    1. Henderson M,
    2. Serences JT
    (2019) Human frontoparietal cortex represents behaviorally relevant target status based on abstract object features. J Neurophysiol 121:1410–1427. https://doi.org/10.1152/jn.00015.2019 pmid:30759040
    OpenUrlCrossRefPubMed
  27. ↵
    1. Hietanen JK,
    2. Nummenmaa L,
    3. Nyman MJ,
    4. Parkkola R,
    5. Hämäläinen H
    (2006) Automatic attention orienting by social and symbolic cues activates different neural networks: an fMRI study. NeuroImage 33:406–413. https://doi.org/10.1016/j.neuroimage.2006.06.048
    OpenUrlCrossRefPubMed
  28. ↵
    1. Hoffman EA,
    2. Haxby JV
    (2000) Distinct representations of eye gaze and identity in the distributed human neural system for face perception. Nat Neurosci 3:80–84. https://doi.org/10.1038/71152
    OpenUrlCrossRefPubMed
  29. ↵
    1. Jenkinson M,
    2. Bannister P,
    3. Brady M,
    4. Smith S
    (2002) Improved optimization for the robust and accurate linear registration and motion correction of brain images. NeuroImage 17:825–841. https://doi.org/10.1006/nimg.2002.1132
    OpenUrlCrossRefPubMed
  30. ↵
    1. Kamphuis S,
    2. Dicke PW,
    3. Thier P
    (2009) Neuronal substrates of gaze following in monkeys. Eur J Neurosci 29:1732–1738. https://doi.org/10.1111/j.1460-9568.2009.06730.x
    OpenUrlCrossRefPubMed
  31. ↵
    1. Kanwisher N,
    2. McDermott J,
    3. Chun MM
    (1997) The fusiform face area: a module in human extrastriate cortex specialized for face perception. J Neurosci 17:4302–11. https://doi.org/10.1523/JNEUROSCI.17-11-04302.1997 pmid:9151747
    OpenUrlAbstract/FREE Full Text
  32. ↵
    1. Klein A, et al.
    (2017) Mindboggling morphometry of human brains. PLoS Comput Biol 13:e1005350. https://doi.org/10.1371/journal.pcbi.1005350 pmid:28231282
    OpenUrlCrossRefPubMed
  33. ↵
    1. Kourtzi Z,
    2. Kanwisher N
    (2000) Activation in human MT/MST by static images with implied motion. J Cogn Neurosci 12:48–55. https://doi.org/10.1162/08989290051137594
    OpenUrlCrossRefPubMed
  34. ↵
    1. Kraemer PM,
    2. Görner M,
    3. Ramezanpour H,
    4. Dicke PW,
    5. Thier P
    (2020) Frontal, parietal, and temporal brain areas are differentially activated when disambiguating potential objects of joint attention. eNeuro 7. https://doi.org/10.1523/ENEURO.0437-19.2020 pmid:32907832
    OpenUrlAbstract/FREE Full Text
  35. ↵
    1. Lanczos C
    (1964) Evaluation of noisy data. J Soc Ind Appl Math Ser B Numer Anal 1:76–85. https://doi.org/10.1137/0701007
    OpenUrl
  36. ↵
    1. Laube I,
    2. Kamphuis S,
    3. Dicke PW,
    4. Thier P
    (2011) Cortical processing of head- and eye-gaze cues guiding joint social attention. Neuroimage 54:1643–1653. https://doi.org/10.1016/j.neuroimage.2010.08.074
    OpenUrlCrossRefPubMed
  37. ↵
    1. Lindquist MA,
    2. Loh JM,
    3. Atlas LY,
    4. Wager TD
    (2009) Modeling the hemodynamic response function in fMRI: efficiency, bias and mis-modeling. Neuroimage 45:S187–S198. https://doi.org/10.1016/j.neuroimage.2008.10.065 pmid:19084070
    OpenUrlCrossRefPubMed
  38. ↵
    1. Lorteije JAM,
    2. Barraclough NE,
    3. Jellema T,
    4. Raemaekers M,
    5. Duijnhouwer J,
    6. Xiao D,
    7. Oram MW,
    8. Lankheet MJM,
    9. Perrett DI,
    10. van Wezel RJA
    (2011) Implied motion activation in cortical area MT can be explained by visual low-level features. J Cogn Neurosci 23:1533–1548. https://doi.org/10.1162/jocn.2010.21533
    OpenUrlCrossRefPubMed
  39. ↵
    1. Marquardt K,
    2. Ramezanpour H,
    3. Dicke PW,
    4. Thier P
    (2017) Following eye gaze activates a patch in the posterior temporal cortex that is not part of the human “face patch” system. eNeuro 4:1–10. https://doi.org/10.1523/ENEURO.0317-16.2017 pmid:28374010
    OpenUrlAbstract/FREE Full Text
  40. ↵
    1. Materna S,
    2. Dicke PW,
    3. Thier P
    (2008a) Dissociable roles of the superior temporal sulcus and the intraparietal sulcus in joint attention: a functional magnetic resonance imaging study. J Cogn Neurosci 20:108–119. https://doi.org/10.1162/jocn.2008.20008
    OpenUrlCrossRefPubMed
  41. ↵
    1. Materna S,
    2. Dicke PW,
    3. Thier P
    (2008b) The posterior superior temporal sulcus is involved in social communication not specific for the eyes. Neuropsychologia 46:2759–2765. https://doi.org/10.1016/j.neuropsychologia.2008.05.016
    OpenUrlCrossRefPubMed
  42. ↵
    1. Moore T
    (1999) Shape representations and visual guidance of saccadic eye movements. Science 285:1914–1917. https://doi.org/10.1126/science.285.5435.1914
    OpenUrlAbstract/FREE Full Text
  43. ↵
    Nilearn contributors (n.d.) Nilearn: statistics for neuroImaging in Python.
  44. ↵
    1. Nummenmaa L,
    2. Calder AJ
    (2009) Neural mechanisms of social attention. Trends Cogn Sci 13:135–143. https://doi.org/10.1016/j.tics.2008.12.006
    OpenUrlCrossRefPubMed
  45. ↵
    1. Perrett DI,
    2. Emery NJ
    (1994) Understanding the intentions of others from visual signals: neuropsychological evidence. Cah Psychol Cogn 13:683–694.
    OpenUrl
  46. ↵
    1. Posner MI
    (1980) Orienting of attention. Q J Exp Psychol 32:3–25. https://doi.org/10.1080/00335558008248231
    OpenUrlCrossRefPubMed
  47. ↵
    1. Power JD,
    2. Mitra A,
    3. Laumann TO,
    4. Snyder AZ,
    5. Schlaggar BL,
    6. Petersen SE
    (2014) Methods to detect, characterize, and remove motion artifact in resting state fMRI. NeuroImage 84:320–341. https://doi.org/10.1016/j.neuroimage.2013.08.048 pmid:23994314
    OpenUrlCrossRefPubMed
  48. ↵
    1. Puce A,
    2. Allison T,
    3. Bentin S,
    4. Gore JC,
    5. McCarthy G
    (1998) Temporal cortex activation in humans viewing eye and mouth movements. J Neurosci 18:2188–2199. https://doi.org/10.1523/JNEUROSCI.18-06-02188.1998 pmid:9482803
    OpenUrlAbstract/FREE Full Text
  49. ↵
    1. Ramachandran VS,
    2. Anstis SM
    (1986) The perception of apparent motion. Sci Am 254:102–109. https://doi.org/10.1038/scientificamerican0686-102
    OpenUrlCrossRefPubMed
  50. ↵
    1. Ramezanpour H,
    2. Thier P
    (2020) Decoding of the other’s focus of attention by a temporal cortex module. Proc Natl Acad Sci U S A 117:2663–2670. https://doi.org/10.1073/pnas.1911269117 pmid:31964825
    OpenUrlAbstract/FREE Full Text
  51. ↵
    1. Satterthwaite TD, et al.
    (2013) An improved framework for confound regression and filtering for control of motion artifact in the preprocessing of resting-state functional connectivity data. NeuroImage 64:240–256. https://doi.org/10.1016/j.neuroimage.2012.08.052 pmid:22926292
    OpenUrlCrossRefPubMed
  52. ↵
    1. Senior C,
    2. Barnes J,
    3. Giampietro V,
    4. Simmons A,
    5. Bullmore ET,
    6. Brammer M,
    7. David AS
    (2000) The functional neuroanatomy of implicit-motion perception or representational momentum. Curr Biol 10:16–22. https://doi.org/10.1016/S0960-9822(99)00259-6
    OpenUrlCrossRefPubMed
  53. ↵
    1. Stemmann H,
    2. Freiwald WA
    (2016) Attentive motion discrimination recruits an area in inferotemporal cortex. J Neurosci 36:11918–11928. https://doi.org/10.1523/JNEUROSCI.1888-16.2016 pmid:27881778
    OpenUrlAbstract/FREE Full Text
  54. ↵
    1. Stemmann H,
    2. Freiwald WA
    (2019) Evidence for an attentional priority map in inferotemporal cortex. Proc Natl Acad Sci U S A 116:23797–23805. https://doi.org/10.1073/pnas.1821866116 pmid:31685625
    OpenUrlAbstract/FREE Full Text
  55. ↵
    1. Tustison NJ,
    2. Avants BB,
    3. Cook PA,
    4. Zheng Y,
    5. Egan A,
    6. Yushkevich PA,
    7. Gee JC
    (2010) N4ITK: improved N3 bias correction. IEEE Trans Med Imaging 29:1310–1320. https://doi.org/10.1109/TMI.2010.2046908 pmid:20378467
    OpenUrlCrossRefPubMed
  56. ↵
    1. Vecera SP,
    2. Rizzo M
    (2004) What are you looking at? Impaired “social attention” following frontal-lobe damage. Neuropsychologia 42:1657–1665. https://doi.org/10.1016/j.neuropsychologia.2004.04.009
    OpenUrlCrossRefPubMed
  57. ↵
    1. Vecera SP,
    2. Rizzo M
    (2006) Eye gaze does not produce reflexive shifts of attention: evidence from frontal-lobe damage. Neuropsychologia 44:150–159. https://doi.org/10.1016/j.neuropsychologia.2005.04.010
    OpenUrlCrossRefPubMed
  58. ↵
    1. Wertheimer M
    (1912) Experimentelle studien über das sehen von bewegung. Leipzig: J.A. Barth.
  59. ↵
    1. Yarkoni T
    (n.d.) Neurosynth [WWW Document]. Neurosynth. URL https://neurosynth.org/ (accessed 8.31.22).
  60. ↵
    1. Zhang Y,
    2. Brady M,
    3. Smith S
    (2001) Segmentation of brain MR images through a hidden Markov random field model and the expectation-maximization algorithm. IEEE Trans Med Imaging 20:45–57. https://doi.org/10.1109/42.906424
    OpenUrlCrossRefPubMed

Synthesis

Reviewing Editor: Niko Busch, Westfalische Wilhelms-Universitat Munster

Decisions are customarily a result of the Reviewing Editor and the peer reviewers coming together and discussing their recommendations until a consensus is reached. When revisions are invited, a fact-based synthesis statement explaining their decision and outlining what is needed to prepare a revision will be listed below. The following reviewer(s) agreed to reveal their identity: Arvid Guterstam.

# Synthesis

This manuscript has undergone review by two experts in the field of gaze perception. Their detailed feedback is appended below for your consideration. Both reviewers have raised important points regarding further analyses, methodological clarifications, and the interpretation of your findings, which I strongly encourage you to address comprehensively.

A common thread in the reviews is the emphasis on substantiating the null findings presented in your study. I agree with reviewer 1 that null results are, in principle, scientifically valuable, and eNeuro is committed to publishing any type of result that is based on rigorous research. However, I also agree that such findings should ideally demonstrate that the null finding was not due to an ineffective experimental manipulation, confounds (e.g. due to eye movements), or insufficient data quality. In this vein, I concur with Reviewer 1's suggestion that including a positive control could significantly strengthen your manuscript by affirming that the observed null findings are not due to the reasons mentioned above.

Finally, I would like to ask you to consider making data and code publicly available.

# Reviewer 1

In this study, the authors compared BOLD activity within the gaze following patch (GFP) during gaze-direction following versus arrow-direction following. Contrary to their initial hypothesis, they did not find a significant difference between gaze and arrow-direction following, and were not able to reproduce the GFP as presented in previous studies. They specualte that a possible explanation for these negative results is that the gaze stimuli in the present study, in contrast to previous studies, did not contain an obvious change of direction that represented a visual motion.

This study addresses an important topic relating to the role of the GFP in gaze-following. The results are negative across the board; there is not a single activation or contrast that survive adequate correction for multiple comparisons. Interpreting negative results is notorously difficult, and I therefore commend the authors for submitting their negative results for publication instead of putting them in the drawer. That said, I have some major concerns/comments.

1. In the significance statement, the authors state: "This study presents evidence that the activation of a brain area previously thought to be a domain-specific cortical module specialized in the processing of other people's direction of visual attention does, in fact, depend on the presence of visual motion. Therefore, it must be characterized in a domain-general sense." This conclusion cannot be drawn from the presented data due to a lack of a positive control for the role of apparent motion (which the author acknowledge in the Discussion), and therefore the conclusions need to be toned down.

2. It would be very interesting to employ multivoxel pattern analysis (MVPA) to the current data set. MVPA is arguably more sensitive than simple univariate contrasts and relevant information regarding gaze-following in GFP might be contained in the fine-grained relative activities of neighbouring voxels, information that is lost in univariate contrasts. If gaze-following vs iris-color mapping can be decoded significantly better than chance from patterns of BOLD activity in the GFP, and if the GFP is domain-specific, one would predict that a classifier trained on gaze-following vs iris-color mapping cannot generalize to decode arrow-direction following versus arrow-color mapping. However, if the alternative hypothesis is true, and the GFP is domain-general, one would expect the classifier to generalize. Using this approach, it might be possible to directly contrast the two alternative hypotheses.

3. Since the results are negative and a positive control for the role of apparent motion is lacking, the conclusions of the paper in its current form are speculative. I leave it up to the editor to determine whether this work meets the standards for publication in eNeuro.

# Reviewer 2

Results:

page 8. "The HRFs for the mGFP show a statistically significant (95% CIs do not include the activation level of 0) deflection around 5 sec after cue onset in both hemispheres in all conditions but the iris-color condition". How does the author decide on the 5sec timepoint? What is the rationale/motivation on choosing this particular time? How long do we expect to see this effect last?

Methods:

Figure 1. "After cue presentation the participants had to wait with their response until the central fixation dot disappeared". How does the author make sure the participants look at central fixation during the process? If participants make saccades/eye movements before the central fixations, how did the author deal with the data?

For trials when the participants look at the correct/incorrect target, was there a difference in the results?

What eye tracker did the author use? What are the eyetracker specs? How did the author implement eyetracking calibration and set up?

Discussion:

The author proposed in discussion "As a third possibility, we propose that the observed activity patterns can be attributed to an effect described as implied motion.
" I have trouble understanding that if the arrow-direction following case (without apparent motion) elicits siginificatn results in mGFP, why would be that the gaze-direction following does not show this "implied motion" effect? Could there be any difference in temporal development in these two cases? Maybe author could add more explanation in the discussion section.

Author Response

# Synthesis This manuscript has undergone review by two experts in the field of gaze perception. Their detailed feedback is appended below for your consideration. Both reviewers have raised important points regarding further analyses, methodological clarifications, and the interpretation of your findings, which I strongly encourage you to address comprehensively. A common thread in the reviews is the emphasis on substantiating the null findings presented in your study. I agree with reviewer 1 that null results are, in principle, scientifically valuable, and eNeuro is committed to publishing any type of result that is based on rigorous research. However, I also agree that such findings should ideally demonstrate that the null finding was not due to an ineffective experimental manipulation, confounds (e.g. due to eye movements), or insufficient data quality. In this vein, I concur with Reviewer 1's suggestion that including a positive control could significantly strengthen your manuscript by affirming that the observed null findings are not due to the reasons mentioned above. Authors' answer: This point is indeed of utmost importance and we are aware that the lack of a positive control is the study's greatest weakness, limiting the interpretability of the first part of the analysis (BOLD-contrasts). Unfortunately, for organizational reasons (several of the subjects we tested are no longer available in Tübingen), it is not possible to add a meaningful positive control experiment post hoc. Such a control experiment would have to be based on a new group of subjects, therefore limiting its value. Hence, in view of the need to compare different populations of subjects we argue that adding such a control experiment would not strengthen the study more than referring to our old studies on the GFP including for example Marquardt et al., 2017, eNeuro, from which we took the stimuli with the modification described in the manuscript. (You may remove the latter sentence when handing over our answers to the reviewers but we also agree to open up the blinding if it helps to clarify this point.) That said, we do argue that the results presented in the study are not solely negative; even though we were not able to reproduce the GFP using the univariate methods used previously (Fig. 2, Tab. 1) we were able to demonstrate a statistically significant activation in the ROI analysis in all but the Iris-color mapping condition and importantly, that this BOLD activation is strongest in the 1 Arrow-direction following condition (Fig. 3). This result alone demonstrates that the GFP is not a dedicated gaze-module, as it was proposed in earlier studies. This is the main reason why we think the study is worth publishing without having the aforementioned positive control condition for the influence of visual motion. We thank you and the reviewers for having pointed out this issue and we have revised the manuscript to discuss this important consideration in more detail (pp. 10/11, 12 and 14). Finally, I would like to ask you to consider making data and code publicly available. We have added the URLs to the resources to the manuscript. # Reviewer 1 In this study, the authors compared BOLD activity within the gaze following patch (GFP) during gaze-direction following versus arrow-direction following. Contrary to their initial hypothesis, they did not find a significant difference between gaze and arrow-direction following, and were not able to reproduce the GFP as presented in previous studies. They speculate that a possible explanation for these negative results is that the gaze stimuli in the present study, in contrast to previous studies, did not contain an obvious change of direction that represented a visual motion. This study addresses an important topic relating to the role of the GFP in gaze-following. The results are negative across the board; there is not a single activation or contrast that survive adequate correction for multiple comparisons. Interpreting negative results is notoriously difficult, and I therefore commend the authors for submitting their negative results for publication instead of putting them in the drawer. That said, I have some major concerns/comments. 1. In the significance statement, the authors state: "This study presents evidence that the activation of a brain area previously thought to be a domain-specific cortical module specialized in the processing of other people's direction of visual attention does, in fact, depend on the presence of visual motion. Therefore, it must be characterized in a domain-general sense." This conclusion cannot be drawn from the 2 presented data due to a lack of a positive control for the role of apparent motion (which the author acknowledge in the Discussion), and therefore the conclusions need to be toned down. Authors' answer: We fully agree and we have revised the significance statement as follows: Significance Statement. This study presents evidence against the notion of domain-specificity of an area in the posterior temporal cortex (the gaze-following-patch; GFP) previously described to specifically serve eye gaze following. This conclusion is suggested by the finding that using arrows to identify a target object among distractors is accompanied by a comparable or even larger BOLD response than when the participants are asked to use the gaze direction of a demonstrator face for target selection. The fact that even the best candidate to date, the posterior temporal GFP, does not stand up to critical scrutiny casts doubt on the assumption that the brain uses a specific module to enable gaze following, as proposed by Simon Baron-Cohen. 2. It would be very interesting to employ multivoxel pattern analysis (MVPA) to the current data set. MVPA is arguably more sensitive than simple univariate contrasts and relevant information regarding gaze-following in GFP might be contained in the fine-grained relative activities of neighboring voxels, information that is lost in univariate contrasts. If gaze-following vs iris-color mapping can be decoded significantly better than chance from patterns of BOLD activity in the GFP, and if the GFP is domain-specific, one would predict that a classifier trained on gaze-following vs iris-color mapping cannot generalize to decode arrow-direction following versus arrow-color mapping. However, if the alternative hypothesis is true, and the GFP is domain-general, one would expect the classifier to generalize. Using this approach, it might be possible to directly contrast the two alternative hypotheses. Authors' answer: We have considered using MVPA and especially cross-condition decoding. However, we came to the conclusion that this would not help to resolve the alternative of domain-specifity vs. generality for the following reason: We think that it is false to assume that if we cannot decode arrow-direction vs arrow-color using a decoder trained on gaze-direction vs iris-color, then the GFP could be considered to be domain specific. This is because if the GFP were a "domain-general early visual" area its activity would 3 depend on the low-level features of the stimuli, which are very different between the two (face and arrows). This alone would hinder successful cross-decoding without the necessity of domain-specificity. On the other hand, if we could decode we would not have learned more than we already have from the univariate analysis and the hrf modeling. The only interesting result here would be the inability to decode arrow-direction vs arrow-color at all (meaning also not when trained on arrow-direction vs arrow-color) but only gaze-direction vs iris-color. However this appears unlikely given that the activation during both arrow conditions is actually stronger - but differential - than during the gaze/iris conditions (see Fig. 3). If the reviewer thinks and convinces us that this argument is erroneous, we would be happy to reconsider applying MVPA. 3. Since the results are negative and a positive control for the role of apparent motion is lacking, the conclusions of the paper in its current form are speculative. I leave it up to the editor to determine whether this work meets the standards for publication in eNeuro. Authors' answer: We fully agree with this reviewer´s criticism regarding the limited interpretation concerning the role of visual motion given the lack of a positive control. Nevertheless we think that the second part of the analysis, the HRF estimations, does provide direct evidence that the GFP does not differentiate between an arrow pointing towards an object and a face gazing at an object irrespective of the role of visual motion. We have toned down the discussion regarding the statements on the role of visual motion and emphasized the ROI analysis more. (see also our answer to the Synthesis) # Reviewer 2 Results: page 8. "The HRFs for the mGFP show a statistically significant (95% CIs do not include the activation level of 0) deflection around 5 sec after cue onset in both hemispheres in all conditions but the iris-color condition". How does the author decide on the 5sec timepoint? What is the rationale/motivation on choosing this particular time? How long do we expect to see this effect last? 4 Authors' answer: Here, we have assumed - without justification - that the reader is familiar with the standard hemodynamic response function. Pointing out the 5 second time point reflects the peak of the canonical HRF, i.e. it is not a time point that we chose but reflects the canonical maximum of the BOLD response. We have clarified this in the manuscript. Methods: Figure 1. "After cue presentation the participants had to wait with their response until the central fixation dot disappeared". How does the author make sure the participants look at central fixation during the process? If participants make saccades/eye movements before the central fixations, how did the author deal with the data? For trials when the participants look at the correct/incorrect target, was there a difference in the results? What eye tracker did the author use? What are the eyetracker specs? How did the author implement eyetracking calibration and set up? Authors' answer: This is a valid criticism. We have recorded eye-tracking data during the fMRI recordings. However, the recordings suffered from bad quality and could not be analyzed consistently. When we tried, we were able to analyze only ~30% of trials. In ~10% of these "good" trials, participants made an error. This was the case for all experimental conditions and roughly matches the behavioral results reported in Marquardt et al. 2017, eNeuro. Further, the eye-tracking camera enabled us to manually monitor participants' behavior during the experiment, giving us confidence that participants generally complied with the task requirements. Before the fMRI session participants underwent a practice session in which their behavior was closely monitored, and corrected, as well. This, together with the fact that the task was easy and intuitive makes us confident that the number of error trials was indeed small and, thus, can be ignored. We have added this reasoning to the manuscript. That said, it would indeed be super interesting to specifically analyze error trials. However this would require a different experimental design and especially more successful efforts in recording eye-tracking data. Discussion: The author proposed in discussion "As a third possibility, we propose that the observed activity patterns can be attributed to an effect described as implied 5 motion. " I have trouble understanding that if the arrow-direction following case (without apparent motion) elicits significant results in mGFP, why would be that the gaze-direction following does not show this "implied motion" effect? Could there be any difference in temporal development in these two cases? Maybe author could add more explanation in the discussion section. Authors' answer: We think there is a misunderstanding here (we have revised the respective paragraph in the manuscript). There is indeed a significant activation in the gaze-following condition, as well! However, the response correlated with gaze-following appears to be smaller than the one related to arrow-direction following (see Fig. 3). We find this result particularly interesting because it is at odds with a finding from Michael Graziano's lab, who reported that there is no "implied motion effect" when observing arrows (see Discussion).

Back to top

In this issue

eneuro: 11 (7)
eNeuro
Vol. 11, Issue 7
July 2024
  • Table of Contents
  • Index by author
  • Masthead (PDF)
Email

Thank you for sharing this eNeuro article.

NOTE: We request your email address only to inform the recipient that it was you who recommended this article, and that it is not junk mail. We do not retain these email addresses.

Enter multiple addresses on separate lines or separate them with commas.
Gaze and Arrows: Does the Gaze-Following Patch in the Posterior Temporal Cortex Differentiate Social and Symbolic Spatial Cues?
(Your Name) has forwarded a page to you from eNeuro
(Your Name) thought you would be interested in this article in eNeuro.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Print
View Full Page PDF
Citation Tools
Gaze and Arrows: Does the Gaze-Following Patch in the Posterior Temporal Cortex Differentiate Social and Symbolic Spatial Cues?
Marius Görner, Hamidreza Ramezanpour, Peter Dicke, Peter Thier
eNeuro 3 July 2024, 11 (7) ENEURO.0065-24.2024; DOI: 10.1523/ENEURO.0065-24.2024

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Respond to this article
Share
Gaze and Arrows: Does the Gaze-Following Patch in the Posterior Temporal Cortex Differentiate Social and Symbolic Spatial Cues?
Marius Görner, Hamidreza Ramezanpour, Peter Dicke, Peter Thier
eNeuro 3 July 2024, 11 (7) ENEURO.0065-24.2024; DOI: 10.1523/ENEURO.0065-24.2024
Twitter logo Facebook logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Jump to section

  • Article
    • Abstract
    • Significance Statement
    • Introduction
    • Materials and Methods
    • Results
    • Discussion
    • Footnotes
    • References
    • Synthesis
    • Author Response
  • Figures & Data
  • Info & Metrics
  • eLetters
  • PDF

Keywords

  • fMRI
  • joint attention
  • social cognition
  • spatial cueing

Responses to this article

Respond to this article

Jump to comment:

No eLetters have been published for this article.

Related Articles

Cited By...

More in this TOC Section

Research Article: New Research

  • Examining Relationships between Functional and Structural Brain Network Architecture, Age, and Attention Skills in Early Childhood
  • Novel roles for the GPI-anchor cleaving enzyme, GDE2, in hippocampal synaptic morphology and function
  • Upright posture: a singular condition stabilizing sensorimotor coordination
Show more Research Article: New Research

Cognition and Behavior

  • EEG Signatures of Auditory Distraction: Neural Responses to Spectral Novelty in Real-World Soundscapes
  • Excess neonatal testosterone causes male-specific social and fear memory deficits in wild-type mice
  • The effects of mindfulness meditation on mechanisms of attentional control in young and older adults: a preregistered eye tracking study
Show more Cognition and Behavior

Subjects

  • Cognition and Behavior
  • Home
  • Alerts
  • Follow SFN on BlueSky
  • Visit Society for Neuroscience on Facebook
  • Follow Society for Neuroscience on Twitter
  • Follow Society for Neuroscience on LinkedIn
  • Visit Society for Neuroscience on Youtube
  • Follow our RSS feeds

Content

  • Early Release
  • Current Issue
  • Latest Articles
  • Issue Archive
  • Blog
  • Browse by Topic

Information

  • For Authors
  • For the Media

About

  • About the Journal
  • Editorial Board
  • Privacy Notice
  • Contact
  • Feedback
(eNeuro logo)
(SfN logo)

Copyright © 2025 by the Society for Neuroscience.
eNeuro eISSN: 2373-2822

The ideas and opinions expressed in eNeuro do not necessarily reflect those of SfN or the eNeuro Editorial Board. Publication of an advertisement or other product mention in eNeuro should not be construed as an endorsement of the manufacturer’s claims. SfN does not assume any responsibility for any injury and/or damage to persons or property arising from or related to any use of any material contained in eNeuro.