Abstract
Visual perception includes ventral and dorsal stream processes. However, it is still unclear whether the former is predominantly related to conscious and the latter to nonconscious visual perception as argued in the literature. In this study upright and inverted body postures were rendered either visible or invisible under continuous flash suppression (CFS), while brain activity of human participants was measured with functional MRI (fMRI). Activity in the ventral body-sensitive areas was higher during visible conditions. In comparison, activity in the posterior part of the bilateral intraparietal sulcus (IPS) showed a significant interaction of stimulus orientation and visibility. Our results provide evidence that dorsal stream areas are less associated with visual awareness.
Significance Statement
The occipital and parietal lobes of the human brain include two visual processing streams, a ventral one more involved in object recognition, and a dorsal one for spatial-, attention-, and action-related processes. It is currently still unclear what the relation is between activity in the dorsal processing stream and consciousness, as evidence so far has been scarce and inconsistent. Our study with whole-body stimuli shows that activity in the dorsal pathway is substantially less influenced by subjective stimulus awareness, while activity in the ventral pathway is significantly higher for consciously perceived stimuli. Our results clarify this important difference between visual perception in dorsal and ventral stream.
Introduction
The occipito-temporal and parietal lobes of the human brain contain two major processing streams: the ventral stream is involved more in processes related to object recognition, and the dorsal one more in spatial processing, attention, and online control of actions (Milner and Goodale, 2006).
An important open question concerns the relation of these two processing streams to subjective awareness. A dissociation has been shown in patients with brain lesions: a patient with lateral occipital cortex damage could perform visually guided actions according to the size, shape or orientation of objects and tools, despite being unable to consciously differentiate those properties (Carey et al., 1996; Milner, 2012); patients with parieto-temporal cortex damage (McIntosh et al., 2004) or bilateral V1 damage (de Gelder et al., 2008) showed obstacle avoidance without being consciously aware of the obstacle. Addressing the relationship between the two streams in neurotypical participants requires controlled presentation of subjectively unseen stimuli, as can be achieved using the continuous flash suppression (CFS) method. Under CFS, the dichoptic presentation of a target stimulus and a dynamic noise pattern renders the target invisible for several seconds (Tsuchiya and Koch, 2005; Tsuchiya et al., 2006; Yang et al., 2014).
The two-stream view does not imply an absolute division, and processing of some object categories clearly involves both streams. For example, tools (Johnson-Frey, 2004; Culham et al., 2006) trigger activity related to the object category in ventral areas, but also to action-observation-execution in dorsal areas. Several functional MRI (fMRI) studies directly compared ventral and dorsal activity and their relationship with visual awareness using CFS. They varied in experimental designs, but all used stimuli of either tools only or together with faces. For the ventral stream, these studies consistently showed that the activity in ventral-lateral areas including the fusiform area and the lateral occipital area covaried with subjective perceptual awareness for both faces and tools, and that the activity for invisible faces/tools was significantly lower than visible ones (Fang and He, 2005; Hesselmann et al., 2011; Hesselmann and Malach, 2011; Ludwig et al., 2015).
For the dorsal stream however, the evidence is still not conclusive. One of the four abovementioned studies presented visible and invisible trials in separate runs without trial-by-trial subjective reports, and used long baseline conditions (Fang and He, 2005). They found that activity in dorsal areas was diminished for invisible faces but not for invisible tools. This dissociation may due to these two categories’ different relations to the function of reaching and grasping in the dorsal stream. By comparison, the other three studies presented visible and invisible trials in the same run, with trial-by-trial subjective reports. They all found higher activity for visible tools in both ventral and dorsal areas that covaried with the visual awareness (Hesselmann et al., 2011; Hesselmann and Malach, 2011; Ludwig et al., 2015). In two of them that performed multivariate pattern analyses between faces and tools, Hesselmann et al. (2011) found that invisible faces and tools were only decodable in the fusiform area, although Ludwig et al. (2015) found them decodable both in the right V3a/V7 in the dorsal stream, and in FFA in the ventral stream. A fifth study did not examine the amplitude of activity across the two streams, but specifically examined the decodability of faces and tools, with five different strengths of CFS masks. They found that the faces and tools were decodable in both streams with the no-mask condition and the weaker masks, associated with higher levels of subjective visibility but were not decodable for the stronger masks with lower levels of subjective visibility (Ludwig et al., 2016).
Here, we examined whether the assumed dissociation between the ventral and dorsal streams holds, and whether the ventral stream is mainly related to conscious and the dorsal stream to nonconscious perception, using stimuli of whole-body images. Human body stimuli have a unique combination of properties and are particularly useful to explore this issue, because they activate the action-related dorsal network (Rizzolatti and Sinigaglia, 2010) similarly to tool stimuli, and at the same time, bodies are processed in category specific areas (extrastriate body area, EBA; fusiform body area, FBA) in the ventral stream (Peelen and Downing, 2007). It has been shown that information of the body stimuli could be processed without visual awareness, through pathways other than V1, activating the EBA of the same patient with bilateral V1 lesions who showed object avoidance (Van den Stock et al., 2014). As a biologically meaningful category, the processing of bodies is also disrupted by the inversion of stimuli (Reed et al., 2003) similar to faces, and under the breaking CFS paradigm (b-CFS) which measures suppression time of stimuli and indirectly reflects the nonconscious processing, inverted bodies have been shown to be suppressed longer than upright bodies (Stein et al., 2012).
We presented the body stimuli either upright or inverted, and rendered them either invisible or visible with CFS, using a slow event-related design. By including inverted versions of exactly the same upright stimuli, the experimental design enabled us to examine the interaction of body orientation and subjective visibility, and to clarify the relationship between dorsal and ventral areas with respect to visual awareness. We measured blood-oxygenation level-dependent (BOLD) activity with fMRI of relatively high resolution (2 × 2 × 2 mm3), while participants passively viewed dichoptic stimuli through a pair of prism glasses. The four types of trials [orientation (upright, inverted) × visibility (visible, invisible)] were balanced and presented within the same runs (Fig. 1). We examined the relationships between orientation and visibility both with the general linear model (GLM) analysis and ANOVA in the whole brain, and performed the ANOVAs in ventral and dorsal regions of interest (ROIs) defined with a separate functional localizer for individual participants.
Materials and Methods
Participants
Eight participants took part in the current study, the data of seven were used for the analysis (two males, mean age = 25.7, SD = 4.1, 2 left-handed), the data of the other participant (participant 8) were excluded due to imperfect suppression (see the Validation of the suppression part in this section). The current experiment involved long scan sessions (2 h), thus continuing perfect CFS suppression is crucial; we applied stringent recruiting criteria on suppression effects for participants. Participants 2, 4, and 5 were recruited based on their performance in a separate CFS priming experiment using faces and bodies as prime stimuli where they did not perceive either faces or bodies (a total of 25 participants took part in this experiment at the time of invitation for the current fMRI study, for 16 of which the percept of the body stimuli was well suppressed, and for nine within the 16 the percept of the face stimuli was also suppressed). Participants 1, 3, 6, and 7 were colleagues in the department that had participated in various similar CFS pilot experiments with either suppressed faces or bodies. Participant 8 was recruited from another CFS experiment (20 participants in that experiment in total), where the visibility of body and object stimuli was manipulated in the same way as in the current fMRI experiment. For participant 8 the body stimuli were suppressed in 90.5% unseen trials in that experiment. This screening procedure also precluded any noticeable training effect of the task during the current fMRI experiment. As it was difficult to find participants with strong and stable suppression, we also included left-handed participants. Although all participants were familiar with the CFS paradigm, they were all naïve to the aim of the current study. The participants had no history of neurological disorders, had normal or corrected-to-normal visual acuity, and had normal stereoscopic color vision. They provided written informed consents for participation and received monetary rewards. The experimental procedures were approved by the ethical committee of Maastricht University, and were conducted in accordance with the standards established by the Declaration of Helsinki.
The main experiment
Stimuli
Images of upright body postures expressing fear (24 identities, 12 were females) were selected from a validated set of whole-body stimuli (Stienen and de Gelder, 2011). To ensure successful suppression of stimuli during the scanning sessions, we used fearful body postures, because it was found in a previous behavioral study that fearful bodies were suppressed longer under CFS than neutral and angry bodies (Zhan et al., 2015). The bodies were aligned at the feet level, with facial information removed, and imbedded in a gray background (RGB value = 128,128,128, size = 240 × 160 pixels, 3.81 × 2.54° of visual angles). The bodies in the images had a height within 161 pixels (2.56°), and a width within 80 pixels (1.27°). The inverted body stimuli were created by turning the upright stimuli upside-down. For catch trials, the stimuli image contained one to four dots, randomly located in the image.
Dynamic Mondrian noise images with same size as the body stimuli were presented at 10 Hz, to achieve the suppression effect. The noise images contained colorful small rectangles (with height and width within 2°) that overlapped with each other; 600 unique noise images were created, and the images presented in each trial were randomly selected from this pool.
Setup for dichoptic presentation
The dichoptic presentation both inside and outside the scanner was achieved by viewing the stimuli through a pair of prism glasses. Stimuli presentation was realized in MATLAB (the MathWorks) with Psychtoolbox (Brainard, 1997; Pelli, 1997). The dichoptic stimuli were presented into two rectangles (240 × 160 pixels) side by side, their centers displaced at equal distance from the center of the screen (792 pixels between centers of two rectangles, 12.41°). A frame of 10 pixels delineated the border of the rectangles, and a black fixation cross was placed in the center of each rectangle. A cardboard divider was positioned between the participant and the screen, dividing the distance between the two rectangles equally, to make sure that each eye of the participant only saw the rectangle ipsilateral to that eye. The diopters of the prism glasses were chosen according to the visual angles between the rectangles (Schurger, 2009). When viewing under this setup, the displacement for each rectangle would be removed by the prism glasses, thus shifting both of the rectangles back to the center of the screen. Participants were asked to free-fuse the two rectangles into one, using the frame and fixation cross of each rectangle. On successful fusion, participants would perceive a tunnel-like view, with the divider showing up on either side as the wall of the tunnel, and with one rectangle at the end of the tunnel in the center of the screen. Since the width of the perceived tunnel depended on the distance between the rectangles, to ensure the horizontal field of view of the gray background in the scanner was not too narrow for the participants, and for practical reasons (we have only one pair of prism glasses for each diopter), we used prism glasses of a bigger diopter (diopter = 12 for each eye) in the scanner, and a smaller diopter (diopter = 8 for each eye) outside the scanner. Apart from the distance between the two rectangles, other parameters of the experiment were kept the same both inside and outside the scanner. Participants reported no difficulty in merging the two rectangles into one, either inside or outside the scanner.
Procedure of the main experiment
The main experiment used a slow event-related design. In each trial, a stimulus image was projected into one of the rectangular frames, and the dynamic noise was simultaneously projected into the other frame. The stimulus image was faded in from 0% to 50% contrast in 2.5 s, and subsequently faded out to 0% contrast in 0.5 s. The dynamic noise was presented at full contrast. To eliminate any possible afterimages of the stimulus, the dynamic noise was kept on the screen for another 2 s after the stimulus faded out. Each trial was followed by an interval of 13, 15, or 17 s. The fixation cross changed to white at 1 s before the start of each trial, remained white through the trial, and changed to black at the intertrial interval (ITI).
The experiment used a 2 × 2 factorial design, where stimuli were presented upright or inverted (orientation), invisible or visible (visibility). To keep the visible and invisible conditions as close as possible, the visible trials were created by overlaying the stimuli onto the noise pattern, resulting for all participants in a subjective percept of the body stimuli fading in and out with the presence of the noise pattern, which was a distinct percept from the invisible trials (percept of noise pattern only). To exclude confounds due to introspection and to motor response, we refrained from including a trial-by-trial report of the subjective percept.
For catch trials, the dot images were presented in both rectangles, similar to the body stimuli in the visible condition, so that the participants could see the dots fading in and out in the noise pattern. The trials were followed by a response screen for 2 s, indicated by a white circle replacing the fixation cross.
Participants were instructed to respond only to the dot trials, where they should indicate during the response screen presentation whether the number of the dots was odd or even, by pressing one of two corresponding buttons on a MR-compatible button box. For all the other trials, they were asked to fixate on the cross and passively view the presentation. The passive viewing task was used to avoid any confound of response-related activation, as could be observed in the parietal and frontal areas. Participants were also advised not to blink during the trials if possible, and to blink between the trials if needed.
The main scanning session consisted of four functional runs of 19 min and 10 s each. Within each run were 48 target trials (12 per condition) and 8 catch trials, presented in pseudorandom order. The side of the eyes that the dynamic noise projected into was also randomized and counterbalanced within the session. In total each individual stimulus was projected onto each eye twice: once visible and once invisible. For one participant, three functional runs were acquired; for the other participants, four functional runs were acquired. The anatomical scan was performed after two functional runs.
Scanning parameters
The scanning was conducted in a Siemens 3T Prisma whole-body scanner (Siemens), with a 64-element head-neck coil. In the scanner, stimuli were back-projected with a LCD projector (Panasonic PT-EZ570, screen resolution = 1920 × 1200, refresh rate = 60 Hz) on a screen 75 cm away from the head of the participant. The cardboard divider was placed in the bore between the head coil and the screen. A T2*-weighted gradient echo EPI sequence was used to acquire functional data covering the whole brain, with 2 × 2 × 2 mm3 resolution (64 slices without gaps, TR (repetition time) = 2000 ms, TE (echo time) = 30 ms, flip angle = 77, simultaneous multi-slice acquisition acceleration factor = 2, FOV = 200 × 200, matrix size = 100 × 100). A T1-weighted MPRAGE sequence was used to acquire the anatomical structure images (1 × 1 × 1 mm3, TR = 2300 ms, TE = 2.98 ms).
Validation of the suppression effect
For each participant, the effectiveness of suppression was validated by verbal reports during the scan after each run, and by behavioral validation runs before and after the scan. We based our decision of data selection mainly on the results of the behavioral validation runs.
To obtain online estimates of the CFS suppression efficiency during scanning, participants responded to the following three questions after each run. (1) In what percentage of trials did you see something in the noise? (2) Were there any merging problems during the scan? (3) Did you see a sudden appearance of the stimulus in the noise, rather than a gradual fading-in? A run with response of >60% seen trials (the actual percentage would be 57% when taking the seen catch trials into account), or sudden perception of stimulus in the noise, or any merging problem, would indicate that during the scan the suppression was not working perfectly. None of the runs included for data analysis had these problems.
The behavioral validation runs were conducted immediately before and after the scan, outside the scanner. The stimuli were presented on an LCD screen (Acer VG248, 3D capable, resolution = 1920 × 1080, refresh rate = 60 Hz), in a room with dim light. The distance between the two rectangles was adjusted according to the diopter of the prism glasses (276 pixels between centers of two rectangles, 9.15°, diopter = 8) to render stable fusion. Trials and their order in the runs before and after the scan corresponded to the run 1 and 2 in the scanner. There was no catch trial in the validation runs; instead a response screen with a circle (same as the one in the main experiment) was presented after stimulus presentation for each trial. Participants were required to respond whether they saw anything in the noise, by pressing either 1 (seen) or 2 (unseen) on the keyboard during the response screen on a trial-by-trial basis. If a participant responded “seen” for more than two times for the unseen (suppressed) trials in either one of the validation runs, including trials without response, the dataset of the participant would be excluded from analyses. The data of seven participants in this study satisfied the inclusion criterion (average accuracy for visible trials: 99.4%, average accuracy for invisible trials: 96.7%), showing that their subjective percept tightly followed our planned visibility manipulation. To further ensure that the stimuli were suppressed in the invisible trials during the fMRI scan, participants were asked again after the scan whether their visual experience of the stimuli was similar to that in the behavioral tasks before the scan. The percept of a stimulus escaping suppression (a stimulus suddenly appearing in the noise, instead of fading in slowly) was also clearly explained to the participants. All seven participants reported not having such percept. The 8th participant reported >70% seen trials after three fMRI runs in the scanner (with catch trials, reported 65–70%, 70–75%, and 60–70%, respectively), and responded three times “seen” for unseen trials in the behavioral test after scan (with no catch trials, reported percentage of seen trials: 50–60%, actual percentage 56%). Consistent with the behavioral test, after the scan this participant reported that in the behavioral test before she saw 50% trials (actual percentage = 52%, 1 trial breaking suppression), while reported “seeing more” in both the 2nd to 4th runs in the scanner and the behavioral test after scan. This participant was excluded from the analysis. The decrease of stimuli suppression efficiency for this participant might be the same effect reported by a few previous CFS studies, where participants saw more stimuli as the experiment progressed (Ludwig et al., 2013; Lupyan and Ward, 2013; Mastropasqua et al., 2015; Stein and Peelen, 2015).
In total, 26 runs (575 volumes each) from seven participants were included in the analysis. One run from another participant was excluded, due to merging problems caused by a contact lenses issue that occurred during that run. Another participant completed three runs instead of four runs.
Functional localizer
Participants were also scanned with a functional localizer run (432 volumes) in a separate session, where they passively viewed stimuli of faces, bodies, houses, tools and words in blocks. Facial stimuli were front-view neutral faces from the Karolinska Directed Emotional Faces (Lundqvist et al., 1998; 24 identities, 12 males). The part below the neck (clothes, hair, etc.) was removed from the face images. Body stimuli (de Gelder and Van den Stock, 2011) were neutral still front-view bodies different from the ones used in the main experiment (20 identities, 10 males), with the facial information removed. House and tool images were obtained from the Internet. The house images consisted of 19 facades of houses with two-to-three-storey height, and the tool images consisted of 18 hand-held tools. Words images consisted of high-frequency English words of four to six letters in Arial font. All the images were imbedded within a gray background (RGB value = 157,157,157), spanning a visual angle of 3.65 degrees (230 pixels). Each block consisted of 12 stimuli from the same category; each stimulus was presented for 800 ms, followed by an interval of 200 ms. An interblock interval of 12 s followed each block presentation. Blocks of each category were presented seven times, and the presentation order of the stimuli and the blocks were pseudorandomized.
Data processing
The acquired data were processed in BrainVoyager (Brain Innovation). Functional data underwent default slice scan time correction, 3D motion correction, temporal GLM with Fourier basis high-pass filtering of two cycles. The functional datasets were then aligned to the anatomical images, brought into Talairach space, and underwent spatial smoothing with a Gaussian filter of 4-mm FWHM.
GLM analyses
Random effects group analyses with GLM were applied to the functional data of the main experiment. Predictors for each condition were convolved with the default two-γ hemodynamic response function. The parameters from 3D head motion correction were z-transformed and added as confound predictors into the GLM analyses. The percentage signal change values for each participant were extracted for subsequent ROI analyses. A 2 × 2 ANOVA with orientation and visibility was performed on the whole-brain basis. To observe the configural processing of fearful bodies, contrast of upright invisible > inverted invisible was also performed. The clusters of the ANOVA and the contrast analyses were corrected for multiple comparison by cluster threshold estimation (initial threshold p = 0.005 for the ANOVA results, initial threshold p = 0.01 for the contrast results, Monte Carlo simulation n = 5000).
ROI analyses
Functional ROIs were defined by GLM contrasts on the functional localizer data, individually for each participant (Fig. 2). Ventral ROIs were defined by the contrast bodies > houses (p = 0.001 uncorrected). Clusters that were located in the lateral occipital sulcus were marked as EBA; clusters located in the fusiform region were marked as FBA. Dorsal ROIs were defined for the anterior, middle and posterior intraparietal sulcus (IPS) bilaterally, by contrasting tools > baseline (p = 0.001 uncorrected). Spheres (radius = 4 mm) were defined at the peak activation sites located in the anterior (connecting postcentral sulcus), middle, and posterior segments of IPS, respectively. As a comparison to the ventral and dorsal areas, sphere ROIs of the primary visual cortex (V1) were defined at the occipital pole, at the spots in bilateral occipitopolar sulci where the calcarine sulci pointed to (radius = 4mm). The V1 ROIs defined anatomically were located within the extensive cluster activated by visual presentation of the 5 conditions in the functional localizer versus baseline (p = 0.00001, uncorrected). For FBA, EBA, pIPS, mIPS, aIPS, and V1, we performed a group-level ANOVA of ROI (six areas) × laterality (left, right) × orientation × visibility, where for each unilateral ROI one averaged percentage signal change value per participant was entered as input. This group-level ANOVA did not show either a significant main effect of laterality (F(1,1) = 2.453, p = 0.362), or interactions with laterality (ROI × laterality: F(5,5) = 0.550, p = 0.736; laterality × orientation: F(1,1) = 3.903, p = 0.298; ROI × laterality × orientation: F(5,5) = 1.048, p = 0.480; laterality × visibility: F(1,1) = 0.606, p = 0.579; laterality × orientation × visibility: F(1,1) = 0.537, p = 0.597; ROI × laterality × orientation × visibility: F(5,5) = 1.430, p = 0.352). Thus, we merged the bilateral ROI pairs into single ROIs. For some of the dorsal areas, only unilateral ROIs could be defined in some participants (e.g., the right aIPS could only be defined in three participants), in those cases the data of the unilateral ROI were entered into further analysis. To compare the ventral and dorsal ROIs directly, the bilateral FBA and EBA ROIs were merged into one combined ventral ROI, and the bilateral ROIs along the IPS were merged into one combined dorsal ROI. The mean percentage signal change values from the GLM analysis were extracted for each resulting ROI of each participant. Group-level repeated-measures ANOVAs were performed in SPSS. We first conducted an ANOVA of stream (ventral, dorsal) × orientation (upright, inverted) × visibility (visible, invisible) with the data of the combined ventral and dorsal ROIs. In case that an interaction was present, we examined the orientation × visibility ANOVA in the specific stream, then conducted subsequent ANOVAs with the data of individual ROIs.
ROI analysis in individual participants
To rule out that the observed results of group-level ANOVAs in our ROI analysis were driven by a minority of participants, we performed within-participant ROI analysis in the seven individual participants, examining the prevalence of effects (or no effects in the dorsal stream). To be most comparable to the group-level ROI analysis, we fitted the same GLM to each run in individual participants. The percentage signal changes (parameter estimates) of each condition were extracted from the same bilateral ROIs of the ROI analysis (including the combined ventral and dorsal ROIs, and the six individual ROIs), in individual participants, and entered into within-participant repeated-measures ANOVAs. Because the number of runs was different across participants, the number of parameter estimates included in the ANOVAs was different (three estimates per condition in participants S1 and S3, four estimates per condition in all five other participants). The ANOVAs included the stream × orientation × visibility ANOVA in the ventral/dorsal combined ROIs, and the orientation × visibility ANOVAs in the six individual ROIs. Lastly, to compare with the results obtained in Fang and He (2005), we performed the pairwise comparisons of upright visible versus upright invisible conditions in these eight ROIs.
Results
Whole-brain analysis
We conducted a whole-brain ANOVA at the group level, with orientation (upright, inverted) and visibility (visible, invisible) as factors (Fig. 3; Table 1). The main effect of orientation (upright, inverted) was observed in clusters mainly in the frontal lobe, and a cluster close to the EBA region defined with the functional localizer. A main effect of visibility (visible, invisible) was observed mainly in clusters in the ventral pathway, including bilateral EBA, FBA, lateral occipitotemporal cortex, and right anterior inferior temporal cortex. Clusters in the dorsal pathway were located in bilateral anterior IPS, and right middle frontal gyrus (corresponding to the frontal eye field, FEF). Other clusters were located at the right inferior frontal lobe, and right posterior cingulate sulcus.
Importantly, the interaction of visibility and orientation was observed mainly in clusters of the parietal and frontal cortex, that overlap with regions of the dorsal attention network (Corbetta and Shulman, 2011). The parietal clusters included left medial IPS, left precuneus, right posterior IPS. The frontal clusters were located along bilateral superior frontal sulci, mostly at the location of FEF, but also more anteriorly for two clusters. Another cluster was located in the right anterior cingulate sulcus, close to the presupplementary motor area. Importantly, the interaction effect also revealed clusters in subcortical areas, including the left pulvinar and the right caudate nucleus. When mirrored to the right hemisphere, the coordinates of these two clusters corresponded to the focal lesion sites found in spatial neglect patients with restricted subcortical lesions (Karnath et al., 2002).
We also conducted a whole-brain contrast of upright invisible > inverted invisible, which showed clusters mainly in the frontal lobe. Importantly, a cluster was present in the right inferior occipital sulcus, showing higher activity for upright bodies. This indicates that despite being invisible, the upright bodies were nonetheless processed more extensively than the inverted ones in the ventral pathway. A cluster was also present in the right caudate nucleus.
ROI analysis
The functional localizer included still images of faces, bodies, houses, tools and words. We defined ventral ROIs by the bodies > houses contrast (p = 0.001 uncorrected), leading to ROIs of bilateral EBA and FBA. Because the tools activate dorsal action observation and execution related structures, we defined the dorsal ROIs by the tools > baseline contrast (p = 0.001 uncorrected). The areas activated by tools largely overlapped with those activated by bodies (bodies > baseline, p = 0.001 uncorrected), especially at the posterior IPS. For the overlaps in individual participants, see Figure 2. Sphere ROIs of 4mm radius were defined at the peak activation sites in the anterior (connecting postcentral sulcus), middle, and posterior segments of IPS, respectively (labeled aIPS, mIPS, and pIPS). For comparison with the ROIs in the ventral and dorsal streams, we also defined sphere ROIs in the bilateral primary visual cortex (V1) that was activated by visual presentation of these 5 stimuli categories. For ROIs of individual participants, see Figure 2. Because we did not find main effects or interactions related to the laterality factor in the group-level ANOVA of areas (six ROI pairs) × laterality (left, right) × orientation × visibility, we merged the bilateral ROIs in each area into one ROI, and then combined the ventral and dorsal ROIs, respectively, to directly examine whether the dorsal stream areas indeed show a different response pattern than the ventral stream.
First, we performed a repeated-measures ANOVA of stream (ventral/dorsal) × orientation × visibility on the averaged percentage signal changes of the combined ventral ROI and the combined dorsal ROI. If the response patterns differ between the two streams across the conditions, it would lead to an interaction of stream × visibility. Indeed, we found a significant interaction of stream × visibility (F(1,6) = 30.821, p = 0.001, ηp 2 = 0.837), and a significant interaction of stream × orientation × visibility (F(1,6) = 7.307, p = 0.035, ηp 2 = 0.549), in line with our prediction. The main effect of visibility was also significant (F(1,6) = 33.370, p = 0.001, ηp 2 = 0.848). We subsequently performed the orientation × visibility ANOVA with the averaged activity separately for each stream.
The combined ventral ROI showed strong main effects of orientation, and visibility, with no interaction effect. Similar to the ventral clusters shown by the main effect of visibility in the whole-brain ANOVA, visible bodies consistently elicited higher activity than suppressed invisible bodies (F(1,6) = 38.063, p = 0.001, ηp 2 = 0.864), which is in accordance with the findings in CFS studies using other stimulus categories (Fang and He, 2005; Jiang and He, 2006; Hesselmann and Malach, 2011; Yang et al., 2014). Upright bodies also elicited higher activity than inverted ones (F(1,6) = 16.297, p = 0.007, ηp 2 = 0.731), also consistent with studies using other categories of inverted stimuli, such as faces (Pinsk et al., 2009; Gilaie-Dotan et al., 2010). See Table 2 for the statistical results of the ANOVA. We also examined the averaged percentage signal changes for each condition in the FBA and EBA ROIs separately (Fig. 4; Table 2). Notably, the reduced activation for inverted bodies was consistent across visibility conditions, as a main effect of orientation was found in both the FBA and EBA ROIs (FBA: F(1,6) = 9.950, p = 0.020, ηp 2 = 0.624; EBA: F(1,6) = 13.230, p = 0.011, ηp 2 = 0.688), without interaction effects to visibility. For the invisible conditions, post hoc paired t test showed significantly higher activity for the upright bodies in the FBA ROI (t(6) = 3.111, p = 0.021), and a trend to significance in the EBA ROI (t(6) = 2.154, p = 0.075). Together with the activation in right inferior occipital gyrus observed under the contrast upright invisible > inverted invisible in the whole-brain analysis, this ROI result shows that ventral body-specific areas are sensitive to the orientation of body stimuli even when the bodies are presented without visual awareness.
Extended Data Figure 4-1
The average percent signal change data for the four main conditions and the catch trials (judging number of visible dots, followed by a button press) for individual participants. Download Figure 4-1, XLSX file.
In the combined dorsal ROI, the ANOVA of orientation × visibility again showed a main effect of visibility (F(1,6) = 9.172, p = 0.023, ηp 2 = 0.605). Important, however, it also showed an interaction of orientation × visibility (F(1,6) = 13.624, p = 0.010, ηp 2 = 0.694). To directly compare our results to other CFS studies without manipulation of stimulus orientation, we also performed the ANOVA stream × visibility with only the upright conditions. Again a strong interaction of stream × visibility was observed (F(1,6) = 34.612, p = 0.001, ηp 2 = 0.852), together with the main effect of visibility (F(1,6) = 24.987, p = 0.002, ηp 2 = 0.806). To better understand the interaction effects found in the dorsal stream, we performed an ANOVA of area (pIPS, mIPS, and aIPS) × orientation × visibility. Again we found the interaction orientation × visibility (F(1,5) = 10.853, p = 0.022, ηp 2 = 0.685), but we also found a significant main effect of area (F(2,10) = 9.962, p = 0.004, ηp 2 = 0.666), and a strong interaction of area × orientation × visibility (F(2,10) = 9.449, p = 0.005, ηp 2 = 0.654), indicating that the response patterns changed across the areas within the dorsal stream. The main effect of visibility showed a trend toward significance (F(1,5) = 6.149, p = 0.056, ηp 2 = 0.552). Indeed, separate inspections of the activity in pIPS, mIPS, and aIPS ROIs showed that the interaction effect of orientation × visibility was present in both pIPS and mIPS ROIs, but was not present in the aIPS ROI, which showed a main effect of visibility instead, with higher activity for upright than inverted bodies, similar to the pattern of the ventral areas. For the pIPS ROI, the main effect of visibility was also present. In both the pIPS and mIPS ROIs, post hoc paired t tests showed that the activity between visible and invisible upright bodies did not differ (pIPS: t(6) = 1.166, p = 0.288; mIPS: t(6) = −0.040, p = 0.970), but the activity between the two inverted conditions differed (pIPS: t(6) = 4.886, p = 0.003, mIPS: t(6) = 4.630, p = 0.004; Fig. 4; Table 2).
In comparison to the ventral and dorsal ROIs, no significant main effect or interaction was observed for V1 ROIs (all p>.05).
ROI analysis in individual participants
To rule out the possibility that the abovementioned ROI results were driven by a minority of participants, we performed within-participant repeated-measures ANOVAs in the bilateral ROIs of the seven individual participants. See Figure 5 and Table 3 for averaged responses per condition, the p values for the statistical tests, and the directions of significant main effects.
The within-participant results in individual participants were consistent with the group results. ANOVA of stream (ventral/dorsal ROIs) × orientation × visibility showed significant interactions of stream × visibility in all seven participants, while the main effect of visibility was present in five participants. The upright visible condition had higher activity than the upright invisible condition in six participants in the combined ventral ROI. In the combined dorsal ROI, however, this comparison was not significant in any of the participants (all p > 0.131).
In individual ventral ROIs, a consistent orientation effect was found in both the FBA and the EBA ROIs (six out of seven participants), which was the same case for pairwise comparisons of the upright visible versus invisible bodies.
In individual dorsal ROIs, two participants (S1 and S6) showed higher activity for visible trials, in pIPS and aIPS ROIs. In dorsal ROIs of other participants, the main effect of visibility was either nonsignificant, or showing the opposite effect to ventral ROIs (higher activity for invisible trials than visible ones, in mIPS for one participant), or showing interactions of orientation and visibility (in pIPS for one participant, in mIPS for two participants). One participant further showed a main effect of higher activity for inverted bodies in both pIPS and mIPS, another showed the opposite effect in aIPS. Pairwise comparisons of upright visible and upright invisible conditions showed higher activity for the upright visible condition, in pIPS for one participant, and showed higher activity for the upright invisible condition in mIPS for two participants.
In the V1 ROI, one participant showed the main effect of visibility, showing higher activity for invisible bodies. For pairwise comparisons, another participant showed higher activity for upright visible bodies.
From these results, it appeared that the group-level effects were driven by the majority of the participants, and our results were consistent with the ones found by Fang and He (2005).
Discussion
Our results show that activity in the dorsal processing stream is relatively independent from visual awareness, strongly contrasting with activity in ventral areas which is strongly linked to the visual awareness. Whole-brain ANOVA showed an interaction effect of stimulus orientation and visibility in regions including the IPS in the dorsal stream, and in subcortical structures. Also, the ROI analysis showed a strong two-way interaction between stream and visibility (validated in within-participant analysis in all seven participants), and a three-way interaction of stream × orientation × visibility, while a main effect of visibility was also present. This overall difference between the two processing streams was caused by different response patterns in posterior and middle IPS ROIs more than the ventral and aIPS ROIs, with the former two areas showing an interaction between stimulus visibility and orientation. Specifically, activity in these two ROIs did not differ between the visible and invisible upright body stimuli. The FBA ROI also showed higher activity for upright bodies than inverted bodies, even when neither was consciously perceived.
The locations of our pIPS and mIPS ROIs correspond to the ROIs of V3A/V7 and IPS in the two previous fMRI CFS studies using tool stimuli (Fang and He, 2005; Hesselmann and Malach, 2011). Our finding that dorsal stream activity for upright body stimuli dissociates from visual awareness is consistent with the findings of Fang and He (2005) using tools. The similarity between our results and theirs underscores that not only tools but also bodies trigger action representation, in which the IPS plays an important role (Culham et al., 2006). The other CFS studies did not find an interaction between stream and visibility but found lower activity for invisible tools in both ventral and dorsal streams (Hesselmann et al., 2011; Hesselmann and Malach, 2011; Ludwig et al., 2015). We also found this main effect of visibility in the dorsal areas, especially for the inverted bodies. However, given the significant interaction between orientation and visibility, our evidence does not support an invariant processing across dorsal and ventral streams.
Previous reviews discussed explanations for the discrepancies between the available studies (Yang et al., 2014; Ludwig and Hesselmann, 2015). One is that the presentation of visible trials was different (presented without dynamic noise in separate runs in Fang and He, 2005; but presented with dynamic noise in the same run in Hesselmann et al., 2011, Hesselmann and Malach, 2011; Ludwig et al., 2015). Experiments of nonconscious tool perception have also been criticized, with the reasoning that the results might be shape-specific and caused by the elongated shape only, rather than other tool-specific properties (Yang et al., 2014), although the elongated invisible tools indeed showed an enhanced decodability (Ludwig and Hesselmann, 2015; Ludwig et al., 2015). However, our results show that these two reasons do not fully account for the previous discrepancies and underlying mechanisms. In our study visible and invisible trials were presented within the same run, always with the dynamic noise present. In addition, the body posture stimuli used in our study have elongated shapes in both upright and inverted forms, but body inversion led to a significant interaction with visibility in posterior and middle IPS, indicating that the underlying mechanisms for nonconscious tool and body perception are not likely to be shape-specific in purely lower-level visual-form aspects, but are more likely linked to higher-level processes associated with these two specific categories, especially their ability to trigger action-related processing.
The discrepancies between those studies may instead be caused by the averaging of activity in dorsal ROIs. Our ROI definition was more fine-grained, and gave the same weight to each dorsal ROI (same spherical ROI size across the three areas). We observed a change of response patterns at the group level along the IPS, where the interaction between orientation and visibility in posterior and middle IPS ROIs was not present in the anterior ROIs, indicating a change of involvement and function across these areas, consistent to the functional heterogeneity found along the IPS in previous research (Freud et al., 2016). Our change of responses was also consistent with the CFS study which specifically examined the decodabilities of faces and tools across two streams (Ludwig et al., 2016). In that study, the authors defined inferior and superior dorsal ROIs, roughly corresponding to a location posterior to our pIPS ROIs, and our mIPS ROI, respectively. They found that the decodability was modulated by the mask contrast in the superior dorsal ROIs, but not so in the inferior dorsal ROIs. Taken together, the response patterns of the main conditions along the IPS are likely to be influenced by the size and location of the ROIs, and by the subsequent averaging of the BOLD responses. In view of the heterogeneity of response patterns along the IPS, further studies with higher functional resolution and fine-grained dorsal ROI definition will help to resolve the discrepancies.
Another possible reason for the discrepancies may be related to the active report of percept with button-press in the three previous studies (Hesselmann et al., 2011; Hesselmann and Malach, 2011; Ludwig et al., 2015). Here, we did not use active reports of visibility for each trial, because active reports under rivalry states induce significantly higher brain activity linked to introspection and action, mainly in frontal areas, but also in superior and inferior parietal areas (Frässle et al., 2014). Since the medial and anterior IPS areas are known to be activated by hand actions such as touching, reaching and grasping (Culham et al., 2006), adding a button response per trial would introduce confounds in perceptual tasks aimed to compare ventral and dorsal activities. Indeed, we can observe the influence of a button press task in our data (Figs. 3, 4), as we see that in the dorsal areas the activity for the catch trials was much higher than the main conditions. Further study explicitly comparing brain activity with/without active reports under CFS would shed more light on its actual influence to the dorsal activity. However, the no-report paradigm also has its limitations. Without explicit requirement of subjective reports like button presses, the participants may still form an implicit “report”; while in cases that a participant indeed consciously perceived a stimulus, the stimulus might either be forgotten, or below the participant’s subjective report criteria, or not even reportable (Tsuchiya et al., 2015). These between-participant variabilities could not all be assessed and accounted for by subjective reports, and would possibly have led to the discrepancies in the literature.
Lastly, the previous CFS studies differed in the length of intertrial/block intervals, and in the data analysis methods. Fang and He (2005) presented 20-s blocks of faces and tools, interleaved with 20-s texture blocks as baseline), and only used the average signal of 8-20 s within each block. Together, by presenting the invisible and visible conditions in different runs, there was no cross talk of signals between conditions, and the BOLD signal dropped to baseline in the texture blocks (shown in their Fig. 3 of objects/scrambled objects experiment; their face/tool experiment was of a similar design). On the other hand, the 3 other studies used another design, with considerably shorter ITIs (1-6.5, 1.5-4.5, and 1-5 s) following the trial-by-trial awareness ratings with button presses, and used the GLM to estimate the BOLD signal changes. Under that specific design, the BOLD signal of the button press in the previous trial was likely to overlap with the upcoming trial despite the jitter of ITI, affecting all conditions. Our slow event-related design included sufficiently long ITIs, which allowed the BOLD response to return to baseline (in our case, the average time for the BOLD response to return to baseline was 8-12 s after the end of dynamic noise presentation; Fig. 3). The next trial started 9-11 TRs after the previous trial onset, and started 1 additional TR later for catch trials with the presence of the response screen (2 s). Thus our design precluded any possible confounds related to effects carried over from preceding trials, and resulted in better estimation for responses of single conditions (Friston et al., 1999).
The time course of the main conditions in Figure 3 seemed to show double peaks, especially in the left pulvinar, with an early peak after trial onset, and a later peak after trial offset. Given the long ITI it is unlikely that this is due to contamination from previous trials. It might be related to the prolonged noise pattern displayed after the target stimulus offset (from 1.5 to 2.5 TR after stimulus onset to remove possible afterimages), during which the two eyes were still under a rivalrous situation, where the noise pattern was rivaling with the blank rectangle instead of the stimulus. This corresponded to the mask-only condition in two previous studies, where two related observations were made. Hesselmann et al. (2011) found that the activity of Mondrian mask-only trials were not significantly different from those of the invisible trials in the ROIs they investigated, although showing a trend to significance in IPS. Ludwig et al. (2015) reconstructed the activity of invisible conditions by subtracting the mask-only activity from the mask-plus-stimulus activity, and found that parametrically modulating the mask contrasts did not show a corresponding difference in the activity in the ventral and dorsal ROIs. Both observations suggest that the activity of invisible conditions under CFS was not modulated in an additive manner relating to the inputs of the two eyes. If the second peak in our data was also induced by the rivaling situation of blank rectangle and the noise pattern, it would question the validity of using the mask-only condition as a baseline. We could not disentangle the mask-only effect from the mask-plus-stimulus effect in our study, but future studies with higher temporal resolution may help understand better the mechanism of CFS.
Since we did not have subjective reports of visibility on a trial-to-trial basis, we used a different way of establishing suppression. We screened participants whose percept of stimuli was well suppressed by CFS, and then verified outside the scanner that their percepts closely follow our experimental manipulation of visibility on a trial-to-trial basis. The strict screening resulted in the relatively low number of participants in the current study, which may not represent the whole population well. To avoid creating differences of processing between the two hemispheres, we balanced the presentation of the noise pattern across the two eyes in both the screening and fMRI experiments. A recent study found that the CFS presentation 3–15 min into one eye would enhance its dominance in a subsequent presentation of binocular rivalry (Kim et al., 2017). Although we used short trials, as our screening experiments are relatively long (0.5–1 h), this may contribute to an understanding of why we did not find a large number of participants whose percept of the stimuli was fully suppressed under CFS. In the current fMRI experiment, there was the possibility that the stimuli occasionally broke the suppression for some participants. If this was the case, the activity for the invisible conditions in the ventral ROIs could be affected. However, this cannot account for the sustained activity we observed in the dorsal ROIs for upright bodies. Thus, our findings are robust in the participants we examined. Future studies may benefit from higher sample size, but may also benefit from experimental designs that are less demanding on the performance of the participants.
Activity in the posterior part of the IPS (pIPS and mIPS ROIs), apart from the possibility that it is linked to the action-perception-related aspect of bodies and tools, may also reflect a more general attentional mechanism triggered by the stimulus. The IPS is part of the dorsal attention network and is known to be activated in multiple tasks. It is involved in the direction of attention, eye movements, and detection of salient events (Corbetta and Shulman, 2011), all of which could have played a role in our experiment. We presented the stimuli in the center of the visual fields, and instructed the participants to always fixate centrally on the fixation cross, thus the voluntary spatial attention of participants was always directed to the center of the rectangles. We could not rule out the possibility that there may be a difference of microsaccades between visible and invisible trials, as a previous CFS study found an increase of gaze directing to the locations of invisible stimuli comparing to contralateral control locations (Rothkirch et al., 2012). Also, activity in the ventral pathway is known to be modulated by attention (Gilbert and Li, 2013), and an attentional modulation of activity was found under CFS as early as V1 despite visibility (Watanabe et al., 2011). However, since the subjective percepts of the invisible trials were the same, the attentional mechanism alone could not explain the higher activity we found for invisible upright vs. inverted bodies in the FBA ROI. Instead, an interplay of dorsal and ventral mechanisms may be present, as recent research suggested for object perception (Freud et al., 2016). In our case, after the information from invisible body stimuli is relayed to both dorsal and ventral pathways, the ventral pathway representation may gain a category-specific processing advantage based on shape and orientation of the upright bodies, and the dorsal pathway representation may gain an advantage relating to action-observation-execution information in the upright bodies. These two in turn may drive the involuntary attention and affect the microsaccades.
The human posterior IPS regions may be homologous to the lateral intraparietal area in monkeys (Culham and Kanwisher, 2001), whose activity is modulated by stimulus salience and behavioral relevance (Corbetta and Shulman, 2002; Baluch and Itti, 2011). In our study the interaction between orientation and visibility is shown in the posterior and middle IPS ROIs as well as in the dorsal attentional network clusters from the whole-brain ANOVA. The interaction found in the posterior part of IPS may reflect a salience competition between the body stimuli and the dynamic noise pattern, caused by binocular disparity. If so, the salience and behavioral relevance of body stimuli may well result from the interplay between ventral and dorsal mechanisms. Given that under CFS salient and behaviorally-relevant stimuli were found to break through suppression faster (Yang et al., 2014), our findings suggest that the posterior part of IPS may act as an important transition stage in mediating stimuli entering into awareness by representing the salience of the stimuli. Our findings add to the link between visual perception and action, and are relevant for understanding the neural basis of perception of affective stimuli outside awareness.
Footnotes
The authors declare no competing financial interests.
This work was supported by the European Research Council, under the European Union’s Seventh Frame-Work Programme (FP7/2007–2013)/ERC Grants 295673 (to B.d.G.) and 269853 (to R.G.).
This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license, which permits unrestricted use, distribution and reproduction in any medium provided that the original work is properly attributed.