Abstract
A simple cue can be sufficient to elicit vivid recollection of a past episode. Theoretical models suggest that upon perceiving such a cue, disparate episodic elements held in neocortex are retrieved through hippocampal pattern completion. We tested this fundamental assumption by applying functional magnetic resonance imaging (fMRI) while objects or scenes were used to cue participants' recall of previously paired scenes or objects, respectively. We first demonstrate functional segregation within the medial temporal lobe (MTL), showing domain specificity in perirhinal and parahippocampal cortices (for object-processing vs scene-processing, respectively), but domain generality in the hippocampus (retrieval of both stimulus types). Critically, using fMRI latency analysis and dynamic causal modeling, we go on to demonstrate functional integration between these MTL regions during successful memory retrieval, with reversible signal flow from the cue region to the target region via the hippocampus. This supports the claim that the human hippocampus provides the vital associative link that integrates information held in different parts of cortex.
Introduction
How does the sight of a vase on one's desk rekindle a memory of the market stall in which it was purchased? Theoretical accounts and computational models posit that after initial binding, a partial retrieval cue will elicit a pattern completion process that reinstates the constituents of the original experience. Importantly, this process is thought to be accomplished by the hippocampus, a key region of the medial temporal lobe (MTL) memory system (Marr, 1971; Teyler and DiScenna, 1986; Treves and Rolls, 1994; Norman and O'Reilly, 2003; Teyler and Rudy, 2007). For instance, according to the “hippocampal memory indexing theory,” the actual contents of a memory representation are held in cortex, and these cortical regions are dynamically linked by the hippocampus during successful memory retrieval (Teyler and DiScenna, 1986; Teyler and Rudy, 2007). Although neuropsychological studies have shown clear evidence for the destructive effects of hippocampal damage on episodic memory (Squire, 1992; Yonelinas et al., 2002; Mayes et al., 2004, 2007; Squire et al., 2004; Vann et al., 2009), these findings cannot answer how the hippocampus dynamically interacts with cortical modules during intact episodic memory retrieval.
However, understanding the MTL's network dynamics (or “functional integration”) during memory retrieval first requires understanding the separate contributions (or “functional segregation”) among its subregions. While there is broad consensus about the role of the hippocampus (HIPP) in associative encoding and retrieval (for reviews, see Squire et al., 2004; Davachi, 2006; Eichenbaum et al., 2007; Mayes et al., 2007), controversy still surrounds the putative roles of perirhinal (PrC) and parahippocampal cortex (PhC) (comprising the anterior and posterior portions of the parahippocampal gyrus, respectively). While “process-based” accounts emphasize differential contributions of PrC and PhC to familiarity-based versus recollection-based recognition (Aggleton and Brown, 1999; Diana et al., 2007; Eichenbaum et al., 2007; Mayes et al., 2007), more recent “domain-based” accounts, building largely on neuroanatomical data (Suzuki and Amaral, 1994; Burwell and Amaral, 1998; Lavenex and Amaral, 2000), emphasize different stimulus properties processed by PrC and PhC (Davachi, 2006; Graham et al., 2010; Wixted and Squire, 2011). A key prediction of such domain-based accounts is that both PrC and PhC may contribute to recollection, but differentially so as a function of the stimulus material used to define successful memory performance.
In the current study, we used a novel object-scene cued recall paradigm (Fig. 1) to first assess whether PrC and PhC differentially process object- and scene-related information, respectively (consistent with domain-based accounts). Importantly, we then set out to reveal how these stimulus-specific contributions might be dynamically integrated during object-scene recall, and whether there is indeed a role of HIPP in linking the episodic elements held in cortex.
Materials and Methods
Participants.
Twenty (11 female) right-handed native English speakers with normal or corrected-to-normal vision participated in the experiment (mean age: 25 years, range: 22–32). Informed consent was obtained in a manner approved by a local Psychological Research Committee and participants were paid for their participation.
Experimental design.
The stimulus material consisted of 384 color pictures (Konkle et al., 2010), half of which (192) depicted objects and half of which depicted scenes (16 additional pictures were used for practice). For each of the two stimulus categories (objects, scenes), there were two similar exemplars for each of 96 subcategories (e.g., a glass of red wine and a glass of white wine for the object subcategory “wine glass” or a scene with a volcano emitting lava and scene with a volcano emitting an ash cloud for the scene subcategory “volcano”). As detailed below, the two exemplars per subcategory were used to enforce attention to event-specific details during encoding and retrieval. The stimulus material was counterbalanced so that half of the participants were presented with set 1 of subcategory exemplars during the first half of the experiment and with set 2 during the second half (and vice versa for the other half of the participants). Thus, no two exemplars of a given subcategory were presented within the same encoding-retrieval cycle.
The experiment consisted of six runs, with each run containing three blocks: an encoding block, a delay block, and a retrieval block (Fig. 1). Scanning was performed continuously across the three blocks, with short unscanned breaks between runs. Only retrieval data are reported here. The experiment was presented via the Psychophysics Toolbox (Brainard, 1997) implemented in MATLAB. During each encoding block, participants were presented with 32 unique object-scene pairs. The pairing of objects and scenes was randomized across participants. Object and scene pictures were each presented in a 250 × 250 pixels frame placed to the left and right of the screen center. During half of the trials (16, randomly selected), the object appeared to the left of the screen center and the scene to the right, with the reverse order during the other half of the trials. The trial duration was 4 s, and for the last 0.5 s the picture pair was replaced with a fixation cross (responses were still recorded), alerting participants that another trial would appear shortly. The encoding task was to indicate via button press whether the given object-scene pair is plausible or implausible, i.e., likely to appear in real life or nature (Staresina and Davachi, 2006). “Plausible” responses were given with the index finger and “implausible” responses with the middle finger. Across participants, use of left versus right hand was counterbalanced (but the finger assignment was held constant). Object-scene encoding trials were intermixed with an active baseline condition (Stark and Squire, 2001). Here, random numbers between 0 and 100 were shown, and participants pressed the index finger key for even numbers and the middle finger key for odd numbers. As soon as a response was given, another random number was shown. The response time for each number was self-paced and participants were encouraged to perform this task as fast as possible without sacrificing accuracy. Each encoding block lasted ∼3 min.
After the last encoding trial, participants saw a transition screen for 16 s, alerting them to the upcoming delay block. During the delay block, participants again performed the odd/even numbers task described above for 2 min. Odd/even response accuracy was reported to participants on the computer screen following the completion of the task to encourage accuracy.
At the end of the delay block, another 16 s transition screen alerted participants to the upcoming retrieval block. Each retrieval block consisted of 32 trials, each trial lasting 6 s. For a given trial, participants saw either the object or the scene of a given object-scene pair from the previous encoding block and were asked to indicate whether they remembered the corresponding paired associate (“recall”; index finger) or not (“forgot”; middle finger). Half of the cues (16) were object pictures, the other half scene pictures. Across the 32 retrieval trials, each cue type (object cue or scene cue) was presented in mini-blocks of eight consecutive trials (A-B-A-B), with a random assignment of object and scene cue trials to A and B in each run. Participants received the following instructions regarding “recall” responses: “Remembering the associate means that you could describe it in such a way that another person who has not seen the stimuli can pick the correct stimulus based on your description. Keep in mind that there are multiple exemplars per category, so your description has to be as detailed as possible.” As mentioned above, there were only two exemplars per subcategory. To ensure that participants gave “recall” responses when they indeed recalled the correct paired associate, we asked them to verbally describe the target after ∼10% of the “recall” responses. In particular, three catch trials per block were randomly determined beforehand (e.g., trial 5, 14, and 32 for block 1). If the participant did not give a recall response on a designated catch trial, the next trial on which a recall response was given served as a catch trial. This means that for some blocks, there were <3 catch trials (e.g., if the participant indicated forgot on trial 32 (the last trial in a block) in the example above so that no alternative catch trial could be chosen). Catch trials started after the 6 s trial period, showing a 2 s warning screen (“prepare to describe the associated image…”) followed by a 10 s period during which a verbal response was recorded (“please describe the associated image”), followed again by a 6 s fade-out screen (“prepare to continue with the experiment”). For scoring purposes, verbal responses were classified as accurate if they encompassed the target image's basic level label as well as some characteristic feature (e.g., “a glass of red wine” in the example above). As during the encoding block, retrieval trials were intermixed with odd/even number baseline trials. The retrieval block lasted ∼6 min.
For the imaging analysis, the four conditions of interest were as follows: (1) object cue, target scene recalled (O-S(R)), (2) object cue, target scene forgotten (O-S(F)), (3) scene cue, object target recalled (S-O(R)), and (4) scene cue, object target forgotten (S-O(F)).
Magnetic resonance imaging scanning details.
Scanning was performed on a 3 T Siemens Tim Trio magnetic resonance imaging (MRI) system using a 32-channel whole-head coil. Functional data were acquired using a gradient-echo, echo-planar pulse sequence (TR = 1000 ms, TE = 30 ms, 16 horizontal slices oriented parallel to the hippocampal axis, descending slice acquisition, 3 × 3 × 3 mm voxel size, 0.75 mm interslice gap, 702 volume acquisitions per run). The first 7 volumes of each run were discarded to allow for magnetic field stabilization. High-resolution (1 × 1 × 1 mm) T1-weighted (MP-RAGE) images were collected for anatomical visualization. Foam padding was used to minimize head motion. Visual stimuli were projected onto a screen that was viewed through a mirror, and responses were collected with magnet-compatible button boxes placed under the participant's hands.
The active baseline task (odd/even-task; Stark and Squire, 2001) comprised a fourth of the total scanning time. The sequence of encoding/retrieval trials and the variable number of baseline trials was pseudorandom and optimized for rapid event-related functional MRI (fMRI; using the “optseq” algorithm; Dale, 1999).
fMRI analysis.
Data were analyzed using SPM8 (http://www.fil.ion.ucl.ac.uk/spm/). During preprocessing, images were corrected for differences in slice acquisition timing, followed by motion correction across all runs. Neural activity for the conditions of interest (O-S(R), O-S(F), S-O(R), S-O(F)) was modeled as an impulse (delta function) in a design matrix that concatenated all retrieval blocks and included nuisance regressors for invalid trials, head movement, low-frequency scanner drift, and run means. Additionally, the 10 s overt speech plus the surrounding 2 s fade-in and 6 s fade-out periods of catch trials were modeled as user-specified nuisance regressors (using unconvolved stick functions for each volume). For the conventional general linear model (GLM) analysis, condition onsets were convolved using a single, canonical hemodynamic response function (HRF), as provided in SPM8. The resulting β-parameter estimates were then averaged across voxels within each region of interest (ROI; see below) in the participant's native space, and the resulting values were used in repeated-measures ANOVAs and t tests. For ANOVA factors with more than one numerator degree of freedom (df), we used a Greenhouse–Geisser df-correction for nonsphericity of the error.
Extraction of time-resolved blood oxygenation level-dependent (BOLD) data was based on the same design matrix, but condition responses were modeled via a finite impulse response (FIR) basis set (rather than the canonical HRF), with 20 bins and a 1 s bin-width equal to the TR (and converted to percentage signal change via the MarsBaR toolbox; Brett et al., 2002). We focused on the evoked BOLD response, corresponding to the first 11 bins, or 0.5–11.5 s (given that data were aligned to the middle slice acquired). These FIR parameter estimates were averaged across voxels within each ROI in the participant's native space.
We analyzed the data in PrC, PhC, and HIPP using hand-drawn, participant-specific ROIs, based on the individual structural image. Anatomical demarcation was done according to Insausti et al. (1998) and Pruessner et al. (2002). As there were no hemispheric differences (data not shown), we combined left and right hemisphere ROIs. Specifically, data were separately extracted for left and right hemisphere ROIs and collapsed before entering analyses. Note that no spatial smoothing was performed on the data, ensuring that there was minimal signal overlap between the regions.
Nonlinear HRF fitting.
To estimate the BOLD onset latency, we fit the model: where y is the vector of the above 11 FIR parameter estimates across time points t = [0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 9.5 10.5 11.5] from a given ROI and condition of a given participant, e is random error, and g is a nonlinear function with parameters p1–3. Given the positively skewed nature of the BOLD response, we defined g as a Gamma function: where p1 is a scaling parameter (amplitude), p2 is the onset latency, and p3 is a dispersion factor that affects the shape and scale of a gamma probability density function (G) over time t, relative to a peak of 6 s when sampled every dt = 1 s here (Evans et al., 1993).
In general, these parameters are fit numerically by an iterative algorithm that maximizes some goodness of fit (GOF) metric between the data (y) and fitted response (g(t, p1, p2, p3)). However, this GOF metric can have local maxima, particularly with noisy data for some participants and ROIs, and particularly when some parameters (such as onset latency and dispersion) have correlated effects on the fitted response. To accommodate this, we regularized the problem by imposing Gaussian priors on the parameters and on the noise, corresponding to a variational Bayesian approach that can be solved by maximizing a free-energy metric that approximates the model evidence (probability of the data given the model; Friston et al., 2007). In this case, e is assumed to be drawn from a zero-mean Gaussian distribution with variance p0. The prior mean of the onset latency (p2) was 0 s, the prior mean of the dispersion factor (p3) was 1, and the prior mean of the amplitude (p1) was set to the mean peak response over all regions, conditions, and participants (0.76% signal change). The variance of each of these four priors for parameters p0-3 was varied over a range [0.01 0.1 0.5 1 5 10 100], and the maximal free-energy used to select the best of the resulting 74 = 2401 models. This optimal model, which had prior variances of 10 for p0, 0.5 for p1, 0.5 for p2, and 0.1 for p3, was then used to estimate the posterior mean of the parameters reported in the main text.
Dynamic causal modeling.
Dynamic casual modeling (DCM) was performed using version DCM10 in SPM8, using the same model described above (except that the inputs had duration of 2 s, to allow sufficient sensitivity to modulation of connections (Henson et al., 2013). The volumes of interest were defined based on both anatomical and functional criteria. First, only voxels within the anatomically defined ROIs (PrC, PhC, HIPP) were considered. Second, within each ROI, we chose the 15 voxels with the strongest univariate effect sizes based on contrasts within a GLM: S-O(R) versus S-O(F) for PrC, O-S(R) versus O-S(F) for PhC, and R versus F (collapsed across object-cue and scene-cue trials) for HIPP. Note that this selection step merely served to identify the voxels most responsive within a given region and does not bias the subsequent DCM analysis to show any of the observed directionality effects. The top 15 voxels were identified for left and right hemisphere regions separately and then combined for the DCM analysis.
Bayesian model selection was performed using a random effects model, as described by Stephan et al. (2009). This allows estimation of the “exceedance probability,” i.e., the extent to which each model is more likely than any other model tested to have generated the data from a randomly selected participant. The choice of models is described below.
Results
Behavioral results
During retrieval, the proportion of scene recall and forgot responses when cued with an object were 49.8 and 49.4%, respectively (with no response given on 0.8% of the trials). The corresponding proportion for object recall when cued with a scene was 43.6 and 54.8%, respectively (with no response given on the remaining 1.6% of the trials). Importantly, reaction times for recall responses did not differ statistically between object-cue and scene-cue trials (2.23 s vs 2.16 s; t(19) = 1.24, p = 0.23). On randomly interspersed catch trials (where participants verbally described their memory after giving a recall keypress), the answer was correct on 98% of the trials when cued with an object and on 96% of the trials when cued with a scene, demonstrating that participants indeed recalled the correct target when indicating so. The average numbers of trials for our conditions of interest were 48 for O-S(R) (range 33–69), 47 for O-S(F) (range 26–62), 42 for S-O(R) (range 25–61), and 52 for S-O(F) (range 35–70).
Functional segregation of PrC, PhC, and HIPP
ROIs were hand-drawn individually for PrC, PhC, and HIPP (Fig. 2a), where PrC and PhC were defined as the anterior and posterior third of the parahippocampal gyrus, respectively (Staresina et al., 2011). Our first set of predictions (for functional segregation) concerned the involvement of different MTL regions as a function of retrieval success (recalled (R) vs forgotten (F)) and cue type (object (O) cue vs scene (S) cue). If the contributions of PrC and PhC are domain specific, we would expect PrC to show a greater response during trials in which object information is represented, regardless of whether the object is perceived as the cue (O-S(R) and O-S(F)), or retrieved as the target after being cued with a scene image (S-O(R)), compared with when no object information is perceived or retrieved (S-O(F)). The same logic applies to PhC: an increased response would be expected whenever scene information is perceived (S-O(R) and S-O(F)), or retrieved from memory (O-S(R)), compared with when no scene information is perceived or retrieved (O-S(F)).
Using a conventional analysis of the parameter estimate for a canonical HRF derived from a GLM (see Materials and Methods), a repeated-measures ANOVA with the factors Region (PrC, PhC, HIPP), Cue Type (object, scene), and Memory (R, F) showed a highly significant three-way Region × Cue Type × Memory interaction (F(1.41,26.87) = 110.33, p < 0.001). Subsidiary repeated-measures ANOVAs conducted separately for each region showed a significant Cue Type × Memory interaction in PrC (F(1,19) = 8.39, p = 0.009) and in PhC (F(1,19) = 94.86, p < 0.001), but only a significant main effect of Memory in HIPP (F(1,19) = 95.17, p < 0.001; Cue Type × Memory interaction, F(1,19) = 0.31, p = 0.583). The pattern of significant pairwise differences is shown in Figure 2. In summary, PrC showed a significant memory effect (greater response to recalled than forgotten trials) for recalling objects, but not for recalling scenes; PhC showed a significant memory effect for recalling scenes, but not for recalling objects; and HIPP showed a significant memory effect for recalling both objects and scenes. This three-way interaction constitutes compelling evidence for functional segregation in the MTL: while PrC and PhC contributions are domain specific, driven by object and scene representations, respectively (either as the perceived cue or as the retrieved target), the contribution of HIPP is domain general and driven by success versus failure of associative recall. The same pattern of significant ROI results was obtained when allowing for latency differences (see below) by using instead as the dependent variable: (1) the percentage signal change from the peak time point of each region and condition or (2) the amplitude parameter from nonlinear fitting of the HRF.
Latencies of evoked responses within PrC and PhC
If PrC and PhC provide domain-specific contributions, their relative engagement over time would be expected to vary as a function of the cue–target relationship. For instance, if PrC holds object representations, engagement of this region should occur earlier when an object serves as the cue than when an object is successfully retrieved as the target, assuming that information must undergo additional processing stages when retrieved from memory relative to being perceived in the environment. Likewise, engagement of PhC should occur earlier when a scene serves as the cue than when a scene is the successfully retrieved target. Correspondingly, we predicted an earlier response for O-S(R) relative to S-O(R) in PrC, but an earlier response for S-O(R) relative to O-S(R) in PhC, reflecting a reversal of the relative temporal ordering of conditions across regions.
Evidence for this prediction was apparent when plotting the trial-averaged time courses of the evoked BOLD response every 1 s (Fig. 3): while the greatest BOLD response in PrC occurred during the fifth TR for the O-S(R) condition (when averaging responses across participants), the greatest mean response for the S-O(R) condition occurred later, in the sixth TR (Fig. 3a). The opposite pattern can be seen in PhC (Fig. 3b). To assess this statistically, we used nonlinear fitting of an HRF that was explicitly parameterized by its amplitude, onset latency, and dispersion (see Materials and Methods), and compared the onset latency estimates across conditions and regions. In PrC, the average onset latency (relative to the stimulus onset at 0 s) was 0.38 s for O-S(R) and 0.52 s for S-O(R). In PhC, on the other hand, the average onset latency was 0.65 s for O-S(R) and 0.23 s for S-O(R) (Fig. 3c). In a repeated-measures ANOVA with the factors Region (PrC, PhC) and Cue Type (object, scene), the onset latencies showed a significant cross-over interaction (F(1,19) = 15.23, p = 0.001). Importantly, no such difference in onset latencies was seen for the corresponding “forgot” trials (F(1,19) = 1.98, p = 0.175), and there was a significant three-way interaction between Region, Cue Type, and Memory (F(1,19) = 10.48, p = 0.004), suggesting that the delayed response indeed reflected the retrieval of the associated target, rather than general stimulus-related properties. There was no significant difference in HIPP between onset latency parameters for the O-S(R) and S-O(R) conditions (t(19) = 1.40, p = 0.177).
Interestingly, there was no significant Region × Cue Type interaction for the dispersion parameter (F(1,19) = 0.65, p = 0.431), which suggests that the above interaction between Region and Cue Type on BOLD onset latency reflected a true difference in onset of neural activity, rather than a difference in the duration of that activity. This is important because a difference in (peak) latency of a BOLD response can also arise if there is a difference in duration, rather than onset, of underlying neural activity (Henson et al., 2013). More specifically, under a simple convolution model of the BOLD response, an increase in the duration of neural activity will produce a more dispersed BOLD response, with an increase in BOLD amplitude and peak latency. However, this possibility is captured by our inclusion of an explicit dispersion parameter. In sum, although both PrC and PhC showed similar response amplitudes for the O-S(R) and S-O(R) conditions, time-resolved BOLD analysis revealed a temporal dissociation within and across these regions: The PrC response preceded the PhC response when an object cue elicited successful recall of a scene target (O-S(R) trials), but the PhC response preceded the PrC response when a scene cue elicited successful recall of an object target (S-O(R) trials). This reversible temporal order across the MTL cortex suggests that information flows from cue to target region during successful recall.
Dynamic interactions across PrC, PhC, and HIPP
While the previous results are suggestive of a signaling cascade from the region representing the cue to the region representing the target, it is still unclear (1) whether there is a causal relationship between the two regions, in the sense that successful recall is driven by increased effective connectivity from the region representing the cue to the region representing the target, and (2) whether this directional flow encompasses HIPP, as suggested by our BOLD amplitude results (Fig. 2). To address these questions, we used DCM (Friston et al., 2003). Briefly, this method is based on simulating a dynamic model of neural activity within each region as a function of its connectivity to other regions, together with a region-specific hemodynamic model that maps such neural activity to the dependent variable, i.e., BOLD time series. Different sets of connections (networks) correspond to different models, and these models can be compared in terms of their Bayesian model evidence. Models are defined by (1) the location of driving inputs (here, the cues) to one or more regions, (2) the presence and direction of connections between regions (intrinsic connectivity), and (3) the connections between regions that are modulated by condition (here, successful vs unsuccessful cued recall). We grounded the basic network architecture (inputs and intrinsic connectivity) in primate anatomy (Suzuki and Amaral, 1994), in that the driving object and scene cue inputs entered the MTL via PrC and PhC, respectively, and all regions were fully, reciprocally connected. The key question was then which of these intrinsic connections was modulated by success vs failure of cued recall.
To address the first question of causal information flow from cue region to target region, we directly compared two types of models (Fig. 4A). In model M+, recall success modulated the effective connectivity from the cue region to the target region, i.e., from PrC to PhC for O-S trials, and from PhC to PrC for S-O trials. This was contrasted with model M−, where recall success modulated the effective connectivity in the reverse direction, i.e., from the target region to the cue region (from PhC to PrC for O-S trials, and from PrC to PhC for S-O trials). Results showed compelling evidence in favor of M+, with an exceedance probability of 0.996 (Fig. 4B, left).
While this result corroborates the notion of directed information flow from cue to target region that was suggested by the BOLD latency analysis (Fig. 3), it still leaves open whether successful recall encompasses information flow via the hippocampus. We thus devised a third model, which again had the same network architecture as M+, but additionally allowed modulation of connections from the cue region to the hippocampus and from the hippocampus to the target region (M++). Results showed that inclusion of a hippocampal route strongly increased the model evidence (exceedance probability of 0.892 in a direct comparison; Fig. 4B, right). Note also that model M++ outperformed all other models in a larger model space that systematically varied all modulatory connections while holding driving inputs and intrinsic connectivity constant, confirming that this is the optimal model among these multiple alternatives. Last, we compared model M++ (in which the flow of information varies flexibly as a function of the cue and target stimulus type) with two alternative models, one in which the connectivity from PrC toward PhC was modulated by recall success for both O-S and S-O trials, and one in which the connectivity from PhC toward PrC was modulated by recall success for both O-S and S-O trials. In other words, the latter two models assumed that information would always flow in a particular direction, regardless of the cue–target relationship. Again, the exceedance probabilities of those three models strongly favored the flexible bidirectional model M++ (0.67 vs 0.15 and 0.18, respectively).
As a final test, we fit a fourth, “full” model in which every connection to and from each region was modulated by recall success, and tested whether the coupling parameters were indeed reversible as a function of the cue–target relationship (O-S vs S-O trials). In a repeated-measures ANOVA on the modulatory coupling parameters, we included the factors Cue Type (object, scene) and Direction (from PrC toward PhC vs from PhC toward PrC, averaging across PrC-PhC, PrC-HIPP, and PhC-HIPP connections). Critically, we observed a significant Cue Type × Direction interaction (F(1,19) = 37.92, p < 0.001), due to a significant increase in effective connectivity from PrC toward PhC (compared with PhC toward PrC) during O-S(R) trials (t(19) = 5.25, p < 0.001), but a significant increase in effective connectivity from PhC toward PrC (compared with PrC toward PhC) during S-O(R) trials (t(19) = 4.08, p < 0.001). Indeed, when testing the individual connection strengths during successful recall (the sum of the intrinsic and modulatory connection parameters), five of the six modulations in the forward direction (from cue region toward target region) were significantly greater than zero (Fig. 4C). This included the output connection from HIPP to PhC during O-S trials, though the output connection from HIPP to PrC during S-O trials did not significantly differ from zero (see Discussion).
To summarize our DCM analysis: following the observation of a reversal of relative response latencies across PrC and PhC (Fig. 3), we obtained further evidence for a causal dynamic relationship between PrC and PhC, such that PrC drives activation in PhC when PrC represents the cue and PhC represents the target, but PhC drives PrC when the cue-target assignment is reversed. We then went on to demonstrate that this directional flow from cue toward target region is better captured by adding a further indirect route via the hippocampus. This was demonstrated both across models that differed in which connections were modulated by successful recall, and across coupling parameters within a fully modulated model. Relating back to the pattern of functional segregation (Fig. 2), these results suggest that the hippocampus flexibly links domain-specific representations in MTL cortex during successful recall.
Discussion
Ever since the hallmark case of patient H.M. (Scoville and Milner, 1957), whose episodic memory was devastated by a large lesion to his MTLs, memory research has primarily focused on teasing apart the contributions of different regions within the MTL (Cohen and Eichenbaum, 1993; Aggleton and Brown, 1999; Cohen et al., 1999; Norman and O'Reilly, 2003; Eichenbaum, 2004, 2007; Squire et al., 2004; Henson, 2005; Davachi, 2006; Diana et al., 2007; Mayes et al., 2007). However, controversy about the precise principles of functional segregation has hindered progress on the arguably more important question of how these regions dynamically interact to enable our rich and integrated episodic memories (functional integration). The question of functional integration is not unique to memory research; it is a fundamental challenge in neuroscience that emerges whenever specialized modules must be integrated to enable coherent perception, thought, and action (Zeki and Shipp, 1988; Edelman, 1993; Friston, 2002; Macaluso and Driver, 2005). In the current study, we first showed a pattern of functional segregation across MTL regions that supports recent neuroanatomically based accounts of MTL functions. Building on this division of labor, we then proceeded to assess how the separate contributions are dynamically integrated during successful recall.
Functional segregation–three-way dissociation in the contributions of PrC, PhC, and HIPP
As mentioned in the Introduction, recent efforts to capture the division of labor among MTL regions have emphasized the anatomical inputs and stimulus representations processed by these regions (Lee et al., 2005; Buffalo et al., 2006; Davachi, 2006; Diana et al., 2007; Graham et al., 2010; Staresina et al., 2011; Wixted and Squire, 2011; Liang et al., 2013). Regarding the MTL cortex, our current data provide strong support for this view (Fig. 2). Engagement of PrC and PhC was driven by processing of objects or scenes, respectively, regardless of whether their preferred stimuli were perceived as a cue, or retrieved as a target. Regarding the retrieval effects, it is interesting to note that despite the strong interaction of Cue Type × Memory in both regions (reflecting differential recall effects for each region's preferred stimulus type), there was a numerical trend in PrC toward a recall effect for scene targets. This pattern is reminiscent of an fMRI study that assessed MTL activation during encoding of objects and spatial locations (Buffalo et al., 2006) and found only spatial encoding effects in PhC, but both object and spatial encoding effects in PrC (albeit stronger effects for objects). Moreover, a recent study using a cued recall paradigm in which objects were used as items and scenes were used as contexts (Hannula et al., 2013) found recall effects in PhC only when retrieving the scene context, but recall effects in PrC both when retrieving the object item and the scene context. Collectively, these results suggest that the assignment of PrC to object processing versus PhC to scene processing may not be perfectly symmetrical. One explanation might be that at conventional fMRI resolutions, PrC may include signal from the adjacent entorhinal cortex, which processes both spatial and nonspatial representations along its mediolateral gradient (Schultz et al., 2012). Higher resolution imaging would be needed to address this possibility. Another explanation (as suggested by Buffalo et al., 2006) might be that there is stronger anatomical input from PhC to PrC than vice versa (Suzuki and Amaral, 1994). The stronger direct connections from PhC to PrC than from PrC to PhC may also explain why modulation of output from HIPP to PrC in the DCM analysis did not reach significance for scene cues (see Results).
On a related note, it is worth considering that the relatively strict criterion for recall responses likely induced a fairly conservative response bias, such that forgot responses may include less confident target recall and/or different levels of stimulus familiarity. Likewise, in searching for the associated target, participants may continue to mentally generate and scan multiple exemplars from the target category during F trials. While such transient stimulus representations are unlikely to achieve the same representational fidelity as successfully retrieved targets, they may still engage PrC and PhC to certain levels. This would explain the clear above-baseline activation levels of S-O(F) and O-S(F) trials in PrC and PhC (Fig. 2), respectively. However, the key finding with regard to PrC and PhC activation levels is their dissociation in supporting recall of object targets versus scene targets, respectively, consistent with a role of these regions in domain-specific retrieval.
Unlike in the MTL cortex, we observed no domain specificity in HIPP; rather, HIPP engagement reflected success versus failure of cued recall regardless of the stimulus-type being recalled. This is in agreement with the idea that HIPP contributions are domain general (Cohen and Eichenbaum, 1993; Eichenbaum, 2004; Davachi, 2006; Staresina and Davachi, 2008; Konkel and Cohen, 2009; Kumaran et al., 2012), and is again consistent with the multimodal array of anatomical inputs this region receives (Suzuki and Amaral, 1994; Lavenex and Amaral, 2000; van Strien et al., 2009). There is abundant evidence for the role of HIPP in associative binding/pattern completion (for reviews, see Cohen and Eichenbaum, 1993; Aggleton and Brown, 1999; Squire et al., 2004; Davachi, 2006; Eichenbaum et al., 2007; Mayes et al., 2007; Konkel and Cohen, 2009), but our data are the first to reveal how recall modulates the functional connectivity of HIPP with other MTL structures. This is arguably more direct evidence for a role of HIPP in pattern completion than has been furnished by previous activation analyses. To be explicit, while our results on functional segregation across the MTL corroborate and extend pervious findings, the novel aspect of the current study is that we build on these different contributions to ask how their dynamic interplay enables memory retrieval, as elaborated below.
Network dynamics across the MTL: from functional segregation to integration
Given the stimulus-specific contributions of PrC and PhC, we first asked whether their engagement reflects different stages in the MTL signaling cascade during episodic retrieval. Specifically, we hypothesized that during O-S(R) trials, PrC activation reflects processing of the object cue, whereas PhC activation reflects retrieval of the scene target. Similarly, during S-O(R) trials, PhC activation should reflect processing of the scene cue, whereas PrC activation reflects retrieval of the object target. Given that the cue, by definition, precedes the recalled item in a cued-recall paradigm (and given that the response latency for recalled responses was ∼2 s; see Results), one would expect these different functions to be expressed with different temporal profiles: engagement of the region representing the perceived cue should precede engagement of the region representing the retrieved target. As illustrated in Figure 3, the data show: during O-S(R) trials, the PrC BOLD response preceded the PhC response, whereas during S-O(R) trials, the PhC BOLD response preceded the PrC response.
Is the BOLD response sensitive enough to reveal temporal differences across conditions at such a short timescale? Despite the tacit assumption that the fMRI signal is proportional to neural firing rates, skepticism is warranted when interpreting BOLD time course effects (Friston et al., 2000; Heeger and Ress, 2002; Logothetis and Wandell, 2004). Therefore, any main effect of Region would be difficult to interpret due to potentially different neural-to-BOLD mappings across regions, and any main effect of Cue Type would be difficult to interpret due to potentially different dynamics earlier in object-processing-pathways versus scene-processing-pathways. Importantly, however, the cross-over interaction of Region × Cue Type we observed here (Fig. 3c) rules out any such region-specific or processing-pathway explanations. Furthermore, although it is difficult to infer backward from BOLD latency differences to underlying neural latency differences, our BOLD latency findings were obtained from nonlinear fitting of a model that included separate parametrization of onset delay, amplitude, and dispersion–allowing for more confident interpretation of differences in the BOLD onset latency parameter in terms of neural onset latency. Indeed, in a previous fMRI study, BOLD latency differences in PrC and HIPP across different memory retrieval conditions were directly confirmed by intracranial electroencephalography recordings (Staresina et al., 2012).
While the latency of the trial averaged-evoked BOLD response is suggestive of a directional interplay between PrC and PhC during successful cued recall, simple latency differences are only indirect evidence for a causal relationship between the region representing the cue stimulus and the region representing the target stimulus. Such causality is better inferred from temporal dependencies between regions across the whole fMRI time series, as in DCM. Furthermore, the latency analysis did not illuminate the putative role of HIPP as a pattern completer in this cue-target cascade (Marr, 1971; Teyler and DiScenna, 1986; Treves and Rolls, 1994; Norman and O'Reilly, 2003; Teyler and Rudy, 2007). Indeed, recent studies using direct electrophysiological recordings in humans have shown results partly consistent with the notion of HIPP as the interface between cue and target: a source retrieval signal was shown to be initiated in HIPP in response to a preceding old/new signal in PrC (Staresina et al., 2012), and an increase in HIPP firing rates and gamma power was shown to precede free recall of target items (Sederberg et al., 2007; Gelbard-Sagiv et al., 2008). However, due to restricted coverage of cortical sites in those studies, the network dynamics between HIPP and stimulus-specific MTL cortical regions (PrC and PhC) during successful recall has remained elusive. Here, we included HIPP in a DCM model to explicitly test for changes in effective connectivity between all three MTL regions during successful memory retrieval. First and foremost, our DCM results provided clear evidence in favor of the expected flow of information across the MTL cortex (Fig. 4), i.e., from PrC toward PhC when perceiving an object and retrieving an associated scene, and in the opposite direction (from PhC toward PrC) when perceiving a scene and retrieving an associated object. Second, a model that additionally incorporated an additional indirect transmission route, (1) from the cortical region representing the cue to HIPP and (2) from HIPP to the cortical region representing the target, further outperformed the model in which only direct PrC–PhC connections were modulated. Again, this directional flow including the hippocampal route was reversible as a function of the cue-target relationship on a given trial. These results are consistent with the notion that HIPP serves as the site of associative recall/pattern completion, and that different parts of the MTL cortex serve as stimulus-specific input and output modules.
Footnotes
This work was supported by a Sir Henry Wellcome Postdoctoral Fellowship to B.P.S. and the UK Medical Research Council Program MC_A060_5PR10 to R.N.H. We thank Mike Anderson for helpful discussion.
The authors declare no competing financial interests.
- Correspondence should be addressed to Dr. Bernhard Staresina, MRC Cognition and Brain Sciences Unit, 15 Chaucer Road, Cambridge CB2 7EF, UK. bernhard.staresina{at}mrc-cbu.cam.ac.uk