Structural and Functional MRI Data Differentially Predict Chronological Age and Behavioral Memory Performance

Abstract Human cognitive abilities decline with increasing chronological age, with decreased explicit memory performance being most strongly affected. However, some older adults show “successful aging,” that is, relatively preserved cognitive ability in old age. One explanation for this could be higher brain-structural integrity in these individuals. Alternatively, the brain might recruit existing resources more efficiently or employ compensatory cognitive strategies. Here, we approached this question by testing multiple candidate variables from structural and functional neuroimaging for their ability to predict chronological age and memory performance, respectively. Prediction was performed using support vector machine (SVM) classification and regression across and within two samples of young (N = 106) and older (N = 153) adults. The candidate variables were (1) behavioral response frequencies in an episodic memory test; (2) recently described functional magnetic resonance imaging (fMRI) scores reflecting preservation of functional memory networks; (3) whole-brain fMRI contrasts for novelty processing and subsequent memory; (4) resting-state fMRI maps quantifying voxel-wise signal fluctuation; and (5) gray matter volume estimated from structural MRIs. While age group could be reliably decoded from all variables, chronological age within young and older subjects was best predicted from gray matter volume. In contrast, memory performance was best predicted from task-based fMRI contrasts and particularly single-value fMRI scores, whereas gray matter volume has no predictive power with respect to memory performance in healthy adults. Our results suggest that superior memory performance in healthy older adults is better explained by efficient recruitment of memory networks rather than by preserved brain structure.


Introduction
Episodic memory performance peaks in young adulthood and declines with increasing age. Notably, a subpopulation of older adults show "successful aging," with memory performance comparable to that of younger adults (Nyberg et al., 2012;Nyberg and Pudas, 2019). An early assessment of changes in cognitive performance can help to determine people at risk of pathologic aging, such as various forms of dementia, and allows for early medical and behavioral interventions (Naismith et al., 2009;Cabeza et al., 2018;Whitty et al., 2020). Machine learning-based techniques such as support vector machine (SVM) classification and regression provide promising approaches to differentiate normal from pathologic neurocognitive aging. They have been employed to predict chronological age from structural magnetic resonance imaging (MRI; Cole et al., , 2018, to estimate brain age (Bashyam et al., 2020;Habes et al., 2021) or to distinguish health from disease (Dyrba et al., 2021;Eitel et al., 2021).
In contrast to the abundant literature on age prediction from structural MRI (Luders et al., 2016;Steffener et al., 2016;Cole et al., , 2018Soch, 2020), few studies have been devoted to predicting cognitive function, particularly memory performance, from neuroimaging data. One such study found that a combination of ApoE genotype and functional MRI (fMRI) was the most effective predictor for future cognitive decline (Woodard et al., 2010). The wide range of cognitive functioning even within narrowly defined age groups suggests that chronological age and cognitive performance might be predicted by different modalities. Several studies evaluated potential structural, functional, physiological and behavioral predictors of age-related cognitive decline (Gross et al., 2011;Hou et al., 2020;Chen et al., 2021), but only few studies systematically compared different predictors and their joint predictive value (Woodard et al., 2010).
Comparing the predictive value of MRI biomarkers for chronological age versus individual memory performance appears to be a promising endeavor, because "successful aging" may reflect dissociable neural mechanisms: differences in the manifestation of age-related physiological changes ("brain maintenance") and/ or differences in cognitive processing ("cognitive reserve"; Nyberg et al., 2012). Thus, data from different modalities may differentially predict chronological age and memory performance, respectively.
We compared SVM-based prediction of chronological age versus prediction of memory performance from behavioral data, task-based fMRI, resting-state fMRI, and structural MRI markers associated with increasing age. Our analyses where based on a large sample of 106 young and 153 older subjects (Soch et al., 2021a). Episodic memory performance was measured in the fMRI task and in various neuropsychological tests, using either incidental or intentional memory formation.
In addition to task-based fMRI, we also included recently described single-value fMRI scores (Soch et al., 2021b;Richter et al., 2022). These scores are derived from fMRI contrasts and describe the amount of deviation from or similarity with prototypical activations seen in young adults during novelty processing and successful encoding, by focusing on either typical versus atypical activations (FADE, functional activity deviation during encoding) or activations and deactivations (SAME, similarity of activations during memory encoding). These scores might constitute more robust predictors than voxel-wise fMRI contrasts, as a recent meta-analysis suggested that test-retest reliability of task-based fMRI is mediocre, and the authors recommended whole-brain aggregate analysis rather than voxel-based or ROI-based analyses to improve reliability (Elliott et al., 2020).
As an intermediate variable between task-based fMRI and structural MRI, we included the strength of restingstate fMRI signal fluctuations (Jia et al., 2020). Although resting-state fMRI, like task-based fMRI, measures the BOLD signal, it is, like structural MRI, not selective with respect to specific cognitive functions, because subjects are not performing a specific cognitive task (Buckner et al., 2008).
We hypothesized that both chronological age and memory performance could be best predicted from structural MRI, because age-related decrease of memory performance is typically accompanied by structural brain alterations (Cabeza et al., 2004;de Mooij et al., 2018). Whether any MRI modality would outperform the others' prediction of memory performance, was assessed exploratively.

Participants
The study was approved by the Ethics Committee of the Otto von Guericke University Magdeburg, Faculty of Medicine, and written informed consent was obtained from all participants in accordance with the Declaration of Helsinki (World Medical Association, 2013).
Participants were recruited via flyers at the local universities (mainly young subjects), advertisements in local newspapers (mainly older participants), and during public outreach events of the institute (e.g., Long Night of the Sciences).
The study cohort consisted of a total of 259 neurologically and psychiatrically healthy adults, including 106 young (47 male, 59 female, age range 18-35, mean age 24.12 6 4.00 years) and 153 older (59 male, 94 female, age range 51-80, mean age 64.04 6 6.74 years) participants. According to self-report, all participants were right-handed and did not use neurologic or psychiatric medication. The Mini-International Neuropsychiatric Interview (M.I.N.I.; Sheehan et al., 1998;German version by Ackenheil et al., 1999) was used to exclude present or past psychiatric illness, alcohol or drug dependence.
Please note that this study is based on the same participant sample as described by Soch et al. (2021a,b) and Richter et al. (2022). The analyses and results described in this study are novel and have not been described or shown elsewhere.

Experimental paradigm
During the fMRI experiment, participants performed a visual memory encoding paradigm with an indoor/outdoor judgment as the incidental encoding task. Compared with earlier publications of this paradigm (Düzel et al., 2011;Barman et al., 2014;Schott et al., 2014;Assmann et al., 2021), the trial timings had been adapted as part of the DZNE-Longitudinal Cognitive Impairment and Dementia (DELCODE) study protocol (Düzel et al., 2018;Bainbridge et al., 2019; for a detailed comparison of trial timings and acquisition parameters, see Soch et al., 2021a). Subjects viewed photographs showing indoor and outdoor scenes, which were either novel at the time of presentation (44 indoor and 44 outdoor scenes) or were repetitions of two highly familiar "master" images (22 indoor and 22 outdoor trials), i.e., one indoor and one outdoor scene prefamiliarized before the actual experiment (cf. Soch et al., 2021a, their Fig. 1B). Thus, every subject was presented with 88 unique images and 2 master images that were presented 22 times each. Participants were instructed to categorize images as "indoor" or "outdoor" via button press. Each picture was presented for 2.5 s, followed by a variable delay between 0.70 and 2.65 s. To optimize estimation of the condition-specific BOLD responses despite the short delay, simulations were employed to optimize the trial order and jitter, as described previously (Hinrichs et al., 2000;Düzel et al., 2011).
Approximately 70 min (70.23 6 3.77 min) after the start of the fMRI session, subjects performed a computerbased recognition memory test outside the scanner, in which they were presented with the 88 images that were shown once during the fMRI encoding phase (old) and 44 images they had not seen before (new). Participants rated each image on a five-point Likert scale from 1 ("definitely new") to 5 ("definitely old"). For detailed experimental procedure, see Assmann et al. (2021) and Soch et al. (2021a).

fMRI data preprocessing
Data preprocessing was performed using Statistical Parametric Mapping (SPM12; Wellcome Trust Center for Neuroimaging, University College London, London, United Kingdom). EPIs were corrected for acquisition time delay (slice timing), head motion (realignment), and magnetic field inhomogeneities (unwarping), using voxeldisplacement maps (VDMs) derived from the fieldmaps. The MPRAGE image was spatially co-registered to the mean unwarped image and segmented into six tissue types, using the unified segmentation and normalization algorithm implemented in SPM12. The resulting forward deformation parameters were used to normalize unwarped EPIs into a standard stereotactic reference frame (Montreal Neurologic Institute, MNI; voxel size = 3 Â 3 Â 3 mm). Normalized images were spatially smoothed using an isotropic Gaussian kernel of 6mm full width at half maximum (FWHM).

General linear modeling
For first-level fMRI data analysis, which was also performed in SPM12, we used a parametric general linear model (GLM) of the subsequent memory effect that has recently been demonstrated to outperform the so far more commonly employed categorical models of fMRI subsequent memory effects (Soch et al., 2021a) when subsequent memory responses are recorded as memory confidence ratings on a parametric scale.
This model included two onset regressors, one for novel images at the time of presentation ("novelty regressor") and one for presentations of the two prefamiliarized images ("master regressor"). Both regressors were created as short box-car stimulus functions with an event duration of 2.5 s, convolved with the canonical hemodynamic response function, as implemented in SPM12.
The regressor reflecting subsequent memory performance was obtained by parametrically modulating the novelty regressor with a function describing subsequent memory report. Specifically, the parametric modulator (PM) was given by: where x 2 1; 2; 3; 4; 5 f gis the subsequent memory report, such that -1 PM 11. Compared with a linear-parametric model, this transformation puts a higher weight on definitely remembered (5) or forgotten (1) items compared with probably remembered (4) or forgotten (2) items (cf. Soch et al., 2021a, their Fig. 2A).
The model also included the six rigid-body movement parameters obtained from realignment as covariates of no interest and a constant representing the implicit baseline.

Extraction of target variables
For each subject, age group (young vs older), chronological age (in years) and memory performance (area under the curve, AUC; see Soch et al., 2021b, Appendix B) were extracted as dependent variables, i.e., target variables for prediction analyses (see Table 1).
Note that our measure of memory performance is not completely independent from some of the source variables, because it was obtained from the same task during which behavioral data and fMRI were acquired (see below, Extraction of source variables). For this reason, we also used independent measures of memory performance to test the predictive performance of our candidate variables. These measures include (1) the number of items retrieved in a verbal learning task (verbal learning and memory test, VLMT; Helmstaedter et al., 2001), in a recall after 30 min or 1 d; and (2) the number of points obtained in a semantic memory test (Wechsler memory scale, WMS; Härting et al., 2000), in a recall after 30 min or 1 d (see Table 2). For detailed description of these neuropsychological assessments, see Richter et al. (2022).

Extraction of source variables
For each subject, the following variables were extracted as independent variables, i.e., source variables for prediction analyses (see Table 3): • behavioral response frequencies: In the surprise recognition memory test, subjects provided memory confidence ratings between 1 and 5 for all 88 old stimuli (i.e., items presented during the encoding session) and 44 new stimuli (i.e., items not seen during the encoding session; see above, Experimental paradigm). From the responses of subject i, we calculated o ij , the proportion of old items rated with confidence level j, and n ij , the proportion of new items rated with j. The variables o i3 and n i3 were dropped to avoid collinearity of predictor variables, since all "old" proportions and all "new" proportions added up to 1, respectively.
• fMRI contrast images: The GLM for first-level fMRI data analysis contained one regressor for novel images, parametrically modulated with a nonlinear transformation of memory confidence, and another regressor for master images (see above, General linear modeling). From this, we generated fMRI contrast maps for "novelty processing" as such, by subtracting the master regressor from the novelty regressor, and for "subsequent memory" effects, identical to the estimated regression coefficient for the PM.
• fMRI summary statistics: We then identified regions with group-level significant positive and negative activations on these contrasts in young subjects. Using these voxels as masks, we calculated two recently described fMRI scores quantifying the deviation of older adults from the prototypical activation of young subjects (for detailed procedure and extracted scores, see Soch et al., 2021b). Both scores, FADE-classic (FADE = functional activity deviation during encoding; Düzel et al., 2011) and FADE-SAME (SAME = similarities of activations during memory encoding; Soch et al., 2021b), were computed from both contrasts, novelty processing and subsequent memory.
• resting-state fMRI maps: We then applied the RESTplus toolbox (Jia et al., 2019) to the preprocessed resting-state fMRI scans of each subject and calculated the voxel-wise percent of amplitude fluctuation (PerAF) of signals in the frequency range from 0.01 to 0.08 Hz. PerAF is the average absolute deviation from the signal mean, measured in percent (Jia et al., 2020, eq. 1). Here, we used "mean PerAF" (mPerAF), which additionally divides PerAF by the global mean (Jia et al., 2020, their Table 1) and was already employed in a previous study (Kizilirmak et al., 2022).
• structural MRI maps: Finally, the T1 image of each subject was submitted to structural MRI analyses (i.e., voxel-based morphometry; VBM) using the Computational Anatomy Toolbox (CAT12; Structural Brain Mapping Group, Department of Neurology, University Jena, Germany), resulting in gray matter volume (GMV) maps. These maps were additionally  Table 2 Details on the different measures of memory performance are given in Table 2. smoothed using a Gaussian kernel (isotropic FWHM = 6 mm) before entering whole-brain decoding analyses.

Prediction of target from source variables
After source and target variables were extracted, several analyses were performed and each analysis consisted in predicting a single target variable from a feature set of source variables using SVMs (see Fig. 1; Table 4).
For decoding the age group, a subject was belonging to, we used support vector classification (SVC) using a linear SVM with C = 1. For predicting chronological age and memory performance, we used support vector regression (SVR) using a linear SVM with C = 1. For both, SVC and SVR, subjects were split with k-fold cross-validation (CV) on subjects per group using k = 10 CV folds. All SVM analyses were implemented using LibSVM in MATLAB via inhouse scripts available from GitHub (https://github.com/ JoramSoch/ML4ML).

Distributional transformation
When predicting chronological age and memory performance, distributional transformation (DT) was applied to preserve the observed distribution of the target variable (Soch, 2020). DT is a postprocessing operation that maps predicted values to the variable's distribution in the training data and can improve prediction precision.
For example, memory measured as AUC always falls into the range between 0 and 1, but a trained SVM may also return values smaller than 0 or larger than 1. Then, DT brings predicted values into the natural range of the target variable while keeping the ranks of all predicted values identical before and after transformation (Soch, 2020). The same holds when predicting age which was always between 18 and 80 years in our study. For subgroup analyses, only the age range of the respective group (young vs older) was applied.

Performance assessment
The prediction precision was assessed using balanced accuracy (BA; ranging between 0 and 1) when decoding age group, i.e., by averaging the decoding accuracies for young and older subjects (Brodersen et al., 2010), and using correlation coefficients (ranging between -1 and 11) when predicting chronological age and memory performance, i.e., as the sample correlation coefficient between actual and predicted values of those variables. For each precision measure, a 90% confidence interval (CI) was established. CIs were generated using the MATLAB functions binofit for accuracies (assuming that the numbers of correct predictions are binomially distributed with unknown success probability) and corrcoef for correlations (assuming that actual and predicted continuous variables are linearly related).
When predicting chronological age and memory performance, we additionally calculated absolute errors (AE) between predicted and actual target values and submitted them to Wilcoxon signed-rank tests to check for significant reduction of the mean AE (MAE) from one feature set to another. This nonparametric test was chosen because of the presumably non-normal distribution of AEs. For each target variable, AEs of the feature set with the highest correlation coefficient were compared against AEs of each other feature set to test whether performances of the feature sets were significantly different from that of the most predictive feature set (Fig. 3). proportion of old items replied to with 1, ..., 5 and proportion of new items replied to with 1, ..., 5 fMRI summary statistics y i1 ; :::; y i4 2 R two scores (FADE-classic, FADE-SAME) computed from two fMRI contrasts (novelty processing, subsequent memory) fMRI contrast images Y i 2 R v voxel-wise fMRI contrasts computed in SPM, representing activations related to novelty processing (novel imagesmaster images) or subsequent memory (PM with memory response) resting-state fMRI maps Y i 2 R v voxel-wise PerAF (mPerAF) computed using the REST toolbox, based on fMRI signals measured during a resting-state session structural MRI maps Y i 2 R v voxel-wise gray matter volumes computed in CAT12, based on each subject's T1 image FADE = functional activity deviation during encoding, SAME = similarities of activations during memory encoding, R = real numbers, v = number of (in-mask) voxels.

Results
Chronological age is best predicted from structural MRI maps The age group a subject belonged to (young vs older subjects) could be predicted from all feature sets with above-chance decoding accuracy (see Extended Data Fig. 2-1). The highest accuracy was obtained with GMV maps (BA = 96.01%; CI = [0.931, 0.976]) and the lowest accuracy was obtained with response frequencies to old items (BA = 59.68%, CI = [0.542, 0.646]).
When predicting chronological age (in years) across all subjects, we found significant correlations for all feature sets (see Fig. 2A; old items: r = 0.40; GMV maps: r = 0.95). However, this was mainly attributable to the inherent correlation between chronological age and age group (see Materials and Methods, Participants), such that decoding age group is already a good predictor for chronological age. Therefore, we performed the same analyses separately within young subjects (18-35 years) and within older subjects (60-80 years).
In young subjects, chronological age could only be reconstructed from whole-brain GMV maps (see Dependent memory performance is best predicted from task-based fMRI Similar to chronological age, memory performance (AUC) across all subjects could be predicted from all feature sets; (see Fig. 3A; GMV maps: r = 0.13; SAME scores: r = 0.48). [Note that we are here not using behavioral data as source variables, because the target variable of memory performance is a mathematical function of the behavioral response frequencies. For this reason, prediction from response frequencies to all items would reach ceiling performance and is not shown.] However, as memory performance is also strongly influenced by age group, with young subjects performing significantly better than older subjects (young: m 1 = 0.82; older: m 2 = 0.77; effect size: d' = 0.72; two-sample t test: t = 5.67, p , 0.001), we again analyzed this target variable separately within young and older subjects, respectively.
In both age groups, memory performance predicted by GMV maps was not correlated to actual memory performance (young: r = 0.11; older: r = 0.11). Instead, memory performance was best predicted by the fMRI memory contrast in young subjects (see Fig. 3B; r = 0.19, CI = [0.032, 0.342]) and the SAME scores in older subjects (see Fig. 3C; r = 0.53, CI = [0.421, 0.616]). Note that the predictive accuracy when predicting from just four singlevalue fMRI scores (FADE and SAME: r = 0.48, CI = [0.368, 0.575]) was better than using two whole-brain task-based fMRI contrasts (novelty and memory: r = 0.35, CI = [0.227, 0.461]). novelty contrast v whole-brain novelty contrast maps mem. memory contrast v whole-brain memory contrast maps both nov. and mem. 2v whole-brain novelty and memory contrast maps mPerAF mPerAF maps v whole-brain percent amplitude fluctuation maps GMV GMV maps v whole-brain gray matter volume maps all all features 4v112 all unique features listed in this table Short and long feature set names are used as x-axis labels on Figures 2-5. The number of features corresponds to the number of columns in the data matrix used for prediction. FADE = functional activity deviation during encoding, SAME = similarities of activations during memory encoding, v = number of (in-mask) voxels. Figure 2. Prediction of chronological age from different feature sets. Bar plots show correlation coefficients for predicting chronological age (in years; A) across all subjects, (B) in young subjects only, or (C) in older subjects only from behavioral data (red), fMRI scores (magenta), task-based fMRI contrasts (blue), resting-state fMRI maps (cyan) and structural MRI (green), or all features (yellow). Error bars denote 90% CIs; x-axis labels are explained in Table 4. The feature set with the highest predictive correlation is denoted with an "o"; other feature sets are labeled with asterisks to indicate significantly different MAE (*p , 0.05, **p , 0.01, ***p , 0.001, otherwise not significant). For classification of age group from these features, see Extended Data Figure 2-1.

Research Article: New Research
Independent memory performance is best predicted from single-value fMRI scores When predicting independent measures of memory performance (see Materials and Methods, Extraction of target variables; Table 2), we restrict the results report to the older subjects, because those measures could not be reliably predicted at all in young subjects (see Extended Data Fig. 4-1), probably because of the lower variation in their close-to-ceiling memory performance.
Generally, the prediction of memory performance in independent tests was less accurate than that of behavioral memory performance in the fMRI task itself (compare Fig.  4 and 3C). Besides this, outcomes from all memory tests are best predicted by the SAME scores (see Fig. 4A Moreover, there appears to be a dissociation by type of memory test. Whereas performance in the verbal-semantic VLMT could be predicted from behavioral responses to old items, but not task-based fMRI contrast maps, the reverse pattern was seen for performance in the auditoryepisodic WMS (see Fig. 4, red and blue bars). [This is presumably because the verbal-semantic VLMT includes a distractor list and the distractors act similar like the new items in the FADE task, requiring subjects to decide during item retrieval, whether an item they remember was in the target list or the distractor list. This similarity of discrimination requirements might induce a correlation between the number of old items recalled (VLMT) and the fraction of old images recognized (FADE), leading to a significant predictive correlation. This interpretation would be in line with a two-process model for recognition and retrieval (Anderson & Bower, 1972) which points out the importance of contextual information, e.g. distractor lists during learning (Cox and Dobbins, 2011).] Notably, the two SAME scores and all four fMRI-based scores were the only feature sets that allowed for above-chance prediction of all four independent measures of memory performance (see Fig. 4, magenta bars).

Effects of age and memory are specific to structural MRI versus fMRI
To follow-up on the findings of predictive analyses, especially the differences in predicting participants' age versus memory (compare Figs. 2C and 3C), we explicitly compared functional and structural MRI data in older subjects using subgroup analyses. To this end, we partitioned all older subjects into four groups based on (1) chronological Figure 3. Reconstruction of memory performance from different feature sets. Bar plots show correlation coefficients for predicting memory performance (AUC; A) across all subjects, (B) in young subjects only, or (C) in older subjects only from fMRI scores (magenta), task-based fMRI contrasts (blue), resting-state fMRI maps (cyan) and structural MRI (green), or all features (yellow). Note that memory performance can be directly derived from behavioral data which is why the corresponding prediction analyses were not performed. The layout follows that of Figure 2. age, separating into "young" and "old" older subjects; and (2) memory performance, separating higher from lower memory performance subjects (see Extended Data Fig.  5-1). Then, the voxel-wise data of the quarter with the lowest values and the quarter with the highest values were submitted to second-level two-sample t tests in SPM. This analysis was performed for both fMRI contrasts, mPerAF maps and GMV maps. Thresholded statistical parametric maps were FWE-cluster-corrected (cluster-defining threshold, CDT: p , 0.001, k = 0), resulting in a minimum cluster size for each analysis [novelty: k = 42; memory: k = 27; mPerAF: k = 23; GMV: k = 33 (separating by age) and k = 42 (separating by memory); see Fig. 5].
Taken together, we observed a double dissociation of structural MRI versus task-based fMRI and age versus memory, in the sense that (1) when partitioning subjects by chronological age, there were significant effects on structural MRI (see Fig. 5A); and (2) when partitioning subjects by memory performance, there were significant effects on task-based fMRI (see Fig. 5B); at the same time, there were no age-related differences with respect to task-based fMRI and no memory-related differences with respect to structural MRI. Resting-state fMRI maps showed differences between younger and older subjects, but not between those with high versus low memory performance (see Fig. 5, third row), suggesting that their informational content is closer to structural MRI than to task-based fMRI.

Single-value fMRI scores have moderate predictive utility
To assess the predictive utility of fMRI summary statistics, we used FADE and SAME scores computed from novelty and memory contrasts (i.e., four features; compare Table 4) and evaluated the precision by which these scores predict memory performance in two ways.
First, we compared predicted with actual values when reconstructing AUC in the fMRI memory paradigm from FADE and SAME scores (compare Fig. 3B,C). In older subjects, there was a correlation of 0.47 (p , 0.001) and AUC could be predicted with a MAE of 0.06 (see Fig. 6B). For comparison, the same correlation was 0.17 (p = 0.082) with an MAE of 0.08 in young subjects (see Fig. 6A).
Second, we tested how well subgroups of the older subjects formed for the previous analysis (see above, Effects of age and memory are specific to structural MRI versus fMRI; compare Fig. 5A,B) could be classified from fMRI scores. When classifying older subjects with lower versus higher memory performance based on FADE and SAME scores (N = 76), the decoding accuracy was 72.37% (sensitivity: 76.32%; specificity: 68.42%). For comparison, the decoding accuracy was 84.93% (sensitivity: 81.08%; specificity: 88.89%) when classifying "old" versus "young" older subjects based on GMV maps (N = 73).

Discussion
In the present study, we have comparatively evaluated the ability of structural and functional (resting-state and task-based) MRI data as well as behavioral measures to predict chronological age versus memory performance in young and older healthy adults (see Fig. 1). While all modalities could predict age group, within-group prediction of age and memory performance revealed distinct patterns. Among young and older subjects, chronological age was best predicted by structural MRI and also resting-state fMRI (see Fig. 2B,C), whereas memory performance was best predicted by fMRI contrasts (novelty and subsequent memory effects) and especially single-value fMRI-based scores (see Figs. 3C,4) in older participants only.

Prediction of chronological age from structural MRI
All of the candidate predictors employed in the present study have previously been shown to exhibit age-related differences: (1) behavioral memory responses are different between age groups, with older adults producing more false positives which reduces memory performance (cf. Soch et al., 2021a, Tab. S2; also see Duarte et al., 2010); (2) memory-related fMRI responses differ between age groups, with older adults showing reduced parahippocampal activations and reduced default mode network (DMN) deactivations during novelty processing and subsequent memory (cf. Soch et al., 2021b, their Fig. 2; also see Maillet and Rajah, 2014;Billette et al., 2022); (3) resting-state fMRI patterns exhibit global age-related differences (Foo et al., 2021;Xing, 2021); and (4) quantitative structural MRI approaches like VBM yield robust and well-replicated age-related differences, with older adults showing reduced hippocampal volumes (cf. Kizilirmak et al., 2022;also see Veldsman et al., 2021) as well as reduced cortical and subcortical GMV, particularly in structures of the human memory network like the medial temporal lobe (Schiltz et al., 2006;Minkova et al., 2017).
In line with the aforementioned observations, all variables could discriminate between age groups, but within the group of older adults, a distinct pattern emerged regarding the prediction of chronological age and memory performance, respectively. Chronological age was best Figure 5. Differential effects of age and memory in structural MRI and fMRI. Significant differences (A) between "young" and "old" older subjects and (B) between older subjects with higher versus lower memory performance, with respect to fMRI activity during novelty processing (first row), subsequent memory (second row), fMRI amplitudes during rest (third row), and voxel-wise gray matter volume (fourth row). Thresholded SPMs are FWE-corrected for cluster size (CDT: p , 0.001, k = 0). Colored voxels indicate significantly higher values for either young subjects and those with higher memory performance (red) or old subjects and those with lower memory performance (blue). For distributions of chronological age and memory performance underlying these analyses, see Extended Data predicted from voxel-wise GMV, reflecting the well-replicated observation that both cortical and subcortical GM show age-related volume loss (Minkova et al., 2017;Soch, 2020;Veldsman et al., 2021), which is, longitudinally, already observable within a year's time (Fjell et al., 2009(Fjell et al., , 2013Bagarinao et al., 2022). Predictive correlation of whole-brain GMV and chronological age within the group of older adults was, however, only moderate, most likely reflecting the considerable interindividual variability in age-related structural brain changes. This phenomenon has in fact been conceptualized within the brain-age framework, a widely researched approach to employ differences between predicted brain age and chronological age as a biomarker for brain health in aging (Cole and Franke, 2017;Bashyam et al., 2020). Including other predictors in the model did not improve age prediction among older adults (Fig. 2C), suggesting that the biological information actually predicting chronological rather than brain age might be limited.
In a recent competition to predict chronological age from structural neuroimaging (Fisch et al., 2021), the winning performance, a MAE of 2.90 years, was achieved using lightweight 3D convolutional neural networks (Gong et al., 2021). Moreover, it was shown that DT can improve the MAE by about half a year, using the distribution of the target values in the training data (Soch, 2020), an approach that was also used in the present study (see Materials and Methods, Distributional transformation).
fMRI as predictor of cognitive performance in old age Unlike chronological age, memory performance could not be reliably predicted from GMV. This is compatible with the fact that in previous studies, we found no correlations between hippocampal volume and our task-based fMRI summary statistics for both hemispheres, using two scores, computed from two contrasts (cf. Soch et al., 2021b, their Fig. 4). It is also supported by another study, in which a combination of ApoE genotype and task-based fMRI was identified as the best predictor of cognitive decline in healthy older adults (Woodard et al., 2010). In line with those findings, we here observed that memory performance could be predicted from single-value fMRI scores (see Fig. 4), especially when extracting both FADE and SAME scores, from both novelty and memory contrasts (Soch et al., 2021b).
It should be noted that the cognitive task underlying our fMRI data set (incidental encoding of visual scenes) in fact targeted declarative long-term memory. In so far, the high predictive value of functional measures derived from activity during such a task (i.e., fMRI novelty and memory contrast maps, FADE and SAME scores) for other measures of declarative memory appears to be a natural outcome, as it is more specifically targeting the to-be-predicted variable than GMV or mPerAF. The same is true for the study of Woodard and colleagues, in which participants encoded names (famous vs unfamiliar names) and the independent measures of cognitive decline comprised different types of neuropsychological memory assessments. On the other hand, we could recently show that, while the scores derived from the novelty contrast were rather specifically associated with tests of explicit memory, the scores computed from the memory contrast were also associated with measures of global cognition (Richter et al., 2022). More generally, our findings are in line with the Figure 6. Prediction of memory performance from single-value fMRI scores. Scatter plots of actual versus predicted memory performance when reconstructing memory performance from FADE and SAME scores (see Fig. 3, magenta bars) in (A) young subjects and (B) older subjects. r = correlation coefficient, MAE = mean absolute error, ***p , 0.001. notion that cognitive reserve may to a certain degree be independent from structural age-related changes of the brain (Nyberg et al., 2012).

Informational content of resting-state maps
It is also noteworthy that resting-state fMRI behaved more similar to structural MRI than task-based fMRI, with BA for mPerAF maps being close to that of GMV maps (see Extended Data Fig. 2-1) and mPerAF similarly predicting chronological age (see Fig. 2C), but not capturing memory performance in older subjects (see Fig. 3C). This suggests that at least voxel-wise mPerAF maps derived from resting-state fMRI provide information that is closer to the brain-anatomic information of structural MRI maps than to the neural-processing information of task-based fMRI contrasts. This is compatible with the line of thought discussed above. While task-based fMRI measures provide informational value for cognitive performance measures, especially when the fMRI task falls into the same cognitive domain as the to-be-predicted performance indicator, resting-state fMRI measures appear to reflect brain integrity more generally (Mevel et al., 2011).
Successful aging, brain structural integrity, and memory performance Overall, our results suggest that successful aging, that is, relatively preserved memory in healthy older adults, may not be primarily attributable to lower gray matter loss, but rather to better preserved functional brain networks, as evident in a higher similarity of memory-related brain activity with that of young adults (see Fig. 5). This might be different in pathologic aging when brain anatomy is affected to a larger extent but is compatible with earlier studies suggesting that in healthy older adults, functional neurocognitive resources may be more important for cognitive performance than structural measures of brain integrity (Scarmeas et al., 2003;Stern, 2009Stern, , 2012Cabeza et al., 2018).
The observation that structural MRI had no predictive power for memory performance in our study may at first seem surprising, given that there are very large differences with respect to GMV between young and older adults (Farokhian et al., 2017) who typically also differ with respect to memory performance (Soch et al., 2021a;Richter et al., 2022). One potential explanation for this finding may be that, in our study, the sample investigated consisted of neurologically and psychiatrically healthy older adults without signs of cognitive impairment. This suggests that brain atrophy (i.e., structural volume loss) may to some extent occur invariably with increasing age, but does not necessarily affect cognitive performance as long as (1) the degree is still within the bounds of normal aging and (2) it is not accompanied by functional processing changes (reflected in fMRI scores), potentially because of compensatory mechanisms (Kizilirmak et al., 2021). This is in line with previous studies that reported a decoupling between gray and white matter measures and memory performance in older age (de Mooij et al., 2018), underscoring that cognitive maintenance or reserve is, at least to a degree, independent of neural maintenance. A large metaanalysis also highlights the lack of a strong dependency between structural and cognitive decline (Oschwald et al., 2019), suggesting that the healthy aging brain possesses a considerable potential to compensate for inevitable age-related structural decline (Stern, 2009;Nyberg et al., 2012;Cabeza et al., 2018).
In conclusion, we have shown a systematic difference in predictive ability between structural MRI markers (and resting-state fMRI) on the one hand versus fMRI markers (especially fMRI summary statistics) on the other hand. Whereas the former are most strongly related to chronological age reflecting the mere progression of time, the latter allow to better predict cognitive performance in episodic memory. In a sense, this double dissociation supports the concept of cognitive reserve as a phenomenon that may to some degree be independent from structural brain aging. Further research has to elucidate the sources of preserved memory performance in older adults with structural degradation, but functional maintenance.