Elsevier

NeuroImage

Volume 101, 1 November 2014, Pages 390-403
NeuroImage

Multisite longitudinal reliability of tract-based spatial statistics in diffusion tensor imaging of healthy elderly subjects

https://doi.org/10.1016/j.neuroimage.2014.06.075Get rights and content

Highlights

  • We implement a multi-site 3 T MRI protocol for brain DTI on 10 EU sites.

  • We acquire across-session test–retest data on 50 healthy elderly subjects.

  • We use full brain TBSS and ROI analysis to calculate FA, MD, RD and AD.

  • Reproducibility errors are in the 2–6% range.

  • Reproducibility errors tended to be lower in sites with shorter acquisitions.

Abstract

Large-scale longitudinal neuroimaging studies with diffusion imaging techniques are necessary to test and validate models of white matter neurophysiological processes that change in time, both in healthy and diseased brains. The predictive power of such longitudinal models will always be limited by the reproducibility of repeated measures acquired during different sessions. At present, there is limited quantitative knowledge about the across-session reproducibility of standard diffusion metrics in 3 T multi-centric studies on subjects in stable conditions, in particular when using tract based spatial statistics and with elderly people. In this study we implemented a multi-site brain diffusion protocol in 10 clinical 3 T MRI sites distributed across 4 countries in Europe (Italy, Germany, France and Greece) using vendor provided sequences from Siemens (Allegra, Trio Tim, Verio, Skyra, Biograph mMR), Philips (Achieva) and GE (HDxt) scanners. We acquired DTI data (2 × 2 × 2 mm3, b = 700 s/mm2, 5 b0 and 30 diffusion weighted volumes) of a group of healthy stable elderly subjects (5 subjects per site) in two separate sessions at least a week apart. For each subject and session four scalar diffusion metrics were considered: fractional anisotropy (FA), mean diffusivity (MD), radial diffusivity (RD) and axial (AD) diffusivity. The diffusion metrics from multiple subjects and sessions at each site were aligned to their common white matter skeleton using tract-based spatial statistics. The reproducibility at each MRI site was examined by looking at group averages of absolute changes relative to the mean (%) on various parameters: i) reproducibility of the signal-to-noise ratio (SNR) of the b0 images in centrum semiovale, ii) full brain test–retest differences of the diffusion metric maps on the white matter skeleton, iii) reproducibility of the diffusion metrics on atlas-based white matter ROIs on the white matter skeleton. Despite the differences of MRI scanner configurations across sites (vendors, models, RF coils and acquisition sequences) we found good and consistent test–retest reproducibility. White matter b0 SNR reproducibility was on average 7 ± 1% with no significant MRI site effects. Whole brain analysis resulted in no significant test–retest differences at any of the sites with any of the DTI metrics. The atlas-based ROI analysis showed that the mean reproducibility errors largely remained in the 2–4% range for FA and AD and 2–6% for MD and RD, averaged across ROIs. Our results show reproducibility values comparable to those reported in studies using a smaller number of MRI scanners, slightly different DTI protocols and mostly younger populations. We therefore show that the acquisition and analysis protocols used are appropriate for multi-site experimental scenarios.

Introduction

Diffusion tensor imaging (DTI) is a quantitative MRI technique widely used for the in vivo characterization of white matter microstructural organization (Ciccarelli et al., 2008, Mori and Zhang, 2006). DTI can be applied to investigate both normal and pathological conditions, and in longitudinal studies it can measure changes of white matter tissue properties in normal aging (Lebel and Beaulieu, 2011, Sullivan and Pfefferbaum, 2007, Sullivan et al., 2010, Westlye et al., 2010) as well as in brain diseases like for example Alzheimer's Disease (Kantarci et al., 2010, Mielke et al., 2009, Scola et al., 2010, Teipel et al., 2010), Huntington's Disease (Magnotta et al., 2009, Sritharan et al., 2010, Weaver et al., 2009), multiple sclerosis (Calabrese et al., 2011, Harrison et al., 2011, Rashid et al., 2008, Sage et al., 2009), stroke recovery (Wang et al., 2006) and traumatic brain injury (Sidaros et al., 2008). Such longitudinal DTI studies can be used to test and develop DTI-based biomarker models of disease progression/recovery, which may be of great utility in better understanding physiopathology as well as for evaluating therapeutic effects.

DTI allows the description of tissue microstructures modeling the Gaussian diffusion properties of water and the detection of white matter lesions (Basser and Pierpaoli, 1996). The most commonly used DTI metrics in clinical studies are fractional anisotropy (FA) and mean diffusivity (MD). Complementary information about white matter structure can be obtained from axial (AD) and radial (RD) diffusivity which, with some limitations, are considered indices of axonal injury and demyelination, respectively (Song et al., 2005, Wheeler-Kingshott and Cercignani, 2009). In addition to these diffusion metrics, orientation information in white matter tracts can be obtained using more advanced DTI acquisition and analysis methods, for example with probabilistic tractography (Behrens et al., 2003, Parker et al., 2003), diffusion spectrum imaging (Wedeen et al., 2005) and high angular resolution methods (Wedeen et al., 2008). These methods, however, typically require longer acquisition times and/or specialized MRI sequences not always available on clinical scanners, and their implementations can therefore be challenging in large multi-centric longitudinal studies, particularly when involving elderly subjects. For these reasons this study focuses on standard DTI acquisitions and their scalar derived metrics (FA, MD, AD, RD).

Longitudinal multi-center MRI studies are becoming an increasingly common strategy to collect large datasets distributing the data acquisition load across multiple partners (Van Horn and Toga, 2009). Moreover, longitudinal studies reduce the between subject variability because each subject is his/her own control. One critical factor that limits the sensitivity to detect changes in any longitudinal study is the reproducibility of repeated measures. Obtaining reproducible quantitative results from DTI data is not trivial given that the final results are sensitive to a large number of acquisition and analysis factors (Jones and Cercignani, 2010). Various aspects of DTI reproducibility have been investigated, including basic reproducibility measures of different populations (Bonekamp et al., 2007, Ciccarelli et al., 2003, Heiervang et al., 2006, Marenco et al., 2006), evaluation of the effects of region of interest (ROI) drawing protocols (Wakana et al., 2007), effects of signal averaging (Farrell et al., 2007), head motion effects (Yendiki et al., 2013), as well as the effects of various acquisition parameters like for example b-value (Bisdas et al., 2008), diffusion weighting scheme (Landman et al., 2007, Vaessen et al., 2010), voxel size (Papinutto et al., 2013), and MRI scanner effects (Brander et al., 2010, Pagani et al., 2010, Pfefferbaum et al., 2003, Vollmar et al., 2010, White et al., 2011, Zhu et al., 2011).

However, despite the wide use of DTI as a tool to assess white matter integrity in 3 T MRI studies, across-session test–retest reliability of diffusion measures on subjects in stable conditions has not been thoroughly investigated using multiple MRI systems. Across-session reproducibility is useful to estimate the effective reproducibility errors that are part of a longitudinal study, since across-session acquisitions include additional sources of variance like MRI system instabilities, differences in head positioning and re-positioning within the RF coil, differences in automated acquisition procedures like auto shimming, as well as potential effects from how different operators follow instructions to execute the same acquisition protocol. These variability sources are negligible in within-session reproducibility studies. Table 1 outlines studies that, to the best of our knowledge, have reported across-session test–retest reproducibility measures of diffusion data derived from adult healthy volunteers using 3 T systems. These studies are limited to few sites with identical 3 T scanners (Huang et al., 2012, Takao et al., 2012, Vollmar et al., 2010), focused mainly on young subjects (< 40 years, except Takao et al., 2012), using DTI analysis mostly based on manual ROIs (Bisdas et al., 2008, Jansen et al., 2007) or aimed at evaluating MRI software upgrade effects (Fox et al., 2012). In other words, the impact of across-session reproducibility errors of DTI metrics derived from multi-site longitudinal 3 T studies is not clearly defined, in particular with the commonly used tract-based spatial statistics (TBSS) analysis (Smith et al., 2006). TBSS is particularly attractive for longitudinal voxel-wise analysis of DTI data given that individual diffusion parameter maps are projected onto a group-wise skeleton constructed from FA data to account for residual misalignments among individual white matter tracts in multiple measures and multiple subjects. These issues are relevant to the PharmaCog project, a new industry-academic European project aimed at identifying biomarkers sensitive to symptomatic and disease modifying effects of drugs for Alzheimer's disease (http://www.alzheimer-europe.org/FR/Research/PharmaCog).

The aims of the present study were the following: i) to implement a multi-site 3 T MRI data acquisition protocol for diffusion analysis (10 different MRI sites covering three common clinical MRI vendors in Europe), ii) to acquire across-session test–retest data (2 acquisitions at least one week apart) from a population of healthy stable elderly subjects (5 subjects per MRI site), and iii) to evaluate and compare the across-session reproducibility of FA, MD, AD and RD diffusivities within and across MRI sites using both voxel-based TBSS and an atlas-based ROI analyses. This multi-site DTI study is unique in that it characterizes brain diffusion reproducibility metrics particularly relevant to multi-center longitudinal studies of brain disease (within-site across-session reproducibility in healthy elderly subject) derived from both full brain voxel-based TBSS and atlas-based ROIs using a wide range of clinical 3 T scanners (Table 2). The test–retest raw DTI data from the 10 MRI sites is made publicly available (100 brain volumes).

Section snippets

Materials and methods

Several aspects of the subjects, study design and data preparation steps used in this diffusion study were already described in a recent morphometry study (Jovicich et al., 2013) but are here repeated for completeness and with the appropriate modifications.

Results

In this multi-site 3 T study (10 clinical scanners from different vendors) we estimated the test–retest reliability (within-site, across two separate sessions at least a week apart) of diffusion measures derived from healthy elderly DTI data (FA, MD, radial and axial diffusivities). Both full brain (TBSS) and ROI (atlas-based) approaches were used.

The 50 subjects enrolled (Table 2) resulted to be all Caucasian with similar age distributions except for site 6 (younger group, mean age 52.4 ± 1.5 

Discussion

In this Pharmacog Consortium study, we found that the test–retest reliability/variability of DTI metrics estimated with TBSS in a 3 T consortium using vendor provided sequences is consistent across sites despite the heterogeneity of MRI scanner configurations. This suggests that pooling DTI data from longitudinal multi-site studies has a potential for accelerating the evaluation of biomarkers related to water diffusion changes. We had three main findings: (1) in a group of healthy stable elderly

Conclusions

Longitudinal multisite neuroimaging designs are typically used to identify differential tissue property changes associated with normal development, plasticity or disease progression/regression. The reliability of neuroanatomical measurements over time and across MRI sites is crucial for the statistical power of longitudinal studies. The main result of this multi-site study with ten 3 T MRI sites is that the across-session test–retest reproducibility/variability obtained with the protocol used

Acknowledgments

Pharmacog is funded by the EU-FP7 for the Innovative Medicine Initiative (grant no. 115009). All members of the Pharmacog project deserve sincere acknowledgement for their significant efforts, but unfortunately, they are too numerous to mention. The authors would like to especially thank the people who contributed to the early phases of this study, including Luca Venturi, Genoveffa Borsci, Thomas Günther, and Aurélien Monnet, as well as Alberto Redolfi for his support with the Intellimaker

References (76)

  • S. Marenco et al.

    Regional distribution of measurement error in diffusion tensor imaging

    Psychiatry Res.

    (2006)
  • M.M. Mielke et al.

    Regionally-specific diffusion tensor imaging in mild cognitive impairment and Alzheimer's disease

    NeuroImage

    (2009)
  • S. Mori et al.

    Principles of diffusion tensor imaging and its applications to basic neuroscience research

    Neuron

    (2006)
  • N.D. Papinutto et al.

    Reproducibility and biases in high field brain diffusion MRI: an evaluation of acquisition and analysis variables

    Magn. Reson. Imaging

    (2013)
  • S.M. Smith et al.

    Tract-based spatial statistics: voxelwise analysis of multi-subject diffusion data

    NeuroImage

    (2006)
  • S.K. Song et al.

    Demyelination increases radial diffusivity in corpus callosum of mouse brain

    NeuroImage

    (2005)
  • M.J. Vaessen et al.

    The effect and reproducibility of different clinical DTI gradient sets on small world brain connectivity measures

    NeuroImage

    (2010)
  • C. Vollmar et al.

    Identical, but not the same: intra-site and inter-site reproducibility of fractional anisotropy measures on two 3.0 T scanners

    NeuroImage

    (2010)
  • S. Wakana et al.

    Reproducibility of quantitative tractography methods applied to cerebral white matter

    NeuroImage

    (2007)
  • C. Wang et al.

    Longitudinal changes in white matter following ischemic stroke: a three-year follow-up study

    Neurobiol. Aging

    (2006)
  • K.E. Weaver et al.

    Longitudinal diffusion tensor imaging in Huntington's Disease

    Exp. Neurol.

    (2009)
  • V.J. Wedeen et al.

    Diffusion spectrum magnetic resonance imaging (DSI) tractography of crossing fibers

    NeuroImage

    (2008)
  • T. Zhu et al.

    Quantification of accuracy and precision of multi-center DTI measurements: a diffusion phantom and human brain study

    NeuroImage

    (2011)
  • A. Alhamud et al.

    Volumetric navigators for real-time motion correction in diffusion tensor imaging

    Magn. Reson. Med.

    (2012)
  • J.L.R. Andersson et al.

    Non-linear optimisation. FMRIB technical report

  • J.L.R. Andersson et al.

    Non-linear registration, aka spatial normalisation. FMRIB technical report

  • T.E. Behrens et al.

    Characterization and propagation of uncertainty in diffusion-weighted MR imaging

    Magn. Reson. Med.

    (2003)
  • S. Bisdas et al.

    Reproducibility, interrater agreement, and age-related changes of fractional anisotropy measures at 3 T in healthy subjects: effect of the applied b-value

    AJNR Am. J. Neuroradiol.

    (2008)
  • A. Brander et al.

    Diffusion tensor imaging of the brain in a healthy adult population: Normative values and measurement reproducibility at 3 T and 1.5 T

    Acta Radiol.

    (2010)
  • M. Calabrese et al.

    Cortical diffusion-tensor imaging abnormalities in multiple sclerosis: a 3-year longitudinal study

    Radiology

    (2011)
  • O. Dietrich et al.

    Measurement of signal-to-noise ratios in MR images: influence of multichannel coils, parallel imaging, and reconstruction filters

    J. Magn. Reson. Imaging

    (2007)
  • V. Drago et al.

    Disease tracking markers for Alzheimer's disease at the prodromal (MCI) stage

    J. Alzheimers Dis.

    (2011)
  • A. Engvig et al.

    Memory training impacts short-term changes in aging white matter: a longitudinal diffusion tensor imaging study

    Hum. Brain Mapp.

    (2012)
  • J.A. Farrell et al.

    Effects of signal-to-noise ratio on the accuracy and reproducibility of diffusion tensor imaging-derived fractional anisotropy, mean diffusivity, and principal eigenvector measurements at 1.5 T

    J. Magn. Reson. Imaging

    (2007)
  • E. Fieremans et al.

    Novel white matter tract integrity metrics sensitive to Alzheimer disease progression

    AJNR Am J Neuroradiol

    (2013 Nov-Dec)
  • R.J. Fox et al.

    A validation study of multicenter diffusion tensor imaging: reliability of fractional anisotropy and diffusivity values

    AJNR Am. J. Neuroradiol.

    (2012)
  • S. Galluzzi et al.

    The Italian Brain Normative Archive of structural MR scans: norms for medial temporal atrophy and white matter lesions

    Aging Clin. Exp. Res.

    (2009)
  • D.M. Harrison et al.

    Longitudinal changes in diffusion tensor-based quantitative MRI in multiple sclerosis

    Neurology

    (2011)
  • Cited by (77)

    • Contrastive semi-supervised harmonization of single-shell to multi-shell diffusion MRI

      2022, Magnetic Resonance Imaging
      Citation Excerpt :

      DW-MRI has opened up new investigations into cognitive neuroscience and brain dysfunction in aging, mental health disorders, and neurological disease [2]. However, clinical adoption is hindered by the variability in DW-MRI measurements caused by differences in the number of head coils, coil sensitivity, imaging gradient non-linearities, magnetic field homogeneity, reconstruction algorithms, and software upgrades [3–7]. These differences are measured in terms of reproducibility across multiple acquisitions and across multiple sites (Fig. 1), and the goal of increasing reproducibility across acquisition parameters, scanners, and scanning sites is known as harmonization.

    • Accuracy and reproducibility of automated white matter hyperintensities segmentation with lesion segmentation tool: A European multi-site 3T study

      2021, Magnetic Resonance Imaging
      Citation Excerpt :

      Moreover, studies on the cause of the low accuracy, like lesions location, type or shape will enhance our knowledge of the WMHs and will help the algorithm developers. Marizzoni et al. (2015, 32] and Jovicich et al. (2013, 2014) [31,33] have already discussed some limitations of the study design, but some of the issues are addressed here for completeness. First, the number of participants included in this study is small (n = 60), and each site contributed with a different number of participants (from 4 to 5).

    View all citing articles on Scopus
    1

    These authors contributed equally to this work.

    View full text