## Abstract

Lesion-symptom mapping is often employed to define brain structures that are crucial for human behavior. Even though poststroke deficits result from gray matter damage as well as secondary white matter loss, the impact of structural disconnection is overlooked by conventional lesion-symptom mapping because it does not measure loss of connectivity beyond the stroke lesion. This study describes how traditional lesion mapping can be combined with structural connectome lesion symptom mapping (CLSM) and connectome dynamics lesion symptom mapping (CDLSM) to relate residual white matter networks to behavior. Using data from a large cohort of stroke survivors with aphasia, we observed improved prediction of aphasia severity when traditional lesion symptom mapping was combined with CLSM and CDLSM. Moreover, only CLSM and CDLSM disclosed the importance of temporal-parietal junction connections in aphasia severity. In summary, connectome measures can uniquely reveal brain networks that are necessary for function, improving the traditional lesion symptom mapping approach.

## Significance Statement

In this article, we present a description of how the individualized structural connectome can be measured from individuals with large brain lesions, leveraging recently described approaches to calculate network dynamics. We then tested whether structural and dynamic connectome measures could be combined with conventional lesion-symptom mapping to better predict and elucidate the neuroanatomy of poststroke aphasia. Using machine learning applied to a large dataset of stroke survivors, we not only observed that adding connectome measures to lesion mapping improves the prediction of aphasia severity, but connectome measures disclose crucial language networks, notably within the temporal parietal regions, that are not identified by conventional lesion mapping.

## Introduction

Aphasia is a disorder of language processing that commonly results from acquired brain damage (Damasio, 1992) but the determinants of aphasia severity are not yet completely understood. Infarct volume is a well-defined predictor of severity, but it can only partly explain the degree of language problems (Pedersen et al., 1995; Forkel et al., 2014). Likewise, the cumulative amount of damage to language related cortical structures is another important determinant of severity, but it is still limited in its ability to define long-term impairments (Plowman et al., 2012). Some persons with aphasia with relatively intact language-related cortical structures may exhibit profound language deficits (Fridriksson et al., 2007), whereas others with more extensive damage display mild deficits (Epstein-Peterson et al., 2012).

This discrepancy may arise from white matter damage beyond the stroke lesion. Extensive damage to white matter has been well-documented after ischemic strokes (Matute et al., 2013): the white matter receives relative less cerebrovascular perfusion compared to gray matter (Fisher, 2011), and its anaerobic resistance declines with aging (Hamner et al., 2011). Furthermore, Wallerian degeneration from the cortical site of injury can lead to remote loss of structural connectivity (Kuhn et al., 1989; Thomalla et al., 2005; Wang et al., 2012; Jones et al., 2013). Interestingly, poststroke white matter disconnection of seemingly spared cortical structures can lead to dysfunction of the disconnected cortex that is tantamount to cortical necrosis (Bonilha et al., 2014). Several other studies have indicated an important role of white matter connections in shaping cognitive and language performance (Catani and ffytche, 2005; Catani and Mesulam, 2008; Kümmerer et al., 2013; Forkel et al., 2014; Duffau, 2015; Stockert and Saur, 2017). For this reason, to better understand the nature and magnitude of brain damage after strokes, it is important to take into account not only gray matter lesions, but also white matter pathways.

The structural human brain connectome is a novel methodological approach in clinical neurosciences. It entails mapping the entireness of medium to large-scale white matter connections in the brain, typically using diffusion tensor imaging (DTI) MRI, combined with high resolution anatomic images (volumetric T1- and T2-weighted scans). Our group has recently demonstrated that structural connectomes can be measured in brains with stroke lesions, and, more importantly, used to assess the relationship between regional network damage and language impairments in persons with aphasia (i.e., connectome lesion symptom mapping, CLSM). CLSM provides additional anatomic information regarding the neuronal networks whose lesions are statistically associated with impairments in behavior, evaluating damage more broadly defined as necrosis and disconnection.

Despite its advantages, CLSM based on connection weight does not provide information about network architecture beyond pair-wise connections. CLSM can identify which connections are crucial for behavior, but it is not sensitive to detecting the integrity of indirect connections. One possible method of addressing the problem of indirect pathways is the approach recently described by Sporns and colleagues (Mišić et al., 2015) to measure connectome dynamics. Their method leverages the structure of the connectome and evaluates the relative lag of transfer of information between different brain regions taking into account the presence and weight of direct and indirect connections. A faster spread on information is observed between regions that are linked by shorter path lengths, or by alternative nearly-shortest paths (the principal advantage of this approach over traditional path length).

This study had three aims. The first was to determine how much of the variance in aphasia severity can be explained by connectome dynamics lesion symptom mapping (CDLSM), CLSM or gray matter necrosis (regional based voxel-lesion symptom mapping). We tested the hypothesis that regional gray matter damage and network dysfunction are both crucial determinants of global aphasia and fluency impairments, using out of sample statistical predictions across a large cohort of chronic stroke survivors. The second aim was to test the hypothesis that combining connectome-based modalities (CLSM and CDLSM) with lesion damage improves prediction of aphasia severity. Finally, we aimed to disclose the cortical and subcortical networks related to aphasia severity and speech fluency.

## Materials and Methods

### Subjects

Ninety-two left hemisphere chronic stroke survivors (55 male, 37 female, mean age = 60.5 years, SD = 11.18 years), at least six months since the stroke, were included in this study. All participants were recruited through local advertisements and none had a premorbid history of other neurologic disorders affecting the brain or psychiatric disorders. All participants signed informed consent and the Institutional Review Boards at our institutions approved this study.

### Language testing

The presence and severity of aphasia was assessed using the Western Aphasia Battery-Revised (WAB-R; Kertesz, 2007), which was administered to all participants. Aphasia severity was determined using the WAB aphasia quotient (WAB-AQ). The WAB-AQ ranges from 0 to 100, with the cutoff for aphasia being set at as 93.8, and lower values indicate more severe aphasia. We also assessed speech fluency, as rated by clinicians using the WAB-R speech fluency rating scale, which ranges from 0 to 10 and lower values indicate less fluency.

### Image acquisition and postprocessing

Volumetric T1- and T2-weighted MRI sequences, as well as diffusion MRI, were obtained from all participants using Siemens 3T Trio Systems with a 12-channel head-coil. Scanning parameters were as follows: T1-weighted images: MP-RAGE sequence with 1 mm^{3} isotropic voxels, FOV matrix of 256 × 256 mm, 9-degree flip angle, and 192 sagittal slice sequence with TR = 2250 ms, TI = 925 ms, and TE = 4.15 ms, with parallel imaging (GRAPPA = 2, 80 reference lines); T2-weighted images: 3D SPACE, voxel size of 1 mm^{3}, 256 × 256 mm FOV matrix, 160 sagittal slice sequence, variable flip angle, TR = 3200 ms, TE = 352 ms, with no slice acceleration. Slice center and angulation were similar to the T1 image sequence. Diffusion MRI: EPI scan using 30 directions with b = 1000 s/mm^{2} and b = 2000 s/mm^{2}, TR = 6100 ms, TE = 101 ms, 82 × 82 matrix, 222 × 222 mm FOV, with parallel imaging GRAPPA = 2, 80 45 contiguous 2.7-mm axial slices, TA = 390 s. This diffusion sequences were acquired twice, and we also acquired an identical sequence with nine B = 0 volumes, yielding a total of 131 volumes (11 B = 0, 60 B = 1000; 60 B = 2000).

Lesion masks were drawn on the T2-weighted images by a neurologist who was unaware to the patient’s behavioral data at the time of the lesion drawing. A map of the lesion overlay is demonstrated in Figure 1. The T2 image and the lesion mask were coregistered to the T1-weighted image, which was normalized to standard space using an enantiomorphic approach to minimize the distortions caused by the brain lesion (Nachev et al., 2008) leveraging SPM12's (RRID:SCR_007037) unified segmentation-normalization, employing the lesion masks smoothed with a 3-mm full-width half maximum Gaussian kernel. The lesion masks were transformed into standard space and parcellated using an in-house anatomic atlas (described below) to determine the amount of damage to each cortical region, computed as the percentage of lesioned voxels. The inverse of the transformation matrix from T2/T1 space to standard space was used to transform the anatomic atlas into T2/T1 space, and subsequently into diffusion space, as described below.

### Anatomic atlas

To study the spread of information through the cortical network, it was important to use an atlas with high spatial resolution to define the nodes of the network with fine granularity. For this purpose, we created a cortical parcellation based on the AICHA atlas^{14}. The AICHA atlas defines 384 gray-matter regions of interest (ROIs); we further subdivided these ROIs to obtain 1358 ROIs of approximately the same volume. The subdivision was conducted using the procedure described by Fornito et al. (2010), which allowed us to triple the number of ROIs while keeping the ROI size approximately constant, leveraging the fact that nodes with approximately the same volume reduces biases in streamline tractography.

### ROIs

To reduce the probability of Type II statistical errors and focus on language relevant areas, all analyses were restricted to language-specific ROIs in the left hemisphere. These ROIs were derived from functional imaging studies by Fedorenko et al. (2012) and adapted for our parcellation. Specifically, eight ROIs were included: superior temporal gyrus, superior temporal sulcus, middle temporal gyrus, superior temporal pole, inferior frontal gyrus (IFG) pars triangularis, IFG pars orbitalis, middle frontal gyrus, angular gyrus (Fig. 2; Table 1).

As such, for each individual, the degree of ROI damage corresponded to a 1 × 8 vector indicating the percentage of damage to each of the 8 ROIs. Likewise, connectivity analyses (explained below) were restricted to the subnetworks involving these eight ROIs.

### Structural brain connectivity

The whole-brain brain connectome was obtained using probabilistic DTI. Fiber tracking was performed in diffusion space. As described above, an enantiomorphic normalization was applied to transform the T1 into standard space, taking into account the deformations caused by the brain lesion. The inverse of this transformation matrix was then used to transform the anatomic atlas into T1/T2 space. The T2-weighted image was then linearly normalized into diffusion space (using the B0 image as reference) employing an affine registration with 12 parameters (Jenkinson et al., 2012). This registration matrix was used to normalize the probabilistic maps of white and gray matter (the latter divided into ROIs), and the lesion mask, into diffusion MRI space. Note that this last registration employed the same individual anatomy (from T2 to diffusion) and only a linear transformation was necessary. Once the anatomic atlas was transformed into diffusion space, pair-wise ROI connectivity was performed using FSL FDT’s probabilistic tractography method (RRID:SCR_002823; Behrens et al., 2007). FDT’s Bedpost built default voxel-wise distributions of diffusion parameters, based on the diffusion weighted data, the nondiffusion image and the gradient table. Bedpost was followed by probabilistic tractography (FDT’s probtrackX, with parameters: 5000 individual pathways drawn through the probability distributions on principle fiber direction, curvature threshold set at 0.2, 200 maximum steps, step length 0.5 mm, using distance correction built into probtrackx). The probabilistic white-matter map excluding the stroke lesion was used as a waypoint mask and the number of streamlines arriving in one ROI, when another ROI was seeded, was computed, averaging the connections from ROI_{i} to ROI_{j} and vice versa, and the number of streamlines was corrected based on distance between ROIs (from probtrackX, as mentioned above) and by the sum of volume of the ROIs (i.e., the resulting number from distance corrected probtrackX was divided by the sum of the volumes of the connected ROIs).

For each participant, a structural connectome including 1358 × 1358 connections was constructed (i.e., a 1358 × 1358 symmetrical adjacency matrix), where each connection represented the connection weight between ROIs (number of streamlines corrected by distance between ROIs and by the combined volume of connected ROIs). From this matrix, we extracted a subnetwork of 8 × 8 connections including only the language-specific ROIs described above (with 28 unique connections, since reciprocal connections were identical). This reduction in the number of connections was performed to focus on connections more commonly associated with language and reduce model overfitting.

For the visual display of the anatomic subnetwork used in this study, we reconstructed the deterministic tractography streamlines between each pair of ROIs across 59 healthy controls with a similar age distribution [45 female, mean age 54.7 ± 8.3 years; DTI parameters: twice-refocused echo-planar imaging b = 0, 1000, 30 diffusion encoding directions, TR = 8500 ms, TE = 98 ms, FOV = 222 × 222 mm^{2}, matrix = 74 × 74, 3-mm slice thickness, and 40 axial slices; DSI studio: fiber reconstruction using Q-Space Diffeomorphic Reconstruction (Yeh and Tseng, 2011), 1.25 diffusion length sampling ratio, 2-mm output resolution]. Each connection was transferred to stereotaxic space using 7, 9, and 7 transformation parameters (Fourier basis) and then combined across subjects, with the resulting bundle being interpolated into 100 segments and the center of mass for each segment was calculated. Streamlines whose segment-wise deviation from the center of mass was within 0.5 SD were maintained, yielding an anatomically representative connectome link. Likewise, the core pathway traversed by each connection was also maintained and used for visual display. to define the length of each pair-wise connection (i.e., the average length of the fibers constituting each connection), the distance between each 100 interpolated points was accrued for each link. The language network used in this study is illustrated in Figure 3.

### Structural brain dynamics

Connectome dynamics were extracted using the method described by Mišić et al. (2015). This method measures the spread of information throughout the network. It is performed by changing the state of one of the network nodes (“seeding” the network). In the next time step, the neighbors of the seed-node change their state, and in following time step, their neighbors change their state, and so on. While this approach is novel in the context of neuroimaging, it has a long and distinguished history in the social sciences (Granovetter, 1978; Watts, 2002). This process is illustrated in Figure 4. Unless the network contains unreachable nodes, all nodes eventually change their state. If the threshold is set to 0, the number of time steps that it takes to change the state of a node is equal to the length of the shortest path to the seed node.

Lesion disrupts the connectome and creates unreachable nodes; for numerical convenience, we used the inverse of the lengths of propagation paths (equivalent to the “speed” of propagation from the seed to the destination node). This way, the propagation speed between a pair of mutually unreachable nodes is zero (corresponding to the shortest path of infinite length). As a result, we obtained 28 values of propagation speed between each pairing of the eight language-network nodes.

Importantly, the pair-wise measures in the connectome dynamics matrix are a reflection of the structure of the whole network, thus providing information about the integrity of the topological organization of the network beyond the pair-wise connection. Conversely, the pair-wise connection weight is related exclusively to the connection between one pair of ROIs, respective of the other connections in the connectome.

### Statistical analyses and out-of-sample prediction

From the methods described above, five sets of variables were obtained from each participant: two behavioral measures, WAB-AQ and WAB fluency subscore; and three neuroimaging measures, ROI damage (1 × 8 vector of proportional damage to the eight language-specific ROIs), CLSM = connectivity weights (1 × 28 vector of the anatomic connection weights between each pairing of the eight language-specific ROIs), and CDLSM = connectivity dynamics (1 × 28 vector representing the connectivity dynamics within the language networks, i.e., the inverse of the shortest path length between each pairing of nodes).

We used linear-kernel support vector regression (SVR) to predict the behavioral measures (WAB-AQ and, separately, WAB fluency) from the neuroimaging measures (ROI damage, connectivity weight, connectivity dynamics) and an ensemble method that combined the three neuroimaging measures.

### Individual measures

The pipeline that predicted a behavioral measure from a neuroimaging measure consisted of SVR, followed by a third-order polynomial regression to match the range of the SVR predictions to the actual observed range. We used the LIBSVM implementation of SVR (Chang and Lin, 2011). The pipeline was trained and tested using 5-fold cross-validation. The four folds that composed the training set were also used to find the optimum regularization parameter for the SVR model (the term that defines the trade-off between margin width and training-set accuracy, known as the C value in the support-vector literature). This parameter was optimized using a grid search with values (.1, .5, 1, 10, 50, 100, 500). We have observed that SVR prediction tends to underestimate the highest scores and to overestimate the lowest scores; to correct for this, a third-order polynomial function was applied to the SVR-predicted scores. To compute the parameters of the polynomial, we predicted the scores of the patients in the training set (after the estimation of the SVR regularization parameter), and applied the least-squares regression to find the relationship between actual and predicted training scores (this is similar to the procedure described by Yourganov et al., 2016, except here we use a polynomial rather than a linear function). The pipeline predicted on the left-out fold and a correlation score was calculated between the predictions and truths. After all five folds served as the left-out fold, the five resultant correlations were averaged to form a single value. This was repeated multiple times to assess prediction distributions. K-fold cross-validation is known to have high variance (Ojala and Garriga, 2010), with results varying for the same dataset depending on the combinations of observations allocated among the folds. To correct for this instability, we performed 1000 iterations of 5-fold cross-validation (i.e., 1000 times the average from each 5-fold cross-validation described above), giving us 1000 values of correlation between actual and predicted scores. Of note, we did not perform classical mass univariate statistical analyses of lesions or connections. We employed these values to build the SVR model instead. For this reason, the SVR approach is not corrected by multiple comparisons in the classical sense of mass univariate statistics, but takes into account the multiple values in a multivariate approach and type 1 and 2 errors are reflected in reduced accuracy of the resulting model.

To statistically compare measures, we calculated the ratio of iterations that a particular measure had a higher mean correlation than another measure. This was possible because each measure was allocated the same folds at each of the 1000 iterations, although the allocation changed across iterations.

We also performed permutation tests to address the well-known issue that it is possible to obtain results that appear better than chance with small datasets that are randomly generated (Combrisson and Jerbi, 2015). For each measure, we calculated a permuted distribution by performing the same pipeline over 1000 iterations. However, the truth values used to train and test the model (i.e., the dependent variables WAB-AQ and WAB fluency) were shuffled at the beginning of each iteration. This resulted in a length 1000 permutation distribution for each measure. For each measure, each of the 1000 real correlations was compared with the permuted distribution, and had a *p* value calculated from its rank on the distribution. These 1000 *p* values were then averaged (Ojala and Garriga, 2010).

The coefficients of the linear SVR models were saved for each of the five folds and averaged. Each coefficient represents the relative predictive power of an individual feature. For the three types of input, the features were: damage to an ROI; strength of the anatomic connection between a pair of ROIs; the inverse of the shortest path between a pair of ROIs. By repeating the predictive process 100 times, we estimated the distribution of each coefficient.

### Combined measures

We created an ensemble pipeline that calculated a prediction for each subject by performing a weighted average on the subject’s corresponding predictions from ROI damage, connectivity weight, and connectivity dynamics.

The coefficients of the weighted average were found by fitting an ordinary least squares (OLS) multiple linear regression model. The OLS model consisted of three predictor (explanatory) variables corresponding to the behavior score predictions of the three neuroimaging-based models, and the prediction (response) variable was the behavior score. In other words, the model predicted a behavior score by combining behavior score predictions. OLS regression assigns larger weights for predictor variables that are more correlated with the prediction variable. Therefore, the neuroimaging-based models that more closely predicted the data had higher coefficients.

The four folds of the training set were used to fit the OLS model. To control for overfitting, we used a leave-one-out procedure (within the training set) to predict the scores using the three types of neuroimaging inputs and to combine them into a regression model. For computational reasons, the regularization parameters associated with the each of the neuroimaging models already found earlier for the training set were used to fit the models.

This pipeline was executed on 1000 iterations of 5-fold cross-validation, where the data allocation was the same as for the other measures. Just as the case of the individual measure-generated distributions, the resultant distribution was compared with the other measure’s distributions and with chance. A visual diagram of the statistical and machine learning steps used here is demonstrated in Figure 5.

### Software tools

The connectome dynamics were computed using the Matlab (RRID:SCR_001622) code developed by R. Betzel and B. Misic (Mišić et al., 2015). The predictive analysis was largely developed in the Julia and Python programming languages (RRID:SCR_008394; http://julialang.org). The Python library Scikit-learn (RRID:SCR_002577; Pedregosa et al., 2011) provided the SVR models, and the linear regression models used for the ensemble method and the third-order polynomial regression. The SVR models provided by Scikit-learn are based in LIBSVM (RRID:SCR_010243; Chang and Lin, 2011). The Julia code base communicated with Scikit-learn through the PyCall package. The visualizations were rendered with Matlab, Surfice, and MRIcroGL (RRID:SCR_008264, RRID:SCR_002403).

## Results

### Individual measures

Both WAB-AQ and fluency were predicted with an accuracy significantly higher than chance (*p* < 0.05). The mean correlation coefficients between the predicted and actual WAB-AQ values were: 0.760 ± 0.026 (combined methods), 0.723 ± 0.025 (ROI damage), 0.719 ± 0.025 (connectivity dynamics), and 0.534 ± 0.035 (connectivity weights).

With regards to WAB fluency, the correlation coefficients were: 0.728 ± 0.023 (combined methods), 0.705 ± 0.026 (ROI damage), 0.697 ± 0.025 (connectivity dynamics), and 0.543 ± 0.030 (connectivity weights). ROI damage and connectivity dynamics were not significantly different from each other for predicting either WAB-AQ or WAB Fluency. However, both outperformed connectivity weights alone (Figure 6). Based on these results, connectome dynamics are as important as lesion location in predicting impairment. However, as explained below, Combined measures, the improved accuracy of the model based on their combination suggests that they provide additional and complementary information.

The ROIs with the highest influence on the predictive model based on ROI damage were, in descending order of importance: for WAB-AQ, superior temporal gyrus, superior temporal pole, angular gyrus, IFG triangularis, middle frontal gyrus, IFG orbitalis, middle temporal gyrus and superior temporal sulcus; for WAB fluency, superior temporal gyrus, angular gyrus, middle temporal gyrus, superior temporal pole, middle frontal gyrus, IFG triangularis, IFG orbitalis, superior temporal sulcus. These results are demonstrated in Figure 8.

The connections with the highest influence on the predictive model built with connectivity weights were: for WAB-AQ, superior temporal gyrus to superior temporal sulcus, superior temporal sulcus to middle temporal gyrus and superior temporal gyrus to superior temporal pole; for WAB fluency, middle temporal gyrus to superior temporal sulcus, superior temporal gyrus to superior temporal sulcus and superior temporal sulcus to angular gyrus (Fig. 7).

On the model built based on connectivity dynamics, the connections with the highest influence toward WAB-AQ were superior temporal gyrus to angular gyrus, superior temporal sulcus to IFG triangularis, and superior temporal pole to angular gyrus. Toward WAB fluency: the superior temporal sulcus to IFG triangularis, middle temporal gyrus to superior temporal pole, and the superior temporal gyrus to IFG triangularis. The coefficients for the connectivity models are listed in Table 2.

As expected, since WAB-AQ is a composite score that includes WAB fluency, there were strong correlations between WAB-AQ and WAB fluency (*R* = 0.92, *p* = 1.45E-39), and in the connection-wise coefficients related to WAB-AQ and WAB fluency (connectivity weights: *r* = 0.9, *p* = 3.54E-34; connectivity dynamics: *r* = 0.79, *p* = 5.06E-21), indicating that connections that were predictive of WAB-AQ were also likely to be predictive of WAB fluency. Nonetheless, there were no correlations between the connection-wise coefficients between connectivity weights and connectivity dynamics (WAB-AQ: *r* = 0.12, *p* = 0.61; WAB fluency: *r* = -0.3, *p* = 0.19), suggesting that links that were strongly influential in the predictive model built using connectivity weights were not necessarily influential in the model using connectivity dynamics.

It is important to emphasize that the model weights should be interpreted in the context of the SVR models (Haufe et al., 2014), and they have not been tested in a mass univariate approach to define the statistical significance of each feature. They are instead being reported in an ordinal order of importance with regards to how much each feature contributed to the model.

Interestingly, we observed an inverse linear relationship between connectome weight coefficients and the distance traveled by the fibers connecting ROIs (based on the normative data from control subjects; WAB-AQ: *R* = -0.46, *p* = 0.039; WAB fluency: *R* = -0.64, *p* = 0.001). However, there was not a relationship between dynamics coefficients and fiber distance (WAB-AQ: *R* = 0.23, *p* = 0.34; WAB fluency: *R* = 0.36, *p* = 0.11). Long connections had lower influence in the predictive model using connectivity weights, but not in the model built with connectivity dynamics. Of note, as expected, the distance traveled by the fibers connecting the ROIs was strongly but not perfectly correlated with the Euclidean distance between ROIs (*R* = 0.74, *p* = 7.27E-05) due to anatomic constraints on fiber trajectory, and the resulting fiber curvature.

### Combined measures

The highest predictive ability for WAB-AQ was observed with an ensemble model combining all three measures (ROI damage, connectivity weights, and connectivity dynamics; *R* = 0.761 ± 0.025) followed by the model combining connectivity dynamics and ROI damage. Regarding WAB fluency, the model with highest accuracy was composed of connectivity dynamics and ROI damage, followed by the model combining all three measures (*R* = 0.734 ± 0.02). The comparison between models is shown in Figure 9. Statistically, the model with three measures was significantly better at predicting language measures compared with the models based on ROI damage or connectivity dynamics alone: it outperformed ROI damage and connectivity dynamics when predicting WAB-AQ to a *p* value of 0.04 and 0.02, respectively. The ensemble performed better for predicting WAB fluency, but only to a *p* value of 0.08 (i.e., not statistically significant) but significantly better (*p* = 0.04) compared to ROI damage and connectivity dynamics, respectively (noting that *p* = 0.04 is weakly indicative of a rejection of the null hypothesis).

## Discussion

In this study, we evaluated the ability of neuroimaging methods to explain personalized severity of language problems across individuals with poststroke aphasia. We tested two specific hypotheses: (1) that connectome dynamics (CDLSM) could increase the correlation between residual connectome integrity and aphasia severity, i.e., provide a more comprehensive assessment of brain network integrity compared with CLSM; and (2) that combining connectome modalities with lesion damage would optimize the understanding of brain structures that are crucial for language and thus improve prediction. While testing these hypotheses, we also identified the crucial cortical and subcortical networks related to aphasia severity and speech fluency.

We observed that CLSM was statistically inferior to cortical ROI damage in predicting individual language deficits; however, CDLSM was comparable to cortical ROI damage and thus superior to CLSM. We also observed that the combination of measures was statistically better than each measure alone, supporting the hypothesis that each neuroimaging assessment may provide additional information about brain integrity and the crucial structures supporting language and fluency.

These results have broad implications for studies assessing crucial brain systems supporting cognitive function. First: how does CLSM complement classical lesion-symptom mapping using cortical damage? The results presented here suggest that CLSM has a lower personalized explanatory value toward aphasia severity compared with ROI damage. Therefore, is CLSM less well suited for behavioral mapping? It is important to consider that CLSM typically uses an exponentially larger number of predictive features compared with cortical ROI damage, since there are multiple possible combinations of ROIs. Thus, for the same number of individuals, the higher number of features in CLSM compared with ROI damage can lead to overfitting and reduce out of sample predictive performance. Likewise, CLSM is highly interrelated to cortical damage since damaged connections are likely those connecting lesioned ROIs. despite these caveats, we suggest that there are crucial features of CLSM that justify its use. CLSM can identify which connections are crucial within a set of all possible connections involving lesioned ROIs. As such, CLSM is a subset of conventional cortical lesion symptom mapping and refines the crucial network involving lesioned ROIs. Moreover, CLSM can also reveal the importance of connections involving lesioned ROIs and nonlesioned ROIs, thus elucidating the network extending beyond the cortical areas commonly injured due to vascular anatomy. If specific corticocortical connections did not influence behavior, all connections from a lesioned ROI would be equally predictors of outcome. Instead, there are crucial links that can be identified with CLSM, and the identification of which connections are crucial can improve the understanding of the systems underlying language production. A diagram representing this concept it presented in Figure 10.

Second, what does assessment of connectivity dynamics (CDLSM) add to ROI damage or connectivity weights? CDLSM is an expansion of CLSM; i.e., it requires the individual connectome, from which dynamic features can be extracted. CDLSM is unique because it can identify crucial corticocortical interactions that may occur indirectly even if their anatomic link is damaged. If the shortest path between nodes A and B is not through direct connections between A and B (because there is no structural connectivity remaining after the stroke), but rather through other nodes, CDLSM can measure this connectivity, while CLSM cannot. For this reason, it expands on CLSM, and provides a more comprehensive view of network integrity. CDLSM evaluates corticocortical interactions by taking into account shortest paths, even if the shortest path involves remote unaffected areas, such as the contralateral hemisphere or subcortical gray matter such as basal nuclei. Therefore, each link in CDLSM is a reflection of the topological organization of the entire network. This characteristic of CDLSM is particularly relevant in cases on neurologic recovery, where not only what was damaged during the lesion, but what was left after it, is crucial to the severity of chronic deficits. As demonstrated here, CDLSM is equivalent to cortical ROI damage with regards to out of sample personalized aphasia severity prediction. Importantly, our results also suggest that the combination of modalities yields the best personalized explanatory features, confirming that CLSM and CDLSM add additional information to mapping based on cortical damage and thus provide additional information about brain integrity that this crucial for aphasia. It should be mentioned that the connectivity dynamics approach used in this study, described by Mišić et al. (2015), is not the only approach to assess dynamics or communication within networks. Other alternatives include communicability (Crofts et al., 2011), search information (Goñi et al., 2014) and K-shortest paths (Avena-Koenigsberger et al., 2017), and these were not tested in this present study. Furthermore, it is important to emphasize that CLSM utilizes more basic edge-level information, and it does not take into account other topological features of network organization that can be obtained, for example, with graph theory. Likewise, CLSM used here employs number of streamlines, which is a measure that can vary due to signal-to-noise and method of tractography. A recent review article by Aerts et al. (2016) has provided a comprehensive overview of several studies that have investigated topological network changes associated with brain injury and strokes, with some few studies using DTI data. Our group also demonstrated in previous studies that temporal lobe betweenness centrality (Bonilha et al., 2016), cortical hub (“rich-club”), status (Gleichgerrcht et al., 2015), and global network small-worldness (Bonilha et al., 2016) are associated with aphasia severity and recovery. Given the plethora of possible measures, there are several possible combinations that have not yet been systematically assessed, and compose an interesting pathway for future studies. Furthermore, these methods could also be expanded to combined with connectomes obtained from functional MRI data (task based or resting state), with the caveat related to potential changes in hemodynamic properties and their effect on BOLD signal in stroke survivors with multiple cardiovascular risk factors and atherosclerosis (D'Esposito et al., 2003). Special attention should be paid to inferences about functional imaging (and neuroimaging in general) in the context of pathologically altered cerebral perfusion patterns (Siegel et al., 2017). In a recent and elegant study, Pustina et al. (2017) demonstrated that combining functional connectivity with structural connectivity increases the ability to explain and predict specific language deficits in chronic poststroke aphasia. They concluded that predictions based on multimodal data were more accurate than predictions using only one modality. It is reassuring that our results, i.e., improved prediction when combining CLSM and CDLSM with lesion based mapping, are similar to their conclusions, even if we used a different approach based only on structural integrity. Overall, their and our findings suggest that there is additional information from each modality that can leveraged to better understand the neurobiology of poststroke impairments and aphasia. Importantly, our findings suggest that CDLSM is particularly useful in improving predictions, indicating that connectome based studies, which are instrumental in expanding the understanding of the relationship between brain structural integrity and behavior (Yourganov et al., 2016; Gleichgerrcht et al., 2017; Marebwa et al., 2017), can be even further improved by including connectome dynamics. To our knowledge, this is one of the first studies to assess structural connectome dynamics related to stroke recovery.

Third, how did the results shown here, using additional modalities (CLSM and CDLSM) expand on the knowledge about aphasia severity and fluency impairment in aphasia? As expected, they revealed corticocortical interactions within the typical lesion bed are strongly predictive of chronic deficits. Notably, they demonstrated the crucial importance of ventral stream (Fig. 7). Moreover, they also demonstrated that connectivity structure and dynamics involving the angular gyrus had a strong influence in the model predicting WAB-AQ and QAB fluency, suggesting that the integrity of posterior temporal-parietal network plays a crucial role in aphasia severity. The posterior temporal-parietal is a watershed zone for the middle cerebral artery and the degree of cortical damage in this region may be unrelated to the degree of white matter damage, particularly given the decreased resilience of white matter to ischemia (Fisher, 2011; Hamner et al., 2011). As such, by elucidating this crucial component of the language networks, CDLSM can complement ROI damage and thus improve personalized prediction of severity. We propose that this approach should now be used to also evaluate the personalized mechanisms supporting spontaneous or treatment-related aphasia recovery.

One limitation of this study is that the measures of aphasia and fluency severity are imperfect. Not only is the WAB-AQ a multidimensional measure, but the fluency score also reflects multiple, distinct aspects of speech “fluency,” including grammaticality, articulation, hesitancy, among others. Furthermore, the inter-rater reliability in scoring is not high, because it is both subjective and multidimensional (Trupe, 1984). However, a more reliable and unidimensional score would likely have a higher, rather than lower, correlation with specific lesions and connections. Moreover, the specific measures of severity are not especially important for achieving our goal of evaluating the contribution of CDSLM to cortical necrosis and CLSM in predicting severity of language impairment.

## Acknowledgments

Acknowledgements: We thank Richard Betzel and Olaf Sporns for sharing the code that computes connectome dynamics and for instructing us how to use it.

## Footnotes

The authors declare no competing financial interests.

This work was supported by National Institutes of Health/National Institute on Deafness and Other Communication Disorders Grants DC014021 (to L.B.), DC011739 and DC014664 (to J.F.), and DC05375 (to A.E.H.) and by the American Heart Association Grant SFDRN26030003 (to L.B.).

This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license, which permits unrestricted use, distribution and reproduction in any medium provided that the original work is properly attributed.