Pain-related fear - From different fear constructs to dissociable neural sources of

The ability to infer emotional states through self-reports is often limited. Their measurement becomes even more challenging when considering emotional phenomena such as pain-related fear where different associated fear constructs have been proposed. Demonstrating significant predictive value regarding disability in patients with persistent musculoskeletal pain, pain-related fear is assessed by questionnaires focusing of movement/(re)injury/kinesiophobia, fear avoidance beliefs or pain anxiety. Furthermore, the relationship of general anxiety measures such as trait anxiety to pain-related fear remains ambiguous. Advances in neuroimaging might help to support potential commonalities or differences across psychological constructs using appropriate machine learning techniques with the ability to reveal predictive relationships between neural information and questionnaire scores. Here, we applied a pattern regression approach using functional magnetic resonance imaging data of 20 non-specific chronic low back pain (LBP) patients. More specifically, we applied a novel approach using Multiple Kernel Learning that allows investigating the contribution of experimental conditions and regional neural information to a prediction model. We hypothesized to find evidence for or against a common fear construct by computing and comparing the prediction model of each questionnaire according to the contribution of fear-related neural information and conditions. The current results underpin the diversity of fear constructs among self-report measures of pain-related fear by demonstrating evidence of non-overlapping and differentially contributing neural sources within fear processing regions. Thus, the current approach might ultimately help to further understand and dissect the fear constructs captured by the various pain-related fear questionnaires. Pain-related fear, often assessed through self-reports such as questionnaires, has shown prognostic value and clinical utility for a variety of musculoskeletal pain disorders. However, it remains difficult to determine a common underlying fear construct of pain-related fear due to several proposed constructs among questionnaires. The current study describes a novel neuroscientific approach using machine learning of neural patterns within the fear network of chronic LBP patients that might have the potential to identify neural commonalities or differences among the various fear constructs. Ultimately, this approach might afford a deeper understanding of the suggested fear constructs of pain-related fear and might be also applied to other domains where ambiguity exists between emotional phenomena and underlying psychological constructs.


Introduction
Self-report measures of emotional states are paramount for behavioral neuroscience by facilitating the understanding of brain activation patterns (Shrout et al., 2017). However, the validity of selfreports is limited (Choi and Pak, 2005), probably also because often overlapping psychological constructs are assessed, illustrated by the fact that a diversity of questionnaires attempts to assess related constructs. One such example is pain-related fear (PRF), which has become the major explanatory determinant regarding disability in patients with persistent musculoskeletal pain (Crombez et al., 1999;Vlaeyen and Linton, 2000;Vlaeyen et al., 2016). However, despite the clinical utility of PRF, its construct validity remains ambiguous. Currently, there are several questionnaires that assess PRF based on different associated fear constructs. Of these, the most popular questionnaires focus either on fear of movement/(re)injury/kinesiophobia, fear avoidance beliefs or pain anxiety. However, despite the demonstrated clinical utility of self-reports of PRF there is an open debate on what exactly their scores and related fear constructs reflect (Lundberg et al., 2011;Caneiro et al., 2017). In this respect, advances in neuroimaging provide the potential to support or question the validity of self-reports through the identification of meaningful relationships between emotional states and brain responses. More specifically, machine learning techniques such as multivariate pattern analysis (MVPA) if applied to functional magnetic resonance imaging (fMRI) data make it possible to directly study the predictive relationship between a content-selective cognitive or emotional state and corresponding multivoxel fMRI activity patterns on a single-subject basis (Haynes, 2015;Hebart and Baker, 2017). Pattern recognition algorithms "learn" a potential association between brain response patterns and an individual's perceptual state that is expressed in terms of a label. The label may have discrete (classification approach) or continuous (regression approach) values such as questionnaire scores (Formisano et al., 2008). In particular the latter is a novel, promising approach to identify predictive relationships between neural information and measures of behavior (Fernandes et al., 2017). Here, we applied pattern regression analysis in combination with Multiple Kernel Learning (MKL) to investigate the contribution of fear-related neural information and conditions to the prediction models of the various PRF questionnaires in a sample of 20 non-specific chronic low back pain (LBP) patients. In contrast to other pattern recognition approaches, the applied sparse version of MKL allows to simultaneously learn the relative contribution of experimental conditions and brain regions (defined by an atlas) to the decision function (Schrouff et al., 2018).
Because of substantial a-priori knowledge regarding the involvement and functional diversification of fear processing regions in PRF (Neugebauer et al., 2004;Tovote et al., 2015;Simons et al., 2014b), we first compared the different PRF questionnaires in terms of their model performance, namely the ability to predict the score of the various PRF questionnaires based on brain response patterns across fear processing regions. Second, we aimed at comparing the different PRF questionnaire models according to the predictive contribution of the different fear processing brain regions. Namely, if the PRF questionnaires share overlapping fear constructs, then the contributing set of fear processing regions should be more similar across the different prediction models. Conversely, if the contributing brain regions vary across the predictions models of the PRF questionnaires, this would provide evidence against a common fear construct across questionnaires. Ultimately, this approach might help to further understand and dissect the various PRF constructs in chronic LBP.

Patients
. CC-BY-NC-ND 4.0 International license peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was not . http://dx.doi.org/10.1101/251751 doi: bioRxiv preprint first posted online Jan. 25, 2018; belief that that activity may result in (re)injury or stronger pain) and "somatic focus" (TSK-SF, the belief in underlying and serious medical problems) (Roelofs et al., 2007;Rusu et al., 2014).
(2) The German version of the fear avoidance beliefs questionnaire (FABQ) (Waddell et al., 1993;Pfingsten et al., 2000) consists of 16 back pain-specific items related to fear avoidance beliefs rated on a 7-point rating scale (0 = "completely disagree" to 6 = "completely agree"). It includes two distinct and established subscales related to beliefs about on how work (FABQ-W) and physical activity (FABQ-PA) affects LBP with internal consistencies of α = 0.88 and α = 0.77, respectively (Waddell et al., 1993).
(3) The short version of the Pain Anxiety Symptoms Scale (PASS-20) assesses fear and anxiety responses related to pain including cognitive, physiological and motor response domains (McCracken and Dhingra, 2002). Items on the PASS-20 are measured on a 6-point Likert scale and relate to four different subscales including cognitive anxiety (PASS-C), fear (PASS-F), physiology (PASS-P) and escape/avoidance (PASS-E) (Roelofs et al., 2004b). The German version of the PASS-20 has an internal consistency of α = 0.90 (Kreddig et al., 2015).
Furthermore, patients were asked to fill out the painDETECT (PD-Q) questionnaire that includes three 11-point numeric rating scales (NRS), with 0 being "no pain" and 10 being the "worst as "anxiety proneness" (Julian, 2011). All questionnaires were administered at the fMRI appointment prior to brain scanning. We tested the scores of the different questionnaires for the assumption of normality of the data using the Shapiro-Wilk test and visually using Q-Q plots implemented in IBM SPSS Statistics (version 23) (Ghasemi and Zahediasl, 2012).

Scanning protocol and design
Brain imaging was performed on a 3-T whole-body MRI system (Philips Achieva, Best, Netherlands), equipped with a 32-element receiving head coil and MultiTransmit parallel RF transmission. Each imaging session started with a survey scan, a B1 calibration scan (for MultiTransmit), and a SENSE reference scan. High resolution anatomical data were obtained with a 3D T1-weighted turbo field echo scan consisting of 145 slices in sagittal orientation with the following parameters: FOV = 230 × 226 mm 2 ; slice thickness = 1.2 mm (resulting in a voxel resolution of 1.1mm x 1.1mm x 1.2mm); acquisition matrix = 208 × 203; TR = 6.8 ms; TE = 3.1 ms; flip angle = 9°; number of signal averages = 1. Functional time series were acquired using whole-brain gradient-echo echo planar imaging (EPI) sequences (365 volumes), consisting of 37 slices in the axial direction (AC-PC angulation) with the following parameters: field of view (FOV) = 240 × 240 mm 2 ; acquisition matrix = 96 × 96; slice thickness = 2.8 mm (resulting in a voxel resolution of 2.5mm x 2.5mm x 2.8mm); interleaved slice acquisition; no slice gap; repetition time (TR) = 2100 ms; echo time (TE) = 30 ms; SENSE factor = 2.5; flip angle 80°.
The PRF-evoking stimuli consisted of video clips with a duration of 4 s recorded from a 3rd person perspective (Meier et al., 2016). The video clips showed potentially harmful activities for the back selected from the Photograph Series of Daily Activities (PHODA) (Leeuw et al., 2007b). The original PHODA was developed in close collaboration with human movement scientists, physical therapists, and psychologists and is comprised of a fear hierarchy based on ratings of perceived harmfulness of daily activities in patients with chronic LBP. From the 40 potentially harmful activities included in the short electronic PHODA version (Leeuw et al., 2007b), we chose three scenarios from the top six most harmful activities, namely shoveling soil with a bent back, lifting a flowerpot with slightly bent back and vacuum cleaning under a coffee table with a bent back (harmful condition). Furthermore, we created video clips of three activities rated as less harmful, such as walking up and down the stairs and walking on even ground (harmless condition).
Presentation® software (Neurobehavioral Systems, Davis, CA, USA) was used to present the video clips in a pseudo-randomized order (no more than two identical consecutive trials). The patients were asked to carefully observe the video clips which were displayed using MRcompatible goggles (Resonance Technology, Northridge, CA, USA). The three harmful and harmless activities were each presented five times (30 trials total). After the observation of the video clips, the patients were asked to rate the perceived harmfulness of the activity on a visual analog scale (VAS) which was anchored with the endpoints "not harmful at all" (0) and "extremely harmful" (10). All ratings were performed using a MR compatible track ball (Current Designs, Philadelphia, PA, USA). After the VAS rating, a black screen with a green fixation cross appeared (duration jittered between 6 and 8s). This experimental protocol has been shown suitable for investigations of neural correlates of PRF in previous fMRI studies based on mass-univariate analyses (Meier et al., 2016;Meier et al., 2017).

MR data organization and pre-processing
We used fMRI raw data of previously reported studies (Meier et al., 2016;Meier et al., 2017). The fMRI data were organized according to the Brain Imaging Data Structure (BIDS), which provides a consensus on how to organize data obtained in neuroimaging experiments. Preprocessing was . CC-BY-NC-ND 4.0 International license peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was not . http://dx.doi.org/10.1101/251751 doi: bioRxiv preprint first posted online Jan. 25, 2018; performed using FMRIPREP (version 1.0.0-rc2, https://github.com/poldracklab/fmriprep), a Nipype based tool (Gorgolewski et al., 2011), which requires minimal user input and provides easily interpretable and comprehensive error and output reporting. This processing pipeline includes state-of-the-art software packages for each phase of preprocessing (see https://fmriprep.readthedocs.io/en/stable/workflows.html for a detailed description of the different workflows). Each T1-weighted (T1w) volume was skullstripped using antsBrainExtraction.sh v2.1.0 (using OASIS template). The skullstripped T1w volume was co-registered to skullstripped ICBM 152 Nonlinear Asymmetrical MNI template version 2009c using nonlinear transformation implemented in ANTs v2.1.0 (Avants et al., 2008). Functional data were slice time corrected using AFNI (Cox, 1996) and motion corrected using MCFLIRT v5.0.9 (Jenkinson et al., 2002). This was followed by co-registration to the corresponding T1w volume using boundary based registration 9 degrees of freedom -implemented in FreeSurfer v6.0.0 (Greve and Fischl, 2009). Motion correcting transformations, T1w transformation and MNI template warp were applied in a single step using antsApplyTransformations v2.1.0 with Lanczos interpolation. Three tissue classes were extracted from T1w images using FSL FAST v5.0.9 (Zhang et al., 2001). Voxels from cerebrospinal fluid and white matter were used to create a mask used to extract physiological noise regressors using aCompCor (Behzadi et al., 2007). The mask was eroded and limited to subcortical regions to limit overlap with grey matter and six principal components were estimated. gaussian kernel was applied. To accelerate data pre-processing we performed parallel computing using the Docker environment (https://www.docker.com/) and the GC3Pie framework (https://github.com/uzh/gc3pie) on the ScienceCloud supercomputing environment at the University of Zurich (S3IT, https://www.s3it.uzh.ch/).

MVPA input data
The pre-processed data were subsequently passed to Statistical Parametric Mapping software package (SPM12, version 6906, http://www.fil.ion.ucl.ac.uk/spm/) for model computation using a general linear model (GLM). For each patient a design matrix was built including the onsets of the video clips with a duration of 4s (harmful / harmless activities, each pooled across the three different activities resulting in 15 harmful and 15 harmless stimuli) as separate regressors. In addition, and for each patient, the following nuisance regressors were implemented in the GLM model: (1) the six regressors derived from the component based physiological noise correction method (aCompCor) and (2) the motion-related regressors generated by AROMA (see section 2.4).
A high-pass filter with a cut-off of 128 s was used to remove low-frequency noise. Trials were modeled as boxcar regressors and convolved with the standard canonical hemodynamic response function (HRF) as implemented in SPM12. Finally, for each patient, parameter estimates (beta images) for each condition were computed and served as the input images for the MVPA. 2017). MVPA was performed using routines implemented in PRoNTo v.2.0 (Schrouff et al., 2013).
For the read-out of multivariate neural information that might serve as a potential score estimator of the different PRF questionnaires, we applied a newly introduced pattern regression approach based on supervised machine learning and testing phases using Multiple Kernel Learning (MKL) (Schrouff et al., 2018). In brief, the objective in supervised pattern recognition regression analysis is to learn a function from data that can accurately predict the continuous values (labels), i.e. f(xi)=yi from a given dataset D={xi, yi}, i=1…N where xi represents pairs of samples or vectors and yi the different labels. Ultimately, the learned function from the learning set is used to predict the labels from new and unseen data (Schrouff et al., 2013). MKL allows to account for brain anatomy (determined by a brain atlas, see section 2.7) and different modalities (such as anatomical/functional data or experimental conditions) during the model estimation by considering each brain region and modality as separate kernels. This approach allows to determine the contribution of each brain region (region weights) and modality to the final decision function of the model in a hierarchical manner by simultaneously learning and combining the different linear kernels (Fernandes et al., 2017;Schrouff et al., 2018). Compared to conventional MVPA methods based on whole-brain voxel weight maps, this procedure provides a straight-forward approach to draw inferences on the region level without the need for multiple comparison correction (Schrouff et al., 2018). To account for possible differential contributions of the harmful and harmless conditions to the decision function, we included the individual SPM beta images of each condition as separate modalities in the MKL model (condition weights). The kernels were mean centered and normalized (to account for the different sizes of the involved brain regions) using standard routines implemented in PRoNTo. Subsequently, for each questionnaire, we trained a separate MKL regression model with the respective labels (FABQ, TSK-17-, -13-and -11-item, PASS and . CC-BY-NC-ND 4.0 International license peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was not . http://dx.doi.org/10.1101/251751 doi: bioRxiv preprint first posted online Jan. 25, 2018; all subscale scores, state and trait anxiety). This resulted in a total of 15 MKL models providing outputs for model evaluation, including model performance, region and conditions weights.
Furthermore, we trained MKL regression models based on the harmfulness ratings collected during the fMRI measurements (mean ratings of the harmful condition and harmless condition, respectively). To reduce overfitting of each model, we applied a nested cross-validation procedure using a "leave-one-subject-out" cross-validation scheme to train the model including optimization of the model's hyperparameter "C" (range [0.1 1 10 100 1000]). Furthermore, to generate a databased null distribution of the performance measures (r and nMSE, see section 2.8), 1000 permutations (permuting the labels across patients) were computed for each model. Results were considered significant at a threshold of p < 0.05. Finally, the MKL currently implemented in PRoNTo operates with sparsity (L1 regularization) in kernel combinations and might therefore not select brain regions that are highly associated with each other and the prediction variable (these regions will have kernel weights of zero) (Fernandes et al., 2017;Schrouff et al., 2013). This might influence the selection of regions across the models. Therefore, to confirm a dissociation regarding the selected brain regions across the predictive models, we performed a secondary cross-validation by choosing the regions contributing most to the prediction (>10%, see Table 3) of each significant questionnaire model as a separate predictive brain set and subsequently trained and tested the labels of each questionnaire on the predictive brain set of the other models. In doing so, related nonsignificant results of model performance would reinforce a dissociation of contributing brain regions between the different models and therefore would be indicative of non-overlapping fear constructs.

Definition of brain regions and atlas registration
. CC-BY-NC-ND 4.0 International license peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was not . http://dx.doi.org/10.1101/251751 doi: bioRxiv preprint first posted online Jan. 25, 2018; Based on a-priori knowledge of brain regions involved in fear processing, we limited the feature space to bilateral fear-related brain regions including the amygdala, hippocampus, thalamus, anterior cingulate, insula, medial prefrontal and orbitofrontal cortices (Tovote et al., 2015;Braem et al., 2017;Meier et al., 2014). The respective brain regions were parcellated according to the Automated Anatomical Labeling (AAL, see Table 3 for the different labels) (Tzourio-Mazoyer et al., 2002) atlas and projected on the ICBM 152 Nonlinear template (section 2.4) by means of MATLAB (version R2017b) based surface-volume registration tools (svreg) implemented in BrainSuite (version 17a) (Shattuck and Leahy, 2002). BrainSuite was also used to generate surfaces of the selected AAL regions for visualization.

Model evaluation and interpretation
Model performance was assessed by two metrics commonly used to assess the performance of regression models (Ivanescu et al., 2016;Fernandes et al., 2017): Pearsons's correlation coefficient (r) and the mean squared error (MSE). The correlation coefficient characterizes the linear relationship between observed and predicted labels; the MSE is calculated as the average of the squared differences between the observed and predicted labels. A significant positive correlation between observed and predicted labels would indicate strong decoding performance.
Unlike in conventional correlation analysis, however, a negative correlation would indicate poor performance. For each model, we report the normalized MSE (nMSE) because the different questionnaires are based on different score ranges. To explore possible differential contributions of fear-related brain regions to the prediction models, we report the contribution rank of each brain region (region weight) within each condition (condition weight) provided by the MKL approach (Table 3). Importantly, the selection of regions by the MKL model might be influenced by small variations in the dataset (as induced by cross-validation) and might therefore lead to different . CC-BY-NC-ND 4.0 International license peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was not . http://dx.doi.org/10.1101/251751 doi: bioRxiv preprint first posted online Jan. 25, 2018; subsets of regions being selected across cross-validation steps (folds). Providing a quantification of this variability, the "expected ranking (ER)" (see Table 3) characterizes the stability of the region ranking across folds: The closer the ER to the ranking of the selected fold, the more consistent is the ranking of the respective brain region across folds. On the other hand, if the ER is different from the ranking, this means that the ranking might be variable across folds.

Ratings, questionnaire scores and correlations
Importantly, the comparison of the ratings during fMRI measurements demonstrated that the potentially harmful activities were perceived as being significantly more harmful compared to the harmless activities (paired-T-Test: T = 8.22, p < 0.001, two-tailed). Descriptive statistics of the different questionnaires as well as age of the patients are summarized in Table 1. Regarding the questionnaire data, visual inspection and the Shapiro-Wilk test indicated non-normality of the data (p<0.05) of several questionnaires (FABQ, FABQ-W, FABQ-PA and T-Anxiety) and therefore, the non-parametric Spearman's rank correlation coefficient was used. Several significant positive correlations between the different PRF questionnaires scores were observed (p < 0.05, Table 2).
Most of the TSK scales significantly correlated with the PASS scales (0.97 < r's > 0.46, p < 0.05) whereas the FABQ work scale did not show significant relationships with the TSK and PASS scales (p > 0.05), except for the PASS-F scale (r = 0.49, p < 0.05). Furthermore, only the S-Anxiety scale of the STAI scale demonstrated significant correlations with some, but not all, TSK scales (0.44 < r's > 0.63, p < 0.05). Finally, only the PASS-F scale showed a positive and significant relationship with the mean rating of the harmful condition (r = 0.44, p < 0.05, Table 2).

Model performance
. CC-BY-NC-ND 4.0 International license peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was not . http://dx.doi.org/10.1101/251751 doi: bioRxiv preprint first posted online Jan. 25, 2018; The MKL models with significant performance results (p < 0.05) characterized by the Pearsons's correlation coefficient (r) and the normalized mean squared error (nMSE) are depicted in Figure 1 (A-E). The FABQ model demonstrated a significant decoding performance characterized by a positive correlation between true and predicted labels (r = 0.61, nMSE =4.25, p < 0.05).
Interestingly, the FABQ-W model showed strong predictive power (r = 0.74, nMSE = 1.81, p < 0.05) whereas the FABQ-PA scale was not decodable from fear-related brain response patterns (r

Condition and region weights
The condition and region weights of models with significant performance (p<0.05, section 3.2) are illustrated in Figure 1 (A-E) and described in detail in Table 3 (A-E). The decoding performances of the FABQ models (FABQ and FABQ-W) were driven by a major contribution of the harmful . CC-BY-NC-ND 4.0 International license peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was not . http://dx.doi.org/10.1101/251751 doi: bioRxiv preprint first posted online Jan. 25, 2018; condition (88% and 87%, respectively). Within this condition, the left thalamus (rank 1), the right amygdala (rank 2) and the left hippocampus (rank 3) contributed more than 69% of the total region weights in the FABQ model (Table 3A). Similarly, the right amygdala (rank 1) and the left thalamus (rank 2) carried the most predictive neural information with 79,42% of the total region weights in the FABQ-W model (Table 3B). In both FABQ models, the right amygdala also demonstrated an association with the harmless condition, although of minor relevance (~11%). By comparison, the TSK models demonstrated a moderate contribution of the harmful condition (TSK-13:60%, TSK-11:66%). Both predictive model performances of the TSK were driven by a major contribution of the right lateral orbitofrontal cortex (lOFC, TSK-13: 52.7%, TSK-11: 60,49%, Table 3C, 3D). Furthermore, the left medial orbitofrontal cortex (mOFC) and the right hippocampus carried predictive information within the harmless condition in both TSK models (TSK-13: left gyrus rectus 19.51%, right hippocampus: 14.03% / TSK-11: left gyrus rectus: 21.29%, right hippocampus: 10.41%). Interestingly and with almost equal contributions of the harmful (52%) and harmless conditions (48%), the predictive model of T-Anxiety was mainly driven by neural contributions of the left medial prefrontal (mPFC) and mOFC (accounting for 44% of the total region weights in the harmful condition) and the left thalamus (together with the mOFC accounting for 44% of the total region weights in the harmless condition, Table 3E).
Finally, the secondary cross-validation using each predictive brain set of the significant models (FABQ, TSK-13, TSK-11, T-Anxiety) and training and testing it with the labels of the other questionnaires did not result in significant performance results (p's > 0.05).

Discussion
A large body of evidence from cross-sectional and longitudinal studies underpins the prognostic value of PRF regarding disability in chronic pain (Leeuw et al., 2007a;Esteve et al., 2017; CC-BY-NC-ND 4.0 International license peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was not . http://dx.doi.org/10.1101/251751 doi: bioRxiv preprint first posted online Jan. 25, 2018; et al., 2014b). PRF and related disability have also received affirmation on the brain level: individuals with elevated PRF exhibit behaviors promoting pain-related disability that is reflected in altered neural processing of fear-related brain regions (Simons et al., 2014a;Simons, 2016;Seifert et al., 2011). However, the construct validity of PRF questionnaires has been challenged (Lundberg et al., 2011). Because differential neural sources within the fear network might underlie the various fear constructs, novel neuroimaging methods based on pattern regression techniques might help to support or question the various fear constructs used in PRF questionnaires. The results revealed that while the individual variability among FABQ-and FABQ-W-, TSK-13, TSK-11 and T-Anxiety-scales was predictable from brain response patterns in fear-related but dissociable brain regions, this was not the case for the FABQ-PA-, the TSK-11 subscales (TSK-11-AA and TSK-SF), the PASS scales and the S-Anxiety scale. Furthermore, the online ratings of perceived harmfulness were not decodable from fear-related brain response patterns.

FABQ and TSK
In comparison to the other self-report measures of PRF, the FABQ and FABQ-W scales were best predicted by fear-related brain response patterns, characterized by a strong contribution of neural information in the harmful condition (88% and 87%, respectively) and key regions of the fear network such as the amygdala, thalamus and hippocampus. These regional neural contributions were clearly dissociable from those of the TSK models revealed by secondary cross-validation.
Interestingly, the FABQ-PA scale did not show a predictive association with fear-related brain response patterns. Due to its strong association with fear-related neural information, this might underpin the emerging evidence indicating that the FABQ-W is a better moderator of treatment efficacy in chronic LBP compared to the FABQ-PA, although this might be dependent on the patient population George et al., 2008;Waddell et al., 1993; . CC-BY-NC-ND 4.0 International license peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was not . http://dx.doi.org/10.1101/251751 doi: bioRxiv preprint first posted online Jan. 25, 2018; 2014a). In support of this, the FABQ-W scale was the only PRF measure that qualified for a clinical prediction rule regarding improvement after spinal manipulation (Dougherty et al., 2014;Flynn et al., 2002).
Among the TSK scales, the TSK-13 and the TSK-11 scales demonstrated a predictive association with fear-related brain response patterns, albeit with moderate contribution of the harmful condition (60% and 66%, respectively). The TSK-11 version showed a stronger relationship between true and predicted labels compared to the TSK-13 version (r = 0.60, nMSE = 0.90, p < 0.05). This result might reflect the progress of previous research regarding the psychometric properties of the different TSK versions. Compared to the 17-item version, the 13-item version was found to have better psychometric properties without the four inversely phrased items (Roelofs et al., 2004a;Neblett et al., 2016) and the 11-item version has been recommended for future research and clinical settings (for a chronological summary see Tkachuk and Harris, 2012).
Interestingly, the right lateral orbitofrontal cortex (lOFC) provided the most predictive information for the two TSK scales (TSK-13: 52%, TSK-11: 60%). In agreement with the TSKs fear construct of kinesiophobia, dysfunction of the OFC has been shown to be implicated in the processing of phobia-related stimuli in disorders such as social anxiety disorder and specific phobia (Dilger et al., 2003). Specifically, lOFC activity was reduced when phobogenic trials were contrasted with fear-relevant trials (Aue et al., 2015). Furthermore, a hyperactive lOFC seems to be linked to anxiety-laden cognitions (Hahn et al., 2011). Therefore, compared to the FABQ scales, it seems that the TSK scales are stronger associated with fear-related neural information of higher order cognitive brain regions. Interestingly, no predictive association could be "learned" by MKL using the TSK-11 subscale labels (TSK-11-SF and TSK-11-AA scores). Although these two lower order factors (activity avoidance and somatic focus) are reflective of the higher order construct "fear of . CC-BY-NC-ND 4.0 International license peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was not . http://dx.doi.org/10.1101/251751 doi: bioRxiv preprint first posted online Jan. 25, 2018; movement and (re)injury/kinesiophobia", the non-significant result might indicate that they are associated with inconsistent neural patterns.
The superiority of the FABQ in decoding performance, driven by the FABQ-W scale, might be explained by the LBP-specific items of the FABQ in conjunction with the nature of the PRF stimuli (bending of the back) used in the current experiment. Compared to the TSK and PASS scales, the items of the FABQ were specifically related to LBP while the TSK and PASS can be used with various musculoskeletal pain diagnoses such as work-related upper extremity disorders, chronic LBP, fibromyalgia, and osteoarthritis (Roelofs et al., 2007). However, the FABQ has also been adapted to shoulder pain where it demonstrated better factor structure and a stronger association with disability compared to the TSK-11 (Mintken et al., 2010).
PASS Surprisingly, the PASS failed to demonstrate a predictive association with fear-related brain response patterns. There may be several explanations. First, whereas the FABQ and the TSK scales have been specifically developed for patients with musculoskeletal pain, the PASS is suitable for various pain phenotypes (Crombez et al., 1999). Second, the PASS has been shown to be more strongly associated with negative affect and was less predictive of pain disability and behavioral performance (Crombez et al., 1999). Third, in a recent study assessing fear of bending, the PASS (and the TSK) score was not related to physiological startle responses (Caneiro et al., 2017).
Fourth, all PASS subscales demonstrated significant multicollinearity in our sample which suggests non-independence between the different subscales. All these aspects may have led to less sensitivity of fear related neural patterns to the PASS and its subscales in the current study sample.

State and Trait anxiety
. CC-BY-NC-ND 4.0 International license peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was not . http://dx.doi.org/10.1101/251751 doi: bioRxiv preprint first posted online Jan. 25, 2018; Beside fear, anxiety and depression significantly mediate the relationship between pain and disability (Marshall et al., 2017). Nevertheless, fear responses specifically related to a patient`s pain and/or potentially painful movements might be more relevant for explaining disability in chronic LBP than general trait anxiety responses (McCracken et al., 1996). The current results are in line with this notion. First, most of the PRF measures did not show a significant relationship with state or trait anxiety. Second, state anxiety was not decodable from fear-related brain response patterns in chronic pain patients. Interestingly, with respect to the trait anxiety model (T-Anxiety, Figure 1E), the harmful (52%) and the harmless conditions (48%) carried almost equal predictive information This suggests that the trait anxiety measure is associated with PRF-unrelated neural content provoked by e.g. enhanced attention to visual information in fear processing brain regions (Berggren et al., 2015). In support of this, the respective predictive information was predominantly provided by fear-related brain regions that were less involved in the prediction of the other PRF measures, namely parts of the mPFC and mOFC (Table 3E). The strong neural contribution of the harmless condition might be driven by a generalized fear response. This notion is supported by a study showing that healthy individuals with high trait anxiety exhibit sustained PFR during extinction (Meulders et al., 2014).

Harmfulness ratings
Interestingly, although the harmful activities were significantly rated as more harmful compared to the harmless activities, the ratings of perceived harmfulness during fMRI measurements were not decodable from fear-related brain response patterns. Furthermore, the ratings did not show a significant relationship with PRF measures (except the PASS-F scale, see Table 2) indicating nonoverlapping constructs between these measures. Other investigations have made similar observations: in a study of Crombez and colleagues (1999), the FABQ and TSK were not . CC-BY-NC-ND 4.0 International license peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was not . http://dx.doi.org/10.1101/251751 doi: bioRxiv preprint first posted online Jan. 25, 2018; significantly related to the expectancy of pain during a behavioral test (Crombez et al., 1999).
Furthermore, in a study investigating low-and high-avoidant patients, no significant differences were found in the anticipation of pain during the confrontation with back straining movements (Crombez et al., 1998). These results suggest that measures such as ratings of pain anticipation or perceived harmfulness are not useful as proxies of PRF.

Conclusions
This is the first time that multivariate brain responses patterns are used to better understand a psychological construct, here, PRF, conventionally assessed by self-report (questionnaires). This approach allowed identifying non-overlapping neural sources patterns associated with the different self-report measures, supporting the existence of various (pain-related) fear constructs. The FABQ, in particular the FABQ-W scale, demonstrated strong predictive power with high sensitivity to the harmful condition and was associated with subcortical fear processing regions (amygdala, thalamus, hippocampus). The TSK scales were more related to neural content of the OFC and showed less sensitivity to the harmful condition compared to the FABQ scales. Finally, the trait anxiety model did not favor a specific experimental condition, potentially indicating that trait anxiety is either not greatly influenced by fear of movement and/or linked to a generalized fear response.

Limitations
A limitation is the relatively small sample size in conjunction with the cross-validation framework.

Minimum
indicates not visible contralateral homologue.
. CC-BY-NC-ND 4.0 International license peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was not . http://dx.doi.org/10.1101/251751 doi: bioRxiv preprint first posted online Jan. 25, 2018;