Pain-Related Fear-Dissociable Neural Sources of Different Fear Constructs.

Fear of pain demonstrates significant prognostic value regarding the development of persistent musculoskeletal pain and disability. Its assessment often relies on self-report measures of pain-related fear by a variety of questionnaires. However, based either on "fear of movement/(re)injury/kinesiophobia," "fear avoidance beliefs," or "pain anxiety," pain-related fear constructs plausibly differ while it is unclear how specific the questionnaires are in assessing these different constructs. Furthermore, the relationship of pain-related fear to other anxiety measures such as state or trait anxiety remains ambiguous. Advances in neuroimaging such as machine learning on brain activity patterns recorded by functional magnetic resonance imaging might help to dissect commonalities or differences across pain-related fear constructs. We applied a pattern regression approach in 20 human patients with nonspecific chronic low back pain to reveal predictive relationships between fear-related neural pattern information and different pain-related fear questionnaires. More specifically, the applied multiple kernel learning approach allowed the generation of models to predict the questionnaire scores based on a hierarchical ranking of fear-related neural patterns induced by viewing videos of activities potentially harmful for the back. We sought to find evidence for or against overlapping pain-related fear constructs by comparing the questionnaire prediction models according to their predictive abilities and associated neural contributors. By demonstrating evidence of nonoverlapping neural predictors within fear-processing regions, the results underpin the diversity of pain-related fear constructs. This neuroscientific approach might ultimately help to further understand and dissect psychological pain-related fear constructs.


Introduction
Self-report measures of emotional states are paramount for behavioral neuroscience by enabling the understanding of brain response patterns (Shrout et al., 2018). However, the validity of self-reports is limited (Choi and Pak, 2005), probably also because often overlapping psychological constructs are assessed, illustrated by the fact that various questionnaires attempt to assess related constructs. One example is pain-related fear (PRF), which is a major explanatory variable of disability in patients with persistent musculoskeletal pain (Crombez et al., 1999;. For the assessment of PRF, various questionnaires exist based on potentially different constructs such as fear of movement/injury and reinjury/kinesiophobia, fear avoidance beliefs, or pain anxiety. There is an open debate on what their scores reflect on the fear-anxiety spectrum (Lundberg et al., 2011;Caneiro et al., 2017). Fear represents a reaction to an imminent threat, preparing the individual for "fight-flight-freeze," whereas anxiety is described as being more diffuse (e.g., cognitions about a future threat; LeDoux and Pine, 2016; Kreddig and Hasenbring, 2017). While PRF questionnaires do not clearly distinguish between these emotions (Lundberg et al., 2011;Kreddig and Hasenbring, 2017), brain research provides evidence for a functional differentiation of fear and anxiety. Both emotions are controlled by the fear circuit (Tovote et al., 2015); however, subcortical regions (e.g., the amygdala) seem to be more involved in fast and defensive fear reactions (short defensive distance to threat) while cortical regions (e.g., the prefrontal cortex) are more likely to be responsible for complex cognitions of anxiety (large defensive distance to threat; McNaughton and Corr, 2004;Qi et al., 2018). Therefore, advances in neuroimaging enable exploring the subcortical/cortical contributions to PRF constructs by examining interrelations between self-reported emotional states and brain response patterns. Specifically, machine learning techniques such as multivariate pattern analysis (MVPA) applied to functional magnetic resonance imaging (fMRI) data make it possible to directly study the predictive relationship between a contentselective cognitive or emotional state (expressed as a label) and corresponding multivoxel fMRI activity patterns (Haynes, 2015;Hebart and Baker, 2017). The label may have discrete (classification) or continuous (regression) values such as questionnaire scores (Formisano et al., 2008). Back-straining activities (i.e., bending and lifting) are the most feared and pain-provoking movements among people with low back pain (LBP), based on ratings of perceived harmfulness or physiologic responses (Leeuw et al., 2007a;Glombiewski et al., 2015;Stevens et al., 2016;Caneiro et al., 2017). As such, bending and lifting, either active or passive (e.g., through pictures) have been frequently used to provoke PRF (Leeuw et al., 2007c;Trost et al., 2009;Barke et al., 2016;Caneiro et al., 2017). Therefore, we provoked PRF by presenting video clips of daily activities including bending and lifting (harmful condition) and harmless activities such as walking (harmless condition) in a sample of 20 patients with nonspecific chronic LBP. We applied a pattern regression analysis in combination with multiple kernel learning (MKL) to assess potential neural predictors of the various PRF constructs based on the weighting of (1) harmful and harmless conditions (condition weights) and (2) pattern information within subcortical and cortical fear-processing regions (region weights). We first contrasted the different PRF questionnaires in terms of their model performance, namely the ability of the model to predict the questionnaire scores based on brain response patterns across fear-processing regions. Second, we compared the different prediction models according to the distributions of their condition and region weights to explore potential neural commonalities or differences of related PRF constructs. If the PRF questionnaires share overlapping PRF constructs, then the region weights should be similarly distributed across fear-processing regions. Conversely, if the contributing brain regions vary across the prediction models, this would provide evidence for nonoverlapping PRF constructs across questionnaires. Ultimately, this approach might help to further understand and dissect the various PRF constructs in chronic LBP. satisfactory internal consistency (Cronbach's ␣ ϭ 0.76 -0.84) contains statements focusing on the fear of physical activity rated on a 4-point Likert scale from 1 (strongly disagree) to 4 (strongly agree; Rusu et al., 2014). Due to additional versions of the original 17-item TSK questionnaire, we also calculated the questionnaire scores of the 13-and 11-item TSK versions (TSK-13, TSK-11). The 13and 11-item versions were previously validated by confirmatory factor analysis and demonstrated acceptable levels of internal consistency (Cronbach's ␣ ϭ 0.80; Goubert et al., 2004;Tkachuk and Harris, 2012). A two-factor solution of the TSK-11 version provides the best fit in terms of explaining variance across German, Dutch, Swedish, and Canadian samples, and included the subscales "activity-avoidance" (TSK-AA; the belief that that activity may result in injury/reinjury or stronger pain) and "somatic focus" (TSK-SF; the belief in underlying and serious medical problems; Roelofs et al., 2007;Rusu et al., 2014).
(2) The German version of the fear avoidance beliefs questionnaire (FABQ;Waddell et al., 1993;Pfingsten et al., 2000) consists of 16 back pain-specific items related to fear avoidance beliefs rated on a 7-point rating scale (0, completely disagree; 6, completely agree). It includes two distinct and established subscales related to beliefs about about how work (FABQ-W) and physical activity (FABQ-PA) affect LBP, with internal consistencies of ␣ ϭ 0.88 and ␣ ϭ 0.77, respectively (Waddell et al., 1993).
(3) The short version of the pain anxiety symptoms scale (PASS-20) assesses fear and anxiety responses related to pain including cognitive, physiologic, and motor response domains (McCracken and Dhingra, 2002). Items on the PASS-20 are measured on a 6-point Likert scale and relate to four different subscales, including cognitive anxiety (PASS-C), fear (PASS-F), physiology (PASS-P), and escape/avoidance (PASS-E; Roelofs et al., 2004b). The German version of the PASS-20 has an internal consistency of ␣ ϭ 0.90 (Kreddig et al., 2015).
Furthermore, patients were asked to fill out the painDE-TECT questionnaire, which includes three 11-point numeric rating scales, with 0 being "no pain" and 10 being the "worst imaginable pain" to assess current pain, strongest, and average pain intensity in the previous 4 weeks (Freynhagen et al., 2006). Finally, to investigate potential differences or shared variance between PRF and general anxiety, we used the State-Trait Anxiety Inventory (STAI), the most widely used self-report measure of anxiety, which includes two subscales (Spielberger and Gorsuch, 1983;Julian, 2011): the State Anxiety Scale (S-Anxiety) assesses current levels of anxiety, whereas the Trait Anxiety Scale (T-Anxiety) evaluates more stable aspects of anxiety such as "anxiety proneness" (Julian, 2011). All questionnaires were administered at the fMRI appointment before brain scanning. We tested the scores of the different questionnaires for the assumption of normality of the data using the Shapiro-Wilk test and visually using Q-Q plots implemented in SPSS Statistics (version 23, IBM;Ghasemi and Zahediasl, 2012).

Scanning protocol and design
Brain imaging was performed on a 3 T whole-body MRI system (Achieva, Philips), equipped with a 32-element receiving head coil and MultiTransmit parallel RF transmission. Each imaging session started with a survey scan, a B1 calibration scan (for MultiTransmit), and a SENSE reference scan. High-resolution anatomic data were obtained with a 3D T1-weighted (T1w) turbo field echo scan consisting of 145 slices in sagittal orientation with the following parameters: field of view (FOV) ϭ 230 ϫ 226 mm 2 ; slice thickness ϭ 1.2 mm; acquisition matrix ϭ 208 ϫ 203 (resulting in a voxel resolution of 1.1 ϫ 1.1 ϫ 1.2 mm); TR ϭ 6.8 ms; TE ϭ 3.1 ms; flip angle ϭ 9°; number of signal averages ϭ 1. Functional time series were acquired using whole-brain gradient-echo echoplanar imaging sequences (365 volumes), consisting of 37 slices in the axial direction (anterior commissure-posterior commissure angulation) with the following parameters: FOV ϭ 240 ϫ 240 mm 2 ; acquisition matrix ϭ 96 ϫ 96; slice thickness ϭ 2.8 mm (resulting in a voxel resolution of 2.5 ϫ 2.5 ϫ 2.8 mm); interleaved slice acquisition; no slice gap; TR ϭ 2100 ms; TE ϭ 30 ms; SENSE factor ϭ 2.5; flip angle ϭ 80°. The PRF-provoking stimuli (harmful condition) consisted of video clips with a duration of 4 s recorded from a third-person perspective (Meier et al., 2016). The video clips showed potentially harmful activities (back-straining movements such as bending and lifting) selected from the Photograph Series of Daily Activities (PHODA; Leeuw et al., 2007a). The original PHODA was developed in close collaboration with human movement scientists, physical therapists, and psychologists, and is composed of a fear hierarchy based on ratings of the perceived harmfulness of daily activities in patients with chronic LBP. From the 40 potentially harmful activities included in the short electronic PHODA version (Leeuw et al., 2007a), we chose three scenarios from the top six most harmful activities, namely shoveling soil with a bent back, lifting a flowerpot with slightly bent back, and vacuum cleaning under a coffee table with a bent back. Furthermore, we created video clips of three activities rated as less harmful, such as walking up and down the stairs and walking on even ground (harmless condition). Presentation software (Neurobehavioral Systems) was used to present the video clips in a pseudorandomized order (no more than two identical consecutive trials). The patients were asked to carefully observe the video clips, which were displayed using MRcompatible goggles (Resonance Technology). The three harmful and harmless activities were each presented five times (30 trials total). After the observation of each video clip, the patients were asked to rate the perceived harmfulness of the activity on a visual analog scale (VAS) anchored with the endpoints "not harmful at all" (0) and "extremely harmful" (10). All ratings were performed using an MR-compatible track ball (Current Designs). After the VAS rating, a black screen with a green fixation cross appeared (duration jittered between 6 and 8 s). We have used this experimental protocol successfully for investigations of neural correlates of PRF self-reports in previous fMRI studies based on mass-univariate analyses (Meier et al., 2016(Meier et al., , 2017.

MR data organization and preprocessing
We used an existing fMRI dataset of previously reported studies (Meier et al., 2016(Meier et al., , 2017. The fMRI data were organized according to the Brain Imaging Data Structure (RRID:SCR_016124; http://bids.neuroimaging.io/), which provides a consensus on how to organize data obtained in neuroimaging experiments. Preprocessing was performed using FMRIPREP (version 1.0.0-rc2, RRID:SCR_016216; https://github.com/poldracklab/fmriprep), a Nipype based tool (Gorgolewski et al., 2011), which requires minimal user input and provides easily interpretable and comprehensive error and output reporting. This processing pipeline includes state-of-the-art software packages for each step of preprocessing (for a detailed description of the different workflows, see https://fmriprep.readthedocs.io/en/stable/workflows. html). Each T1w volume was skullstripped using ants-BrainExtraction.sh version 2.1.0 (using OASIS template). The skullstripped T1w volume was coregistered to the skullstripped ICBM 152 Nonlinear Asymmetrical MNI template version 2009c using nonlinear transformation implemented in ANTs version 2.1.0 (Avants et al., 2008). Functional data were slice time corrected using AFNI (Cox, 1996) and motion corrected using MCFLIRT version 5.0.9 (Jenkinson et al., 2002). This was followed by coregistration to the corresponding T1w volume using boundarybased registration 9 df implemented in FreeSurfer version 6.0.0 (Greve and Fischl, 2009). Motion-correcting transformations, T1w transformation, and MNI template warp were applied in a single step using antsApplyTransformations version 2.1.0 with Lanczos interpolation. Three tissue classes were extracted from T1w images using FSL FAST version 5.0.9 (Zhang et al., 2001). Voxels from CSF and white matter were used to create a mask used to extract physiologic noise regressors using aCompCor (Behzadi et al., 2007). The mask was eroded and limited to subcortical regions to limit overlap with gray matter, and six principal components were estimated. Independent component analysis-based automatic removal of motion artifacts (AROMA) was used to generate aggressive motion-related noise regressors. The AROMA classifier identifies motion components with high accuracy and robustness and is superior to motion artifact detection using 24 motion parameters or spike regression (Pruim et al., 2015). Finally, to preserve high spatial frequency while reducing noise, spatial smoothing with a full-width at half-maximum 4 mm Gaussian kernel was applied. To accelerate data preprocessing, we performed parallel computing using the Docker environment (https://www.docker.com/) and the GC3Pie framework (https://github.com/uzh/gc3pie) on the Science-Cloud supercomputing environment at the University of Zurich (S3IT; https://www.s3it.uzh.ch/).

MVPA input data
The preprocessed data were subsequently passed onto the Statistical Parametric Mapping software package (SPM12, version 6906; RRID:SCR_007037; http://www. fil.ion.ucl.ac.uk/spm/) for model computation using a general linear model (GLM). For each patient, a design matrix was built with separate regressors for the harmful and harmless activities, respectively (15 harmful and 15 harmless stimuli). The video clips were modeled as boxcar functions (onset ϭ onset of video clip; duration ϭ 4 s) and convolved with the standard canonical hemodynamic response function, as implemented in SPM12. In addition, the following nuisance regressors were implemented in the GLM model for each patient: (1) the six regressors derived from the component-based physiologic noise correction method (aCompCor) and (2) the motion-related regressors generated by AROMA (see MR data organization and preprocessing section). A high-pass filter with a cutoff of 128 s was used to remove low-frequency noise. Finally, for each patient, voxelwise ␤ images for each condition were computed and served as the input images for the MVPA.

Multivariate pattern analysis
Compared with univariate analyses, MVPA can achieve greater sensitivity and is able to detect subtle and spatially distributed effects (Schrouff et al., 2013;Haynes, 2015). A pattern of activity can represent many more different states than each voxel individually, which leads to an information-based view compared with the activation-based view of univariate analyses (Hebart and Baker, 2017). MVPA was performed using routines implemented in PRoNTo version 2.0 (RRID:SCR_006908; http://www.mlnl.cs.ucl.ac.uk/pronto/; Schrouff et al., 2013). For the readout of multivariate neural information that might serve as a potential score estimator of the different PRF questionnaires, we applied a newly introduced pattern regression approach based on supervised machine learning and testing phases using MKL. In brief, the objective in supervised pattern recognition regression analysis is to learn a function from data that can accurately predict the continuous values (labels; i.e., f(x i ) ϭ y i from a given dataset D ϭ {x i , y i }, i ϭ 1. . .N, where x i represents pairs of samples or vectors and y i represents the different labels). Ultimately, the learned function from the learning set is used to predict the labels from new and unseen data (Schrouff et al., 2013). MKL allows accounting for brain anatomy (determined by a brain atlas; see Feature selection) and different modalities (e.g., anatomic/ functional data or in the current approach: conditions) during the model estimation by considering each brain region and modality as separate kernels. This approach allows determination of the contribution of each brain region (region weights) and condition (condition weights) to the final decision function of the model in a hierarchical manner by simultaneously learning and combining the different linear kernels that are based on support vector machines (SVMs; Rakotomamonjy et al., 2008;Fernandes et al., 2017;Schrouff et al., 2018). Compared with conventional MVPA methods based on whole-brain voxel weight maps, this procedure provides a straightforward approach to draw inferences on the region level without the need for multiple comparison correction (Schrouff et al., 2018). To account for possible differential contributions of the harmful and harmless conditions to the decision function, we included the individual SPM ␤ images of each condition as separate modalities in the MKL model (condition weights). The kernels were mean centered and normalized (to account for the different sizes of the involved brain regions) using standard routines implemented in PRoNTo. Subsequently, for each questionnaire, we trained a separate MKL regression model with the respective labels (FABQ, TSK-17-, TSK-13, TSK-11, PASS and all subscale scores, and state and trait anxiety). Furthermore, we trained MKL regression models based on the harmfulness ratings collected during the fMRI measurements (mean ratings of the harmful condition and harmless condition, respectively). This resulted in a total of 17 MKL models providing outputs for model evaluation, including model performance, region, and conditions weights. To reduce the risk of overfitting for each model, we applied a nested cross-validation procedure using a "leave-one-subject-out" cross-validation scheme to train the model including optimization of the model hyperparameter "C" (range, [0.1, 1, 10, 100, 1000]). Furthermore, to generate a data-based null distribution of the performance measures [r and normalized mean squared error (nMSE); see Model evaluation and interpretation], each model was recomputed 16,000 times with permuted labels (permuted questionnaire score per subject) using parallel computing. Multiple-comparison correction for the model performance (r values and nMSE) was based on a false discovery rate (FDR) of 5% (p(FDR) Ͻ 0.05). As a note, by controlling the expected proportion of false-positives, FDRcontrolling procedures provide less stringent control of type I errors compared with other procedures, such as the Bonferroni correction, which control the probability of at least one type I error. In addition, each model representing a potential PRF construct [i.e., a model with a significant (FDR corrected and uncorrected) performance] was trained and tested through an additional cross-validation procedure using each predictive feature set (brain regions that contributed Ͼ10%; see Table 5) of the other models (betweenmodel cross-validation; e.g., training and testing of the FABQ labels was repeated using the predictive feature sets of the TSK-11, TSK-13, and T-Anxiety models). A failure of predictive performance in the between-model crossvalidation would point toward a dissociation of brain regions contributing to the different models and would therefore be indicative of nonoverlapping PRF constructs.

Feature selection
To further reduce the risk of overfitting and based on a priori knowledge of brain regions involved in fear processing, we limited the feature space to bilateral fear-related brain regions including the amygdala, hippocampus, thalamus, anterior cingulate, insula, and medial prefrontal, and orbitofrontal cortices (Meier et al., 2014;Tovote et al., 2015;Braem et al., 2017). The respective brain regions were parcellated according to the automated anatomic labeling (AAL; RRID: SCR_003550; http://www.gin.cnrs.fr/en/tools/aal-aal2/; see Table 5 for the different labels; Tzourio-Mazoyer et al., 2002) atlas and projected on the ICBM 152 Nonlinear template (see MR data organization and preprocessing) by means of MATLAB (version R2017b)-based surface-volume registration tools (svreg) implemented in BrainSuite (version 17a; RRID:SCR_006623; http://brainsuite.org/; Shattuck and Leahy, 2002). BrainSuite was also used to generate surfaces of the selected AAL regions for visualization.

Model evaluation and interpretation
Model performance was assessed by two metrics commonly used to assess the performance of regression models (Ivanescu et al., 2016;Fernandes et al., 2017), as follows: Pearsons's correlation coefficient (r) and the MSE. The correlation coefficient characterizes the linear relationship between true and predicted labels; the MSE is calculated as the average of the squared differences between the true and predicted labels. A significant positive correlation between true and predicted labels would indicate strong decoding performance. Unlike in conventional correlation analysis, however, a negative correlation would indicate poor performance. Furthermore, for each model, we report the nMSE because the different questionnaires are based on different score ranges. To explore possible differential contributions of fear-related brain regions to the prediction models, we report the contribution rank of each brain region (region weight) within each condition (condition weight) provided by the MKL approach (see Table 5). Importantly, the selection of regions by the MKL model might be influenced by small variations in the dataset (because of the leave-one-subject-out cross-validation) and might therefore lead to different subsets of regions being selected across cross-validation steps (folds). Providing a quantification of this variability, the "expected ranking" (ER; see Table 5) characterizes the stability of the region ranking across folds, as follows: The closer the ER to the ranking of the selected fold, the more consistent is the ranking of the respective brain region across folds. On the other hand, if the ER is different from the ranking, this means that the ranking might be variable across folds.

Ratings, questionnaire scores, and correlations
Importantly, the comparison of the ratings during fMRI measurements demonstrated that the potentially harmful activities were perceived as being significantly more harmful compared with the harmless activities (paired t test: T ϭ 8.22; p Ͻ 0.001, two-tailed). Descriptive statistics of the different questionnaires as well as the age and sex of the patients are summarized in Table 1. Regarding the questionnaire data, visual inspection (Q-Q plots) and the Shapiro-Wilk test indicated the non-normality of the data (p Ͻ 0.05) of several questionnaires (FABQ, FABQ-W, TSK-11, FABQ-PA, and T-Anxiety); therefore, the nonparametric Spearman's rank correlation coefficient was used. Several significant positive correlations between the different PRF questionnaires scores were observed (p Ͻ 0.05; Table 2). Most of the TSK scales significantly correlated with the PASSs (0.97 Ͻ r's Ͼ 0.46, p Ͻ 0.05), whereas the FABQ work scale did not show significant relationships with the TSK and PASSs (p Ͼ 0.05), except for the PASS-F (r ϭ 0.49, p Ͻ 0.05). Furthermore, only the S-Anxiety scale of the STAI scale demonstrated significant correlations with some, but not all, TSK scales (0.44 Ͼ r's Ͻ 0.63, p Ͻ 0.05). Finally, only the PASS-F showed a positive and significant relationship with the mean rating of the harmful condition (r ϭ 0.44, p Ͻ 0.05; Table 2).

Condition and region weights
The condition and region weights of models with predictive performance (p Ͻ 0.05, FDR-corrected and uncorrected; see Model performance) are illustrated in Figure  1A-E and are described in detail in Table 5 (sections A-E). The decoding performances of the FABQ models (FABQ and FABQ-W) were driven by a major contribution of the harmful condition (88% and 87%, respectively). Within this condition, the left thalamus (rank 1), the right amygdala (rank 2) and the left hippocampus (rank 3) contributed Ͼ69% of the total region weights in the FABQ model (Table 5, section A, Fig. 1A). Similarly, the right amygdala (rank 1) and the left thalamus (rank 2) carried the most predictive neural information with 79.62% of the total region weights in the FABQ-W model (Table 5, section B, Fig. 1B). In both FABQ models, the right amygdala also demonstrated an association with the harmless con-dition, although it was of minor relevance (ϳ11%). In comparison, the TSK models demonstrated a moderate contribution of the harmful condition (TSK-13, 60%; TSK-11, 66%). Both predictive model performances of the TSK were driven by a major contribution of the right lateral orbitofrontal cortex (lOFC; TSK-13, 52.7%; TSK-11, 60.49%; Table 5, sections C and D, Fig. 1C,D). Furthermore, the left medial orbitofrontal cortex (mOFC) and the right hippocampus carried predictive information within the harmless condition in both TSK models (TSK-13: left gyrus rectus, 19.51%; right hippocampus, 14.03%; TSK-11: left gyrus rectus, 21.29%; right hippocampus, 10.41%). With almost equal contributions of the harmful (52%) and harmless conditions (48%), the prediction of the T-Anxiety scores was mainly driven by neural contributions of the left medial prefrontal cortex (mPFC) and mOFC (accounting for 44% of the total region weights in the harmful condition) and the left thalamus (together with the mOFC accounting for 44% of the total region weights in the harmless condition; Table 5, section E, Fig. 1E).

Discussion
Evidence from cross-sectional and longitudinal behavioral studies demonstrates a strong association between PRF and disability in chronic pain (Leeuw et al., 2007b;Wertli et al., 2014b;Esteve et al., 2017). However, the  different PRF constructs such as "fear of movement/ (re)injury/kinesiophobia," "fear avoidance beliefs," or "pain anxiety" are often used interchangeably in the literature (Lundberg et al., 2011), and it is unclear whether they share a common PRF construct reflected by similar neural sources. The subcortical/cortical neural basis of fear and anxiety that controls cognition and regulates appropriate behavior dependent on threat characteristics is well described (  The current MVPA approach using MKL demonstrated the feasibility to neuronally dissect the proposed constructs of PRF self-reports based on their subcortical/ cortical predictors during PRF-related brain activity. The results revealed that while the variability across individuals of some questionnaires, specifically the FABQ and FABQ-W, TSK-13, TSK-11, and T-Anxiety scales, was predictable from response patterns in fear-related, dissociable neural sources on subcortical and cortical levels, this was not the case for the FABQ-PA, the TSK-11 subscales (TSK-11-AA and TSK-SF), the PASSs, and the S-Anxiety scale. Furthermore, the on-line ratings of perceived harmfulness were not decodable from fear-related brain response patterns.

FABQ and TSK
The FABQ and FABQ-W scales demonstrated the best model performances among the investigated PRF questionnaires, which were characterized by a strong contribution of neural information in the harmful condition (condition weights, 88% and 87%, respectively). Interestingly, the FABQ-PA scale did not show a predictive association with fear-related brain response patterns. The better model performance of the FABQ-W based on fearrelated brain activity patterns is in line with the emerging evidence that the FABQ-W is a better predictor of treat-ment outcome in chronic LBP compared with the FABQ-PA, although this might be dependent on the patient population (Waddell et al., 1993;George et al., 2005George et al., , 2008Wertli et al., 2014a). In support of this, the FABQ-W scale qualified for a clinical prediction rule regarding improvement after spinal manipulation, whereas the FABQ-PA scale did not (Flynn et al., 2002;Dougherty et al., 2014).
With respect to the region weights, the FABQ models were mainly driven by subcortical neural contributions involving the thalamus, hippocampus, and the amygdala, while frontal brain regions played a minor role. The thalamus and, particularly, its midline structures have been considered to be a nonspecific arousing system (van der Werf et al., 2002). However, it has been shown recently that parts of dorsal midline thalamic structures are necessary for fear memory processing by directly targeting the hippocampus, which plays an important role for context-dependent emotional memory (Penzo et al., 2015;Lara-Vásquez et al., 2016;Zheng et al., 2017). Furthermore, the amygdala has long been considered a "fear center" (Darwin, 1873;Panksepp, 1998). However, the heterogeneous structure consisting of several nuclei is not essential for the experience of fear, which has been demonstrated in patients with amygdala lesions (Anderson and Phelps, 2002;Feinstein et al., 2013;LeDoux and Pine, 2016). Instead, the amygdala has been shown to be more strongly implicated in behavioral and physiologic responses to threats (i.e., defensive processes); its relation to complex cognitions like fear and anxiety is controversial (Panksepp, 2011;LeDoux and Pine, 2016;Fanselow and Pennington, 2017). A recent opinion article (LeDoux and Hofmann, 2018) suggested that subjective feelings of fear and anxiety do not initially arise from subcortical activity of the fear circuit centered around the amygdala. Thus, amygdala activity and mediated physiologic responses of fear and anxiety might be, at its best, only a correlate of subjective feelings of fear and anxiety (LeDoux and Hofmann, 2018). Nevertheless, the results presented here indicate a strong predictive association between subjective reports of PRF, assessed by the FABQ scales, and amygdala activity patterns.
Among the TSK scales, the TSK-13 and the TSK-11 demonstrated a predictive association with fear-related brain response patterns, albeit with less contribution of the harmful condition compared with the FABQ scales (TSK-13, 60%; TSK-11, 66%). The TSK-11 version   showed a stronger relationship between true and predicted labels compared with the TSK-13 version (r ϭ 0.60, nMSE ϭ 0.90, p Ͻ 0.05). This result might reflect the progress of previous research regarding the psychometric properties of the different TSK versions. Compared with the 17-item version, the 13-item version has better psychometric properties without the four inversely phrased items (Roelofs et al., 2004a;Neblett et al., 2016), and the 11-item version has been recommended for future research and clinical settings (for a chronological summary, Condition weight: harmful activities, 52%; harmless activities, 48%. ‫ء‬Brain regions included in the feature set for between model cross-validation (see Table 4 for results).
see Tkachuk and Harris, 2012). Interestingly, no predictive association could be "learned" by MKL using the TSK-11 subscale labels (TSK-11-SF and TSK-11-AA scores). Although these two lower-order factors (activity avoidance and somatic focus) are reflective of the higher-order construct "fear of movement and (re)injury/kinesiophobia," the nonsignificant result might indicate that they are associated with inconsistent neural patterns across individuals.
Regarding the region weights of the TSK models, the right lOFC provided the most predictive information for the two TSK scales (TSK-13, 52%; TSK-11, 60%). In agreement with the phobia-related construct (kinesiophobia), dysfunction of the OFC has been shown to be implicated in the processing of phobia-related stimuli in disorders such as social anxiety disorder (Dilger et al., 2003). Specifically, lOFC activity was reduced when phobogenic trials were contrasted with fear-relevant trials (Aue et al., 2015). Furthermore, a hyperactive lOFC has been shown to be linked to anxiety-laden cognitions (Hahn et al., 2011). Interestingly, the higher cortical contributions of the TSK models were clearly dissociable from the largely subcortical contributions involving the amygdala, hippocampus, and thalamus that predicted the FABQ scores.
To conclude, the FABQ scales demonstrated high PRF sensitivity (harmful condition weights Ͼ 87%) and were linked to subcortical predictors that have been associated with fear responses to an imminent threat and defensive behavior (McNaughton and Corr, 2004;LeDoux and Pine, 2016). In contrast, the TSK scales appeared to capture emotional states largely associated with cortical fear processing that might be related to cognitive aspects of PRF. In support of this, the observed higher harmless condition weights of the TSK compared with the FABQ models might indicate that the TSK scales are associated with more diffuse anxiety-related cognitions.

PASS
Surprisingly, the PASS failed to demonstrate a predictive association with fear-related brain response patterns. There may be several explanations. First, although the FABQ and the TSK have been specifically developed for patients with musculoskeletal pain, the PASS is suitable for various pain phenotypes (Crombez et al., 1999). Second, the PASS has been shown to be more strongly associated with negative affect and less predictive of pain disability and behavioral performance (Crombez et al., 1999). Third, all PASS subscales demonstrated significant multicollinearity in our sample, suggesting nonindependence between the different subscales. All these aspects may have led to less sensitivity of fear-related neural patterns to the PASS and its subscales in the current study.
The superiority of the FABQ scale (driven by the FABQ-W) in decoding performance compared with the TSK and PASSs might also be influenced by the backspecific items of the FABQ in conjunction with the nature of the PRF-provoking stimuli (back-straining movements). The items of the FABQ were specifically related to the back, while the TSK and PASS can be used with various musculoskeletal pain diagnoses such as work-related upper extremity disorders, chronic LBP, fibromyalgia, and osteoarthritis (Roelofs et al., 2007). Nevertheless, the FABQ has also been adapted to shoulder pain, where it demonstrated better factor structure and a stronger association with disability compared with the TSK-11 (Mintken et al., 2010).

State and trait anxiety
Beside PRF, anxiety, and depression significantly mediate the relationship between pain and disability (Marshall et al., 2017). However, fear responses specifically related to a patient's pain and/or potentially painful movements might be more relevant for explaining disability in chronic LBP than general trait anxiety responses (Mc-Cracken et al., 1996). The current results are in line with this notion. First, most of the PRF measures did not show a significant relationship with state or trait anxiety. Second, state anxiety was not decodable from fear-related brain responses to potentially harmful activities in chronic pain patients. Interestingly, with respect to the trait anxiety model (T-Anxiety; Fig. 1E), the harmful and the harmless conditions carried almost equal predictive neural information (52% vs 48%). This suggests that the trait anxiety measure is associated with neural content irrespective of the harmfulness of a stimulus, provoked by, for example, enhanced attention to visual information processed in fear-related brain regions; Berggren et al., 2015). It might further indicate that the T-Anxiety scale captures neural responses that are associated with a more generalized fear response.
Regarding the region weights, predictive information was predominantly provided by brain regions that were less involved in the prediction of the other PRF measures, namely parts of the mPFC and mOFC (Table 5, section E). This is in line with the proposed functional differentiation of neural structures regarding fear in response to an imminent threat (defensive response) and cognitive fear/ anxiety (distal, uncertain threat) whereas the latter involves more rostral cortical structures such as the mPFC and mOFC (McNaughton and Corr, 2004;LeDoux and Pine, 2016). Moreover, research on self-report measurements indicates that trait anxiety is relatively distinct from tissue damage fear, which supports a behavioral and neural dissociation of trait anxiety and PRF (Cooper et al., 2007;Perkins et al., 2007).

Harmfulness ratings
Interestingly, although the PRF-provoking harmful activities were rated as significantly more harmful compared with the harmless activities, the ratings of perceived harmfulness during fMRI measurements were not decodable from fear-related brain response patterns. Furthermore, the ratings did not show significant correlations with PRF measures (except the PASS-F scale; Table 2). Others reported only moderate relationships (r values Ͻ0.39) between perceived harmfulness ratings of PHODA items and self-report measures such as the TSK, paincatastrophizing scale, or pain intensity (Leeuw et al., 2007a), indicating that ratings of perceived harmfulness assess something akin to, but also distinct from, PRF self-report measures. The weak relationships between ratings of perceived harmfulness and self-report measures of PRF might be explained by the specificity of the potentially harmful movements depicted by the PHODA items. Namely, the ratings of perceived harmfulness were specifically related to back-straining movements such as bending and lifting, while the PRF measures might also be associated with other potentially harmful movements. As such, the fear-related neural patterns induced by the observation of potentially harmful activities for the back might not include information about movement specificity. Instead, these neural patterns might predict PRF and its constructs in a more general fashion that is captured by the TSK and FABQ.

Limitations
A limitation of this study is the relatively small sample size in conjunction with the cross-validation framework.
Ideally, the predictive model should be trained and tested with completely independent data. However, the results obtained are likely to be valid for the following several reasons: (1) the goal of the current study was not maximizing decoding performance, rather, multivariate decoding was used for the interpretation and understanding of the different PRF constructs, for which significant predictive accuracy was obtained (Hebart and Baker, 2017); (2) the applied linear SVM has been shown to exhibit good performance even in very high-dimensional settings with small sample sizes (Varoquaux and Thirion, 2014); (3) the applied regression approach using continuous variables enhances statistical power compared with a categorical analysis (e.g., low vs high fear; Altman and Royston, 2006); and (4) the variability of the regions that contribute most to the models across cross-validation folds was very small (indicated by the ER), demonstrating stable ranking, irrespective of which subjectЈs data were left out for validation. For these reasons, the differences of the prediction models are unlikely to be caused by the small sample size. A further limitation is related to the sparsity approach (L1 regularization) of the MKL algorithm currently implemented in PRoNTo, which does not select brain regions with highly correlated neural information. Therefore, potential lateralization effects of brain regions (e.g., left and right amygdala) should be carefully interpreted. Finally, the study design only allows interpretations of PRF to back-straining movements and LBP. Therefore, conclusions related to other musculoskeletal conditions should be drawn with caution. Nevertheless, the current approach might represent a promising new tool to dissect psychological constructs of self-report measures by using their neural predictors.

Conclusion
This is the first time that multivariate brain response patterns have been used to better understand and dissect a psychological construct, here, PRF, conventionally assessed by selfreport (questionnaires). Relating content-selective neural information to potentially different psychological constructs likely supports their construct validity by revealing (hidden) commonalities or differences across psychological constructs. Indeed, dissociable fear-related neural information served as score estimators of the FABQ (FABQ total and FABQ-W), the TSK (13-and 11-item versions), and the T-Anxiety questionnaire, supporting the distinctness of the fear constructs behind these questionnaires. The FABQ scales demonstrated strong predictive power with high sensitivity to the harmful condition and were associated with subcortical fear-processing regions (amygdala, thalamus, hippocampus). The TSKs were more related to neural response patterns of the OFC, potentially indicating that the construct of kinesiophobia is more related to higher-order brain structures associated with anxiety, while the FABQ scales are more related to subcortical defensive responses to fear. The PASS and its subscales failed to demonstrate a predictive association with fear-related brain response patterns. From a clinical point of view, it might indicate that the various PRF questionnaires, although often correlating, indeed measure different fear phenotypes related to pain. Therefore, the results emphasize the need to carefully consider the different PRF questionnaires in research and clinical settings as their constructs do not appear to be interchangeable.