Introduction

The ability to mount an appropriate response to stress is vital for the survival of every living organism. However, when the homeostatic mechanisms to cope with stressful stimuli are disrupted, either because the individual has a particular vulnerability or because the response system is exhausted by a continuous activation, maladaptive responses take place and predispose to cognitive impairment and even to pathological conditions.1, 2, 3 Maladaptive stress affects cognitive behavior through sequential structural modulation of brain networks, mainly as a consequence of the release of corticosteroids.4, 5 In fact, several studies have revealed stress-induced deficits in spatial reference- and working-memory and behavioral flexibility;6, 7 these behavioral changes are attributed to synaptic/dendritic reorganization in both the hippocampus6 and the medial prefrontal cortex.7, 8 Recently, we showed, in rodents, that chronic stress triggers changes in the frontostriatal networks that govern instrumental behavior decisions.9 Briefly, different corticostriatal circuits are thought to control competing behavioral strategies during choice situations: whereas the medial prefrontal cortex and the caudate nuclei (associative striatum) have been implicated in goal-directed actions, the putamen (sensorimotor striatum) has been implicated in habitual behavior.10, 11 In that study, we showed that chronic stressed rats display an atrophy of the associative network (medial prefrontal cortex and dorsomedial striatum), in parallel with a hypertrophy of the dorsolateral (sensorimotor) striatum and the most lateral portions of the orbitofrontal cortex. In addition, the structural changes were associated with a bias in decision-making strategies, as behaviors in stressed rats rapidly shifted from goal-directed actions to habits.9

This automatization of recurring decision processes into stereotypic behaviors or habits caused by exposure to stress can be viewed as ‘advantageous’, as it increases behavioral efficiency by releasing cognitive resources for more demanding tasks.10 Typically, habitual responses do not require the evaluation of their consequences and can be elicited by particular situations or stimuli.10, 11 However, to adapt to ever-changing life conditions, the ability to select the appropriate actions to obtain specific outcomes based on their consequences is of utmost importance. The capacity to shift between habit-based and goal-directed actions is a condition for appropriate decision-making.12 Importantly, this flexibility includes the ability to inhibit automatisms (habits) in order to use the more demanding goal-directed strategies, which was shown to be impaired in stressed rodents.9

Materials and methods

Subjects, psychological tests and cortisol measurements

Two cohorts of medical students participated in this experiment: one was under their normal academic activities (controls, n=12; 6 females, 6 males; mean age, 23.6±2.11), whereas the other included subjects that had just finished their long period of preparation for the medical residence selection exam (chronic psychosocial stress; stress group, n=12, 6 females, 6 males; mean age, 23.9±0.70). For the longitudinal assessment, stressed individuals (n=12) were reassessed in similar conditions 6–7 weeks after the end of the exposure to stress. To control for the influence of test–re-test, a smaller cohort of stress-recovered individuals (n=4; 2 females, 2 males; mean age, 24.25±0.96) naïve to the first experiment were also included. In addition, we also reassessed a smaller cohort of control subjects (n=4; 2 females, 2 males; mean age, 24.33±1.15) 6 weeks after the first assessment. After arrival, the subjects responded to a laterality test and to a self-administered questionnaire regarding stress assessment (perceived stress questionnaire).13 The perceived stress questionnaire is a reliable and validated instrument to assess chronic psychosocial stress in both healthy and clinical adult samples.5, 13, 14 It measures four scales (worries, tension, joy, demands); the first three scales represent internal stress reactions, whereas the scale ‘demands’ relates to perceived external stressors. Participants were further assessed with the Hamilton anxiety scale15 and the Hamilton depression scale16 by a certified psychologist. The Hamilton scales provide a broad assessment and are widely used, including healthy populations;17 here, the clinical cut-scores for anxiety and depression were not used, but rather a continuous approach to the variables to compare absolute scores was utilized. Upon filling of the questionnaires, and immediately before the instrumental task, subjects collected saliva samples using Salivette (Sarstedt, Germany) collection devices. Collection took place between 0900 h and 0500 h in all subjects (the variation in cortisol levels in this time period is relatively small;18 furthermore, subjects from the control and the stressed group were mixed in order to have a similar distribution of the collection time for cortisol, allowing to minimize the impact of the variation in collection time). Samples were stored at −20 °C until the biologically active, free fraction of the stress hormone cortisol was analyzed using an immunoassay (IBL, Hamburg, Germany). Subjects were preassessed to exclude those with a previous history of neurological or psychiatric illness; none indicated a history of eating disorders. The study was conducted in accordance with the principles expressed in the Declaration of Helsinki and was approved by the Ethics Committee of Hospital de S. Marcos (Braga, Portugal). The study goals and tests were explained to all participants and all gave informed written consent.

Instrumental task

The task was adapted from a validated protocol.19 Subjects were asked to fast for at least 12 h before their scheduled arrival time at the laboratory, but were permitted to drink water. Before starting the instrumental task, hunger level and pleasantness of the liquid foods were checked for each subject. The liquid-food rewards were chocolate milk and tomato juice. Alternatively, apple or strawberry juices were given to subjects that did not find pleasant either chocolate or tomato juice. These liquid foods were selected as they can be administered in liquid form, are palatable at room temperature, and their flavor and texture are distinguishable; in this way, sensory-specific satiety effects were kept and the likelihood of the subjects developing a generalized satiety to all liquid foods was minimized. The food rewards were delivered by means of separate syringe pumps (one for each liquid) positioned in the scanner room; subjects received the food through polyethylene plastic tubes (straw-like) although they lay supine in the scanner. The task, with an event-related jittered design, consisted of different sessions: sessions of valued (VAL) actions with reward deliver followed by sessions of devalued (DEV) actions with the outcome devaluation and extinction. Between the two sessions there was a 30-min break, during which subjects were fed to satiety with one of the two liquid foods, outside of the scanner. Each session consisted of 150 trials (50 trials per condition: chocolate, tomato and neutral) subdivided in five blocks of 30 trials each. In each trial, the decision time was 1.5 s; after each decision, the choice appeared highlighted during 4 s; this was followed by the reward delivery time, a black screen with a red fixation with 2 s duration, and the jittered interstimulus interval with 4 s mean duration (Supplementary Figure 1). Before the experiment, subjects were informed about the pairs of fractal patterns that would appear on each trial and were instructed to select one of the possible actions on each trial. They were informed that according to their choices they would receive 0.75 ml of liquid food (valued outcome), the same quantity of a neutral solution (water) or nothing; although there was no information about which action was associated with which particular outcome, subjects were told that one of each pair of actions was associated with a higher probability of obtaining an outcome than the other. During the first session, subjects were instructed to learn to choose the actions that led to high probabilities of pleasant liquid foods, including chocolate and tomato juice. Choosing this option led to a chance of obtaining chocolate milk (P=0.4) or orange juice (P=0.3) in the chocolate condition, and tomato juice (P=0.4) or orange juice (P=0.3) in the tomato condition. After this session, in which subjects learned to preferentially choose the options that gave them the best chance of obtaining a juice reward, they were then removed from the scanner and invited to eat to satiety (selective satiation), until they did not want to eat any more, and the pleasantness rating for that food had decreased (devaluation), as checked by a reassessment similar to the one used before session 1. This selective outcome devaluation procedure served to devalue one of the outcomes associated with a particular instrumental action, leaving the value of the outcome associated with the other action intact. To test the effects of the devaluation procedure, subjects underwent a second session, in which they were presented with the same trial types involving the same actions and once again had to select whichever action they preferred. The chosen stimulus increased in brightness as it did during the first session but, in this session, the outcome was no longer presented (that is, the subjects were tested in extinction for these outcomes). That is, the devalued and non-devalued outcomes were never presented again to the subjects during the test. Yet, to maintain responses on both the actions subjects still received the non-devalued orange juice outcome so that the overall outcome was now available with equal probability on the two available actions (P=0.3 each). The total acquisition time was between 2.5–3 h. Using this design and by comparing the different patterns of activation between the first (first 30 trials) and the last block (last 30 trials), one can appreciate the different brain areas associated with distinct behaviors during the instrumental task.

Data acquisition

The different MRI acquisition sequences of the brain were conducted in different sessions on the same day, using a clinically approved Siemens Magnetom Avanto 1.5. T (Erlangen, Germany). The detailed description of data acquisition is provided in Supplementary Materials and methods.

Image processing

The detailed description of fMRI and volumetric data analysis20, 21, 22, 23, 24 is provided in Supplementary Materials and methods.

Statistical analysis

Results of the psychological scales, cortisol levels, behavioral performance and regional volumes were analyzed in the IBM SPSS Statistics software, v.19. (IBM, New York, USA). The detailed description of the statistical analysis is provided in Supplementary Materials/ subjects and Methods.

Results

Stress insensitivity to outcome devaluation is reversible

In order to investigate whether our previous findings in rodents translate into humans, we designed two experiments. In the first, two cohorts of medical students were recruited: one was under their normal academic activities (controls, n=12), whereas the other included subjects that had just finished their long period of preparation for the medical residence selection exam (stress group, n=12). Stressed individuals displayed increased scores in the stress-perceived questionnaire (Figure 1a; t22=3.429, P=0.002) and in the Hamilton anxiety (t22=2.202, P=0.042) and depression scores (t22=3.698, P=0.001) when compared with controls; a trend was also found for increased salivary cortisol levels in stressed subjects (t22=2.077, P=0.05). In the second experiment, we performed a longitudinal study in the same stressed individuals (stress recovered, n=12) by reassessing their psychological and behavioral performance 6–7 weeks after the end of the exposure to stress, which allowed to infer on the (ir)reversibility of the stress-induced changes. The results clearly showed that a stress-free period of a few weeks is sufficient to normalize all the psychological changes (Figure 1a; stress perception: t11=3.663, P=0.004; anxiety score: t11=2.766, P=0.018; depression score: t11=4.551, P=0.001); salivary cortisol levels were partially restored (t11=1.835, P=0.094). To control for the influence of test–re-test, a smaller cohort of stress-recovered individuals (n=4) naïve to the first experiment were also included; no differences were obtained in any parameter under study between stress-recovered non-naïve and naïve groups (data not shown). In addition, a smaller cohort of control subjects (n=4) was also reassessed 6 weeks after the first assessment; no significant differences were found between the two assessment moments in any of the parameters analyzed (data not shown).

Figure 1
figure 1

Exposure to chronic stress does not influence the acquisition of instrumental tasks and activates the associative fronto-striatal network. (a) Mean score of the stress perceived questionnaire (control vs stress t22=3.429, P=0.002; stress vs stress recovered t11=3.663, P=0.004). (b) Response rate during the acquisition of the task, for the valued rewards ((b1) chocolate, (b2) tomato) in both the high (chocolate: control t11=2.568, P=0.026; stress t11=3.806, P=0.003; stress recovered t11=2.615, P=0.024; tomato: control t11=3.144, P=0.009; stress t11=2.556, P=0.027; stress recovered t11=2.828, P=0.016) and the low probability options (chocolate: control t11=1.321, P=0.213; stress t11=2.152, P=0.054; stress recovered t11=3.120, P=0.010; tomato: control t11=2.677, P=0.022; stress t11=2.335, P=0.039; stress recovered t11=2.187, P=0.051). No significant differences were found between groups. (c1), Pattern of activation when deciding between high- vs low-value choices during the learning phase of the task (that is, contrast between the last and first block of the first session). The activation in the medial prefrontal cortex (left medial superior gyrus; x=−10, y=44, z=32; Z score=2.81; P<0.002, uncorrected) demonstrates the engagement of this brain region during the acquisition of the decision task. No other brain region showed effects at this significance in this contrast. (c2), Pattern of brain activation in controls throughout the learning phase of the task. There is activation of components of the associative network, namely the medial prefrontal cortex (anterior cingulate: x=0, y=10, z=42; Z score=4.13; P<0.05, corrected for small volume for family wise error (FWE)) and the caudate nucleus (left: x=−12, y=6, z=10; Z score=4.49; P<0.05, corrected for small volume for FWE and right x=18, y=10, z=18; Z score=3.67; P<0.05, corrected for small volume for FWE).

Following the psychological and analytical determinations, we tested whether chronic stress affected the ability of the individuals from all groups to perform actions based on the consequences of their behavior, using an operant instrumental task adapted from Valentin et al.19 All subjects were under 12 h fasting. After receiving instructions on the task, subjects learned how to associate an action to an outcome (valued: chocolate and tomato juice; neutral: water). All groups increased their choices of options with a high probability of reward (chocolate and tomato juice) (tomato: control t11=3.144, P=0.009; stress t11=2.556, P=0.027; stress recovered t11=2.828, P=0.016; chocolate: control t11=2.568, P=0.026; stress t11=3.806, P=0.003; stress recovered t11=2.615, P=0.024) in detriment of low-probability options, indicating that both control and stressed subjects had no difficulty in associating the particular action they were performing with the specific outcome obtained (Figure 1b). Of note, there was no preference for high or low probability options when the reward was neutral (all comparisons nonsignificant, data not shown). This operant task was performed with the subject inside a scanner, during the acquisition of fMRI, which allowed determining the patterns of activity of fronto-striatal networks during the decision-making processes.25 The response rate during the acquisition phase (Figure 1b) was observed to be similar and goal-directed in all experimental groups. Although distinction between high- and low-value outcomes triggers a specific activation of the medial frontal gyrus (Figure 1c1), a region of the brain associated with high-level executive function and decision-making,26 the learning of the instrumental task in goal-directed actions activated highly the cingulate gyrus and the caudate nuclei (Figure 1c2), both key components of the associative network.

In a second session, we tested whether subjects were sensitive to outcome devaluation, by providing free access to one of the rewards (for example, chocolate), until they referred satiety before starting the task. In accordance with our rodent studies, controls adapted their choices in response to sensory-specific satiety, whereas stressed subjects were insensitive to the expected value of the outcome, as indicated by the lack of a devaluation effect (Figure 2a; control: t11=3.767, P=0.003; stress: t11=1.464, P=0.171); importantly, group comparisons proved that the stress group significantly differs from controls in the number of devalued choices (Figure 2a; t22=−2.143, P=0.043), but not in valued options (t22=0.410, P=0.686). These data suggest that individuals without stress exposure perform actions because of the consequences of their behavior, whereas stressed subjects rapidly develop habitual behaviors and do not adjust their actions to their current needs. Importantly, during the devaluation phase of the task stressed subjects activate significantly more the left putamen than controls (Figure 2b2), whereas controls display a greater activation of the right caudate than stress subjects (Figure 2b1); these findings correlate with an impairment in devaluation observed in stressed subjects and support the view that habit-based decisions are linked to an overactivation of components of the sensorimotor corticostriatal network. Interestingly, after a period of recovery from stress, these subjects regain the ability to orient their action by goal-directed decisions (Figure 2a; stress recovered: t11=3.336, P=0.007; stress vs stress recovered: valued t11=0.338, P=0.742 devalued t11=2.918, P=0.014) and display a greater activation of the right caudate during devaluation than subjects immediately after stress exposure (Figure 2b), in a pattern of activation very similar to the one found in controls.

Figure 2
figure 2

The stress insensitivity to outcome devaluation is reversible and associated with variations of the activation of the corticostriatal networks. (a) Response rate for the high probability option of the devalued reward before (last block of the first scanning session) and after (first block of the second scanning session) devaluation. Controls significantly reduced their preference (control: t11=3.767, P=0.003), whereas stressed subjects were insensitive to the decrease in the value of the outcome (stress: t11=1.464, P=0.171), but regained a goal-directed behavior after a stress-free period (stress recovered: t11=3.336, P=0.007). Group comparisons showed that the stress group significantly differs in the number of devalued choices from both controls (t22=−2.143, P=0.043) and stress-recovered subjects (t11=2.918, P=0.014). (b) Pattern of activation during devaluation phase of the task. Controls display a higher activation in the right caudate nuclei (x=8, y=6, z=12; Z score=3.45; P<0.05 corrected for small volume for FWE) than stressed subjects (b1), whereas stressed subjects display a greater activation of the left putamen (x=−26, y=0, z=16; Z score=3.35; P<0.05 corrected for small volume for FWE) than controls (b2); after a period of recovery from stress, a higher activation of the right caudate (x=20, y=−4, z=22; Z=3.39; P<0.005, uncorrected) is observed when compared with activation immediately after stress (b3). *P<0.05; line: within group comparisons; dashed line: between groups comparisons.

Structural plasticity in the stressed brain

The impairment in decision-making and shift in behavioral strategies observed in stressed individuals evoke the effects observed after manipulations of the associative or sensorimotor corticostriatal circuits.9, 27, 28 Therefore, using neuroimaging morphological techniques, we first investigated the effects of chronic exposure to stress on the structure of striatal and cortical circuits known to be required for goal-directed actions and habits; in addition, we also assessed the recovery from stress in the same parameters. The present data reveals opposing effects of chronic stress in the caudate and in the putamen: whereas we found an atrophy of the caudate (relative volumes) that was only significant on the right (Figure 3a; left: t22=2.067, P=0.051; right: t22=2.676, P=0.014), the putamen revealed a significantly increased relative volume in both hemispheres (Figure 3a; left: t22=2.617, P=0.016; right: t22=3.132, P=0.005). As a consequence, the caudate-to-putamen ratio was increased in controls relative to stressed individuals (left: 0.72 vs 0.61, t22=3.565, P=0.002; right: 0.76 vs 0.64, t22=4.190, P<0.001, respectively), which suggests a bidirectional modulation of neuronal connectivity in the dorsal striatum expressed by a global hypertrophy of the sensorimotor striatum, and a shrinkage of the associative striatum. In addition, the orbitofrontal cortex, which is also a target of stress5, 9 and has been implicated in decision-making,29, 30 showed a different pattern of change, with the most medial portions of the orbitofrontal cortex displaying a structural atrophy that reached statistical significance in the left hemisphere (Figure 3a; left: t22=3.764, P=0.001; right: t22=1.494, P=0.149), whereas nonsignificant increases were found in the lateral components of this cortical region (left: t22=30.319, P=0.075; right: t22=1.355, P=0.189). No differences were found in the motor or somatosensory cortices (Figure 3a; motor: left: t22=1.450, P=0.161; right: t22=0.459, P=0.651; sensory: left: t22=1.272, P=0.217; right: t22=0.543, P=0.593) or in total intracranial volumes (t22=0.033, P=0.974). Noticeably, most structural changes found in stressed subjects were transient. Indeed, data from the second neuroimagiological assessment revealed a complete recovery of the caudate (Figure 3b; left: t11=2.590, P=0.025; right: t11=2.494, P=0.030), right putamen (Figure 3b; t11=2.246, P=0.046) and left medial portions of the orbitofrontal cortex volumes (Figure 3b; t11=2.914, P=0.014) and a trend for restoration in the volume of the left putamen (Figure 3b; t11=1.495, P=0.163). Importantly, these results demonstrate that stress-induced changes are not permanent and after a short period of recovery from stress (6 weeks) young adults display an impressive plasticity in fronto-striatal networks.

Figure 3
figure 3

Volumetric changes in the brain after stress exposure (a) and after recovery from stress (b). Upper panels represent changes in subcortical regions, whereas the lower panels represent volumetric variations in cortical regions. (a) The impact of stress in the structure of corticostriatal loop. The color changes illustrate variations in volumes of stressed subjects in contrast to controls. (b) The amount of recovery from the impact of stress in the structure of cortico-basal ganglia loop. The color changes illustrate variations in volumes in stressed subjects after recovery from stress.

Discussion

The burden of chronic stress exposure is increasing in our modern society. Although stress response is vital for the survival of every living organism, maladaptive responses to stress can produce changes in the brain and affect cognitive processes, attention and executive functions,1, 2, 3 such as decision-making. The selection of the appropriate actions in particular situations is an extremely dynamic process. Actions can be selected based on their consequences (for example, when we first select the best route to drive from home to work). This goal-directed behavior is crucial to face the ever-changing environment but demands an effortful control and monitoring of the response. To increase the efficiency, one can automatize recurring decision processes as habits (or rules). Habitual responses no longer need the evaluation of their consequences and can be elicited by particular situations or stimuli (for example, after driving to work for some time in the established route, we automatically, when entering the car, go that way). The ability to shift back and forth between these two types of strategies is necessary for appropriate decision-making in everyday life. For example, in a novel situation, it may be crucial to be able to inhibit a habit and use a goal-directed strategy (for example, if we need to go to another place first, besides work, it is most likely inappropriate to use our habitual route to work).

In this study, we show that humans exposed to chronic stress rapidly shift toward habitual strategies (in other words, following the above example, stressed individuals are more likely to choose the habitual route, even when the right choice would be to go a different way). More specifically, our findings demonstrate that prolonged exposure to stress triggers a reorganization of corticostriatal circuits that determine decisions under instrumental tasks (instrumental behavior is determined by the association between an action and an outcome, as tested in this study; in this form of responses, actions can be either ‘goal-directed’ or ‘habitual’. By testing the subjects from the two distinct conditions—control vs stressed—in a paradigm in which they work for a reward, the pattern of their instrumental responses can be discriminated). An atrophy of the associative corticostriatal circuit that rules goal-directed actions, in parallel with a hypertrophy of the sensorimotor corticostrial network, was found in young subjects displaying signs and symptoms of stress; these structural changes were associated with a decreased activation of this circuit in instrumental tasks. Most importantly, stressed individuals had a bias to habits in their decision-making processes. Interestingly, we also demonstrate the remarkable plasticity of these neuronal circuits, by showing that after a stress-free period, both the structural and the functional changes were reverted and the pattern of decision in previously stressed subjects was again biased, which became again goal-directed. Of note, other studies, in distinct experimental conditions, have shown volumetric variations in a similar (or even shorter) time frame.31, 32 In the stress field, studies have also demonstrated rapid structural changes triggered by stress exposure.6, 33 These changes typically occur at the dendritic level and are likely to represent alterations in synaptic connectivity between different brain regions. Alterations in several molecules, including trophic factors and adhesion molecules, are assumed to underlie such structural changes, which occur in opposite directions in distinct brain regions. Importantly, these changes are associated with functional impairments at specific neural circuit level.

Our results confirm a divergent structural reorganization of corticostriatal circuits in humans exposed to prolonged stress, with hypertrophy/overactivation of the sensorimotor and atrophy/deactivation of the associative corticostriatal circuits. This frontostriatal reorganization is accompanied by a shift toward habitual strategies, affecting the ability of stressed individuals to perform actions based on their consequences. These results expand previous studies showing that acute stress can modulate decision-making processes in humans.34 For the first time it is now demonstrated that such behavioral changes are linked to alterations in the frontostriatal networks in humans, thus providing insights into the neural circuits underlying the shift between goal-directed actions and habitual behavior, and that can lead to dysfunctional decision-making upon exposure to stress.

Noticeable, this stress-decision bias was found to be reversible after the end of the stress exposure, with signs of plasticity both at the structural and at the activational levels. This is in accordance with previous data in rodents and primates showing that stress-induced changes in the structure of the prefrontal cortex are reversible, at least in young subjects.7 As a consequence of the structural/activational reorganization, we found a behavioral restoration of decision-making strategies in subjects that have been exposed to stress. This novel finding is of paramount importance inasmuch as optimization of decision-making processes confers an important advantage in response to a constantly changing environment. Indeed, under conditions of maladaptative stress, there is a reduced ability to shift from habitual strategies to goal-directed behaviors, even when conditions would recommend that shift. However, it is also true that the fronto-striatal networks, even after prolonged stress, preserve the plastic properties that allow for a functional recovery once the stressful stimuli are gone. Therefore, these results are not only of relevance to understand the mechanisms through which stress is modulating decision-making in both physiological and pathological conditions, but they certainly also pave the way for interventional therapies that empower stress-coping mechanisms.