Common Sense in Choice: The Effect of Sensory Modality on Neural Value Representations

Abstract Although it is well established that the ventromedial prefrontal cortex (vmPFC) represents value using a common currency across categories of rewards, it is unknown whether the vmPFC represents value irrespective of the sensory modality in which alternatives are presented. In the current study, male and female human subjects completed a decision-making task while their neural activity was recorded using functional magnetic resonance imaging. On each trial, subjects chose between a safe alternative and a lottery, which was presented visually or aurally. A univariate conjunction analysis revealed that the anterior portion of the vmPFC tracks subjective value (SV) irrespective of the sensory modality. Using a novel cross-modality multivariate classifier, we were able to decode auditory value based on visual trials and vice versa. In addition, we found that the visual and auditory sensory cortices, which were identified using functional localizers, are also sensitive to the value of stimuli, albeit in a modality-specific manner. Whereas both primary and higher-order auditory cortices represented auditory SV (aSV), only a higher-order visual area represented visual SV (vSV). These findings expand our understanding of the common currency network of the brain and shed a new light on the interplay between sensory and value information processing.


Introduction
Throughout our daily life, the brain computes and assigns values for items or concepts it comes across. These value labels allow us to compare alternatives and decide which is preferable, for example, staying in bed for ten more minutes or getting up to go to work. Previous studies have identified a neural network which represents value in an abstract way (Levy and Glimcher, 2012;Clithero and Rangel, 2014). This network, termed the common currency network, is composed mainly of the ventromedial prefron-tal cortex (vmPFC) and the ventral striatum (vStr; Levy and Glimcher, 2012; Bartra et al., 2013). Both these regions track and represent reward values irrespective of its identity or type, ranging from tangibles such as DVDs (Levy et al., 2011) and snack foods (Plassmann et al., 2007;Levy and Glimcher, 2011;Pogoda et al., 2015), to pastime activities (Gross et al., 2014), attractive faces (O'Doherty et al., 2003), and social interactions (Lin et al., 2012;Ruff and Fehr, 2014). However, such representations should allow us to compare not only across categories, but also between value of heard information, like footsteps of a loved one, and value of things we see, such as their smile.
Only a handful of experiments examining the properties of this network employed sensory modalities other than visual (Kühn and Gallinat, 2012). Studies of the neural value representations of odors (O'Doherty et al., 2000;Anderson et al., 2003;Grabenhorst et al., 2007;Howard et al., 2015), tastes (Doherty et al., 2002;McCabe and Rolls, 2007), and their combination (De Araujo et al., 2003) showed a correlation between the activity of the medial orbitofrontal cortex and ratings of subjective pleasantness. One study examined the representation of beauty in the brain by comparing neural activity in response to listening to music and looking at paintings. The authors found that both conditions activate the vmPFC as a function of how beautiful a stimulus is (Ishizu and Zeki, 2011). However, subjects did not make actual choices between options, and it is not clear if beauty ratings are similar to value-based choices. Therefore, it remains unknown whether the common currency network represents subjective value (SV) irrespective of sensory modality. In other words, just how common is the common currency network?
A complement question is the nature of value representation in sensory cortices. In the visual domain, value modulation was observed in both early (Serences, 2008;Zeki and Stutters, 2012) and higher areas (Chatterjee et al., 2009), such that neural activity was correlated with rewarding properties of visual stimuli. There is evidence for value modulation in the auditory domain as well (Thiel et al., 2002;Weinberger, 2007;Brosch et al., 2011;Puschmann et al., 2013), suggesting that associating a specific tone with appetitive or aversive consequences can alter the neural activity of the auditory cortex. Furthermore, neural correlates of pleasure derived from music appear in both vStr and superior temporal gyrus, where the auditory cortex lies (Mueller et al., 2015), and the functional coupling between them increases as music's reward value increases (Salimpoor et al., 2013). Taken together, these findings suggest that sensory cortices are influenced by value. However, whether this influence is modalityspecific or cross-modal is unclear. Furthermore, to our knowledge, no study has examined the auditory cortex during visual decision making or vice versa. Finally, it is unknown where within the sensory cortex such a representation resides.
To address these questions, we conducted a neuroimaging study using a standard risk task (Holt andLaury, 2002, 2005;Levy and Glimcher, 2011), in which subjects performed a series of choices between a risky and a safe alternative. Importantly, the alternatives were presented either visually or aurally. Using this task, we identified regions of the brain representing SV as a function of the modality of presentation, as well as brain areas tracking SV irrespective of the sensory modality. We then used cross-modality classifiers to examine whether this representation is truly generic, such that training a classifier on one modality can distinguish values in the other. Two potential outcomes exist: one, in line with evidence that the vmPFC represents stimuli of various modalities, the common-currency network could represent visuallypresented rewards similarly or identically as the same rewards presented aurally. Alternatively, since visual and auditory information are not equivalent in terms of their neural representation, the brain is biased toward visual information, both anatomically (Zeki, 1993) and functionally (e.g., Bigelow and Poremba, 2014), it is possible that visually-presented rewards will yield a different representation within the value system, compared to the aurallypresented rewards.

Participants
Forty-three healthy subjects participated in the study (21 females, mean age 25, 20 -43). Subjects gave informed written consent before participating in the study, which was approved by the local ethics committee at Tel Aviv University. All subjects completed at least one behavioral session. Of them, 40 completed the second behavioral session (one dropped out, and two were excluded due to random choice behavior). Of them, 26 participated in the neuroimaging session: six subjects opted out, two did not complete a full scanning session due to technical problems (one was accidently scanned using the wrong protocol, and one had troubles using the MR-compatible glasses), five subjects were not called back due to inconsistent risk preferences, and one due to random choice behavior in the second session.

Risk-evaluating task
On each trial, subjects chose between a presented lottery and a certain amount of money (a reference option). The lottery consisted of an amount of money [10,35,45,50,or 75 New Israeli Shekels (NIS); 1 NIS is ϳ0.25 USD] and a chance of winning it (15%, 30%, 45%, 62%, or 80%), presented consecutively. The presentation order was counterbalanced across trials. The reference option was always a certain amount of 10 NIS, and it was not presented to subjects during the trials, only in the instructions stage, and they were reminded about it at the beginning of each block. Each lottery was presented either visually or aurally (Fig. 1). On visual trials, the amount and probability appeared as white text on a black background for 2 s each. Next, a green fixation-cross appeared for 300 ms (2 s in the fMRI session), after which subjects indicated their choice (lottery or reference) by clicking on the right or left buttons of a computer mouse. The buttons' encoding remained constant throughout trials, blocks and sessions for each subject, but was counterbalanced between subjects. The time window to indicate a choice was 1.5 s long. Next, a feedback appeared on screen, a check mark in cases when the subject made a choice, and a text reading "no choice was made" otherwise. On auditory trials, subjects heard male voice recordings of the phrases "## shekalim" and "at ## percentage" via headphones. As in visual trials, amount and probability were presented for 2 s each, followed by a beeping sound (similar to the green fixation-cross in the visual trials), signaling subjects to choose. A feedback for response was either another beep in case the subject made a choice, or a buzzer sound in case the subject failed to respond within the allotted time.

Stimuli
In the behavioral sessions, we used all 25 lotteryoptions that can be composed using the five amounts and five probabilities. In the fMRI session, we used a subset of 13 options, sampling the center of the payoff matrix and some of the corners: 10 NIS at 15%, 10 NIS at 80%,35 NIS at 30%,35 NIS at 45%,35 NIS at 62%,45 NIS at 35%,45 NIS at 45%,45 NIS at 62%,50 NIS at 35%,50 NIS at 45%,50 NIS at 62%,75 NIS at 45%,and 75 NIS at 80%. For the auditory lotteries, we used a Hebrew text-to-speech software (Alma Reader, Kolpics), to create audio files of the amounts and probabilities read out-loud. Each stimulus lasted ϳ2 s. In the fMRI session, the stimuli were delivered using S14 in-ear headphones by Sensimetrics lcd, and the volume amplitude was adjusted manually per subject to ensure that the auditory task is delivered in a clean and well-balanced manner and overcomes the MR background noise (ϳ75 db).

Behavioral sessions
Before starting the first behavioral session, subjects gave written consent, read the instructions and filled out a short demographics questionnaire. Next, they underwent a short training session, consisting of five auditory trials and five visual trials. On successful completion of the training, subjects preformed the risk-evaluating task. The task was composed of 12 blocks, six visual and six auditory, presented at a random order. Each block consisted of 25 trials presented at a random order. At the end of the task, one trial was randomly selected, implemented and paid out to the subject in addition to the participation fee. After completing the first session, we assessed each subject's risk-preference (for details, see below, Riskpreference estimation). Subjects with behavior that we were unable to fit with a utility function (due to random choices) were considered as outliers and were not asked to return. The other subjects were called in for a second behavioral session, in which they performed the task again. The average time elapsed between the first and the second session was 13.025 d (5-61). We then calculated subjects' risk-preference based on the second session, and subjects with stable scores (a difference smaller than 0.25 in their estimated risk-preference parameter) were called back for the final fMRI session. We chose the 0.25 threshold based on the behavior of the first ten participants in our study; their average difference across sessions was 0.1, with a standard deviation of 0.135. Therefore, any subject with a difference of more than one SD of the mean was discontinued from the study. The average elapsed time between the second behavioral session and the fMRI session was 44.42 d (5-311). Note, that the first two subjects in the experiment had an extremely long duration between behavioral sessions and fMRI scans. When disregarding them, the average time between the second session and the fMRI session dropped to 24.45 d (5-94). Notwithstanding, both subjects' risk-preference scores remained stable between the three time points, with changes in their fitted risk preference parameter (␣) Ͻ 0.12.

fMRI session
In the fMRI session, subjects performed the riskevaluating task while being scanned. The task was iden- On each trial, subjects saw or heard a lottery, a winning probability followed by an amount of money (the order of the presentation of the amount and probabilities were counterbalanced across trials). After a short go-signal, subjects chose between the lottery and a sure amount of money, which was always 10 NIS and was not presented on each trial. Numbers on top represent the duration in seconds in the behavioral sessions. In brackets are the durations in the fMRI experiment. s, seconds; ITI, intertrial interval.
tical to the task used in the behavioral sessions, except for the addition of an intertrial interval to account for the hemodynamic delay (mean duration 8 s, jittered between 6, 8, and 10 s). Each run consisted of three repetitions of the 13 possible trials (a total of 39 trials per run) for a given sensory modality, and each subject completed a total of four functional runs. The runs were randomly ordered across the session. After completing the main task, we obtained an anatomic scan for each subject. Lastly, each subject completed two functional localizers, one visual and one auditory, to identify individual loci of sensory activations, visual and auditory (for details, see below, Functional localizers). At the end of the fMRI session, one trial was selected at random and paid out to the subject, in addition to the participation fee.

Functional localizers
For the visual localizer, we used two visual categories: objects (black-and-white images) and scrambled objects (the same objects broken into pixels and scrambled into nonrecognizable images). The localizer consisted of 21 blocks. The duration of each block was 16 s, and blocks were presented in a pseudo-random order. Out of the 21 blocks, eight were objects blocks, eight were scrambled objects, and the remaining five blocks were interleaved between the other blocks, consisting of a blank-screen which served as a baseline. Within each block, 20 images were presented for 800 ms per image. To make sure that subjects payed attention to the localizer stimuli presented on the screen, at the beginning of each run we presented two images (one object and one scrambled) from the pool of stimuli and instructed subjects to memorize both images and press a key whenever they appeared on screen during the run.
To locate auditory sensitive regions, we used a localizer with auditory stimuli of three categories: silence (baseline), non-vocal sounds taken from The Voice Neurocognition Laboratory in University of Glasgow (e.g., birds chirping, cars honking, etc.; Pernet et al., 2015), and sequences of beeps. The auditory localizer consisted of 21 blocks. The duration of each block was 8 s and blocks were presented in a pseudo-random order. Eight blocks were non-vocal, eight were beeps, and the remaining five were silence serving as baseline. As in the visual localizer, we asked subjects to memorize a particular sound beforehand, and to press a key whenever it appeared in the run.

Image acquisition
Scanning was performed at the Strauss Neuroimaging Center at Tel Aviv University, using a 3T Siemens Prisma scanner with a 64-channel Siemens head coil. Anatomic images were acquired using MPRAGE, which comprised 208 1-mm-thick axial slices at an orientation of Ϫ30°to the AC-PC plane. To measure blood oxygen leveldependent (BOLD) changes in brain activity during the risk-evaluating task, a T2‫-ء‬weighted functional multi-echo EPI pulse sequence was used [TR ϭ 2 s; TE ϭ 30 ms; flip angle ϭ 90°; matrix ϭ 74 ϫ 74; field of view (FOV) ϭ 222 mm; slice thickness ϭ 3 mm]; 33 axial (Ϫ30°tilt) 3-mm slices with no interslice gap were acquired in ascending interleaved order. To measure neural activity during the functional localizers, a multi-band EPI sequence was used (TR ϭ 2 s, TE ϭ 30 ms; flip angle ϭ 90°; matrix ϭ 112 ϫ 112; FOV ϭ 224 mm; slice thickness ϭ 2 mm); 58 axial (Ϫ30°tilt) 3-mm slices without gaps were acquired in an ascending interleaved order.

Image analysis
BrainVoyager QX (Brain Innovation, RRID:SCR_006660) was used for image analysis, with additional analyses performed in Matlab (MathWorks, RRID:SCR_001622). Functional images were sinc-interpolated in time to adjust for staggered slice acquisition, corrected for any head movement by realigning all volumes to the first volume of the scanning session using six-parameter rigid body transformations, and de-trended and high-pass filtered to remove low-frequency drift in the fMRI signal. Data were also spatially smoothed with a Gaussian kernel of 4 mm (full-width at half-maximum). Note that for the multivariate analysis, we used the nonsmoothed data. Runs in which a subject moved Ͼ3 mm were removed from any further analyses (a total of three runs were removed). Images were then coregistered with each subject's high-resolution anatomic scan and normalized using the Montreal Neurologic Institute (MNI) template. All spatial transformations of the functional data used trilinear interpolation.

Risk-preference estimation
We used random utility theory to derive the subjectspecific estimated SV for each modality. We pooled the choice data from all three sessions (two behavioral and one fMRI) and separated it into visual and auditory trials. For each subject, we modeled the utility functions for each sensory modality separately as power functions having the form where p is the probability for an option to yield a reward (in the reference alternative this is equal to 1, and in the lottery alternative it varies between trials), X is the amount (in NIS) of the offered reward, and ␣ sm is the free parameter representing the subject-specific (s) modality-specific (m) attitude toward risk. With a power utility function, a value of ␣ ϭ 1 denotes risk-neutrality, a value of ␣ Ͼ 1 represents a risk-seeking individual with a convex utility function, and an ␣ Ͻ 1 represents a risk-averse individual with a concave utility function. We selected this particular equation to fit the utility for its simplicity, minimal assumptions, having only one free parameter, and its ability to predict choice behavior.
Using maximum likelihood estimation (MLE), we fitted the choice data of each modality to a single logistic function of the form where P L is the probability that the subject chose the lottery option, EU L and EU R are the expected utility for the lottery and reference option, respectively, and ␤ sm is the slope of the logistic function, which is the second subject-specific modality-specific free parameter. This analysis produced a fitted risk-preference parameter (␣ sm ) and a slope parameter (␤ sm ) for each sensory modality. It thus specified a utility function (or equivalently, a SV function) for each modality for each subject that could account for the trade-offs between risk and reward that we observed in our subjects.
To use the SV as a parametric regressor in our analysis, we calculated for each subject in each sensory modality the SV of a trial t (SV t ), defined as the probability to win multiplied by the amount to the power of subject's risk preference: To examine the behavioral data for differences in risk preferences between sensory modalities, we averaged the risk parameters across subjects for a given sensory modality and used the nonparametric rank test.
To ensure that differences in reaction times (RTs) cannot explain the neural response to different levels of value, we conducted two tests. One, we created a linear regression of RTs and SVs of each trial of the 26 participants in the fMRI session and clustered the errors by subject. A significant coefficient would indicate that any neural results might by due to an effect of elapsed time on the trial. Second, we correlated RTs to visual lotteries with RTs to auditory lotteries. To do so, we first arranged the data to have the same number of samples, such that if a subject missed a trial in one modality, we omitted a trial of the same lottery from the other modality. Then we sorted the RTs according to the lotteries, to make them comparable across modalities. Since the neural classification analysis is subject specific, we correlated each subject's auditory RTs with visual RTs.

Statistical analysis Whole-brain analysis of SV
To identify the neural correlates of auditory SV (aSV) and of visual SV (vSV), we created a general linear model (GLM) with 11 predictors. The first two predictors contained the trial-by-trial SVs, separated by modality. Note, that these values relate to the SV of the lottery presented on the screen, irrespective of subject's choice. These values were entered at the first two TRs of each trial, normalized and convolved with the canonical hemodynamic response function (HRF). Another two dummy predictors represented trial identity, also separated into modalities and convolved with the HRF (aStick and vStick). The additional seven predictors consisted of six nuisance predictors, obtained from the motion-correction stage, and a constant. Results were corrected for multiple comparisons using FDR correction.
Since the representation of value might be linked to choice, such that it encodes the value of the chosen alternative and not the offer (Raghuraman and Padoa-Schioppa, 2014), we conducted an additional GLM, modeling chosen SV instead of the presented SV. In this GLM the trial-by-trial SV was equal to the lottery SV in case the subject chose the lottery, and to the reference SV in case the subject chose the reference option. All reference trials were equal to SV reference ϭ 1ϫ10 ␣ sm . We modeled missed trials as 0. Then, we computed a Pearson correlation between the chosen-SV and the presented-SV regressors to examine possible differences between the two models, applied the chosen-SV model to the neural data and compared it to the presented-SV results.

Region of interest (ROI) analysis in sensory cortices
To identify value modulation in sensory areas, we first pinpointed eight ROIs for each subject (when possible), based on the functional localizers' data. Primary visual cortices (both left and right) were defined as the peak activity when using the contrast scrambled objects Ͼ objects, at a significance level of z ϭ 4. Similarly, we identified higher-order visual cortices using the opposite contrast (objects Ͼ scrambled objects) at the same significance threshold. Primary auditory cortices were defined as the peak activity when using the contrast beeps Ͼ silence at a significance level of z ϭ 4, and higher-order auditory cortices were defined using the contrast nonvocal sounds Ͼ beeps, at a slightly lower statistical threshold of z ϭ 3, due to overall reduced activations. We then conducted the same GLM mentioned above and correlated SV with BOLD activity extracted from each of the eight subject-specific sensory ROIs. To examine if a sensory ROI is representing SV, we conducted two tests: one, we compared the ␤-values of aSV and vSV of each ROI to zero, using one-sample two-tailed t tests. Second, to test for the specificity of the value representation, we directly compared aSV to vSV using one-tailed pairedsamples t tests. Additionally, to be certain that the ROIs are indeed sensory, we compared the ␤-values of the trial identity dummy variables (aStick and vStick) to zero.
As the auditory localizer is less commonly used than the visual localizer, we wished to replicate any finding related to it by implementing an alternative method of defining it. We achieved this by using the web-based tool Neu-roSynth (Yarkoni, 2014), that allows to create neural maps based on meta-analyses of the literature. We downloaded two maps, corresponding to "Heschl gyrus" and "planum temporale." Both maps were set to z ϭ 10.5 to identify the most central region. Then, we applied the same fourpredictor GLM, extracted ␤-values from right and left ROIs and repeated one-sample (against zero) and pairedsamples (aSV Ͼ vSV) t tests. All of the reported results were corrected for multiple comparisons using Bonferroni correction.

Multivoxel pattern analysis (MVPA)
Our main objective was to further strengthen our GLM findings that voxels in the vmPFC represent value on a common scale. To do so, we used a cross-modality classification algorithm, which allowed us to test the similarity of neural representation between conditions, visual and auditory. We used a MVPA ROI searchlight approach (Kriegeskorte et al., 2006) to determine voxels that exhibit a significant difference in activation between low-and high-value trials, evaluating their value representation property. Furthermore, the cross-modality approach allowed us to determine voxels that are not only sensitive to value, but also not sensitive to sensory modality. Thus, the searchlight analysis enables us to define subregions of the vmPFC that represent modality-free value. To do so, we first determined low-and high-SV trials (lSV and hSV) for each subject using a median split. We restricted our analysis to the vmPFC, which was defined using an ROI from a meta-analysis of value representation (Bartra et al., 2013). The unsmoothed BOLD signal of each voxel in the ROI was z-scored, to account for signal intensity variations across runs. We then extracted the signal at the fourth TR (6 s) after stimulus onset. For each center-voxel, data of the 24 closest voxels (in Euclidean distance) during lSV and hSV trials was used as input to the classifier. We used a Matlab implementation of a support vector machine (SVM) classifier (Chang and Lin, 2011; RRID: SCR_010243) to classify lSV from hSV. To obtain a prediction score for the center-voxel, a repetitive leave-2-out cross-validation analysis was performed, in which we trained the SVM to classify lSV and hSV trials of one sensory modality but tested it on trials from the other sensory modality. One trial of each condition (lSV and hSV) from the opposite modality was used as the test-set and all remaining trials of the trained modality were used as the training. The model's prediction score could vary between 0% (unsuccessful classification of both test trials), 50% (one successful), and 100% (both successful). The overall classification for each center-voxel is an average of the prediction accuracy score over the total number of iterations (100). This process was repeated for each voxel within the ROI, for each subject and each modality separately. To test for significance, we used a nonparametric permutation test, in which the labels for the high-and low-value trials were randomly shuffled on each leave-2-out iteration, and a prediction score was computed. On each iteration, we used 100 permutations of the labels, and averaged over the iterations. We ran the process in two modes of classification, train-on-auditory and test-on-visual, and train-on-visual and test-onauditory. Thus, each voxel was assigned two "real" (i.e., unshuffled) scores, and 200 shuffled scores. We then averaged the classification results across the two modes of classification, yielding one real and 100 shuffled scores. Voxels were considered significant if their real performance exceeded 95% of the shuffled accuracy scores. To combine the results of all 26 subjects together, we created a probability map of the significant voxels across subjects from the classification step. For presentation purposes, we present only voxels that had a significant classification in at least 75% of subjects.
To test the robustness of the vmPFC-only result, we performed an additional whole-brain searchlight crossmodality classification analysis. To do so, we defined a gray-matter mask, and extracted the BOLD data from each voxel, as in the vmPFC-only analysis. The searchlight size was adapted for the bigger mask, to consist 125 voxels. All other parameters of the classifier remained identical. We then computed a probability map of significant voxels across subjects, thresholded at 75% of subjects and restricted cluster size to be at least 15 continuous voxels.

Behavior
Subjects performed a risk-evaluating task inside an fMRI scanner (Fig. 1). On each trial subjects chose between a safe and a risky alternative. The safe alternative was a certain amount of money (the reference option, 10 NIS). The risky alternative was a lottery, some probability to win some amount of money. The probabilities and amounts varied across trials. Importantly, we presented the amount and probability information either visually on a computer screen (the visual condition) or aurally via headphones (the auditory condition). We first calculated each subject's risk-preference in each modality separately, by fitting a logistic function to their choice behavior, using a MLE process with two free parameters, ␣ and ␤ (see Materials and Methods). The ␣ parameter represents subjects' attitude toward risk, with scores under 1 representing risk-aversion, and scores above 1 representing riskseeking. The ␤-parameter is the inverse temperature, or the slope of the logistic function, and it represents the level of noise in choices. We next constructed each subject's utility function separately for each modality, based on their own estimated risk-preferences. Figure 2 depicts results from an example subject. Note, that we report data and analyses conducted on the 26 subjects that performed both behavioral sessions and the fMRI session.
We first examined whether there is a difference between risk preferences across sensory modalities. As can be seen in Figure 3A, in the auditory condition the average risk preference was 0.62, with scores ranging from 0.26 for the most risk-averse subject to the most risk-seeking subject with a score of 1.31. In the visual condition the average risk preference was 0.61, with scores ranging from 0.31 to 1.17. We did not find a significant difference between ␣ auditory and ␣ visual (n ϭ 26, Z ϭ Ϫ0.045, p ϭ 0.96, Wilcoxon rank-sum test). This suggests that on average subjects display similar levels of risk preferences irrespective if the choice is presented visually or aurally. The slopes of the logistic functions (the ␤-parameter) did not differ between modalities as well (n ϭ 26, z ϭ 0, p ϭ 1).
We next examined whether risk preferences across sensory modalities are correlated within subjects. As can be seen in Figure 3B, risk preferences across sensory modalities are highly correlated within subjects (Spearman R ϭ 0.97, p Ͻ 0.001). Subjects show a high correlation of the slopes of the logistic function as well (Spearman R ϭ 0.95, p Ͻ 0.001). This suggests that subjects' attitudes toward risk are preserved across sensory modalities. If a subject is highly averse to lotteries when they are presented visually, she would be as averse to lotteries that are presented aurally. Likewise, the consistency of one's choices is not affected by the modality in which they are presented (Levy and Glimcher, 2011).
Because in our task probabilities and amounts were presented serially we wanted to make sure that the presentation order did not influence subjects' risk preferences. Therefore, we split the data into trials in which the probability appeared first and to trials in which the amount appeared first and compared the average estimated risk parameter. We did not find an effect for the order of amount and probability presentations, comparing ␣ probability first and ␣ amount first (auditory: Z ϭ Ϫ0.027, p ϭ 0.97; visual: Z ϭ Ϫ0.3, p ϭ 0.76, Wilcoxon rank-sum test).
Finally, we wanted to ensure that any differences in value did not generate differences in RTs. The task was designed in a way that prevents any RTs differences, by introducing a wait period after the presentation of the lottery information and the implementation of the choice. However, any lingering differences in RT could create major confounds in the neural representations and must therefore be addressed. A linear regression of RTs and SVs has revealed no connection between the two (auditory coefficient ϭ Ϫ0.004, p ϭ 0.49; visual coefficient ϭ Ϫ0.008, p ϭ 0.5). We therefore conclude that any neural representation of value is not related to RTs. Another concern was that similarity in RTs between modalities would generate a similarity in neural representation that is not directly related to value. To address this issue, we correlated each subject's RTs to visual lotteries to their RTs to auditory lotteries. The correlation coefficients ranged from Ϫ0.24 to 0.28 across subjects, with only three out of the 26 subjects exhibited a correlation of p Ͻ 0.05, but none were p Ͻ 0.01. We conclude that similarity in RTs alone cannot explain the similarity in value representation.

Common value-representation: GLM
To identify areas of the brain which represent value across sensory modalities, we looked for voxels sensitive to SV of options presented aurally (aSV) or visually (vSV). To this end, we used the estimated modality-specific individual risk-preferences to calculate each trial's SV, by raising the lottery's amount to the power of ␣ and multiplying it by the winning probability (probability ϫ amount ␣ ; see Materials and Methods). We used the trial-by-trial SV variation in a GLM, with a separate predictor for each modality (aSV and vSV). As can be seen in Figure 4A, contrasting aSV with baseline revealed significant activations (q(FDR) Ͻ 0.05) in the vmPFC (MNI coordinate: 1, 46, Ϫ16). Contrasting vSV against baseline also revealed significant positive activations (q(FDR) Ͻ 0.05) in the vmPFC (MNI coordinate: 1, 47, Ϫ13).
Next, to identify voxels that represent value in both modalities concurrently, we constructed a conjunction analysis of both conditions (aSV പ vSV). As can be seen in Figure  4A, we found that activations in an anterior part of the vmPFC (MNI coordinate: 0, 46, Ϫ14) positively tracks SV for both sensory modalities.
Note, that in our analysis we modeled the presented-SV, that is, the SV of lotteries, irrespective of subject's choice. Although the presented-SV and the chosen-SV are highly correlated (mean R across subjects and runs ϭ 0.81, SD ϭ 0.18), we looked for value representation of the chosen alternative and not only the offer value (Raghuraman and Padoa-Schioppa, 2014), for completeness. We constructed another GLM with chosen-SV as our main predictor instead of presented-SV. We found a highly similar neural response to our previous GLM. A conjunction analysis of aSV chosen പ vSV chosen revealed a cluster in the vmPFC, located at the MNI coordinate: 0, 45, Ϫ14, overlapping with the cluster identified for presented-SV (Fig. 4C).

Common value-representation: MVPA
To further examine the neural substrate of SV and how modality influences it, we turned to multivariate pattern analysis (MVPA) using a cross-modality algorithm. This analysis can show that the value representations not only coincide in a similar anatomic region but are in fact functionally interchangeable. For each subject, we split the trials of each sensory modality into high-and low-value conditions, based on the median SV. This resulted in four conditions: auditory-high, auditory-low, visual-high, and visual-low. We extracted the BOLD signal from the vmPFC, and conducted a cross-modality classification analysis. We trained a SVM to classify between high-and low-value trials based on the pattern of activity of the vmPFC in one sensory modality and then tested the model on the trials of the other sensory modality, and used a permutation test for significance (see Methods). We considered voxels that significantly classify the trials to represent value in a modality-free manner, since they are sensitive to the difference between high and low values but are not sensitive to the sensory modality in which the lotteries were presented.
In significant voxels, we found an average classification accuracy across subjects of 59% (SD: 1.17%, range: 56.94 -62.24%). To identify where in the vmPFC this common representation of value is located, we created a probability map across subjects. We counted the number of subjects that a given voxel was significant in their analysis. Figure 5A shows the resulting probability map of the vmPFC when we set the presented voxels to a minimum of 75% of subjects. A voxel with a high probability level is a voxel that has a shared representation for aSV and vSV in many of the subjects. Importantly, the cluster which is common to most subjects (85%) was located in the anterior part of the vmPFC. To ensure the robustness of this result, we also looked at a whole-brain level for voxels that successfully classify value in a cross-modality manner. This analysis revealed a common value representation at the medial PFC, at a slightly more dorsal and anterior location (MNI coordinates 7, 55, Ϫ1; for full list of findings, see Fig. 5B; Table 1). Both the vmPFC-only and whole-brain findings are in accordance with previous work (Smith et al., 2010), that identified the anterior vmPFC (aVMPFC; MNI coordinates 0, 46, Ϫ8) as a region sensitive to experienced value. More specifically, the authors report a conjunction result of monetary value and social value, located at a strikingly similar area of the vmPFC as the result we report in the GLM analysis (MNI coordinate 0, 46, Ϫ14) as well as in the vmPFC-only MVPA analysis (MNI coordinate Ϫ3, 48, Ϫ10).

Value modulation of sensory areas
Next, we focused on the sensory cortices themselves and examined whether they too convey value information  Figure 4. Common value representations across sensory modalities. Whole-brain random-effects maps; n ϭ 26. A, Significant voxels tracking aSV, vSV, and a conjunction between the two modality-specific SV predictors (aSV പ vSV). B, All three maps superimposed on each other. C, Conjunction between the two modality-specific chosen SV predictors (aCSV പ vCSV), superimposed on the presented subjective-value conjunction map. All maps are shown at MNI coordinate x ϭ 0. All maps are at p Ͻ 0.05 (FDR corrected) but shown in different thresholds for presentation purposes (aSV at z ϭ 6.5, vSV at z ϭ 4, conjunction at z ϭ 4.3). and if the neural representation is specific to a sensory modality. By presenting a series of sounds and images while scanning subjects' neural activity (see Materials and Methods, Functional localizers), we identified the primary and higher-order visual and auditory cortices for each subject. We defined eight functional ROIs for each subject: right and left primary auditory cortex, right and left higher-order auditory cortex (found at the superior temporal gyrus), right and left primary visual cortex, and right and left higher-order visual cortex [the lateral occipital cortex (LOC); Fig. 6A,B]. We then used the same GLM as in the common value-representation analysis. This univariate model holds four main predictors: two dummy variables for modality (aStick, vStick), and two subjectivevalue predictors, separated by modality (aSV, vSV), which represent the subject-specific trial-by-trial variation of SV. For each subject, we extracted the ␤-values for aSV and vSV predictors from each of the eight ROIs, and tested if it tracks SV (i.e., significantly different from zero).

C B A
As can be seen in Figure 6C, we found that the visual and auditory cortices indeed represent SV, but they do so in a modality-specific way. That is, unlike the vmPFC which represents value irrespective of the modality in which the information is presented to the subject, the sensory cortices represent value only for their corresponding sensory modality. Interestingly, we found that whereas all four subject-specific auditory ROIs are signif-icantly influenced by SV (maximal p Ͻ 0.0001, Bonferroni corrected, two-tailed t test), only the higher-order visual cortices are sensitive to changes in SV (maximal p Ͻ 0.0001, Bonferroni corrected, two-tailed t test). To ensure the specificity of the effect, we have directly compared the regression coefficients (␤-values) between modalities, using a paired t test. We confirmed our original analysis, showing that aSV is significantly greater than vSV in all four auditory ROIs (all p Ͻ 0.0001), while vSV is greater than aSV in only the left and right higher-order visual ROIs (p ϭ 0.028 and p ϭ 0.0003, respectively, FDR corrected).
To ensure that the auditory cortices were properly defined, we repeated this analysis using an alternative method of defining the ROIs. We downloaded two maps from the web-based tool NeuroSynth (Yarkoni, 2014), corresponding to the search-terms Heschl gyrus and planum temporale (primary and higher-order auditory cortices, respectively). We set both maps to a threshold of z ϭ 10.5 to maximize the spatial separation between the maps and applied the same GLM from the original analyses and extracted the ␤-values of aSV and vSV. We repeated the two approaches to test for the value representation in these ROIs. First, we compared the ␤-values to zero using a one-sample t test, and second, we directly compared aSV to vSV. This meta-analytical approach to define the ROIs has replicated the results we obtained with the functional localizer, namely, all four ROIs show a signifi-75% 85% 2 5 = y 2 = x 96% 75% B A Figure 5. Cross-modality classification results; n ϭ 26. To establish a common representation across sensory modalities, we conducted a cross-modality classification analysis. A SVM trained to classify high-from low-value trials of one sensory modality and tested on the other. Hence, significant voxels are sensitive to value and not sensitive to sensory modality. A, Results from a vmPFC-only analysis. A probability map for a voxel to be significant across subjects, at a threshold of 75% and up (in purple). In yellow, the area of the vmPFC mask. The map is shown at MNI coordinate x ϭ Ϫ3. B, Results from a whole-brain analysis (in orange), superimposed on the vmPFC-only result (in purple). Finally, to assert that the ROIs were defined properly and hold sensory information, we statistically compared the ␤-values of aStick and vStick to zero. We found that each sensory ROI is active in a modality-specific manner. That is, primary and higher-order visual regions responded to visual trials but not to auditory trials, whereas primary and higher-order auditory regions responded to auditory trials but not to visual ones (visual ROIs: vStick mean activity ϭ 0.62, mean SD ϭ 0.46, all p Ͻ 0.001; aStick mean activity ϭ 0.007, mean SD ϭ 0.27, all p values nonsignificant; auditory ROIs: aStick mean activity ϭ 0.56, mean SD ϭ 0.4, all p Ͻ 0.005; vStick mean activity ϭ 0.04, mean SD ϭ 0.27, all p values nonsignificant).

Discussion
The results of our study provide novel insights into the nature of the human common currency network. Using a within-subject design, we directly compared the behavioral and neural representations of systematically and rigorously measured SVs of two sensory modalities. On a behavioral level, there were no differences, on average, in subjects' attitude toward risk when comparing visual and auditory lotteries. In fact, individual's risk-preferences were highly correlated between modalities. On the neural level, we found that the anterior portion of the vmPFC tracks SV irrespective of the sensory modality in which choice alternatives are presented. We demonstrate this using both a univariate approach, by creating a conjunction analysis of trial-variations in SV, as well as with a cross-modality classification algorithm, which showed some voxels of the vmPFC to be sensitive to value but not to modality. Finally, we show that visual and auditory cortices, defined functionally for each subject, are also sensitive to value. Unlike the vmPFC value region, the sensory cortices hold value information in a modalityspecific manner, i.e., the visual cortex is sensitive to the value of lotteries represented visually, while the auditory cortex is sensitive to the value of lotteries represented aurally.  Figure 6. Value modulation of sensory cortices. Individual subjects' sensory ROIs were defined using functional localizers, and subjective-value ␤-values were extracted and tested against zero. A, B, For each subject, four contrasts were defined to identify primary and higher-order areas of each modality. Each color represents an individual subject. Upper panels. Primary auditory (A) and visual (B) cortices. Lower panels, Higher-order auditory (A) and visual (B) cortices. C, SV representation in these areas. The y-axis represents the extracted ␤-values from a GLM, which included four predictors: aSV (denoted A), vSV (denoted V), and two dummy variables for trial (results not shown here). Each colored marker represents an individual-subject's ␤-value. Black horizontal lines represent the means; ‫‪p‬ءء‬ Ͻ 0.001, Bonferroni corrected.
The temporal imprecision of the fMRI renders it unsuitable to definitively answer this question, and more studies need to be conducted with higher temporal and spatial resolutions to examine this question further.
As the TR in our design was an integer multiple of the ITI, this could create potential over-or underestimation of differences in neural activity across slices of the brain and pose a limitation for our findings. However, this limitation does not affect the main finding we report here, which involves comparison of neural activity in the same exact region (vmPFC), and not across slices. The only finding that might be affected by the biased sampling of the HRF is the discrepancy we find between primary and highorder visual cortices in their representation of value. In this case, both areas are located in slices close to each other, in respect to the sampling sequence (mean z MNI coordinate, primary visual cortex ϭ Ϫ3, higher-order visual cortex ϭ Ϫ4), rendering any potential bias minimal.
In summary, we show that the common currency network of the human brain represents the value of stimuli irrespective of the sensory domain in which they were presented. Additionally, we show that sensory cortices hold information regarding the value of stimuli in a modalityspecific manner. These findings bring our understanding of the neural valuation system a step closer to real-world environments, where individuals choose between multidimensional alternatives, composed of information from different domains and sensory modalities.