Functional Connectivity Basis and Underlying Cognitive Mechanisms for Gender Differences in Guilt Aversion

Abstract Prosocial behavior is pivotal to our society. Guilt aversion, which describes the tendency to reduce the discrepancy between a partner’s expectation and his/her actual outcome, drives human prosocial behavior as does well-known inequity aversion. Although women are reported to be more inequity averse than men, gender differences in guilt aversion remain unexplored. Here, we conducted a functional magnetic resonance imaging (fMRI) study (n = 52) and a large-scale online behavioral study (n = 4723) of a trust game designed to investigate guilt and inequity aversions. The fMRI study demonstrated that men exhibited stronger guilt aversion and recruited right dorsolateral prefrontal cortex (DLPFC)-ventromedial PFC (VMPFC) connectivity more for guilt aversion than women, while VMPFC-dorsal medial PFC (DMPFC) connectivity was commonly used in both genders. Furthermore, our regression analysis of the online behavioral data collected with Big Five and demographic factors replicated the gender differences and revealed that Big Five Conscientiousness (rule-based decision) correlated with guilt aversion only in men, but Agreeableness (empathetic consideration) correlated with guilt aversion in both genders. Thus, this study suggests that gender differences in prosocial behavior are heterogeneous depending on underlying motives in the brain and that the consideration of social norms plays a key role in the stronger guilt aversion in men.


Introduction
Prosocial behaviors are fundamental to human society. The most perceived motivation behind prosocial behaviors is inequity aversion (Fehr and Schmidt, 1999), which is defined as the propensity to avoid an imbalance between outcomes for the self and the other person. A great deal of behavioral research (Bolton and Katok, 1995;Eckel and Grossman, 1998;Andreoni and Vesterlund, 2001;Dickinson and Tiefenthaler, 2002;Croson and Gneezy, 2009;Kamas and Preston, 2015;Grosch and Rau, 2017) has accumulated evidence that women are more prosocial than men, since women are more inequity-averse. However, economic research has shown that human prosocial behavior depends on not only preferred behavioral outcomes (e.g., fairness), but also on the belief of others (for review, see Fehr and Schmidt, 2006). People tend to live up to the expectations of others, since they suffer from guilt if they disappoint others (Baumeister et al., 1994). In behavioral game theory, this psychological process is named "guilt aversion" (Charness and Dufwenberg, 2006;Dufwenberg, 2007, 2009), in which an individual dislikes disappointing another person relative to what the other person believes they should receive (see Materials and Methods for a more detailed definition). However, gender differences in guilt aversion have been far less explored.
Previous functional magnetic resonance imaging (fMRI) studies of guilt aversion have revealed involvement of the dorsolateral prefrontal cortex (DLPFC), dorsal medial PFC (DMPFC), ventromedial PFC (VMPFC), insula, supplementary motor area, and temporal parietal junction (Chang et al., 2011;Nihonsugi et al., 2015;van Baar et al., 2019). For instance, it was demonstrated that the DLPFC is causally involved in the implementation of guilt aversion by integrating fMRI and transcranial direct current stimulation (tDCS; Nihonsugi et al., 2015). Considering these contributions of prefrontal cortices in guilt aversion, we assumed that prefrontal network interactions among the DLPFC, DMPFC, and VMPFC play a key role in producing the gender difference in guilt aversion, if any. In particular, the VMPFC may well be central to the gender difference in guilt aversion because several lesion studies (Tranel et al., 2005;Sutterer et al., 2015) demonstrated that the VMPFC is involved in the gender differences in social cognition.
Additionally, it is also possible that the gender difference in guilt aversion may reflect different cognitive strategies used by men and women. Guilt aversion requires the ability to assess another individual's expectations and directly relates to his or her disappointment (i.e., empathy or theory of mind; Hoffman, 1982). At the same time, guilt aversion is a normative behavior elicited by experience (i.e., rule-based decisions; Haidt, 2003). Therefore, we also hypothesized that if there is a gender difference in guilt aversion, these two potential cognitive strategies: empathetic consideration and rule-based decision-making may contribute to the difference.
Regarding inequity aversion, previous fMRI studies (Haruno and Frith, 2010;Tricomi et al., 2010;Gospic et al., 2011;Crockett et al., 2013;Haruno et al., 2014;Tanaka et al., 2017) revealed involvement of the ventral striatum and amygdala. An integration of pharmacological intervention and fMRI also demonstrated that activity in the ventral striatum is critical for gender differences in this aversion (Soutschek et al., 2017). Therefore, we hypothesized that women show stronger inequity aversion than men, with the striatum and amygdala playing a critical role.
To test these hypotheses from a neuro-cognitive point of view, we conducted a model-based fMRI study and a large-scale online behavioral study of the trust game task, which was designed to measure guilt aversion and inequity aversion. The fMRI study investigated the neural and network mechanisms for the guilt and inequity aversions, with particular focus on gender differences. For the online behavioral data, a regression analysis of guilt aversion was conducted based on Big Five and social factors, such as age and socioeconomic status, by which we expected cognitive and societal aspects of guilt aversion would be revealed.

Intersection of fMRI and online studies Trust game
Participants performed a trust game adapted from the task originally used by Charness and Dufwenberg (2006). In this task, two subjects are paired as players A and B (Fig. 1A). First, player A must choose between In and Out options and simultaneously reveal their belief about t A (from 0% to 100%), the probability that player B will choose Cooperate. In other words, t A is player A's level of trust in player B. If player A chooses Out, players A and B receive payments z A and z B , respectively. If player A chooses In, then knowing player A's belief probability, player B must choose Cooperate or Defect. If player B chooses Defect, player A receives y A and player B receives y B ; if player B chooses Cooperate, then the two players receive x A and x B , respectively. In the example shown in Figure 1B, the belief probability of player A was 80%. If player B defected, player A and player B would receive 220 and 910 yen, respectively; if player B cooperated, they would receive 780 and 650 yen, respectively.
There are two important conditions regarding the payments in Figure 1A (see also the definitions of guilt and inequity aversion below): if (1) y A ,z A ,x A , then player A signals trust (cooperation) to player B when player A chooses In; if (2) z B ,x B ,y B , then player B feels guilt on disappointing player A relative to player A's belief in what player A will receive. This trust game was originally designed and used in Nihonsugi et al. (2015).

Guilt aversion and inequity aversion
Guilt aversion (Charness and Dufwenberg, 2006;Dufwenberg, 2007, 2009)  guilt sensitivity elicited in the trust game by guilt aversion theory is fundamentally related to the Test of Self-Conscious Affect-3 (TOSCA-3) and the Guilt and Shame Proneness Scale (GASP), which is a common measure of guilt sensitivity in psychology, but is unrelated to shame (Bracht and Regner, 2013;Bellemare et al., 2019).
This model includes social pressure on player B if the profile (In, Defect) is played (Fig. 1A). Player B is assumed to believe that if player A chooses In, then player A believes that he will get a return of which is non-negative in our settings, can measure how much player B believes that he/she has disappointed player A relative to player A's belief had player B chosen Defect. In other words, the difference t A ðx A À y A Þ is the amount of guilt that player B experiences. Let us assume that g B is the parameter that measures player B's sensitivity to guilt. A player is guilt-averse and will Cooperate if y B À g B Á t A ðx A À y A Þ , x B . In the example trial in Figure  1B, if 910 À g B Á 0:8 Á ð780 À 220Þ , 650, player B will choose Cooperate. Since g B does not directly measure guilt experiences or emotional traits, we can only infer that "g B expresses sensitivity of guilt." As mentioned in Results, however, our interpretation that g B expresses a guilty experience is consistent with the results of the postexperiment questionnaire.
By contrast, inequity aversion assumes a social preference for equitable payoffs (Fehr and Schmidt, 1999). An individual is inequity-averse if, in addition to their monetary self-interest, their utility decreases when the allocation of monetary payoffs is different. If an inequity-averse player suffers from inequity, they will choose an option that results in a smaller difference between their own and the other's monetary payoffs. Notably, the advantageousinequity (receiving a larger reward than others) in Fehr and Schmidt's inequity-aversion model is also referred to as "guilt." However, it is important to note that this outcomebased "guilt" and the intension-based "guilt" we treat in guilt-aversion are completely different.
As mentioned below, based on the results of the model selection using both the cross-validation analysis (predictive likelihood) and the Bayesian information criterion (BIC; Fig. 2B,C; see also below, Model validation and comparison), the absolute difference for inequity was found superior than the standard inequity aversion model, which splits the inequity into positive and negative terms, in the present study. Figure 1. Task design. A, Design of the trust game. First, player A chooses In or Out, which reveals a belief probability of the likeliness that player B will choose Cooperate. If player A chooses Out (i.e., does not trust player B), player A and B receive z A and z B , respectively. If player A chooses In (i.e., trusts player B), then with the knowledge of player A's belief probability, player B decides whether to Cooperate or Defect. If player B chooses Defect, players A and B receive y A and y B , respectively; if Cooperate, players A and B receive x A and x B , respectively. The actual assignment of x, y, z and t A for the 45 trials is shown in Extended Data Figure 1-1. B, An outline and example of experimental trials. After the green fixation period (2-5 s; cue phase), a task condition is presented for 5 s (choice phase), and participants are asked to press the Cooperate or Defect button (blue and red, respectively). Then, a yellow fixation cross is shown for 6-15 s (rest phase). C, An illustration of the complete experimental paradigm. For both the fMRI and online studies, in the first experiment, participants (as player A) chose In or Out and reveal their belief probability that player B would choose Cooperate. In the second experiment, participants (as player B) chose to Cooperate or Defect. Participants make their decisions while being scanned in the fMRI experiment. Instructions for the first and second experiments are shown in Extended Data We integrated guilt aversion and inequity aversion into a utility function (u B ) for player B as follows: where a B is a constant that measures player B's sensitivity to inequity. A narrowly self-interested agent is given the special case g B ¼ a B ¼ 0. In our game, players choose between binary actions that yield two different monetary payoff allocations, X ¼ ðx A ; x B Þ and Y ¼ ðy A ; y B Þ. The utilities of these allocations are given by the formula above, yielding u B ðXÞ and u B ðYÞ.

Statistical analysis of behavioral data
We estimated three separate components, monetary self-interest, guilt, and inequity, for each participant based on the logistic model of stochastic choice. The probability that player B chooses Cooperate can be expressed as P B;Cooperate ¼ 1=11e Àfu B ðXÞÀu B ðYÞg . Although our model does not include an inverse temperature parameter explicitly, this does not imply the model does not consider decision noise. In fact, our model implicitly assumed the inverse temperature parameter to be 1. Such an implementation of the softmax function with the inverse temperature parameter = 1 is often seen in the behavioral analysis of the economic decision-making (Boorman et al., 2009;Cai, and Padoa-Schioppa, 2014;Suzuki et al., 2015) because the inverse temperature is relatively difficult to estimate. Based on this logistic model, we used a logistic regression as follows: where Reward t is the size of the reward and calculated as x B À y B at time t, Guilt t is the size of guilt and calculated as Àf0 À t A Á ðx A À y A Þ}, and Inequity t is the size of inequity and calculated as Àðjx A À x B j À jy A À y B jÞ. For convenience,b 1 , b 2 , and b 3 are denoted as b ðRewardÞ, b ðGuiltÞ, and b ðInequityÞ, respectively. In order to orthogonalize the three explanatory variables, the actual t A used in the experiments was also set by the experimenter. Player B was asked to make decisions assuming that player A chose the In option. We therefore set t A to 60% or higher (player A is expected to choose the Out option when t is small). More specifically, t A was 60% 7 times, 70% 5 times, 80% 13 times, 90% 11 times, and 100% 9 times. We display the actual values of x, y, z, and t A in Figure 2. Behavioral results. A, In the fMRI study (n = 26 men, 26 women), the b value for guilt was higher in men than in woman (p = 0.046, t test), whereas the b value for inequity was higher in women than in men (p = 0.039, t test). B, We validated and compared the performance of 10 models using the repeated 3-fold cross-validations and found that the model containing three predictors (Reward, Guilt, and Inequity) was best for both fMRI and online studies. Extended Data Figure 1-1. The correlation coefficients among the three explanatory variables were less than 0.30 and insignificant (p . 0.05); the values of guilt and inequity were designed to be orthogonal [the correlation coefficient of these two variables was À0.138 and nonsignificant (p = 0.367)] to dissociate the computational processes for guilt aversion and inequity aversion. This logistic regression was computed using the R statistical package (R Core Team, 2021). We used the brglm package to conduct our maximum likelihood estimation with the bias-reduction method (Kosmidis, 2019).

Model validation and comparison
Our utility model comprises three separate components: Reward, Guilt, and Inequity, as defined above. With regard to Inequity, we adopted the absolute difference for Inequity. However, participants may alternatively use Fehr and Schmidt's model, which splits the inequity into positive (called Inequity-positive hereafter) and negative (called Inequity-negative hereafter) terms. Therefore, we need to verify which model (component) better explains the data for the current experiments.
To address this issue, we first compared 10 possible models (for details of the 10 models, see Fig. 2B) based on the predictive negative log likelihoods using a crossvalidation. This cross-validation approach for valuebased decision-making allows us to avoid overfitting the data and to compare models with different numbers of parameters robustly. It has also been adopted in many recent studies (Daw, 2011;Smith et al., 2014;Linderman and Gershman, 2017;Park et al., 2019;Fig. 2B). We also compared more familiar BIC values for the models and exemplified the ones with the first and second minimum BIC to confirm the results (Fig. 2C).
More specifically, to compute the minimum predictive negative log-likelihood, we repeated bootstrap (500 iterations) 3-fold cross-validations for the model validation and comparison. For each model, we randomly divided 45 trials for each participant into three groups of equal size (i.e., 15). We fitted the model to 30 trials and predicted the behavior in the held-out 15 trials and repeated this process three times. We repeated this 3-fold crossvalidation procedure 500 times and selected the model with the minimum predictive negative log-likelihood for held-out trials.

fMRI study Participants
A total of 52 participants (mean age 21.2 years; SD = 1.4 years; 26 females) participated in the fMRI experiments. They were scanned on a Siemens 3T Trio scanner at the Center for Information and Neural Networks (CiNet) of the National Institute of Information and Communications Technology (NICT). The ethical committees of the NICT approved this study, and all participants gave informed consent. Participants received money proportional to the number of payoffs earned during the experiment (equivalent to 45-60 United States dollars). Although our task was the same as the one in Nihonsugi et al. (2015), we collected completely different participants in this study for two main reasons. First, we had access to a 64-channel MRI coil to analyze the DMPFC and VMPFC. The 64-channel brain coil provides a 1.3-fold higher signal-to-noise ratio in the brain cortex than the 32-channel array (Keil et al., 2013). Second, the number of participants (n = 42) in Nihonsugi et al. (2015) was not enough for re-analysis; Yarkoni (2009) suggested that a sample size of .50 is necessary for identifying a moderate correlation at relatively conservative thresholds. Additionally, we also wished to test whether we could replicate our previous results.

Experimental design and procedure
We conducted two experiments in which participants played a trust game in different roles ( Fig. 1C; see also the instructions in Extended Data Fig. 1-2). In the first (behavioral) experiment, .10 participants per experiment were invited into a room and read instructions of the rules and procedure of the trust game. Every participant played the trust game as player A (i.e., choose In or Out and reveal belief probability t A ) and experienced one trial. The participants were informed that these choices would be used when player B made their choice in the second (fMRI) experiment. However, player A was not informed of player B's identity. Participants were told that earnings for player A will be determined according to the actual outcome made by both players' choices if A's choice is used in the second experiment.
The second experiment was conducted on average 6 d (range = 1-10 d) after the first experiment. All participants played the game as player B (i.e., choose Cooperate or Defect with knowledge of player A's belief probability) for 45 trials. Participants were instructed to assume that player A chose In in this experiment (the Out option is illustrated as a dashed line in Fig. 1B). The sequence of the trials was randomized across subjects. Participants were told that the other participant (player A) differed for each trial and that the pairings were anonymous. We did not provide any feedback to the participants during the experiment. Participants were also informed that earnings for player B will be the sum of the show-up fee and the actual outcome obtained from both players' choices in the 45 trials.
Because there was the risk that player B felt that the other player was hypothetical rather than real, we invited .10 participants at a time into a room in the first experiment to make them realize the other's presence and impress on them that they would have a real partner in the second experiment. In addition, when giving instructions for the second experiment, we repeatedly explained that we had conducted similar first experiments many times and that there were many player As and the partner in the second experiment was one of them. In other words, on the day of the second experiment, the participants were likely to think about other participants in the first experiment. Thus, although the experiment was hypothetical, we assume that the participants were engaged in the tasks as if they were in a real interaction. Indeed, no participant reported or even referred to the absence of their partner in a postexperiment interview.
After reading the instructions for the task and procedure, the participants were briefed about the rules of the game by the experimenter and tested to confirm that they understood the rules. They were then individually invited into the scanning room and practiced the game using the response buttons in the scanner.
Functional images were acquired as participants played the game. The timeline of a trial is shown in Figure 1B. Each trial began with a 2-to 5-s preparation interval during which time a green fixation cross was presented for the first 1 s and then a yellow fixation cross (cue phase) was presented for the remainder of the time. The participants were then presented with the trust game, including the allocation of monetary payoffs for each choice and player A's belief, and selected Cooperate or Defect by pressing the corresponding button within 5 s (choice phase). In each trial, participants made their choice on the assumption that player A chose In. This was followed by the presentation of a fixation cross for a variable time period of 6-15 s (rest phase).
After scanning, all participants answered the questionnaire. For guilt aversion behavior, participants were asked to answer the following three questions on a five-point scale (1: I don't think so,..., 5: I think so): a. Did you think that the reason why player A chose In was because they expected (and aimed) to gain x A yen (i.e., the result of player B choosing Cooperate)? b. Did you think that choosing Defect would reduce the payoff (x A yen) expected by player A? c. Did you feel guilt that your choice of Defect would reduce the payoff (x A yen) expected by player A?
Question a examined whether the respondent understood the partner's intention of choosing In (the meaning behind the expectation); question b examined whether the respondent was aware that their choice of Defect reduces their partner's expected payoff; and question c asked whether the respondent felt guilty when he/she reduced their partner's expected payoff.
fMRI data preprocessing SPM12 (http://www.fil.ion.ucl.ac.uk/spm) was used for the MRI data preprocessing and analysis. Preprocessing included motion correction, coregistration to the participant's anatomic image, and spatial normalization to the standard Montreal Neurologic Institute (MNI) T2 template with a resampled voxel size of 2 mm. Coregistered EPI data were normalized using an anatomic normalization parameter. Spatial smoothing was performed using an 8mm Gaussian kernel.

General analysis methods
To explore the neural basis of guilt, inequity and value difference, we performed a general linear model (GLM) analysis of the functional data. We constructed two GLM models.
GLM 1. To model the blood oxygen level-dependent (BOLD) signal driven by Guilt and Inequity, the two variables were convolved with a hemodynamic response function (HRF; spm_hrf function with TR equal to 3.0 s). For first level GLM analysis, the onset and duration were the onset timing of "Choice phase" and 0 s, respectively. In addition to a response-period constant regressor, we introduced (1) an HRF for Guilt and (2) an HRF for Inequity. Additional regressors modeling head motion, as derived from the realignment procedure, were included in the model. Serial autocorrelation was modeled as a firstorder regressor, and data were high-pass filtered at a cutoff of 128 s.
We calculated second-level group contrasts using onesample t tests to reveal the main effect of each parametric regressor within participants using the individual contrast images. To correct for multiple comparisons, we used for Guilt contrast the familywise error (FWE) correction across the whole brain at p , 0.05 based on Gaussian random field theory as implemented in SPM12 [minimum cluster extent (k) . 20 voxels, see also Extended Data Fig. 3-1 for the actual cluster size]. Since the analysis of Inequity targets small regions, such as the striatum and amygdala, we set the minimum cluster extent to 20 voxels to keep the extent size the same throughout the analysis of Guilt and Inequity. When analyzing Inequity, we used for the whole-brain analysis a threshold of p , 0.001 uncorrected.
GLM 1.1. After calculating GLM1, a two-sample t test was used to compare Guilt contrast between men and women. For the whole-brain analysis, a threshold of p , 0.001 uncorrected with an extent threshold of k = 20 was adopted.
GLM 1.2. After calculating GLM1, a two-sample t test was used to compare Inequity contrast between men and women. For the whole-brain analysis, a threshold of p , 0.001 uncorrected with an extent threshold of k = 20 was adopted.
GLM 2. We modeled brain activity related to utility. For the first-level analysis, we entered the value difference between choice options (larger utility-smaller utility) as a parametric modulator of a regressor. The onset and duration were the onset timing of the Choice phase and 0 s, respectively. Additional regressors modeling head motion, as derived from the realignment procedure, were included in the model. Serial autocorrelation was modeled as a first-order regressor, and data were high-pass filtered at a cutoff of 128 s.
We calculated second-level group contrasts using onesample t tests to reveal the main effect of each parametric regressor within participants using the individual contrast images. Additional regressor modeling of a gender-indicating variable was included in the model. We used for the whole-brain analysis a threshold of p , 0.001 uncorrected.

Region of interest (ROI) analysis
For the Guilt contrast in GLM1, because of the lack of adequate previous neuroimaging studies and consistent imaging results for guilt aversion, we had no specific priori hypothesis and performed no ROI analysis. However, for GLM1.1 (gender difference in guilt), we did have a priori hypothesis from previous lesion studies that showed the VMPFC is involved in gender differences in social cognition (Tranel et al., 2005;Sutterer et al., 2015). Therefore, we performed a ROI analysis on whether this region survived a small volume correction at p , 0.05 with an FWE correction. For Inequity contrast in GLM1 and GLM1.2, because we had a priori hypothesis from previous research that found the amygdala and striatum are involved in inequity (Haruno and Frith, 2010;Tricomi et al., 2010;Gospic et al., 2011;Crockett et al., 2013;Haruno et al., 2014;Tanaka et al., 2017) and there exists a gender difference in inequity (Soutschek et al., 2017), we employed a ROI analysis with a small volume correction (p , 0.05; small volume FWE corrected). With regard to value difference in GLM2, we again had a priori hypothesis from previous research that found the VMPFC is involved in value difference Nicolle et al., 2012). Therefore, we employed a ROI analysis with a small volume correction (p , 0.05; small volume FWE corrected).
The small volume of the VMPFC and DMPFC was based on a 15-mm sphere around the coordinates (x = 2, y = 41, z = À6) and (x = À3, y = 48, z = 30), because these coordinates were used in a neuroimaging study (Baumgartner et al., 2011) of social preferences similar to ours. In that study, the VMPFC coordinates were determined by averaging the peak coordinates across five neuroimaging studies (value and economic decision-making), and the DMPFC coordinates were based on a meta-analysis study on social cognition (van Overwalle, 2009). Furthermore, the VMPFC coordinates (subjective value: x = 2, y = 46, z = À8; decision stage: x = 2, y = 40, z = À8) in a previous meta-analysis (Bartra et al., 2013) are very close to the coordinates we used. The small volumes for the amygdala and striatum were defined using the WFU PickAtlas toolbox (Maldjian et al., 2003).

Psycho-physiological interaction (PPI) analysis
We performed two PPI analyses using the function of SPM12.
PPI1. Having confirmed that the VMPFC was involved in value difference by the GLM2 analysis, we next conducted a hypothesis-based PPI analysis to examine whether this VMPFC activity truly integrates the value components of Guilt and Inequity. More specifically, we used VMPFC (shown in Fig. 3C) as a seed region and examined whether brain areas associated with VMPFC Â Guilt overlapped with the Guilt-correlated areas (i.e., DLPFC and DMPFC in Fig. 3A) and whether brain areas associated with VMPFC Â Inequity overlapped with the Inequity-correlated area (i.e., striatum in Fig. 3B). For each subject, we extracted the time course of activity from a 5mm-radius volume of interest (VOI) around the peak voxel in the VMPFC (shown in Fig. 3C). Based on the procedure by Gitelman et al. (2003), the time series of the VOI was extracted and then deconvolved, multiplied with the psychological variable (size of Guilt or Inequity), and reconvolved with the HRF set up as the PPI regressor. The three regressors (i.e., PPI regressor, VOI time series, and psychological variable) were then convolved with the canonical HRF and entered into the regression model along with six head motion parameters. The individual parameter estimate image for the PPI regressor was subsequently subjected to a one-sample t test. Finally, we also included a gender-indicating variable and performed a group analysis to identify brain regions showing increased functional connectivity with the seed VOI during the Choice phase. For the whole-brain analysis, we used a threshold of p , 0.001 uncorrected.
PPI2. The goal of this analysis was to examine whether different brain networks are involved in the computation of guilt and inequity between men and women. More specifically, this analysis aimed to find differences between men and women in brain regions that correlate more strongly with VMPFC or striatum activity as guilt or inequity increases. For each subject, we extracted the time course of activity from VOIs with a 5-mm-radius around the peak voxel in the VMPFC, as shown in Figure 4A, and the ventral striatum, as shown in Figure 5A. For this analysis, the PPI terms were defined as VMPFC Â guilt and ventral striatum Â inequity. We entered six variables (i.e., PPI regressor, VOI time series and psychological variable for guilt and inequity, respectively) and movement regressors into a GLM. The individual parameter estimate image for the PPI regressor was subsequently subjected to a one-sample t test. Finally, group analysis was performed to identify brain regions showing increased functional connectivity with the seed VOIs. A two-sample t test was performed to further assess different connectivity patterns between men and women. For the whole-brain analysis, a threshold of p , 0.001 uncorrected at the peak voxel level with an extent threshold of k = 20 was adopted.

Mediation analysis
We performed a mediation analysis to test whether the interaction between gender and guilt-based prosocial behavior was mediated by a brain function using a mediation toolbox (https://github.com/canlab/MediationToolbox; Wager et al., 2008). Briefly, this analysis was based on a standard three-variable path model, as shown in Figure  4D. This analysis quantifies the degree to which a relationship between two variables, X and Y, can be explained by another variable, M.
For the guilt-aversion behavioral analysis, we defined X as the gender-indicating variable (1 = men), Y as the behavioral variable, b ðGuiltÞ; and M as the brain variable functional connectivity between the right DLPFC and VMPFC (Fig. 4A). Following convention, we required that three tests reach statistical significance in the mediation analysis. First, path a measured the association between the gender-indicating variable and the functional connectivity. Second, path b measured the association between the functional connectivity and b ðGuiltÞ after controlling for the gender-indicating variable. Third, the mediation effect, defined as the product of the indirect paths (a Â b), must be significant. We refer to the overall predictor-outcome relationship as effect c and the direct effect controlling for the mediator c9. Thus, the a Â b effect tests the significance of c À c9. We conducted bootstrap tests (10,000 iterations) for statistical significance of the mediators.
For inequity-aversion behavioral analysis, we defined X as the gender-indicating variable (1 = women), Y as the behavioral variable b ðInequityÞ, and M as the brain variables (striatum shown in Fig. 5A).

Online study Participants
We analyzed data from 4723 participants (mean age 37.9 years, SD = 15.4 years, 2737 females; for more detailed descriptive statistics, see Table 1) who followed the task instructions correctly and spent longer than 1 h to complete seven different personality trait tests such as Big Five Inventory, anxiety (STAI) and depression (BDI) and the trust game task. These data were collected using our in-house online experiment system. The study protocol was approved by the ethical committees of the NICT, and all participants gave informed consent. For their participation, participants were paid in cashable points proportional to the number of payoffs earned during the experiment (equivalent to 3-5 United States dollars).

Experimental design and procedure
Participants performed a trust game on our inhouse online experiment system in a similar way to the fMRI study (Fig. 1C). We conducted two consecutive experiments in which participants played a trust game in a different role. Before the first experiment, online participants read the rules of the trust game and the procedure. In the first experiment, every participant played the trust game as player A (i.e., choose In or Out and reveal belief probability t A ) and experienced one trial. Participants knew these choices would be used, and the pairings were anonymous when player B made their choice in the second behavioral experiment.
In the second experiment, all participants played the game as player B (i.e., choose Cooperate or Defect with knowledge of player A's belief probability). Participants (player B) were instructed to assume that player A chose In. Every participant experienced 45 trials. The sequence of the trials was randomized across subjects. Participants were told that the other participant (player A) differed for each trial and that the pairings were anonymous. We did not provide any feedback to the participants during the experiment. All participants answered seven different personality trait tests including the Big Five Inventory. The final earnings were calculated following the same pattern as the fMRI study.

Evaluation of cognitive mechanisms using Big Five Inventory
We first examined the relationship between guilt-aversion [b ðGuiltÞ] and gender. Specifically, we estimated b ðGuiltÞ for participants by the same logistic regression as the fMRI study and compared b ðGuiltÞs between men and women. To investigate two different cognitive processes (i.e., agreeableness and conscientiousness) potentially underlying gender difference in guilt aversion and to control for the confounding effects of the participant's socioeconomic status, we conducted a multiple linear regression analysis based on the following equation: Figure 4. Results of gender differences for guilt in neural activity. A, Men showed greater VMPFC activity than women (p = 0.029). As displayed in the box plot, the extracted contrast estimates in the VMPFC demonstrate that men showed increased VMPFC activity in response to guilt (p , 0.001, t test). Importantly, the VMPFC seed exhibited positive correlation with activity in the right DLPFC as guilt increases for men but not for women (p , 0.001, uncorrected). Differences of activities related to guilt between men and women are listed in Extended Data Figure 4-1. B, Overlay of the VMPFC, which is related to gender difference in Guilt (blue), and the Guilt-correlated region (red). For display purposes, the threshold for the Guilt areas is p , 0.001 uncorrected and the VMPFC threshold is p , 0.005 uncorrected. The activation of the VMPFC involved in gender difference in Guilt largely overlaps with the clusters of activation correlated with guilt (overlap area; brown). C, Overlay of the VMPFC cluster shown in Figure 3C, which was positively correlated with the value difference (green), and the VMPFC cluster shown in A, which showed differential activation in the guilt contrast (men . women; blue). These two areas are close but do not overlap. D, Using a PPI analysis, a comparison of men and women showed enhanced functional connectivity of the VMPFC with the right DLPFC during the processing of guilt only in men (orange areas). This activation area (DLPFC) largely overlaps with the clusters of activation correlated with guilt shown in Figure 3A (shown in this figure as red areas). Results of the PPI analysis for guilt when testing for gender differences are shown in Extended Data Figure 4-2. E, Mediation analysis of the relationship of gender, DLPFC-VMPFC connectivity and b Guilt ð Þ shows that DLPFC-VMPFC connectivity is a complete mediator of the interaction between gender and guilt-aversion behavior. Path coefficients are shown next to arrows with SEs in parentheses; *p , 0.05, ***p , 0.001. F, Diagram summarizing the results of our analyses. Activities in the DLPFC and DMPFC were correlated with guilt in both genders. The blue line represents a stronger connectivity between the VMPFC and right DLPFC in men than in women depending on VMPFC Â Guilt, and the green line represents stronger positive coupling between the VMPFC and DMPFC depending on VMPFC Â value difference.
where Neuroticism i , Extraversion i , Openness i , Agreeableness i , and Conscientiousness i are the individual's Big Five score (Murakami and Murakami, 1999), Age i is the individual's age, SelfEduHistory i and ParentsEduHistory i are the individual's scores of educational history and his/her parents' score of educational history, respectively (Okada et al., 2014), Income i and Occupation i are the individual's income and occupation, respectively (Ganzeboom et al., 1992), and SubjectiveSES i is the individual's subjective socioeconomic status (Adler et al., 2000). Sex i is the binary variable representing individual (1)'s sex (men = 1) and used to represent interactive effects with Big Five scores and socioeconomic status variables.The multiple linear regressions were conducted using the glm package based on the R statistical package (R Core Team, 2021).

Results fMRI study
Behavioral results of the fMRI study We first performed a logistic regression analysis to determine whether reward, guilt, and inequity had an effect on participant behavior (Cooperate or Defect). Behavioral data from the fMRI experiment (n = 52) were analyzed using the utility function, which comprises a linearly weighted sum of reward, guilt, and (absolute) inequity (for details, see Materials and Methods). The b values of the Figure 5. Results of gender differences in neural activity for inequity. A, Women showed greater ventral striatum activity than men (p = 0.008). The box plot illustrates the contrast estimates in the right ventral striatum and shows that only women showed increased activity in response to inequity (p , 0.001, t test). Differences of activities related to inequity between men and women are summarized in Extended Data Figure 5-1. B, A mediation analysis shows that the mediation effect of the striatum is significant (a*b, p , 0.001). Path coefficients are shown next to the arrows with SEs in parentheses; *p , 0.05, ***p , 0.001. three predictors, Reward, Guilt, and Inequity, were positive and significant (p , 0.001; Table 2), indicating that they all played critical roles in the current task.
Having confirmed that these three factors play crucial roles in the current task, we then compared b values between men and women. This analysis showed that the b value of Guilt [called b ðGuiltÞ hereafter] of men was significantly higher (t (41.6) = 2.05, p = 0.046; Fig. 2A) than that of women, whereas the b value of Inequity [called b ðInequityÞ hereafter] of women was significantly higher (t (48.7) = 2.11, p = 0.039; Fig. 2A) than that of men. These findings show that gender differences in prosocial behavior are heterogeneous depending on the underlying motives.
For the model validation and selection, 10 possible models were compared based on the predictive negative log likelihoods by a cross-validation. This cross-validation approach for value-based decision-making allows us to avoid overfitting the data and to compare models with different numbers of parameters robustly; it has also been adopted in many recent studies (Daw, 2011;Smith et al., 2014;Linderman and Gershman, 2017;Park et al., 2019;Fig. 2B; see also Model validation and comparison in Materials and Methods). More specifically, we introduced a bootstrap sampling (500 iterations) and compared the model predictions to the held-out data across all folds based on the negative log-likelihood of the estimated model for each participant. We then selected the model with the minimum negative log-likelihood and found that the best-fit model contained three predictors: Reward, Guilt, and Inequity. In addition, we compared the BIC and found not only that the best model was the same with the smaller mean BIC than the second best model of Fehr and Schmidt (1999;39.44 vs 40.61), but also that for 40 of the 52 participants (76.9%), the smallest BIC model was the best individual model (Fig. 2C).
Finally, we examined whether the guilt aversion parameter b ðGuiltÞ reflects the guilt experience of the participants in the current experiment. Note that b ðGuiltÞ captures a decision strategy to avoid future guilt but does not directly measure guilt. To address this issue, we analyzed the relationship between b ðGuiltÞ and the score of the postexperiment questionnaire (see Materials and Methods for the questionnaire). Question a asked whether participants understood the intentions behind player A's action, question b asked whether participants understood that they reduced player A's payoff if they chose Defect, and question c asked whether participants felt guilty when they reduced player A's expected payoff. We found significant or marginal positive correlation between b ðGuiltÞ and scores for the questions (Fig. 2D; question a, p = 0.0557; question b, p = 0.0491; question c, p = 0.0451). These results indicate that the guilt aversion parameter reflects the guilt experience in the current study.

Imaging results of guilt, inequity, and utility
For the imaging, we first examined the brain regions activated commonly in both genders. Similar to the logistic regression, a GLM analysis was conducted (SPM 12) to identify brain regions whose activity was correlated with the difference in guilt and inequity between the two choice options (hereafter, we call these differences guilt and inequity, respectively, for simplicity; see GLM1 in Materials and Methods). We included guilt and inequity as additional regressors attached to the task presentation event. We found a significant correlation between the amount of guilt and activity in the bilateral DLPFC and DMPFC [right DLPFC, p , 0.001; left DLPFC, p , 0.001; DMPFC, p , 0.001; family-wise error (FWE) corrected; Fig. 3A; Extended Data Fig. 3-1]. By contrast, the amount of inequity was correlated with activity in the bilateral ventral striatum (right ventral striatum, p = 0.035; left ventral striatum, p = 0.042; small volume FWE corrected; Fig. 3B; Extended Data Fig. 3-2). Additionally, we confirmed that the same results were obtained even when the two parameters (Guilt and Inequity) of GLM1 were analyzed as separate GLMs.
To identify the neural substrates that integrate different types of values, such as guilt and inequity, we searched for the neural correlates of the value difference between the choice options (larger utility-smaller utility; see GLM2 in Materials and Methods). We found a significant correlation between the value difference and activity in the VMPFC (p = 0.040; small volume FWE corrected; Fig. 3C; Extended Data Fig. 3-3), which is consistent with previous neuroimaging studies of value-based decision-making Nicolle et al., 2012). Significance: ***p , 0.001, **p , 0.01. We next performed a PPI analysis (Friston et al., 1997) to confirm the value signals in the VMPFC reflect the value components of both Guilt and Inequity. In our behavioral hypothesis, because participants make decisions depending on both the guilt and inequity components, the VMPFC should link with both the guilt-correlated area (DLPFC and DMPFC shown in Fig. 3A) and inequality-correlated area (striatum shown in Fig. 3B). To validate this hypothesis, we estimated a PPI in which signals in the VMPFC were modulated by the Guilt or Inequity values separately for each condition (see PPI1 in Materials and Methods). More specifically, we used the VMPFC (shown in Fig. 3C) as the seed region to determine which other brain regions correlated with VMPFC Â Guilt and VMPFC Â Inequity, respectively. For the PPI of VMPFC Â Guilt, this analysis revealed positive coupling between the VMPFC and the DMPFC (p , 0.001, uncorrected; Fig. 3C; Extended Data Fig. 3-4). Notably, the VMPFC Â Guilt contrast overlapped the guilt-correlated region in Figure 3A  (Fig. 3C). On the other hand, for VMPFC Â Inequity, we found positive coupling between the VMPFC and the striatum (p , 0.001, uncorrected; Fig. 3C; Extended Data Fig. 3-5). The VMPFC Â Inequity contrast overlaps the inequity-correlated region in Figure 3B at the relaxed threshold ( Fig. 3C; VMPFC Â Inequity, uncorrected p , 0.05). These results suggest that the guilt difference and inequity difference between the two options computed in the DMPFC and striatum contribute to the value difference in the VMPFC for both men and women.
Imaging results of gender differences for guilt Next, we explored the different neural substrates for guilt aversion between men and women (see GLM1.1 in Materials and Methods). Men showed higher correlation with guilt in the VMPFC (p = 0.029; small volume FWE corrected; Fig. 4A; Extended Data Fig. 4-1) compared with women, whereas there was no significant brain activity in the opposite contrast even at moderate threshold (uncorrected p , 0.005). Figure 4A illustrates a box plot of the contrast estimate from the VMPFC, confirming that men showed increased VMPFC activity (t (49.9) = 3.68, p , 0.001) when responding to guilt. Furthermore, this activation of the VMPFC overlapped with the activity correlated with guilt (Fig. 4B), indicating that the VMPFC is sensitive to guilt aversion overall and more so in men than in women. Importantly, the VMPFC activity correlating with the value difference was spatially close but did not overlap with the VMPFC activity correlating with the gender difference (Fig. 4C). This observation suggests that the two VMPFC areas are involved in related but distinct computations.
Having revealed gender differences in brain activity for guilt, we next performed a PPI analysis to examine whether different neural links work for guilt aversion in men and women. More specifically, we used the VMPFC (shown in Fig. 4A) as a seed region to search which other cortical regions correlated with the VMPFC Â Guilt and then conducted two-sample t tests to compare this contrast between men and women (see PPI2 in Materials and Methods). In other words, the aim of this analysis was to find differences between men and women in brain regions whose activity correlate more strongly with VMPFC activity in accordance with the increase of guilt. This analysis revealed that connectivity between the VMPFC and the right DLPFC is significantly stronger in men than in women (p , 0.001, uncorrected; Fig. 4A; Extended Data Fig. 4-2). The active right DLPFC area overlapped with the common activity correlated with guilt for men and women (Fig. 4D), suggesting that men recruit DLPFC-VMPFC connectivity more for guilt aversion, although the DLPFC works with the DMPFC to compute guilt in both genders.
The results so far suggest the possibility that the relationship of gender and guilt aversion is mediated by DLPFC-VMPFC connectivity. We therefore performed a mediation analysis to examine this hypothesis ( Fig. 4E; see Materials and Methods). Figure 4E shows the results of this analysis and suggests that DLPFC-VMPFC connectivity is a complete mediator of the interaction between gender and guilt-aversion behavior.
In summary, according to our PPI and mediation analyses, the DMPFC works with the DLPFC to compute guilt for both genders, and the VMPFC encodes not only the value difference in collaboration with the DMPFC in both genders but also the amount of guilt (difference) in collaboration with the DLPFC predominantly in men (Fig. 4F).

Imaging results of gender difference for inequity
We also searched for gender-related neural substrates for inequity aversion (see Materials and Methods, GLM1.2). We found that the ventral striatum was significantly more active in women than in men (p = 0.008; small volume FWE corrected; Fig. 5A; Extended Data Fig. 5-1), but there was no significant brain activity in the opposite contrast even at moderate threshold (uncorrected p , 0.005). The box plot of the contrast estimates in the ventral striatum (Fig. 5A) demonstrates that activity in this region was correlated with the increased inequity in women (t (50.0) = 4.26, p , 0.001). When we computed PPI for functional connectivity between the ventral striatum ( Fig. 5A as the seed region) and other brain areas in correlation with ventral striatum Â inequity (see Materials and Methods, PPI2), no differential link was identified between men and women (at uncorrected p , 0.001), indicating the important role of the ventral striatum in inequity aversion. Indeed, we performed a mediation analysis for our hypothesis that the relationship of gender and inequity-aversion behavior is mediated by the ventral striatum ( Fig. 5B; see also Materials and Methods, Mediation analysis) and found that the mediation effect of the striatum is significant (a*b, p , 0.001).

Online study
The behavioral data of our fMRI study (n = 52) showed that men display greater guilt aversion than women. However, this analysis provided only weak evidence because it was based on a relatively small dataset. In addition, our fMRI results did not specify the cognitive processes underlying the gender differences in guilt aversion, although the DLPFC-VMPFC connectivity result suggested a possibility that social norms play a key role, as discussed below. To clarify these issues and make the results more robust, we conducted a large-scale online behavioral study that also considered Big Five Inventory scores (Costa and McCrae, 1992) and socioeconomic status.
The differential use of prefrontal networks during guilt aversion may reflect different cognitive strategies used by men and women. Guilt aversion requires the ability to assess another individual's expectations and directly relates to his or her disappointment (i.e., empathy or theory of mind ;Hoffman, 1982). On the other hand, guilt aversion is also a normative behavior elicited by experience (Haidt, 2003) and therefore may be executed by self-discipline without requiring empathy or inference about another's mind (i.e., rule-based decisions or systemizing). Thus, we can think of two potential cognitive underpinnings of guiltbased prosocial behavior: empathy with the disappointment of others and rule-based decisions by self-discipline. Related to this, previous studies have reported that the link between the DMPFC and VMPFC and the one between the DLPFC and VMPFC are involved in the theory of mind (De Martino et al., 2013) and in social norms (Baumgartner et al., 2011;Pornpattananangkul et al., 2018;Hackel et al., 2020) and self-control (Hare et al., 2009;Steinbeis et al., 2016), respectively. However, it is also important to be careful of reverse inference.
Because evidence connecting these prefrontal networks and gender differences in guilt aversion remain elusive, we further investigated this issue using the Big Five Inventory (Costa and McCrae, 1992), which defines five fundamental dimensions of personality (i.e., neuroticism, extraversion, openness, agreeableness, and conscientiousness). Because agreeableness is characterized by the understanding of others' emotions, intentions and mental states, and conscientiousness is characterized by rule-based regulation and self-discipline (DeYoung et al., 2010), we hypothesized that guilt aversion correlates with agreeableness and conscientiousness and may also explain gender differences.

Behavioral results of online study
We first conducted a model selection using the same cross-validation analysis as the fMRI study and found that as in our fMRI study the same model containing three predictors: Reward, Guilt, and Inequity ( Fig. 2B; see also Materials and Methods) was selected as the best model. The same result was also found by the BIC analysis (Fig.  2C). We then performed the logistic regression comprised of reward, guilt, and inequity and found that the b values of Reward, Guilt, and Inequity were positive and significant (p , 0.001; Table 2), indicating that they all played a critical role in the online experiment.
To identify the relationship between guilt aversion [b ðGuiltÞ] and gender, we first performed a GLM analysis based on the explanatory variables including the gender term (Sex; men = 1), Big Five and socioeconomic status scores [target variable: b ðGuiltÞ] for all participants. The coefficients of Sex was positive and significant (p , 0.001; for other significant coefficients, Agreeableness and Income, p , 0.001), demonstrating that men displayed greater guilt aversion than women, validating the behavioral result in the fMRI study with even larger data.
Next, to identify the cognitive mechanisms specific to either gender, we performed the second GLM analysis that included interaction terms between the gender variable sex and Big Five and socioeconomic status scores (for more details, see Materials and Methods). We found that the coefficients of Agreeableness and SexÂ Conscientiousness were positive and significant (Agreeableness, p = 0.00,802; SexÂ Conscientiousness, p = 0.00,712; see Table 3). These findings support our hypothesis that for guilt aversion, both men and women use the empathic strategy, while men also recruit the rule-based strategy (i.e., social norms).

Discussion
In this study, in correspondence with stronger guilt aversion in men than women, we demonstrated that men recruit DLPFC-VMPFC connectivity more in the processing of guilt than women do. We also found that the DMPFC is involved in the processing of guilt and the value difference between the choice options for both men and women. The analysis of the online behavioral data of 4723 participants not only replicated the gender difference in guilt aversion, but also suggested that the stronger guilt aversion in men than women is attributable to the use of rule-based (social norm-based) strategies more, while both genders commonly use empathetic consideration. Previous behavioral economics studies have closely examined guilt aversion in social interactions (Charness and Dufwenberg, 2006;Khalmetski, 2016;Bellemare et al., 2017Bellemare et al., , 2018, but to our knowledge, this is the first study reporting the evidence of gender differences in guilt aversion. Additionally, we also replicated a previously reported result (Soutschek et al., 2017) that women show greater activity of the ventral striatum than men for stronger inequity aversion. For inequity-based prosocial behaviors, previous behavioral studies have reported that men choose efficient allocations, while women are more inequalityaverse (Croson and Gneezy, 2009;Kamas and Preston, 2015). In the ultimatum game, women are significantly more likely to propose an equal split than men (Güth et al., 2007) and more likely to reject lower offers than men (Solnick, 2001). Furthermore, in the dictator game and social value orientation tasks, women are more inequality-averse in their dictator-giving (Bolton and Katok, 1995;Eckel and Grossman, 1998;Andreoni and Vesterlund, 2001;Dickinson and Tiefenthaler, 2002;Grosch and Rau, 2017). With regard to brain function, previous studies have reported a key role of the ventral striatum in resource allocation and inequity aversion. Not only is ventral striatum activity positively correlated with the ratio of the payoff (i.e., the self's payoff vs the other's payoff; Fliessbach et al., 2007), it is also activated when inequity between the self and the other is reduced (Tricomi et al., 2010) and when making a decision to punish someone for acting unfairly (Crockett et al., 2013). A recent study suggested that activation patterns of the ventral striatum are gender-specific, being more sensitive to sharing money with others in women (Soutschek et al., 2017). The present study is consistent with these previous studies in the sense that women show stronger inequity aversion than men, with the ventral striatum playing a critical role.
At the neural level, previous studies have reported that DLPFC and DMPFC activity varies with guilt (Chang et al., 2011;Nihonsugi et al., 2015;van Baar et al., 2019). With regard to gender differences, the current study showed that the VMPFC plays a critical role in computing guilt in men. The VMPFC has been implicated in social cognition (Blakemore, 2008). For instance, the VMPFC was implicated in affective regulation and depression (Ressler and Mayberg, 2007), the evaluation of moral dilemmas (Crockett et al., 2017), and social value decision-making (Hare et al., 2010;Baumgartner et al., 2011). In line with these studies, some studies showed that men with right VMPFC lesions have deficits in social emotion and decision-making compared with men with left VMPFC lesions, but no such difference was seen in women (Tranel et al., 2005;Sutterer et al., 2015). In addition, men with right VMPFC lesions tended to show a significant elevation in paranoia and introversion according to the Minnesota Multiphasic Personality Inventory-2 Scale, a widely-used measure of personality and psychopathology (Tranel et al., 2005). These results suggest that the right VMPFC plays an important role in social decision-making in men, consistent with the present study reporting that men use the right VMPFC (coordinates 10, 42, À16) more than women to implement guilt aversion.
A previous study showed that connectivity between the VMPFC and DLPFC in men is associated with normative decisions in the ultimatum game (Baumgartner et al., 2011). The study recruited male subjects (n = 32) and demonstrated that repetitive transcranial magnetic stimulation applied to the right DLPFC of responders in the ultimatum game subsequently reduced their rejection rate (i. e., normative decision) and also diminished activity in the DLPFC and VMPFC. This result is consistent with our view that connectivity between the VMPFC and DLPFC plays a key role in guilt-based prosocial behavior in men.
For the cognitive mechanisms underlying gender differences in guilt aversion, our online study showed that guilt aversion in men correlates with conscientiousness. The Empathizing-Systemizing theory is widely known as a measure of individual differences in cognition (Baron-Cohen et al., 2003). Empathizing is the drive to identify another's mental state and to respond with an appropriate emotion and has a positive correlation with agreeableness (Nettle, 2007;Wakabayashi and Kawashima, 2015). On the other hand, systemizing is defined as the drive to analyze, understand, predict, control, and construct rule-based systems (e.g., map-reading, physics, and mathematics) and has a positive correlation with conscientiousness, which has a desire for order as one of its components (Nettle, 2007;Wakabayashi and Kawashima, 2015). Interestingly, several previous studies showed that men are more interested in systemizing than women ( Baron-Cohen, 2004;Greenberg et al., 2018). These behavioral backgrounds are consistent with our functional connectivity result of the DLPFC-VMPFC, as this link has been associated with social norms (Baumgartner et al., 2011;Pornpattananangkul et al., 2018;Hackel et al., 2020). It may also be worth noting that our results suggest that guilt aversion contains both empathizing (empathy or theory of mind) and systemizing (social norms) components. By conducting a large-scale online behavioral study, we strengthened the neuroscientific hypothesis that the DLPFC-VMPFC connectivity predominantly seen in men contributes to their stronger guilt aversion by the influence of social norms. Because the recruitment of an equally large sample for fMRI experiments is very difficult, we believe that integrating fMRI and large-scale online experiments provides a powerful tool to obtain broader and more reliable insights into human cognitions.
There is the possibility that a small t may elicit an emotion other than guilt, such as distrust, because we did not directly measure emotions to belief (t A ). Distrustful behavior can be perceived as hostile acts and reduce cooperation (Fehr and Rockenbach, 2003). Related to this, a previous behavioral study (Balafoutas and Fornwagner, 2017) showed that there is an inverted-U shape relationship between belief and guilt aversion using a simple dictator game. This relationship suggests that there is a threshold beyond which guilt aversion no longer applies and higher perceived expectations lead to less kind behavior on the part of the decision makers. However, this phenomenon may only occur in the dictator game, because the dictator is less likely to feel guilt because of the lack of a rational reason to live up to the recipient's expectations. In any case, the fact that some previous research findings did not show a linear relationship between belief (t ) and cooperation is likely to reflect psychological differences in response to the size of belief. The present study did not allow us to address these issues, because it only considered reasonably high belief (t A ). Future research should assess emotions to beliefs more precisely.
Finally, our findings do not preclude the possibility that social environments largely contribute to the gender differences in guilt aversion instead of biological reasons. At the same time, our behavioral data (n = 4723) suggested that gender differences in guilt aversion are independent of age (see Table 3), indicating that gender differences are only weakly dependent on contemporaneous social environmental factors and more affected by long-lasting determinants such as social systems and biological factors. Therefore, further investigation is necessary to address what causes the gender differences in guilt aversion. For instance, we need to examine the behavioral and neural gender differences in guilt aversion in different cultures (i.e., South-East Asian and European countries). Such studies would provide more biological and societal insights into our understanding in the diversity of human prosocial behaviors.