Introduction

In late 2019 a novel coronavirus, SARS-CoV-2, emerged in China. In humans, this virus can lead to the respiratory disease COVID-19, which can be fatal1,2. Since then, SARS-CoV-2 has spread around the world, causing widespread mortality, and with major impacts on societies and economies. While the virus and its resulting disease represent a major humanitarian disaster, they also represent a potentially existential risk to our closest living relatives, the nonhuman primates. Transmission incidences of bacteria and viruses—including another coronavirus (H-CoV-OC43)—from humans to wild populations of nonhuman primates have previously been linked to outbreaks of Ebola, yellow fever, and fatal respiratory diseases, leading in some cases to mass mortality3,4,5,6,7,8,9. Such past events raise considerable concerns among the global conservation community with respect to the impact of the current pandemic10.

Infection studies of rhesus monkeys, long-tailed macaques, and vervets as biomedical models have made it clear that at least some nonhuman primate species are permissive to SARS-CoV-2 infection and develop symptoms in response to infection that resemble those of humans following the development of COVID-19, including similar age-related effects11,12,13,14,15,16. Recognizing the potential danger of COVID-19 to nonhuman primates, the International Union for the Conservation of Nature (IUCN), together with the Great Apes section of the Primate Specialist Group, released a joint statement on precautions that should be taken for researchers and caretakers when interacting with great apes17. However, the risk for many primate taxa remains unknown. Here we begin to assess the potential likelihood that our closest living relatives are susceptible to SARS-CoV-2 infection.

While the biology underlying susceptibility to SARS-CoV-2 infection remains to be fully elucidated, the viral target is well established. The SARS-CoV-2 virus binds to the cellular receptor protein angiotensin‐converting enzyme‐2 (ACE2), which is expressed on the extracellular surface of endothelial cells of diverse bodily tissues, including the lungs, kidneys, small intestine, and renal tubes18. ACE2 is a carboxypeptidase whose activities include regulation of blood pressure and inflammatory response through its role in cleaving the vasoconstrictor angiotensin II to produce angiotensin 1–7 and triggering varied downstream responses19,20,21,22. ACE2 is made up of a signal sequence at the N terminus (residues 1–17), a transmembrane sequence at the C terminus (residues 741–762), and an extracellular region, which contains a zinc metallopeptidase domain (residues 19–611) and a collectrin homolog (residues 612–740)23,24.

Characterizations of the infection dynamics of SARS-CoV-2 have demonstrated that the binding affinity for the human ACE2 receptor is high, which is a key factor in determining the susceptibility and transmission dynamics. When compared to SARS-CoV, which caused a serious global outbreak of the disease in 2002–200325,26, the binding affinity between SARS-CoV-2 and ACE2 is estimated to be between fourfold27,28,29,30 and 10- to 20-fold greater31. Recent reports describing structural characterization of ACE2 in complex with the SARS-CoV-2 spike protein receptor-binding domain (RBD)27,28,29,30 allow identification of the key binding residues that enable the host–pathogen protein–protein recognition. Following the initial binding of the virus to the ACE2 receptor, humans experience a great deal of variation in response to infection, with some individuals experiencing relatively mild symptoms, while others experience major breathing problems and organ failures, which can lead to death. Some of this response is known to be linked to variation in how the immune system responds to infection, with some individuals experiencing a hyperinflammatory ‘cytokine storm’, which in turn aggravates respiratory failures and increases mortality risk32,33. There may also be some variation among humans in initial susceptibility to infection, such that approaches examining variation in ACE2 tissue expression and gene sequences can offer insight into variation in human susceptibility to COVID-1934,35,36,37. Similarly, we can use such an approach to compare sequence variation across species, and hence try to predict the likely interspecific variation in susceptibility to initial infection. Previous analysis of comparative variation at these sites enabled estimates of the affinity of the ACE2 receptor for SARS-CoV in nonhuman species (bats)38.

Here, we undertake such an analysis for SARS-CoV-2 across the primate radiation. Our aim is to investigate the likelihood of initial susceptibility to infection for different major radiations and species while recognizing that downstream processes such as immune responses are likely to determine the extent to which species and individuals develop symptoms and pathologies in response to infection. We compiled ACE2 gene sequence data from 29 primate species for which genomes are publicly available, covering primate taxonomic breadth. For comparison, we assessed 4 species of other mammals that have been tested directly for SARS-CoV-2 susceptibility in laboratory infection studies39. We also included in our analysis the amino acid sequence variation at these sites for horseshoe bats, thought to be the original vector of the virus, and pangolins, a potential intermediate host, where viral recombination may have led to the novel viral form SARS-CoV-240. We assessed the variation at amino acid residues identified as critical for ACE2 recognition by the SARS-CoV-2 RBD and undertook an analysis of positive selection and protein modeling to gauge the potential for adaptive differences and the likely effects of protein variation. Our aim was to develop predictions about the susceptibility of our closest living relatives to SARS-CoV-2 as a resource for stakeholders, including researchers, caretakers, practitioners, conservationists, and governmental and non-governmental agencies.

Results

Variation in ACE2 sequences

The ACE2 gene (2418 bp) and translated protein (805 amino acids) sequences are strongly conserved across primates. The average pairwise identity across 29 primate species is 93.6% for the ACE2 nucleotide sequence and 90.8% for the protein sequence, with a pairwise similarity (BLOSUM62 ≥ 1) of 95.3% (Supplementary Data 13). Out of 2418 bp, 1631 bp (67.5%) are identical, while 401 bp (16.58%) are phylogenetically-informative sites for primates, and gene trees we generated (Supplementary Fig. S1a, b) closely recapitulate the currently accepted phylogeny of primates (Fig. 1). In particular, the twelve sites in the ACE2 protein that are critical for binding of the SARS-CoV-2 virus are invariant across the Catarrhini, which includes great apes, gibbons, and monkeys of Africa and Asia (Fig. 1). Furthermore, catarrhines do not vary at any of the 21 sites identified by alanine scanning (Supplementary Table S1 and Supplementary Fig. S2). The other major radiation of monkeys, those found in the Americas (Platyrrhini), have ACE2 sequences that are less similar to humans across the length of the protein (91.68–92.55% identical to H. sapiens, Supplementary Data 2) but conserved within their clade (average pairwise identity 97.2%, Supplementary Data 2). They share nine of twelve critical amino acid residues with catarrhine primates; the three sites that vary from catarrhines, H41, E42, and T82, are conserved within the platyrrhines. Strepsirrhine primates and tarsiers, were more variable in the binding sites and less similar to the human protein across the length of the sequence (81.86–86.93% pairwise identity, Supplementary Data 2). Like platyrrhines, the tarsier (Carlito syrichta), mouse lemur (Microcebus murinus), and galago (Otolemur garnettii) have an H41 residue, while the sifaka (Propithecus coquereli), aye-aye (Daubentonia madagascariensis), and the blue-eyed black lemur (Eulemur flavifrons) have the same allele as humans and other catarrhines, Y41.

Fig. 1: ACE2 protein sequence alignment and evolutionary relationships of study species.
figure 1

Branch lengths represent the evolutionary distance (time, in millions of years) estimated from TimeTree63. We outline amino acid residues at critical binding sites for the SARS-CoV-2 spike receptor-binding domain. Solid outlines highlight sites predicted to have the most substantial impact on viral binding affinity. Notably, protein sequences of catarrhine primates are highly conserved, including uniformity among amino acids at all binding sites. Primate species that are able to be successfully infected with COVID-19 are indicated in red. Predicted susceptibility to COVID-19 for other primates is additionally coded by terminal branch colors. We use the nomenclature Cebus capucinus to be consistent with the species name used in the genome annotation but note the recent adoption of Cebus imitator for this species. Silhouettes are from PhyloPic.org and available under the Public Domain Dedication 1.0 license, with the exception of Cebus (Sarah Werning; Creative Commons Attribution 3.0 Unported).

In non-primate mammals, a higher number of amino acid substitutions are evident (77.37-85.22% pairwise identity to H. sapiens, Supplementary Data 2), including at critical binding sites. All species possess a different residue to primates at site 24. Bats are exceptionally variable within the binding sites, with the genus Rhinolophus alone encompassing all of the variation seen in the rest of the non-primate mammals. Where primates have glutamine (Q24), bats have glutamate (E24), lysine (K24), leucine (L24), or arginine (R24) (Fig. 1). All fasta alignments of ACE2 gene and protein sequences are available in Supplementary Data 47, a full-length protein alignment is also shown in Supplementary Fig. S2, and distance matrices are provided in Supplementary Data 13.

Analysis of species-specific residues on ACE2–RBD interactions

The ACE2 receptors of all catarrhines have identical residues to humans at the RBD/ACE2 binding interface across all 12 critical sites, and are predicted to have a similar binding affinity for SARS-CoV-2. Platyrrhines diverge from catarrhines at three of the twelve critical amino acid residues. Compared to catarrhine ACE2, the platyrrhines’ ACE2 is predicted to bind SARS-CoV-2 RBD with a roughly 400-fold reduced affinity (ΔΔGbind = 3.5 kcal/mol) (Table 1). In particular, the change at site 41 from Y to H found in monkeys in the Americas has the largest impact of any residue change examined (Table 2), which alone is predicted to lead to a 25-fold decrease in the binding affinity to SARS-CoV-2 (Fig. 2). This single mutation combined with additional substitutions, especially Q42E, found in platyrrhines is predicted to substantially reduce the likelihood of successful viral binding (Table 2). Of the other primates modeled, two of the three strepsirrhines, and tarsiers, also have the H41 residue and furthermore have additional protein sequence differences leading to further decreases in predicted binding affinity. The predicted binding affinity of tarsier ACE2 is the most dissimilar to humans and this primate might be the least susceptible of the species we examine. In contrast, Coquerel’s sifaka (Propithecus coquereli), the aye-aye (Daubentonia madagascariensis), and a blue-eyed black lemur (Eulemur flavifrons) share the same residue as humans and other catarrhines at site 41 and have projected affinities that are near to humans (Table 2). Other mammals included in our study - ferrets, cats, dogs, pigs, pangolin, and two of the seven bat species (R. pusillus and R. macrotis) - show the same residue as humans (Y) at site 41, with accompanying strong affinities for SARS-CoV-2. The remaining five sister species of bats possess H41 and lower binding affinities (Table 2).

Table 1 Results of computational protein–protein interaction experiments predicting the impact of amino acid changes, relative to human ACE2 residues, across the full complement of critical binding sites with SARS-CoV-2 receptor-binding domain.
Table 2 Results of computational protein–protein interaction experiments predicting the impact of single residue replacements, relative to human ACE2 residues, at critical binding sites with SARS-CoV-2 receptor-binding domain.
Fig. 2: Model of human ACE2 in complex with SARS-CoV-2 RBD.
figure 2

Key ACE2 interfacial residues are highlighted (a). Interactions at critical binding sites 41 and 42 are shown for the residues found in all catarrhines (apes and monkeys in Africa and Asia) (b), and for the residues found in all platyrrhines (monkeys in the Americas) (c). The dashed lines indicate predicted hydrogen bonding interactions. Y41 participates in extensive van der Waals and hydrogen bonding interactions with RBD; these interactions are abrogated with histidine. Q42 side-chain amide serves as a hydrogen acceptor and donor to contact RBD; change to glutamic acid diminishes the hydrogen bonding interactions.

Adaptive evolution of ACE2 sequences

We find evidence that the selective pressures acting on ACE2 are not equivalent across the major clades in our analysis. The codeml clade model C provided a better fit than the null model (LRT = 26.726, p < 0.001; Table 3, Supplementary Table S3). Branch-site models indicate that the catarrhine primate clade (LRT = 14.546, p < 0.001) and bat clade (LRT = 42.649, p < 0.001) are both under positive selection, while platyrrhines (LRT = 0.633, p = 0.427) and strepsirrhines (LRT = 0.833, p = 0.361) are not. The six positively selected sites in the bat clade include the binding site 24 and two others adjacent to known binding sites (Table 3). In catarrhines, the three positively selected sites identified by BEB calculations are not near the binding sites for SARS-Cov-2 (residues 249, 653, and 658; Table 3).

Table 3 Results of codeml analyses of adaptive evolution across ACE2 gene sequences.

Discussion

Our results strongly suggest that catarrhines - all apes, and all monkeys of Africa and Asia, are likely to be susceptible to infection by SARS-CoV-2. There is high conservancy in the protein sequence of the target receptor, ACE2, including uniformity at all identified and tested major binding sites. Indeed, even among the 21 residues identified in our full list of potential binding points, catarrhines are invariant (Supplementary Table 1 and Supplementary Fig. S2). Consistent with our results, infection studies show that rhesus monkeys (Macaca mulatta), long-tailed macaques (M. fascicularis), and vervets (Chlorocebus sabaeus) are permissive to infection by SARS-CoV-2, and go on to develop COVID-19 like symptoms11,12,14,15,16. Our results based on protein modeling offer potentially better news for monkeys in the Americas (platyrrhines). There are three differences in amino acid residues between platyrrhines and catarrhines, and two of these, H41Y and E42Q show strong evidence of being impactful changes. These amino acid changes are modeled to reduce the binding affinity between SARS-CoV-2 and ACE2 by ca. 400-fold. Recent clinical analysis of viral shedding, viremia, and histopathology in catarrhine (macaque) versus platyrrhine (marmoset, Callithrix jacchus) responses to inoculation with SARS-CoV-2, show much more severe presentation of disease symptoms in the former, strongly supporting our results16. Similar reduced susceptibility is predicted for tarsiers, and two of the five lemurs and lorisoids (strepsirrhines). What is concerning is that three of the analyzed lemurs spanning divergent lineages—the Coquerel’s sifaka, the aye-aye, and the blue-eyed black lemur—are more similar to catarrhines at important binding sites, including possessing the high-risk residue variant at site 41, and as such are also predicted to be susceptible. Nonetheless, these are only predicted results based on amino acid residues and protein–protein interaction models. We urge extreme caution in using our analyses as the basis for relaxing policies regarding the protection of platyrrhines, tarsiers or any strepsirrhines. Experimental assessment of synthetic protein interactions can now occur in the laboratory, e.g.41, and confirmation of our model predictions should be sought before any firm conclusions are reached.

Emerging evidence in experimental mammalian models appears to support our results; dogs, ferrets, pigs, and cats have all shown some susceptibility to SARS-CoV-2 but have demonstrated variation in disease severity and presentation, including across studies39,42. Substitutions at binding sites might be at least partially protective against COVID-19 in these mammals. For example, the limited experimental evidence to date suggests that while cats - which have the same residue as humans at site 34—are not strongly symptomatic, they present lung lesions, while dogs—which have a substitution at this site—do not39. The amino acid residue at site 24 differs from primates in all other mammalian species examined. However, our models suggest that the variant residues may confer relatively minor reductions in binding affinity. Other sources of variation may affect ACE2 protein stability34. Our results are also consistent with previous reports that ACE2 genetic diversity is greater among bats than that observed among mammals susceptible to SARS-CoV-type viruses. This variation has been suggested to indicate that bat species may act as a reservoir of SARS-CoV viruses or their progenitors38. Intriguingly, all but 2 bat species we examined have the putatively protective variant, H41. Additionally, results of our codeml branch-site analysis support previous findings of ACE2 in bats being under positive selection, including sites within the binding domain of SARS-CoV and SARS-CoV-243, which may be evidence of host-virus coevolution. Sites showing evidence of positive selection within catarrhine ACE2 sequences were not in or near known CoV binding sites (Table 3 and Fig. 1). Two (residues 653, 658) fall within the cleavage site (residues 652-659) utilized by the sheddase ADAM17, known to interact with ACE244. However, neither of the residues under selection are the amino acids targeted by ADAM1745 leaving the functional significance of evolution at these sites uncertain. Further clinical and laboratory study is needed to fully understand infection dynamics.

There are a number of important caveats to our study. Firstly, all of our predictions are based on interpretations of gene and resultant amino acid sequences, rather than based on direct assessment of individual responses to induced infection. Nonetheless, the overall pattern of our results is being borne out by infection studies on a few species that are used as biomedical models. So far, all catarrhine species tested by infection studies, including rhesus macaques, long-tailed macaques, and vervet monkeys12,16,46 have exhibited COVID-19-like symptoms in response to infection, including large lung and other organ lesions16 and cytokine storms12. In contrast, marmosets did not exhibit major symptoms in response to infection16. While these results support and validate our findings based on ACE2 sequence interpretation, the number of primate species that can and will be tested directly by infection studies will be restricted to just a handful. Our study enhances this picture, by allowing inferences to be made across the primate radiation, backed up by the published infection studies on a few target model species. Some of our results, such as the uniform conservation of ACE2  binding sites among catarrhines, backed up by the demonstrated high susceptibility of humans and other catarrhines to SARS-CoV-2, should give a good degree of confidence of high levels of risk. Given the identical residues of humans to other apes and monkeys in Asia and Africa at the target sites, it seems unlikely that the ACE2 receptor and the SARS-CoV-2 proteins would not readily bind. Our results for other taxa are dependent on modeling, hence should be treated more cautiously. This includes all interpretations of the susceptibility of platyrrhines and strepsirrhines, where the effects of residue differences on binding affinities have been estimated based on protein–protein interaction modeling. Another caveat is that we have modeled only interactions at binding sites, and not predictions based on full residue sequence variation. Residues that are not in direct contact may still affect binding allosterically. Other factors, including proteases necessary for viral entry, and other viral targets, may also impact disease susceptibility and responses34. More generally, if adhering to the precautionary principle, then our results highlighting higher risks to some species should be taken with greater gravity than our results that predict potential lower risks to others. Another limitation of our study is that we have looked at only 29 primate species, albeit with broad taxonomic scope. Analysis of additional species is important, especially among strepsirrhine species, where our coverage is relatively scant. In particular, the residue overlap at important binding sites in the sequences of Coquerel’s sifaka, the aye-aye, and blue-eyed black lemur with those of catarrhines suggests many lemurs may be highly vulnerable and we underscore the need to assess a wider diversity of lemur species. Furthermore, we examine only one individual per species, and intraspecific variation across populations should be considered; however, studies on intraspecific ACE2 variation with humans and vervet monkeys suggest ACE2 variants are low in frequency47,48,49. Finally, it is also important to remember that our study assesses only the potential for the initial binding of the virus to the target site. Downstream consequences of infection may differ drastically based on species-specific proteases, genomic variants, metabolism, and immune system responses50,51. In humans, the development of COVID-19 can lead to a pro-inflammatory cytokine storm of hyperinflammation, which may lead to some of the more severe impacts of infection32,52. Nonetheless, it is evident from the hundreds of thousands of deaths and global lockdown that humans are highly susceptible to SARS-CoV-2 infection, and our results suggest that all apes and monkeys in Africa and Asia are similarly susceptible.

Many endangered primate species are now only found in very small population sizes53. For example, there are believed to be only around 1000 mountain gorillas left in their entire range54. With such small populations, the introduction of a new highly infectious disease is of serious concern. Re-opening access to habituated great ape groups for tourism purposes, which may be critical to local economies55, may be fraught with issues. IUCN best practices recommend that tourists stay at least 7 meters away from great apes56, but in practice, almost all tourists get far closer than this - for example, the average distance that tourists get from mountain gorillas at the Bwindi Impenetrable National Park in Uganda is just 2.76 m57. A concerted effort may be required by all stakeholders to try to avoid the introduction of SARS-CoV-2 into wild primate populations10. Recent measures suggested by the IUCN for researchers and caretakers of great ape populations include: ensuring that all individuals wear clean clothing and disinfected footwear; providing hand-washing facilities; requiring that a surgical face mask be worn by anyone coming within 10 m of great apes; ensuring that individuals needing to cough or sneeze ideally leave the area, or at least cough/sneeze into the crux of their elbows; imposing a 14-day quarantine for all people arriving into great ape areas who will come into frequent close proximity with them17. The IUCN’s ‘Best Practice Guidelines for Health Monitoring and Disease Control in Great Ape Populations’ should also be followed58.

Our results suggest that dozens of nonhuman primate species, including all of our closest relatives, are likely to be highly susceptible to SARS-CoV-2 infection, and vulnerable to its effects. Major actions may be needed to limit the exposure of many wild primate populations to humans. This is likely to require coordinated input from all stakeholders, including local communities, international and national governmental agencies, non-governmental conservation and development organizations, and academics and researchers. While the focus of many at this time is rightly on mitigating the humanitarian devastation of COVID-19, we also have a duty to ensure that our closest living relatives do not suffer from devastating infections and further population declines in response to yet another human-induced catastrophe.

Methods

Variation in ACE2 sequences

We compiled ACE2 gene sequences for 16 catarrhine primates: 4 species from all 3 genera of great ape (Gorilla, Pan, Pongo), 2 genera of gibbons (Hylobates, Nomascus), and 10 species of African and Asian monkeys in 7 genera (Cercocebus, Chlorocebus, Macaca, Mandrillus, Papio, Rhinopithecus, Piliocolobus, Theropithecus); 6 genera of platyrrhines (monkeys from the Americas: Alouatta, Aotus, Callithrix, Cebus, Saimiri, Sapajus); 1 species of tarsier (Carlito syrichta); and 5 genera of strepsirrhines (lemurs and lorisoids: Eulemur, Daubentonia, Microcebus, Propithecus, Otolemur) (Supplementary Table S2). We also included four species of mammals that have been tested clinically for susceptibility to SARS-CoV-2 infection39, including the domestic cat (Felis catus), dog (Canis lupus familiaris), pig (Sus scrofa), and ferret (Mustela putorius furo). Finally, we included the pangolin (Manis javanica) and several bat species, including horseshoe bats (Rhinolophus spp., Hipposideros pratti, Myotis daubentonii). Sequences were retrieved from NCBI, either from annotations of published genomes or from GenBank entries38. We manually checked annotations by performing tblastn searches of the human ACE2 protein sequence against each genome. We identified one misannotation for exon 15 in Microcebus murinus, which we manually corrected. The ACE2 nucleotide sequence for Alouatta palliata was obtained from an unpublished draft genome, via tblastn searches using the Cebus ACE2 protein sequence as a query and default search settings. Accession numbers for sequences retrieved from NCBI and GenBank are provided in Supplementary Table S2 and the Alouatta palliata sequence is available in Supplementary Data 4.

Coding sequences were translated using Geneious Version 9.1.8 and we aligned both nucleotide and amino acid sequences with MAFFT59. Amino acids were aligned with the BLOSUM62 scoring matrix, while the 200 PAM scoring matrix was used for nucleotides. A 1.53 gap open penalty and an offset value of 0.123 were used for both. We manually inspected and corrected any misalignments, and verified the absence of indels and premature stop codons.

To visualize patterns of gene conservation across taxa and identify the congruence of the ACE2 gene tree with currently accepted phylogenetic relationships among species, we reconstructed trees using both Bayesian (MrBayes 3.2.660) and Maximum Likelihood (RAxML 8.2.1161) methods with 200,000 MCMC cycles and 1000 bootstrap replicates, respectively (code available on GitHub62). Gene trees were compared to a current species phylogeny assembled using TimeTree63, which is also used to illustrate the evolutionary relationships between study species in Fig. 1. Phylogenetically-informative sites along the ACE2 sequence were identified with the pis function in the R package ips v. 0.0.1164,65.

Identification of critical binding residues and species-specific ACE2–RBD interactions

Critical ACE2 protein contact sites for the viral spike protein receptor-binding domain (RBD) have been identified using cryo-EM and X-ray crystallography structural analysis methods27,28,29,30. The ACE2–RBD complex is characteristic of protein–protein interactions (PPIs) that feature extended interfaces spanning a multitude of binding residues. Experimental and computational analyses of PPIs have shown that a handful of contact residues can dominate the binding energy landscape66. Alanine scanning mutagenesis provides an assessment of the contribution of each residue to complex formation67,68,69. Critical binding residues can be computationally identified by assessing the change in binding free energy of complex formation upon mutation of the particular residue to alanine, which is the smallest residue that may be incorporated without significantly impacting the protein backbone conformation70. Our computational modeling utilizes the human SARS RBD/ACE2 high-resolution structures, and we make the implicit assumption that the overall conformation of ACE2 is conserved among different species. This assumption, which is rooted in the high sequence similarity between ACE2 sequences, allows us to use the structure of the complex to predict the impact of mutations at the protein–protein interface.

We defined critical residues as those that upon mutation to alanine decrease the binding energy by a threshold value ΔΔGbind ≥ 1.0 kcal/mol. Nine of the 21 residues identified by alanine scanning as involved in the ACE2–RBD complex met this criterion (Supplementary Table S1). There was a large congruence in the sites identified with those highlighted by other methods. Each of the eight sites implicated by cryo-EM27, were also detected by alanine modeling; five residues were ≥1.0 kcal/mol threshold and 3 were below this threshold. To be cautious, in addition to the 9 critical ACE2 sites we identified through alanine scanning, we also examined residue variation at the 3 sites that fell below the ≥1.0 kcal/mol threshold but that were identified as important by structural analyses27,28,29,30 for a total of 12 critical sites. All computational alanine scanning mutagenesis analyses were performed using Rosetta software70. The alanine mutagenesis approach has been extensively evaluated and used to analyze PPIs and design their inhibitors, including by members of the present authorship71,72.

We utilized the SSIPe program73 to predict how ACE2 amino acid differences in each species would affect the relative binding energy of the ACE2/SARS-Cov-2 interaction. Using human ACE2 bound to the SARS-Cov-2 RBD as a benchmark (PDB 6M0J), the program mutates selected residues and compares the binding energy to that of the original. Using this algorithm, we studied interactions of all primates across the full suite of amino acid changes occurring at critical binding sites for each species. To more thoroughly assess the impact of each amino acid substitution, we also examined the predicted effect of individual amino acid changes (in isolation) on protein-binding affinity.

Adaptive evolution of ACE2 sequences

We further investigated ACE2 and how selective pressures in different clades might be shaping variation at the binding sites, using codeml clade C and branch-site models in PAML74. We first tested if selection acting on ACE2 is divergent between the major clades in our sample (platyrrhine, catarrhine, and strepsirrhine primates, non-primate mammals) with the codeml clade model C, which was compared to the null model (M2a_rel) with a likelihood ratio test75. This test shows whether there is a divergent selection (dN/dS ratio = ω) across all clades, but not which clades are experiencing positive selection. We, therefore, followed the clade model with a series of branch-site models, which allow one clade at a time to be designated as a set of “foreground” branches and test whether this clade has experienced episodes of positive selection compared to the remaining sets of “background” branches (ωforeground > ωbackground). Branch-site models are compared to a null model that fixes ω at 1 with a likelihood ratio test. In the case of the alternative model having a significantly better fit than the null model, indicating positive selection, potential sites under positive selection are identified with a Bayes Empirical Bayes (BEB) approach76. We completed branch-site models for each primate clade (platyrrhine, strepsirrhine, and catarrhine), as well as bats because previous research has identified ACE2 to be under positive selection in this clade, potentially in response to coronaviruses43. We had to exclude Hipposideros pratti and Myotis daubentonii from PAML analyses, because only a partial ACE2 sequence was available for these two species. Input files and control files for PAML codeml analyses are available in the GitHub repository62.

Statistics and reproducibility

Models in PAML were compared with likelihood ratio tests and evaluated for significance with a right-tailed chi-squared distribution. As this was a comparative study of gene sequences across species, we had one representative individual for each species (n = 41) and no replicates.

Reporting summary

Further information on research design is available in the Nature Research Life Sciences Reporting Summary linked to this article.