Heterogeneous Expression of Nuclear Encoded Mitochondrial Genes Distinguishes Inhibitory and Excitatory Neurons

Abstract Mitochondrial composition varies by organ and their constituent cell types. This mitochondrial diversity likely determines variations in mitochondrial function. However, the heterogeneity of mitochondria in the brain remains underexplored despite the large diversity of cell types in neuronal tissue. Here, we used molecular systems biology tools to address whether mitochondrial composition varies by brain region and neuronal cell type in mice. We reasoned that proteomics and transcriptomics of microdissected brain regions combined with analysis of single-cell mRNA sequencing (scRNAseq) could reveal the extent of mitochondrial compositional diversity. We selected nuclear encoded gene products forming complexes of fixed stoichiometry, such as the respiratory chain complexes and the mitochondrial ribosome, as well as molecules likely to perform their function as monomers, such as the family of SLC25 transporters. We found that the proteome encompassing these nuclear-encoded mitochondrial genes and obtained from microdissected brain tissue segregated the hippocampus, striatum, and cortex from each other. Nuclear-encoded mitochondrial transcripts could only segregate cell types and brain regions when the analysis was performed at the single-cell level. In fact, single-cell mitochondrial transcriptomes were able to distinguish glutamatergic and distinct types of GABAergic neurons from one another. Within these cell categories, unique SLC25A transporters were able to identify distinct cell subpopulations. Our results demonstrate heterogeneous mitochondrial composition across brain regions and cell types. We postulate that mitochondrial heterogeneity influences regional and cell type-specific mechanisms in health and disease.


Introduction
The mitochondrion is classically depicted as the powerhouse of the cell despite performing a variety of functions outside of ATP production (Spinelli and Haigis, 2018). From a purely bioenergetic perspective, some of these functions are necessary for energy requirements to maintain plasma membrane potential, synaptic activity, and actin cytoskeleton dynamics (Attwell and Laughlin, 2001;Bernstein and Bamburg, 2003;Harris et al., 2012). However, additional roles for mitochondria have been identified in behavior, synaptic plasticity, neuronal migration, neurodevelopment, calcium buffering, lipid metabolism, and cell death (Kann and Kovács, 2007;Mattson et al., 2008;Mann et al., 2021). The requirement of functional mitochondria for neuronal tissue is perhaps best demonstrated by the family of mitochondrial diseases, which share a high prevalence of neurologic symptoms despite being otherwise clinically heterogeneous (Chinnery, 1993;Vafai and Mootha, 2012;Gorman et al., 2016).
Mitochondria are dynamic organelles and vary phenotypically by organ, cell type, and even within the cell (Pagliarini et al., 2008;Aryaman et al., 2018;Fecher et al., 2019;Rath et al., 2021). These differences in phenotypes may emerge because of variation in mitochondrial composition across cell types and/or within a single cell. This concept has been poorly considered and explored to date, as most studies of mitochondrial biology involve bulk purification of mitochondria from diverse organs (Pagliarini et al., 2008;Fecher et al., 2019;Rath et al., 2021). Cell type-specific differences in mitochondrial composition could determine differential cellular susceptibility to neurodevelopmental disorders and neurodegenerative diseases. Here, we address whether mitochondrial composition varies across cell types and brain regions. We take advantage of systems biology gene expression analyses in microdissected brain tissue and single-cell mRNA sequencing (mRNAseq) datasets. We analyzed the transcriptome and proteome in microdissected mouse cortex, hippocampus, and striatum. We focused on the five respiratory chain complexes and the mitochondrial ribosome, as necessary components of mitochondria that have a fixed stoichiometry (Vafai and Mootha, 2012), as well as the SLC25A transporter family, as molecules of variable expression among tissues (Cunningham and Rutter, 2020;Palmieri et al., 2020;Rath et al., 2021). Collectively, this set of genes encompasses 18% of the mitochondriaannotated proteome (Rath et al., 2021). Notably, while the expression of this selected set of nuclear encoded mitochondrial genes produced distinct regional clusters differentiating the cortex, hippocampus, and striatum at the proteome level, analysis of the transcript expression of these nuclear encoded mitochondrial genes could not distinguish between these three different brain regions. However, at the single-cell level, distinct cortical and hippocampal regions could be distinguished by differential expression of mitochondrial ribosome, SLC25A (inner mitochondrial membrane transporters), or individual respiratory chain complex transcripts. Expression of mitochondrial genes could prominently distinguish excitatory and inhibitory neurons, as well as different classes of GABAergic interneurons.
The present study demonstrates that nuclear encoded mitochondrial transcripts and proteins are differentially expressed across brain regions and cell types, informing our understanding of the molecular diversity and heterogeneity within the brain. Our work expands recent findings demonstrating that mitochondria differ in composition among cell populations in the cerebellum (Fecher et al., 2019) and between fast-spiking and regular spiking neurons (Cserép et al., 2018). We postulate that cell lineagespecific mitochondrial composition and metabolism are poised to contribute to the susceptibility of certain cell types to damage and/or cell death in diseases of the nervous system.

Animals and tissue dissection
Animal husbandry and euthanasia was conducted as approved by our Institutional Animal Care and Use Committees. C57BL/6J male mice (The Jackson Laboratory #000664), six weeks of age, were euthanized with CO 2 asphyxiation and decapitated. Whole brain was removed, rinsed in ice-cold phosphate buffered saline and placed in a prechilled adult mouse coronal slicing matrix with 1.0-mm slice interval (Zivic catalog #BSMAS001-1). Chilled blades were placed in the matrix channels according to manufactures recommendations and slices laid out on an ice-cold aluminum block for punch microdissection. Hippocampal regions were identified in sections #2 and #3 of the slices corresponding to sections 21-22 of the C57BL/6J Atlas (http://www.mbl.org/atlas170/atlas170_ frame.html). Cortex punches were taken adjacent to the hippocampal regions. Striatum was dissected from slice #6 or #7 corresponding to sections 15-16 of the C57BL/ 6J Atlas. Punches of the brain tissue were taken using a chilled punch set with 1.00-mm diameter punches (Stoelting catalog #57401): six punches were taken from each of the hippocampus, cortex, and striatum brain regions (three from the left hemisphere and three from the right hemisphere; Barr et al., 2004). Punches were ejected, transferred to a microcentrifuge tube using forceps, and flash frozen in liquid nitrogen until processing for RNAseq or mass spectrometry (MS).

RNAseq
RNA extraction, library construction, and sequencing were done by BGI and are briefly described below. Total RNA was extracted with TRIzol and quality control was performed with the Agilent 2100 Bio analyzer (Agilent RNA 6000 Nano kit) to do the total RNA sample QC: RNA concentration, RIN value, 28S/18S, and the fragment length distribution.
For library construction, poly-A containing mRNA molecules were isolated using poly-T oligo-attached magnetic beads. Following purification, the mRNA was fragmented into small pieces using divalent cations under elevated temperature. The cleaved RNA fragments were copied into first strand cDNA using reverse transcriptase and random primers. This was followed by second strand cDNA synthesis using DNA Polymerase I and RNase H. cDNA fragments underwent addition of a single "A" base and subsequent ligation of the adapter. The products were then purified and enriched with PCR amplification. We quantified the PCR yield by Qubit and pooled samples together to make a single strand DNA circle (ssDNA circle), which gave the final library. DNA nanoballs (DNBs) were generated with the ssDNA circle by rolling circle replication (RCR) to enlarge the fluorescent signals at the sequencing process. The DNBs were loaded into the patterned nanoarrays and pair-end reads of 100 bp were read through on the BGISEQ-500 platform for data analysis. For this step, the BGISEQ-500 platform combined the DNB-based nanoarrays and stepwise sequencing using combinational probe-anchor synthesis sequencing method. On average, we generated ;5.64 Gb bases per sample. The average mapping ratio with reference genome was 93.47%, the average mapping ratio with gene was 67.04%; 19,972 genes were identified in which 19,972 of them are known genes and 2659 of them are novel genes; 29,781 novel transcripts were identified.

Analysis of sequencing reads
The sequencing reads were uploaded to the Galaxy web platform, and we used the public server at https:// usegalaxy.org/ to analyze the data (Afgan et al., 2018). FastQC was performed to remove samples of poor quality (Andrews, 2010). All mapping was performed using Galaxy server (v. 21.01) running Hisat2 (Galaxy version 2.1.01galaxy7), FeatureCounts (Galaxy version 2.0.1), and Deseq2 (Galaxy version 2.11.40.61galaxy1; Liao et al., 2014;Love et al., 2014;Kim et al., 2015). The Genome Reference Consortium build of the reference sequence (GRCm38) and the GTF files (Ensembl) were used and can be acquired from iGenome (Illumina). Hisat2 was run with the following parameters: paired-end, unstranded, default settings were used except for a GTF file was used for transcript assembly. The aligned SAM/ BAM files were processed using Featurecounts (Default settings except used Ensembl GRCm38 GTF file and output for DESeq2 and gene length file). FeatureCounts output files and raw read files are publicly available (GEO with accession GSE140054). The FeatureCounts compiled file is GSE140054_AllTissueFeatureCounts. txt.gz. Gene counts were normalized using DESeq2 (Love et al., 2014) followed by a regularized log transformation. Differential Expression was determined using DESeq2 with the following settings: factors were tissue type, pairwise comparisons across tissues was done, output all normalized tables, size estimation was the standard median ratio, fit type was parametric, outliers were filtered using a Cook's distance cutoff.
We compared the top 100 genes whose expression was different between cortex and hippocampus data from our RNAseq study to the quantitative in situ hybridization data from the mouse brain atlas at the Allen Institute. Correlation analysis was performed with Prism 9 for macOS version 9.1.1 (223), see Fig. 1.

Sample processing
Each tissue piece was individually homogenized in 500 ml of urea lysis buffer (8 M urea and 100 mM NaHPO 4 , pH 8.5), including 5 ml (100Â stock) HALT protease and phosphatase inhibitor cocktail (Pierce). All homogenization was performed using a Bullet Blender (Next Advance) according to manufacturer protocols. Briefly, each tissue piece was added to urea lysis buffer in a 1.5 ml Rino tube (Next Advance) harboring 750-mg stainless steel beads (0.9-2 mm in diameter) and blended twice for 5-min intervals at 4°C. Protein supernatants were transferred to 1.5ml Eppendorf tubes and sonicated (Sonic Dismembrator, Fisher Scientific) three times for 5 s with 15-s intervals of rest at 30% amplitude to disrupt nucleic acids and subsequently vortexed. Protein concentration was determined by the bicinchoninic acid (BCA) method, and samples were frozen in aliquots at À80°C. Protein homogenates (100 mg) were diluted with 50 mM NH 4 HCO 3 to a final concentration of less than 2 M urea and then treated with 1 mM dithiothreitol (DTT) at 25°C for 30 min, followed by 5 mM iodoacetimide (IAA) at 25°C for 30 min in the dark. Protein was digested with 1:100 (w/w) lysyl endopeptidase (Wako) at 25°C for 2 h and further digested overnight with 1:50 (w/w) trypsin (Promega) at 25°C. Resulting peptides were desalted with a Sep-Pak C18 column (Waters) and dried under vacuum.

Tandem mass tag (TMT) labeling
For each tissue type, 10 individual samples and one composite sample were labeled using the TMT 11-plex kit (ThermoFisher 90406). Labeling was performed as previously described (Ping et al., 2018;Higginbotham et al., 2020). Briefly, each sample containing 100 mg of peptides was re-suspended in 100 mM TEAB buffer (100 ml). The TMT labeling reagents were equilibrated to room temperature, and anhydrous ACN (256 ml) was added to each reagent channel. Each channel was gently vortexed for 5 min, and then 41 ml from each TMT channel was transferred to the peptide solutions and allowed to incubate for 1 h at room temperature. The reaction was quenched with 5% (v/v) hydroxylamine (8 ml; Pierce). All 10 channels were then combined and dried by SpeedVac (LabConco) to ;150 ml and diluted with 1 ml of 0.1% (v/v) TFA, then acidified to a final concentration of 1% (v/v) FA and 0.1% (v/v) TFA. Peptides were desalted with a 200 mg C18 Sep-Pak column (Waters). Each Sep-Pak column was activated with 3 ml of methanol, washed with 3 ml of 50% (v/v) ACN, and equilibrated with 2 Â 3 ml of 0.1% TFA. The samples were then loaded and each column was washed with 2 Â 3 ml 0.1% (v/v) TFA, followed by 2 ml of 1% (v/v) FA. Elution was performed with 2 volumes of 1.5 ml 50% (v/v) ACN. The eluates were then dried to completeness.

High pH fractionation
High pH fractionation was performed essentially as described with slight modification (Ping et al., 2020). Dried samples were re-suspended in high pH loading buffer (0.07% v/v NH 4 OH, 0.045% v/v FA, 2% v/v ACN) and loaded onto an Agilent ZORBAX 300 Extend-C18 column (2.1 Â 150 mm with 3.5-mm beads). An Agilent 1100 HPLC system was used to carry out the fractionation. Solvent A consisted of 0.0175% (v/v) NH 4 OH, 0.0125% (v/v) FA, and 2% (v/v) ACN; solvent B consisted of 0.0175% (v/v) NH 4 OH, 0.0125% (v/v) FA, and 90% (v/v) ACN. The sample elution was performed over a 58.6-min gradient with a flow rate of 0.4 ml/min. The gradient consisted of 100% solvent A for 2 min, then 0-12% solvent B over 6 min, then 12-40% over 28 min, then 40-44% over 4 min, then 44-60% over 5 min, and then held constant at 60% solvent B for 13.6 min. A total of 96 individual equal volume fractions were collected across the gradient and subsequently pooled by concatenation into 24 fractions and dried to completeness using a vacuum centrifugation.

Liquid chromatography tandem MS
Each of the 24 high-pH peptide fractions was resuspended in loading buffer (0.1% FA, 0.03% TFA, 1% ACN). Peptide eluents were separated on a self-packed C18 (1.9 mm Maisch) fused silica column [25 cm Â 75 mM internal diameter (ID), New Objective] by an Easy nLC 1200 (Thermo Scientific) and monitored on an Q-Exactive HFX MS (Thermo Scientific). Elution was performed over a 120 min gradient at a rate of 300 nl/min with buffer B ranging from 3% to 40% (buffer A: 0.1% FA in water; buffer B: 0.1% FA in 80% ACN). The mass spectrometer was set to acquire data in positive ion mode using data-dependent acquisition with top 10 cycles. Each cycle consisted of one full MS scan followed by a maximum of 10 MS/MS. Full MS scans were collected at a resolution of 120,000 (400-1600 m/z range, 3 Â 10^6 AGC, 100 ms maximum ion injection time). All higher energy collision-induced dissociation (HCD) MS/MS spectra were acquired at a resolution of 45,000 (1.6 m/z isolation width, 30% collision energy, 1 Â 10 -5 AGC target, 86-ms maximum ion time). Dynamic exclusion was set to exclude previously sequenced peaks for 20 s within a 10-ppm isolation window.

Data processing protocol
All raw files were searched using Thermo's Proteome Discoverer suite (version 2.1.1.21) with Sequest HT. The spectra were searched against a mouse Uniprot database downloaded July, 2018 (98,225 target sequences). Search parameters included 20-ppm precursor mass window, 0.05-Da product mass window, dynamic modifications methione (115.995 Da), deamidated asparagine and glutamine (10.984 Da), phosphorylated serine, threonine and tyrosine (179.966 Da), and static modifications for carbamidomethyl cysteines (157.021 Da) and N-terminal and lysine-tagged TMT (1229.26340 Da). Percolator was used filter PSMs to 0.1%. Peptides were grouped using strict parsimony and only razor and unique peptides were used for protein level quantitation. Reporter ions were quantified from MS2 scans using an integration tolerance of 20 ppm with the most confident centroid setting. Only unique and razor (i.e., parsimonious) peptides were considered for quantification.
The MS proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE (Perez-Riverol et al., 2019) partner repository with the dataset identifier PXD026104.
Correlations and statistical analysis between the fold of change expression slopes of the selected 210 nuclear encoded mitochondrial transcripts presented in Figure 2G was performed with Prism 9 for macOS version 9.1.1 (223).

Single-cell RNAseq
Single-cell RNAseq data are described in (Yao et al., 2020). Gene expression data matrix (matrix.csv) and cell metadata (metadata.csv). Whole cortex and hippocampus-smart-seq (2019) with 10Â-smart-seq taxonomy (2020) data were downloaded from the Allen Institute Portal. This dataset contains RNAseq data of single cells isolated from .20 areas of mouse cortex and hippocampus. Abbreviations used in figures match the Allen Mouse Brain Atlas. The data set includes 76,307 single cells. The sequencing results were aligned to exons and introns in the GRCm38.p3 reference genome using the STAR algorithm, and aggregated intron and exon counts at the gene level were calculated.
Matrix files were processed with Delimit Pro for Windows 10/8.1/7. We selected the 210 nuclear encoded transcripts from the matrix.csv file with Delimit Pro and data were assembled in Excel together with the metadata.csv data. Data were exported as tab delimited text file and analyzed with the Qlucore Omics Explorer version 3.6(33). Data were log2 converted and normalized to a mean of 0 and a variance of 1. 2D t-distributed stochastic neighbor embedding (t-SNE) plots were generated using a perplexity of 10 and default settings. Callouts were made by cell metadata or gene expression levels. Respiratory complexes and mitochondrial ribosome subunits were defined using the CORUM database, see text for complex entries (Giurgiu et al., 2019).

Results
Brain expression of mitochondrial proteins reveals regional heterogeneity Expression levels of proteins and their transcripts have been used to explore tissue heterogeneities in organelle Research Article: New Research abundance and/or composition (Andersen and Mann, 2006;Geiger et al., 2013;Wilhelm et al., 2014;Cardoso-Moreira et al., 2019;He et al., 2020). We applied this paradigm to mitochondria from adult mouse brain regions. We performed simultaneous quantifications of the transcriptome and proteome from punch-microdissected mouse coronal sections of cortex, hippocampus, and the striatum. We chose punch-microdissected tissue to minimize noise introduced by tissue heterogeneity. Microdissection resulted in tissue samples of ;160 mm 3 for microanalytical omics analyses. We focused on components of the five electron transport chain complexes, the mitochondrial ribosome, and the SLC25A family of inner mitochondrial membrane transporters. We selected the five respiratory chain complexes and the mitochondrial ribosome, as these complexes are necessary components of mitochondria and have defined subunit stoichiometries necessary for their function (Vafai and Mootha, 2012). In contrast, the expression of SLC25A transporter family members is variable among tissues, as only the phosphate carrier (SLC25A3) and ADP/ATP carriers (SLC25A4-6) are essential for ATP synthesis (Cunningham and Rutter, 2020;Palmieri et al., 2020;Rath et al., 2021). Collectively, this set of genes constituted 210 proteins, or 18% of the mitochondria-annotated proteome (Rath et al., 2021). We reasoned that respiratory chain complex subunits and the mitochondrial ribosome should be refractory to anatomic expression differences because of their fixed stoichiometries, while the SLC25A family of transporters would be likely to reveal heterogeneous expression across brain regions.
We quantified mRNA expression across three distinct mouse brain regions encompassing diverse cell types: cortex, hippocampus, and striatum ( Fig. 1). We focused first on all mRNAs encoded in the mouse genome (Fig.  1A1), and then on a subset of 210 of these messages encoding proteins localized to mitochondria (Fig. 1A2). We considered an expression change significant if gene expression between two regions differed by at least 2-fold with p , 0.001. These same thresholds were applied to RNAseq and proteome datasets from mouse tissues (Figs. 1-3).
Brain regions were discriminated by their whole-genome transcript expression (Fig. 1A1). For example, cortex and hippocampus differed by 353 genes whose relative expression was higher in cortex and 316 genes whose relative expression was higher in hippocampus (Fig. 1A1). We validated these gene expression differences with the Allen Mouse Expression Atlas and observed a strong correlation between both datasets (r = 0.69, p , 0.0001; Fig. 1B). In contrast with the transcriptomes of whole brain regions, the 210 mRNAs mapping to the selected subset of nuclear encoded mitochondrial proteins have minimal expression differences among the three brain regions (Fig. 1A2). We could only distinguish the striatum from other regions because of its low expression of the transporters SLC25A13, SLC25A34, and SLC25A37 (Fig. 1A2). These transporters encode an aspartate-glutamate exchanger, an orphan transporter, and an iron uptake transporter, respectively (Palmieri and Monné, 2016). While global gene expression patterns segregated cortex and other brain regions into defined clusters by t-SNE ( Fig. 1C; Kobak and Berens, 2019), the 210 selected mitochondrial transcripts poorly distinguished brain regions using t-SNE analysis (Fig.  1D). This indicated minimal regional differences in the bulk expression of messages encoding proteins localized to mitochondria.
The poor discrimination between brain regions by the 210 mRNAs encoding our selected subset of mitochondrial proteins could be interpreted in the following ways. Anatomical differences in cellular composition could skew regional differences dictated by these mRNAs. For instance, increased numbers of mitochondria in certain cell types may mask any differences that would otherwise be detectable when analyzing single cells. Additionally, regional differences could be manifested at the protein rather than at the transcript level. This last problem is a common occurrence in diverse tissues and cell types, including the brain, with correlations between mRNA and protein expression below 0.5 (de Sousa Abreu et al., 2009;Ghazalpour et al., 2011;Schwanhäusser et al., 2011;Carlyle et al., 2017;Wang et al., 2019). We addressed these questions by quantifying regional proteomes in mouse brain (Fig. 2) and by analyzing the expression of these 210 mitochondrial transcripts at a single-cell level . We used quantitative isobaric labeling by TMT of adult mouse brain proteomes to measure regional proteome differences ( Fig. 2; Werner et al., 2012;Gokhale et al., 2019Gokhale et al., , 2021. We selected TMT MS quantification of the proteome because TMT offers improved capacity to detect changes reaching statistical significance. This is because of TMT's superior precision and reduced number of missing values as compared with label-free quantifications (O'Connell et al., 2018).
We first determined the quality of our 15-plex TMT brain proteome across the three brain regions selected, analyzing correlation coefficients between biological replicates within and in between brain regions (Fig. 2). Multiscatter plots and correlation matrices showed Pearson correlations .0.9 among biological replicates within a brain region and above 0.5 in comparisons between regions ( Fig.  2A). These strong correlations manifested as a reduced variance by principal component analysis (PCA) where 97% of the total variance was accounted by principal component 1 (Fig. 2A, inset). Biological replicates within regions clustered closely and were segregated from other brain regions by PCA, thus validating our TMT proteome dataset ( Fig. 2A, inset).
The global proteome unveiled vast differences between mouse brain regions (Fig. 2B). These proteome differences were more pronounced than those found at the transcript level. As an example, the cortex and hippocampus differed in the expression levels of 4698 proteins among a total of 5285 proteins quantified by TMT, while these two regions differed significantly in only 670 mRNAs (compare Figs. 1A1 and 2B). We further validated our findings and datasets by comparing our proteome hits against the label-free quantified proteomes by Sharma et al. (2015). Our mouse cortex-enriched proteome captured 69.2% of the mouse motor cortex-enriched proteome described by Sharma et al. (2015). This overlap is 30.8 times above what it is expected by chance (p = 7.9E-195; Fig. 2B, Venn diagram insets). We also found significant, yet less pronounced, overlaps between the hippocampal-enriched and striatum-enriched hits and those previously reported (Sharma et al., 2015). Thus, our results capture previously reported differences in the regional brain proteomes and significantly expand them by deploying TMT MS as a way to quantify the proteome.
Among proteins whose expression differed across brain regions, we found multiple mitochondrial proteins (Fig. 2C,D). The most significant changes in mitochondrial protein expression included proteins belonging to respiratory chain complexes, the mitochondrial ribosome, and the SLC25A family of transporters (Fig. 2C,D). The highest expression levels of some of these mitochondrial proteins were observed in the cortex (Fig. 2C). We used a nonlinear tool of data dimensionality reduction, t-SNE, to uncover similarities in the local and global structure of the protein expression data (Fig. 2E,F; Kobak and Berens, 2019). t-SNE analysis of the whole proteome showed that the three brain regions studied did not group into clearly defined clusters (Fig. 2E). However, when t-SNE analysis was performed with the selected mitochondrial proteins, cortex, hippocampus, and striatum were group into clearly distinct cluster (Fig. 2F). Thus, t-SNE analysis indicates that expression differences in mitochondrial proteins alone can anatomically discriminate these datasets.
To further explore the regional differences we observed in mitochondrial protein expression, we performed correlation Figure 1. RNAseq analysis of microdissected mouse brain regions. A 1 , A 2 , Volcano plots of cortex compared with hippocampus, cortex compared with striatum, and hippocampus compared with striatum from adult male mice (n = 5). Threshold for significance was set at p , 10 À3 and log 2 fold change at 1. Color code symbols depict the fold of change below or above the thresholds. A 1 , All transcripts quantified using DSeq2 annotated to the mouse genome GRCm38. A 2 , All nuclear transcripts encoding subunits of the respiratory chain complexes, the mitochondrial ribosome, and the SLC25A family of transporters. Note that scarce numbers of these nuclear encoded mitochondrial transcripts show modest expression differences among brain regions. B, Validation of the RNAseq results using as a comparison the in situ hybridization data from the Allen Mouse Brain Atlas. The 100 most upregulated and downregulated genes when comparing cortex and hippocampus by RNAseq were correlated with the differences reported by the Allen data. C, t-SNE analysis of the RNAseq data presented in A 1 . D, t-SNE analysis of the data presented in A 2 . analysis of these differences across paired brain regions (Fig.  2G), focusing on the selected mitochondrial proteins of interest. We reasoned that anatomically universal mitochondrial expression patterns would be represented by similar expression differences across multiple mitochondrial proteins between two regions. Similar slopes among regional pairwise comparisons would indicate homogenous expression differences, while differences in slope would suggest regional composition differences. These compositional distinctions could originate either from differences in mitochondria shared by all cells in a defined anatomic location or differences in mitochondrial composition among diverse cell types residing in a defined anatomic region (Fig. 2G). We found that pairwise expression difference correlations showed different ordinate intersects (Fig. 2G). The slope of these correlations was significantly distinct among regional pairwise comparisons (Fig.   2G). Moreover, subunits of respiratory chain complexes and the mitochondrial ribosome were similarly weighted to the parameters of these correlations (Fig. 2G, orange and blue symbols, respectively). These data argue for regional heterogeneity in the expression of mitochondrial constituents, even among respiratory chain complexes and the mitochondrial ribosome.
Single-cell transcriptomes identify anatomic and cell type-specific differences in nuclear encoded mitochondrial genes The proteomics data suggested that regional heterogeneities in mitochondrial protein expression in adult mouse brain could originate from intrinsic differences in the cellular expression of nuclear encoded mitochondrial genes. Threshold for significance were set at p , 10 À3 and log 2 fold change at 1. Color code symbols depict the fold of change below or above the thresholds. B, All proteins quantified in brain samples with inset Venn diagrams depicting the overlap between our TMT data (blue) and label-free quantifications by Sharma et al. (2015;pink). Representation factor and p values were estimated with an exact hypergeometric probability test. C, All nuclear encoded subunits of the respiratory chain complexes, the mitochondrial ribosome, and the SLC25A family of transporters. Note the abundant nuclear encoded mitochondrial proteins differentiating brain regions. D, Heat maps of the proteins that show the most pronounced changes based on the q value and magnitude of the difference. E, t-SNE analysis of the proteome data presented in B. F, t-SNE analysis of the data presented in C. Note that the best clustering of brain regions is obtained with the nuclear encoded mitochondrial proteins described in C. G, Simple linear correlation analysis of expression differences across brain regions. Proteins belonging to respiratory chain complexes, the mitochondrial ribosome, and the SLC25A family of transporters are color coded. Note the differences in slopes. p values describe the differences between adjacent correlation plots slopes obtained with Prism. Shaded area represents the 95% confidence interval. See Extended Data Figure 2-1 for list of protein hits with p , 10 À3 and log 2 fold change of least 1.
To test this hypothesis, we analyzed the expression of the 210 nuclear encoded mitochondrial transcripts at the single-cell level. We reasoned the ineffectiveness of bulk tissue RNAseq discriminating brain regions solely on nuclear encoded mitochondrial transcripts (Fig. 1) could be bypassed by the richness of finegrained categorizational information from single-cell RNAseq datasets. We selected the Allen single-cell Figure 3. Nuclear encoded mitochondrial transcripts differentiate neurons by neurotransmitter identity and anatomical location. A, Volcano plots were assembled using the Allen single-cell RNAseq dataset. A total of 50,002 pyramidal glutamatergic neurons were compared with 22,745 GABAergic interneurons. Volcano plots are organized by subunits belonging to the mitochondrial ribosome, electron chain complexes I to V, and the SLC25A family of solute transporters. The mitochondrial ribosome and the SLC25A family of transporters are the most dissimilarly expressed transcripts when comparing GABAergic with glutamatergic neurons. B 1 , t-SNE cell atlas generated with the expression levels of all transcripts encoding mitochondrial ribosome subunits. The t-SNE atlas encompasses .20 areas of mouse cortex and hippocampus, totaling 76,307 cells. Color codes denote brain regions annotated by the Allen Brain Atlas. B 2 shows B 1 data after 100 consecutive permutations. Anatomical segregation is lost. C, Diagram explaining strategy for cell type and anatomic callout in t-SNE atlases. GABAergic neurons were color-coded green and glutamatergic neurons were color coded gray. Cell type and anatomic region were marked by a triangle. D, t-SNE atlas shown in B 1 that was layered with the neurotransmitter identity of cells and anatomic location of cells (triangles). GABA, parvalbumin (Pval), somatostatin (Sst), g -synuclein (Sncg), vasointestinal peptide (Vip), and lysosomal-associated membrane protein family member 5 (Lamp5) denote markers defining specific interneuron subpopulations. E, t-SNE atlas shown in B 1 but layered with the subtype of interneuron (triangles). transcript expression dataset as the most comprehensive single-cell transcript expression study to date (Yao et al., 2020). The Allen brain dataset encompasses .20 areas of mouse cortex and hippocampus, totaling 76,307 cells. Of these cells, 50,002 correspond to pyramidal glutamatergic neurons and 22,745 correspond to GABAergic interneurons, which include 4363 parvalbumin (PV)-positive cells (Yao et al., 2020).
We asked whether the mRNA expression of any one of the five electron transport chain complexes, the mitochondrial ribosome, or the SLC25A family of inner mitochondrial membrane transporters was able to discriminate anatomic regions and brain cell types in t-SNE-generated atlases (Figs. 3-5). We first sought to determine whether the expression of transcripts could be different between glutamatergic and GABAergic neurons when single-cell expression was bulk averaged across each cell category. Volcano plots revealed that the most pronounced changes in the number of transcripts and the magnitude of expression differences occurred among subunits of the mitochondrial ribosome and the SLC25A family of mitochondrial transporters (Fig. 3A). Some transcripts were enriched in glutamatergic neurons, such as the mitochondrial ribosome subunit Lactb (Mrpl56) or the mitochondrial glutamate/proton symporter SLC25A22 (Fig. 3A). Conversely, the mitochondrial ribosome subunit Mrpl52 and the mitochondrial aspartate-glutamate carrier SLC25A13 were enriched in GABAergic neurons (Fig. 3A). The bulk expression of at least one respiratory complex subunit or its assembly factor was substantially different between these two neuron types ( Fig.  3A; see Ndufaf5, Sdhc, and Atp5J).
We used t-SNE to compress multidimensional mRNA expression data into single point cell representations. t-SNE atlases capture and represent similarities in single-cell gene expression by clustering cells along the coordinates of a bidimensional space ( Fig. 3B; Kobak and Berens, 2019). These atlases were then annotated based on their anatomic location or cell type, using triangles to label cells belonging to a particular region (Fig. 3C-E). We focused on GABAergic (Fig. 3C-E, green dots) and glutamatergic neurons (Fig. 3C-E, gray dots), as these two cell types were the most numerous cells whose gene expression was scored in the Allen dataset (Yao et al., 2020).
We next focused our analysis on the mitochondrial ribosome, as this organelle is the biggest protein complex in mitochondria. The ribosome is encoded by ;80 core nuclear expressed proteins necessary for organelle function (CORUM complex #320; Giurgiu et al., 2019). We built a mitochondrial ribosome subunit expression t-SNE atlas (Fig. 3B1). This atlas revealed that mitochondrial ribosome mRNA expression profiles grouped cells into distinct areas of the cortex and hippocampus (the hippocampus is annotated by color in Fig. 3B1 and by a triangle callout in Fig. 3D), as well as neuronal cell types within these regions (Fig. 3D, anatomic annotation by triangle callout over a color-coded glutamate vs GABA-annotated atlas). These anatomic distinctions were removed after data permutation, supporting specific anatomic patterns of mitochondrial ribosome gene expression across the brain (Fig. 3B2). t-SNE analysis of mitochondrial ribosome gene expression segregated cells into two major clusters; one was enriched in cells from the temporal cortex, visual cortex, and hippocampus, while the other cluster preferentially enriched cells from motor cortex areas (Fig. 3D, anatomic annotation by triangle callout over a color-coded glutamate vs GABA-annotated atlas). We mapped the neurotransmitter identity of different cell types into this mitochondrial ribosome gene expression atlas to assess cell type-specific variations in gene expression (Fig. 3E, GABAergic neuronal subtype annotation by triangle callout over a color-coded glutamate vs GABA-annotated atlas). t-SNE analysis revealed clear distinctions between GABAergic and glutamatergic cells (Fig.  3E, green and gray symbols, respectively). In particular, PVpositive and g -synuclein (Sncg)-positive interneurons were the most clearly segregated cell types, regardless of their anatomic location (compare triangle callouts in Fig. 3D,E). Cell clustering was less pronounced for somatostatin-positive or vasointestinal peptide-positive interneurons (Fig. 3E). We overlapped the t-SNE transcriptional clusters with heat maps depicting expression levels of representative mitochondrial ribosome transcripts (Fig. 5). These heat maps indicated that the expression of several mitochondrial ribosome transcripts was higher in GABAergic interneurons, in particular PV-positive interneurons localized to motor areas of the cortex (compare Figs. 3D,E and 5A).
We wanted to evaluate the robustness of nuclear encoded mitochondrial gene expression sets to segregate cell populations into anatomic and cell type categories (Fig. 4). To this end, we built additional gene expression atlases with transcript datasets made up of either the 48 SLC25A transporters, 45 subunits of Complex I, four subunits of Complex II (CORUM complex #440), 10 subunits of Complex III (CORUM complex #403), 14 subunits of Complex IV (CORUM complex #6442), or 16 subunits of Complex V (CORUM complex #563; Giurgiu et al., 2019). Each one of these atlases segregated cells into distinct anatomic and cell type-specific cell populations (compare Fig. 4A, where anatomic annotation is done by color, and B, where the GABAergic cell type annotation is done with triangle callouts). Of note, the distance to nearest neighbors for the SLC25A family was more variable, containing small distinct clusters, compared with the ribosome and electron transport chain complexes. Transcriptionally defined cell populations were identified regardless of the complexity of the dataset fed into the t-SNE algorithm. For example, t-SNE analysis of Complex II, a complex represented just by four transcripts, segregated cells into clusters categorized by anatomic location and cell type (Fig. 4A,B). The expression of Complex II subunits was sufficient to distinguish PV-positive GABAergic neurons among all cell types, regardless of their anatomic location (Fig. 4A,B). Similar findings were obtained with cell atlases generated with each one of the respiratory chain complexes, as well as the SLC25A transporter family (Fig. 4A,B).
We also determined whether the expression levels of different transcripts could further distinguish transcriptionally defined cell populations. We superimposed expression heat maps of mitochondrial ribosome subunits, transporters of the SLC25A family, and subunits of Complex I into their corresponding atlases. Subunits of the mitochondrial ribosome expressed similarly and preponderantly in GABAergic neurons (compare Figs. 5A and 3E); however, members of the SLC25A transporter family displayed variable transcript expression patterns (Fig.  5B). On one hand, we found SLC25A transporters whose expression was higher in classes of GABAergic neurons (compare Figs. 5B1 and 4A). These include the phosphate transporter SLC25A3 and the ADP-ATP mitochondrial translocators SLC25A4 and SLC25A5. These three SLC25A transporters are indispensable in mitochondria for ATP generation (Palmieri and Monné, 2016;Cunningham and Rutter, 2020). On the other hand, we identified transporters whose expression was 2-fold to 4-fold higher in discrete cell populations. For example, SLC25A13 was expressed at high levels in a unique subgroup of cells among PV-positive cells (compare Fig. 5B1 and 4A). This cell population is missed in the expression analysis of this transporter when glutamatergic and GABAergic neuron SLC25A13 mRNA levels were averaged in bulk (Fig. 3A, volcano  plots). Similarly, the most expressed transporters in glutamatergic cells by bulk averaging were SLC25A22, SLC25A37, and SLC25A42 (Fig. 3A). Yet, when we mapped the expression levels of these transporters to a t-SNE atlas, we found again discrete cell populations with uniquely high expression levels of these three transporters (Fig. 5B2).
We expanded our studies to the expression of a subset of SLC25A transporters annotated in the SFARI database that are associated with autism spectrum disorder (SLC25A12, SLC25A27, and SLC25A39; Abrahams et al., 2013), or whose expression is altered in postmortem autism brain samples (SLC25A12, SLC25A14, and SLC25A27; Fig. 5B3; Segurado et al., 2005;Lepagnol-Bestel et al., 2008;Anitha et al., 2012). We found that these transporters had non-overlapping increases in expression levels in discrete brain areas and cell types (compare Figs. 5B3 and 4A). Interestingly, while SLC25A12 and SLC25A13 are both aspartate-glutamate carrier isoforms, only SLC25A12 is linked to autism (Figs. 4, 5B3;Segurado et al., 2005;Lepagnol-Bestel et al., 2008;Palmieri and Monné, 2016;Cunningham and Rutter, 2020). The expression properties uncovered with t-SNE atlases built with mitochondrial ribosome or SLC25A transporter datasets were also evident with a Complex I subunit dataset (Fig. 5C). We found discrete cell clusters that differed markedly in the expression levels of some of the Complex I subunits (Fig. 5C, compare Ndufa1 and Ndufa6). Our findings demonstrate that PV-positive neurons and glutamatergic neurons can be differentiated based on the abundance and expression patterns of nuclear encoded mitochondrial genes. We conclude that the expression of nuclear encoded mitochondrial genes varies across anatomic locations and cell types in the brain. These findings set the stage for the possibility of diversified mitochondrial composition and function across cell types and regions in neural tissue.

Discussion
We used proteomic and single-cell transcriptomic datasets to discern whether expression of nuclear mitochondrial genes can differentiate both anatomic regions and neuronal cell types in the adult mouse brain. We focused on a subset of nuclear mitochondrial genes encompassing the electron transport chain, mitochondrial ribosome, and SLC25A family of mitochondrial transporters. Using whole tissue datasets from the cortex, hippocampus, and striatum, we found that regional mitochondrial differences were apparent at the protein level but not the transcript level (Figs. 1, 2). We reasoned that this proteomic variation could stem from intrinsic regional differences in mitochondrial composition and/or cell typespecific mitochondrial composition, given the variable makeup of cell types in the different regions (Wheeler et al., 2015;Zeisel et al., 2015;Gokce et al., 2016;Tasic et al., 2016Tasic et al., , 2018Erö et al., 2018). We used t-SNE analysis of a comprehensive, neuronally-enriched single-cell RNAseq dataset to gain more insight into these possibilities (Yao et al., 2020). We found that differences in nuclear encoded mitochondrial transcript expression at the single-cell level distinguished cortical areas and regions of the hippocampal formation from one another , and the expression of some nuclear encoded mitochondrial genes was differentially enriched in distinct cell populations in single-cell RNAseq analysis (Fig. 5). In particular, our results showed that excitatory and GABAergic neurons can be differentiated based solely on their expression of nuclear encoded mitochondrial transcripts (Fig. 3).
Our findings expand recent evidence that there is heterogeneity in mitochondrial composition among different brain cell types (Fecher et al., 2019). Fecher and colleagues used an elegant genetic approach to tag and isolate brain mitochondria in a cell type-specific manner, demonstrating that GABAergic Purkinje cells, glutamatergic granule cells, and astrocytes in the cerebellum have distinct proteomes that help carry out specialized functions in these cell types (Voogd and Glickstein, 1998;Ioannou et al., 2019). Here, we extend this evidence of heterogeneity by analyzing single-cell transcript data from a large number of neurons across diverse cortical and hippocampal areas. The breadth and granularity of our analysis extends the principle that brain mitochondria are heterogeneous organelles across diverse brain regions. We focused on electron transport chain genes, mitochondrial ribosome genes, and the SLC25A transporter family, reasoning that the defined stoichiometry of electron transport chain and ribosomal complexes would preclude  Fig. 3A). B 3 , SLC25A transporters annotated in the SFARI database associated with autism spectrum disorder (SLC25A12, SLC25A27, and SLC25A39) or whose expression is altered in postmortem autism brain samples (SLC25A12, SLC25A14, and SLC25A27). them from having much heterogeneity while the SLC25A family, most of which are dispensable for ATP generation, would have more variable expression. Our proteomics data showed that, surprisingly, regional differences in mitochondrial composition extend to electron transport chain subunits and mitochondrial ribosome subunits (Fig. 2). Several mitochondrial ribosome proteins and respiratory complex proteins are enriched in the cortex compared with the hippocampus or striatum (Fig. 2). Moreover, t-SNE analysis of the single-cell RNAseq profile of mitochondrial ribosome transcript expression, SLC25A transcript expression, or expression of genes of the individual respiratory complexes segregates different cortical and hippocampal regions from one another (Figs. 3, 4). Generally, mitochondrial ribosome gene expression was enriched in GABAergic cell types, particularly fast-spiking PV-positive interneurons (compare Figs. 3E and 5A). This is perhaps not surprising given the role of the mitochondrial ribosome in translating mitochondrially encoded subunits of the electron transport chain. However, while it has been established that the mitochondrial ribosome is required for neuronal development and function (Gokhale et al., 2021), there is no evidence for GABAergic-specific requirements of mitochondrial ribosomes.
The single-cell transcriptomes of glutamatergic and GABAergic neurons, as well as different classes of GABAergic neurons, distinguish these neuronal types from one another (Fig. 3). These differences in presumed mitochondrial composition may underlie unique mitochondrial demands imposed by specialized cell types, as has been suggested by previous work (Murgia et al., 2015;Cserép et al., 2018;Fecher et al., 2019;Thomas et al., 2019). For instance, the faster spiking characteristic of PV-positive GABAergic interneurons imposes greater energy demands for these cells, and their mitochondria are ultrastructurally adapted to generate ATP very efficiently (Cserép et al., 2018). Moreover, the integrity of electron transport chain subunits is crucial for PV interneuron function (Inan et al., 2016;Sanz-Morello et al., 2020). Our data suggest that the mitochondrial ribosome and members of the SLC25A family also play key roles in PV interneuron function, as the expression of mitochondrial ribosome subunits and certain SLC25A transporters (SLC25A3-5, SLC25A13) is enriched in PV interneurons (compare Figs. 3E,4,5). GABAergic signaling by PV interneurons is key in establishing the ratio of excitatory to inhibitory (E-I) neurotransmission in the cortex (Ferguson and Gao, 2018). Disruptions to the E-I ratio have been widely hypothesized to contribute to pathogenesis of neurodevelopmental and psychiatric disorders (Nelson and Valakh, 2015;Sohal and Rubenstein, 2019). Perturbations of the E-I ratio in rodents impair circuit function and information processing capabilities of cortical neurons, producing behavioral defects common in neurodevelopmental and psychiatric disease (Yizhar et al., 2011;Nelson and Valakh, 2015;Antoine et al., 2019;Sohal and Rubenstein, 2019). Such disturbances to the E-I ratio in neurodevelopment can be caused by compensatory homeostatic plasticity in response to genetic defects, such as in mouse models of fragile X syndrome (Antoine et al., 2019). Recent work implicates mitochondria as mediators of homeostatic plasticity, with more pronounced changes in the mitochondrial proteome in response to activity deprivation in mice modeling fragile X syndrome (Bülow et al., 2021). Given these findings and our results here reporting enriched expression of nuclear-encoded mitochondrial transcripts in PV interneurons, it is tempting to speculate that disturbances in mitochondrial composition contribute to altered E-I ratios common in disease.
The heterogeneity we observed in ribosomal and respiratory chain proteins, as measured in our proteomic data or predicted from the single-cell transcript datasets, can be interpreted in the following ways. First, differences in expression do not necessarily mean that the stoichiometries differ from what is expected of the respiratory chain complexes embedded in the inner mitochondrial membrane or the mitochondrial ribosome in the matrix of the organelle. These expression differences may reflect anatomic and cell type-specific regulation of free subunits in the cytoplasm before they are targeted to their corresponding mitochondrial compartments. This hypothesis would suggest that the biogenesis or destruction of respiratory complexes or the mitochondrial ribosome subunits is different among anatomic regions or cell types. A second model considers heterogeneity in the composition of these complexes, a possibility bolstered by recent findings of variable neuronal cytoplasmic ribosome composition (Fusco et al., 2021). The contributions of these models to the stochiometric assembly of these mitochondrial complexes awaits further experimentation.
Our analysis of expression of SLC25A transporters showed that, as predicted, this family of proteins has variable expression across regions and cell types (Figs. 4, 5). Interestingly, we found that several SLC25A transporters were expressed at higher levels in small populations of cells (Fig. 5). The increased expression of SLC25A family members in distinct cell populations included multiple orphan transporters whose function is unclear, such as SLC25A30, SLC25A34, and SLC25A39 (Palmieri, 2013;Palmieri and Monné, 2016;Palmieri et al., 2020), and transporters linked to neurodevelopmental disorders, such as the citrate transporter SLC25A1 and those annotated in the SFARI database of autism spectrum-linked genes (Fig. 5B). Mutations in SLC25A1 cause a rare, often fatal metabolic disorder characterized by neonatal epileptic encephalopathy (Nota et al., 2013). Moreover, SLC25A1 is part of the chromosomal interval deleted in 22q11.2 deletion syndrome, which is associated with increased risk for myriad neurodevelopmental disorders, most prominently schizophrenia (McDonald-McGinn et al., 2015). Recent work suggests that SLC25A1 and SLC25A4 are hub genes in the network of the perturbed brain proteome associated with 22q11.2 deletion syndrome (Gokhale et al., 2019). The distinctive expression patterns of SLC25A transporters that we observed may be an intrinsic cell autonomous characteristic defining a distinct cell population. Alternatively, such discrete cell populations may represent a transient metabolic state triggered by an acute stimulus. While we cannot resolve between these hypotheses until single-cell metabolomics is possible, genetically encoded biosensors for metabolites are an alternative tool that could be used to discriminate between these hypotheses. Subcellularly targeted and genetically encoded biosensors for lactate, glucose, ATP, NADH, and pyruvate have been successfully used in neurons, while biosensors for the TCA cycle metabolites citrate and a-ketoglutarate have also recently become available (Lüddecke et al., 2017;Koveal et al., 2020;Zhao et al., 2020). These tools can be used to support further investigation into our results here. Together, our results suggest that further investigation into the roles of the SLC25A transporter family in the brain will produce important insights into how mitochondria influence brain function and neurodevelopment. Our work provides a resource of several mitochondrial genes enriched in certain cell types or regions that will serve as a novel tool to inspire hypothesis generation and functional studies of mitochondrial regional and cellular heterogeneity in the brain.