Sequencing of multiple mammalian genomes, together with the development of whole-transcriptome profiling technologies, have opened the door to an unprecedented ability to study gene expression in the brain. Transcriptomics refers to a class of high-throughput methods, such as microarray (gene chip), serial analysis of gene expression, and, more recently, whole-transcriptome sequencing (RNA-Seq), which enable measurement of the abundance of tens of thousands of transcribed RNAs in a given sample. Before the development of these technologies, studies of gene expression and function in the brain were restricted to targeted assay of a relatively small number of genes for any given study. Now it is possible to obtain a more panoramic view of gene expression, and potentially to understand the molecular underpinnings of brain function from the viewpoint of gene networks rather than from a viewpoint dominated by the effects of single genes.
Complicating this enterprise is the fact that the brain is comprised of a famously diverse menagerie of cell types, limiting the utility of data obtained from tissue homogenates. For a given experimental condition, gene expression changes occurring in rare cell types may go undetected, as they contribute to only a small fraction of the total tissue RNA. Additionally, genes may be regulated in opposing directions in different cell types, thereby appearing static in composite data. Attempts at deconvolving tissue expression data into independent contributions from constituent cell types are computationally challenging and the results are uncertain. Hence, directly measuring the transcriptomes of specific cell types is crucial to understanding the intracellular gene networks that underlie cellular phenotypes.
Cell type-specific transcriptomics requires completion of four separable tasks. First, the cell type of interest must be identified and (typically) labeled; second, RNA from the targeted cells must be extricated from that in surrounding cell types; third, because the resulting RNA is typically low in abundance, it must be amplified; and fourth, the isolated sequences must be identified through sequencing or hybridization (i.e., microarray) methods. Recent reviews have focused on the problems of cell identification (Miyoshi and Fishell, 2006; Kuhlman and Huang, 2008; Madisen et al., 2010), and we direct the readers' attention to a number of online resources that provide detailed information about available cell type-specific reporter and cre-driver mouse lines (Table 1). Briefly, cell types are commonly identified by electrophysiological properties using whole-cell patch-clamp recordings, by projection target using retrograde or anterograde tracers, by cell type-specific markers using immunostaining, or through transgenic labeling approaches. The choice of amplification strategies and final readout are interrelated and subject to a number of technical concerns that are outside the scope of this short review. Although the current majority of cell type-specific transcriptomic data comes from microarray studies, RNA-Seq is clearly the next frontier. In particular, RNA-Seq affords a more straightforward ability to study genetic variation, uncover previously uncharacterized transcriptional start and stop sites and splice variants, and discover novel noncoding RNAs (Guttman et al., 2010). However, before performing either microarray or RNA-Seq experiments, the considerable obstacle of tissue heterogeneity must be overcome. Thus, we focus here on methods for extracting cell type-specific RNAs from mammalian brain tissue (although similar approaches have also been applied to nonmammalian species). The typical workflow for each of the reviewed methods is depicted in Figure 1, though this representation is by no means exhaustive of all conceivable experimental designs, and in some cases multiple purification methods may be sequentially applied (Cahoy et al., 2008).
Often the preference (if not necessity) for using a given cell type-specific mRNA isolation technique is closely wedded to the means of cellular identification being used. For example, a neuron's electrophysiological properties, characterized in whole-cell patch-clamp recordings from acute slice preparations, are frequently used to classify neural cell types. Historically, there has been considerable interest in reducing a neuron's electrophysiological properties to the specific complement of ion channels it expresses, and in finding genetic signatures that correlate with firing phenotype in general. This led to the development of postrecording, single-cell profiling techniques, in which the cytosol of recorded cells is aspirated through the patch pipette, collected in a buffer, and the mRNA is subsequently isolated (patch/aspirate) (Surmeier et al., 1996; Martina et al., 1998). Given the limited amount of RNA contained in a single cell (5–10 pg), early attempts were only able to profile a small number of candidate genes using reverse-transcription followed by PCR. Over the years, improvements in mRNA amplification and PCR techniques have expanded the scope of the patch/aspirate technique to profiling tens to hundreds of genes using multiplex PCR (Toledo-Rodriguez et al., 2004) and tens of thousands of genes using microarrays and sequencing (Kurimoto and Saitou, 2010; Ozsolak et al., 2010a,b; Subkhankulova et al., 2010). However, due to the small amounts of collected RNA, single-cell methods are generally more prone to producing false negatives (described in more detail below), particularly for low-abundance transcripts, and the profiling results are often less reproducible than transcriptional profiles obtained using pools of cells. The extent to which reduced reproducibility also reflects biological and not just technical variability remains unclear (Raj and van Oudenaarden, 2009; Janes et al., 2010). Given these difficulties, much of our current understanding of the genetic diversity of neural cell types has come from studies predominately using mRNA collected from pooled cells (described below). Moreover, advances in transgenic engineering have made it possible to consistently target electrophysiologically defined cell types (Feng et al., 2000; Oliva et al., 2000; Chattopadhyaya et al., 2004), which offers an alternative strategy for discovering genes underlying electrophysiological phenotypes (Okaty et al., 2009).
Laser-capture microdissection (LCM) and closely related microdissection techniques such as laser microbeam microdissection or laser-directed microdissection (LDM) (Rossner et al., 2006) use a laser to excise cells of interest (identified under a microscope) from mounted thin-tissue sections that have been either fixed or frozen. The methods differ in the type of laser being used and the specific manner in which targeted cells are extracted. In LCM, a low-power infrared laser beam melts selected small (∼7.5 μm) regions of a plastic membrane on the surface of a tissue section that then adhere to the target cell(s) upon cooling. The sheet is then peeled away from the surface of the tissue section, taking with it the adherent cells. An obvious drawback of this method is that it does not allow the user to tailor the laser beam to the particular morphology of any given cell, and thus some cellular domains may be excluded. Additionally, closely apposed off-target cells, such as glia, may become attached to the plastic film, resulting in contamination.
Other laser-based microdissection systems, such as the AS LDM system (Leica) overcome the first limitation by using a much narrower ultraviolet laser (∼0.5 μm), allowing the user to directly trace and cut the outline of the target cell; however, closely apposed glia are often difficult to detect visually, particularly if they are beneath the target cell. Despite these drawbacks, LCM and LDM have greatly facilitated cell type-specific assays of mRNA from human postmortem tissue, as they are ideally suited for obtaining cells from fixed or frozen tissue preparations.
Given that tissue fixation can degrade nucleic acids, and the heightened risk of contamination when extracting cells from intact tissue, cellular dissociation-based methods may be better suited for studies in which live tissue is available. As a first step to performing fluorescence-activated cell sorting (FACS), immunopanning (PAN), and manual cell sorting (Manual), acutely dissected brain tissue is digested in a protease solution containing artificial CSF (ACSF) and in some cases ion channel and receptor blockers that promote the health of dissociated cells (Hempel et al., 2007). In the Manual technique, dissociated cells are deposited in a small Petri dish and viewed under a fluorescence microscope. Fluorescent cells are aspirated using a small glass pipette, washed in ACSF in a series of clean dishes to ensure purity, and finally collected in a cell lysis buffer. RNA collected from even a single manually sorted cell can be amplified using a two-step in vitro transcription amplification method and yield sufficient quantities of labeled RNA for hybridizing to microarrays (≥10 μg). However, comparison between the microarray present call, or the percentage of probes on the Affymetrix chip that register a signal level above an algorithmically determined threshold for being expressed in a given experiment, and the number of manually sorted cells used as input demonstrates that a minimum of 30 cells is generally required to achieve the highest reproducible sensitivity, indicating that this level of input minimizes false negatives (Fig. 2; note the saturating present call between 30 and 60 cells, though the precise number of cells and the resulting amount of collected RNA and present call depend on the particular cell type). Furthermore, detecting a single nonfluorescent cell out of a population of 30–100 dissociated cells under a microscope is relatively straight forward, ensuring that false positives resulting from contaminating cell types are kept to a minimum. Thus, Manual sorting is particularly well suited to profiling rare and low-abundance cell types, as well as more plentiful cell populations. However, a high level of proficiency is generally required on the part of the sorting technician.
FACS requires a flow cytometry platform and can typically sort thousands of cells in a relatively short amount of time, provided the proportion of the labeled cell is reasonably high (∼1 h with ∼0.1% labeled cells) (Lobo et al., 2006; Cahoy et al., 2008; Marsh et al., 2008). Streams of dissociated cells are ushered into a fluidic channel where a fluorescence detector sorts cells based on user-specified criteria. This is often achieved by forming a droplet that only contains a single cell just after it has passed through the beam of the fluorometer. A charge is then applied to the droplet, depending on whether or not the target spectrum was detected, and the cell-containing droplet is then diverted into the appropriate receptacle. Multiple excitation lasers and fluorescence detectors can be incorporated into a single flow cytometry platform, allowing for simultaneous detection of multiple spectra, and therefore multiple labeled cell markers. This is especially useful given that many cell types can only be identified by combinatorial expression of multiple genes, rather than expression of a single, unique marker gene. Additionally, properties of the light scatter may be used to infer cell size, morphology, and intracellular complexity (granularity); thus, FACS allows for multiparametric analysis across an array of cellular properties. A drawback of FACS sorting is that it is often too stressful for mature neurons (Arlotta et al., 2005; Lobo et al., 2006; Heiman et al., 2008) and special care must be taken to ensure that neurons remain healthy (Lobo et al., 2006). Given that the Manual method has been successfully implemented for acquiring microarray data from numerous cell types from mature mice (Sugino et al., 2006), the high-throughput nature of FACS may be more of an advantage in applications that require a greater number of starting cells than required for microarrays, such as assays of genomic DNA, like chromatin immunoprecipitation.
Unlike FACS and Manual, PAN does not rely on a fluorescent signal to detect specific cell populations, and therefore may be used to purify unlabeled cell types (Barres et al., 1988, 1992). PAN uses antibodies raised against cell type-specific surface proteins to select various subsets of cells from a heterogeneous cell suspension. Panning plates are first coated with antibodies and the dissociated cells are deposited on the plate for 30 min to 1 h to allow sufficient time for antibodies to bind. The plate is then washed and either the adherent or nonadherent cells (depending on the target population) can then be used for downstream assay. Often, multiple plates with multiple antibodies are necessary to sequentially enrich the target population of cells (Barres et al., 1988, 1992; Cahoy et al., 2008). Thus, PAN can be more time consuming than the other techniques and potentially exposes dissociated cells to more reagents, both of which could in theory induce aberrant transcriptional responses. Another important consideration is that a given cell type is only amenable to immunopanning purification if it can be identified by a unique cell surface antigen, and if a suitable antibody exists for that antigen. Ultimately, the purity of the resulting cell population depends on the specificity of the antibody. Alternatively, an exogenously derived cell surface antigen for which a highly specific antibody exists may be introduced to a particular population of cells through retrograde transport of the injected antigen adsorbed to fluorescent beads (Dugas et al., 2008), or conceivably by introduction of an exogenous epitope through viral transfection or transgenic methods.
Whereas LCM, FACS, Manual, and PAN harvest total RNA from pooled sorted cells, translating ribosome affinity purification (TRAP) immunoprecipitates labeled polysomes directly from tissue homogenates obtained from special transgenic mice. These mice are engineered using bacterial artificial chromosomes (Shizuya et al., 1992; Gong et al., 2003) to target an EGFP-L10a ribosomal transgene to restricted cell populations in the CNS (Doyle et al., 2008; Heiman et al., 2008). Thus, TRAP detects only ribosome-associated mRNA, and therefore only transcripts that are actively being translated, rather than the full population of transcribed RNA. Data obtained from some of these lines appear to be somewhat prone to background contamination and post hoc cleanup of the data must be applied (Dougherty et al., 2010). However, TRAP and another closely related method called RiboTag (Sanz et al., 2009) significantly improve on pre-existing translational profiling techniques, such as polysome fractionation (Darnell et al., 2009), by adding cell type-specificity and offering considerably greater throughput. Detecting the presence of polysomal mRNA for a given gene is a stronger indicator of protein expression than the detection of the mRNA itself, and thus TRAP offers an advantage over other methods in this regard. Conversely, the inability to detect noncoding RNAs using the TRAP method may also be construed as a limitation insofar as noncoding RNAs are critically important for gene regulation. Lastly, TRAP and RiboTag require the use of special mouse lines, whereas the other methods are currently applicable to a wider array of cell types; however, several new TRAP- and RiboTag-compatible mouse lines are in development. Importantly, RiboTag mice and the newer generation of TRAP mice incorporate cre-responsive ribosomal transgenes, which may be crossed with cell type-specific cre-driver mice, providing a modular experimental design strategy that will greatly expand the applicability of ribosomal pull-down methods.
Recently, we reanalyzed all of the publically available mouse brain, cell type-specific microarray data (Affymetrix platforms only) obtained by each of the described methods (with the exception of patch/aspirate) to quantify potential differences in repeatability, contamination, and stress effects (Okaty et al., 2011). We found that all methods demonstrated a comparably high degree of repeatability as measured by the correlation between biological replicate samples (>0.94), but we detected significant differences in the levels of contamination. Using the expression levels of well established cell-type-specific marker genes for GABAergic cells, astrocytes, and oligodendrocytes to calculate contamination indices for non-GABAergic, nonastrocyte, and nonoligodendrocyte cell types profiled by each method, we found that LCM and TRAP samples showed significantly higher levels of contamination than FACS, PAN, and Manual. On average, the effects were higher in LCM samples and contamination of TRAP samples was highly variable, indicating that the level of contamination may vary depending on the cell type and/or transgenic mouse line being used. A summary of these results can be found in Figure 3 (note that using an expanded list of genes selected by unsupervised clustering to calculate contamination indices did not alter the key finding, namely that LCM and TRAP samples showed significantly higher levels of contamination). Although comparisons were largely made between different cell types, in a handful of cases common cell types were profiled by different methods, and in each case LCM or TRAP data showed evidence of higher contamination than data acquired by other methods, suggesting that differential contamination stems from the purification method, rather than the cell type. Also, differences in the translational activities of different transcripts may be reflected in TRAP versus non-TRAP data. Expression of noncoding RNAs, for example, was detected in LCM-, Manual-, FACS-, and PAN-purified samples but not in TRAP samples, as expected. We also detected heightened expression of immediate early genes, stress-related genes, and apoptosis genes in some PAN samples; however, the effect was modest. Additional sources of differences observed in the data may be differential sensitivities between methods or more idiosyncratic differences deriving from other experimental conditions rather than the purification methods per se.
Ultimately, the suitability of each method for use in a given study is a function of the particular goals of the study, the cell type(s) of interest, and the availability of resources, such as equipment and/or mouse lines (Tables 1, 2). For instance, all of the discussed methods can be used to detect enrichment of strong marker genes for a given cell type. However, comparison between different conditions for the same cell type, such as a disease state, activity deprivation, or other perturbations in which expression differences are generally smaller than between-cell type differences requires a method achieving high purity to ensure that contamination will not skew the results. Additionally, not all methods are equally compatible with all cell populations. Cells identified by retrograde or anterograde tracer injection or by viral transfection of a fluorescent reporter construct are more readily collected by LCM, FACS, PAN, or Manual, and only LCM and Manual are well suited to small, sparsely labeled populations. However, for assays requiring greater amounts of genetic material, FACS, PAN, and TRAP may be more appropriate. Regardless of the sorting method used, cell type-specific transcriptomic data are a valuable resource to the neuroscience community.
Footnotes
Editor's Note: Toolboxes are intended to describe and evaluate methods that are becoming widely relevant to the neuroscience community or to provide a critical analysis of established techniques. For more information, see http://www.jneurosci.org/misc/ifa_minireviews.dtl.
This work was supported by funding from the National Institute of Mental Health.
The authors declare no competing financial interests.
- Correspondence should be addressed to Sacha B. Nelson, MS008, Brandeis University, 415 South Street, Waltham, MA 02454. nelson{at}brandeis.edu