Abstract
CNS axons differ in diameter (d) by nearly 100-fold (∼0.1–10 μm); therefore, they differ in cross-sectional area (d2) and volume by nearly 10,000-fold. If, as found for optic nerve, mitochondrial volume fraction is constant with axon diameter, energy capacity would rise with axon volume, also as d2. We asked, given constraints on space and energy, what functional requirements set an axon's diameter? Surveying 16 fiber groups spanning nearly the full range of diameters in five species (guinea pig, rat, monkey, locust, octopus), we found the following: (1) thin axons are most numerous; (2) mean firing frequencies, estimated for nine of the identified axon classes, are low for thin fibers and high for thick ones, ranging from ∼1 to >100 Hz; (3) a tract's distribution of fiber diameters, whether narrow or broad, and whether symmetric or skewed, reflects heterogeneity of information rates conveyed by its individual fibers; and (4) mitochondrial volume/axon length rises ≥d2. To explain the pressure toward thin diameters, we note an established law of diminishing returns: an axon, to double its information rate, must more than double its firing rate. Since diameter is apparently linear with firing rate, doubling information rate would more than quadruple an axon's volume and energy use. Thicker axons may be needed to encode features that cannot be efficiently decoded if their information is spread over several low-rate channels. Thus, information rate may be the main variable that sets axon caliber, with axons constrained to deliver information at the lowest acceptable rate.
Introduction
Axon calibers in central tracts are diverse: some are thin, others thick, but often mixed (Peters et al., 1991). The distributions of caliber are also diverse: some are symmetrical about the mean; whereas, others show a striking skew. Calibers vary across tracts nearly 100-fold, implying a 10,000-fold range of cross-sectional area and volume.
Mitochondrial distributions in tracts are also diverse (Kageyama and Wong-Riley, 1984, 1986). Some tracts appear nearly devoid of mitochondria, whereas others are well endowed. Optic axons express a mitochondrial volume fraction that, above a threshold, is constant with axon diameter (d), so energy capacity and volume rises as d2 (Perge et al., 2009). These quadratic dependencies of space and energy capacity emphasize that whatever variable requires a larger caliber, that variable is expensive and should deliver something valuable in exchange.
The standard idea is that thicker axons reduce conduction time. This is certainly true where the distance is great and/or the time needs to be short, such as in feedback loops for skeletal muscle, in “escape” neurons of fish and invertebrates (Wiersma, 1947; Roeder, 1948; Furshpan and Furukawa, 1962; Young and Keynes, 2005), and in interhemispheric axons in very large animals (Wang et al., 2008). But where distance is short, the importance of higher velocity diminishes, especially compared with other variables. For example, across the entire range of fiber diameters in the optic nerve, conduction times differ by just a few milliseconds, less than the typical spike jitter; whereas across light levels, response time of the retinal circuit varies by 50 ms (B. Borghuis, personal communication). Moreover, although this tract differs by 10-fold in length across species, the axon caliber distributions are nearly identical (Perge et al., 2009). Thus, small differences in conduction time cannot generally explain diversity of axon caliber.
What then would explain this diversity? It has been shown as a corollary to Shannon's formulas for information that any information channel using a discrete alphabet (e.g., spikes) to double its information rate must more than double its signaling rate (Balasubramanian et al., 2001). This law of diminishing returns operates in optic axons where the distribution of mean firing rates is matched by the distribution of axon diameters (Perge et al., 2009). Thus, to double its information rate, an optic axon must more than quadruple its space and energy costs. This apparently constrains optic axons to fire at the lowest rates acceptable to their downstream targets (Niven and Laughlin, 2008).
In view of the established law of diminishing returns and our specific findings in optic nerve, we hypothesize that the main determinant of axon caliber in central tracts is selective pressure to lower costs by minimizing information rates. To explore this, we compared fiber groupings of disparate function, length, and energy capacity. Where natural firing rates or relative information rates were known, they were compared with fiber caliber and mitochondrial content. These examples are consistent with the idea that axons allocate space and energy capacity according to their mean information rates. Where data are incomplete, the hypothesis makes specific, testable predictions.
Materials and Methods
Electron microscopy.
Mammalian brain tissue had been prepared from previous studies from five adult male guinea pigs (Cavia porcellus) (400–500 g) and an adult rhesus monkey (Macaca fasicularis). Each animal was anesthetized with ketamine (100 mg/kg), xylazine (20 mg/kg), and pentobarbital (50 mg/kg), and then perfused with 2% paraformaldehyde plus 2% glutaraldehyde in 0.1 m phosphate buffer. Tissue was stored overnight at 4°C. The tracts in guinea pig were then identified using a stereotaxic atlas, dissected, osmicated, soaked in uranyl acetate, and prepared for electron microscopy as previously described (Perge et al., 2009). Electron micrographs were taken at 5000 times magnification and digitized.
Rat (Rattus norvegicus) cerebellum and auditory nerve were prepared similarly. In short, four adult Sprague Dawley rats were anesthetized with sodium pentobarbital (60 mg/kg) and perfused through the ascending aorta with 200 ml of Ringer's variant at 37°C, followed without interruption by 2% freshly prepared formaldehyde and 1% glutaraldehyde in 0.12 m phosphate buffer at 37°C. One hour after perfusion/fixation, the cerebellum was removed and sectioned at 200 μm on a vibrating blade microtome, whereas the eighth cranial nerve was microdissected under an operating microscope and processed further in one piece, without separating the vestibular and cochlear nerve roots. Tissues were osmicated and prepared for electron microscopy as above.
Insect tissue was prepared from locust (Schistocerca gregaria). The nerve cord was exposed and dissected free with <2.5% glutaraldehyde in a 0.13 m phosphate buffer, pH 7.3. Samples were removed and further fixed for several hours, then washed in buffer, stained in 1% osmium tetroxide for an hour, washed in buffer, and then in distilled water. Tissue was dehydrated through ethanol to propylene oxide and embedded in Araldite. Cross sections were taken through the cord at ∼90 nm thickness and stained with uranyl acetate and lead citrate. Electron micrographs were taken at ∼3300 times magnification and calibrated with a grooved calibration plate.
Analysis.
Axons were identified and measured in cross section by custom software written in MATLAB. Corrections of misidentified axons (∼5% of the profiles) were done manually using Adobe Photoshop 8 or Canvas X. Because axon profiles are not perfectly circular, we measured the area of each profile and then calculated the diameter of the circle with equivalent area. Skewness was calculated as the third central moment of the measured diameters divided by the cube of the SD.
Mitochondrial profiles within axonal cross sections were identified manually. The mitochondrial volume fraction was used as a measure of energy capacity, and was found to correlate strongly with capillary density (Borowsky and Collins, 1989; Weibel, 2000).
Information.
“Information,” as used here, is a quantity that broadly limits the quality of decisions based on observations of a signal. The formula for information provides a summary statistic that distills communication power from three key aspects of a signal: its signaling range (mean firing rate for spike trains), how well the signaling range is exploited (presence or absence of correlations in spike trains), and noise in the signal (variability in spike timing and number). For neurons firing action potentials at a rate R, the maximum amount of information that could be conveyed per unit time in the absence of noise and spike correlations (i.e., the “capacity” in bits/second) is C(R, δt) = −[R dt log2(R δt) + (1 − R δt) log2(1− R δt)]/δt, where R is the firing rate and δt is the time bin used to compute information (typically, the refractory period) (Koch et al., 2006). In fact, noise (measured from the variability of responses to repeated stimulus presentations) and correlations (measured from responses to long movies) reduce the information rate from this theoretical maximum. It was shown that in guinea pig retina, there is an orderly relation between firing rate and information rate I(R) = αC(R, δt), where α is a fixed constant for all retinal ganglion cell types and all stimulus classes, and C is the capacity at firing rate R (Koch et al., 2006). We will assume that some orderly relation of this kind persists between information and firing rate in other species and tracts as well, although the constant α or precise functional form might be different in each case. Broadly, although most of our results are phrased in terms of firing rates, we will understand that higher firing rates R imply higher information rates I, but that I increases sublinearly with R. This is expected to be true on general mathematical grounds (Balasubramanian et al., 2001).
Results
We considered several categories of tract, as follows: (1) unmyelinated axons with passive conduction; (2) unmyelinated axons with action potentials; (3) myelinated tracts with action potentials; and (4) unmyelinated invertebrate (insect and mollusk) axons with action potentials. For each tract, we measured axon diameters (d), obtaining their distribution, the mean (μ), and coefficient of variation (cv) [cv = σ/μ, where σ = SD, and skewness = E([(d − μ0/σ]3), where E[x] denotes the expected value of x].
We also measured the distribution of mitochondria (volume fraction). This measure represents energy capacity, but it probably also reflects energy use because what is not used tends to be pared away (Diamond, 2002; Sterling and Freed, 2007). Moreover, as we shall note, this measure correlated with capillary density, further supporting the equivalence between capacity and use. Most observations are based on our own material, but some are drawn from published work. We also collected from published work the available data on natural spike rates, information rates, and numbers of output synapses, which correlate with axon diameter and relative information rates.
Several patterns emerged relating axon diameter to firing rate, information rate, and energy capacity, supporting a broad hypothesis that information rate sets axon caliber, and that information, being expensive, drives the architecture of nerve tracts toward small axon diameters. Here we discuss these patterns tract by tract.
Unmyelinated axons with graded potentials
Rod
Rod axons are quite thin (mean, 0.41 μm; Fig. 1). Their distribution of diameters is narrow (SD, 0.037 μm) with a small coefficient of variation (0.09). The distribution is largely symmetrical about the mean, with a small positive skew (0.75). The axon cross section accommodates ∼35 microtubules, which supply a single ribbon-type active zone. These structural features are strongly conserved across mammalian species (Sterling, 2004) and serve the rod's irreducibly simple signal, as we now explain.
The rod outer segment over the lowest 3 log units of environmental intensity (“starlight”) spends long intervals—minutes—waiting for a photon. During this period, the active zone releases transmitter tonically. When finally a photon is captured, the event is amplified under tight biochemical regulation to produce a small, stereotyped hyperpolarization (Sampath and Rieke, 2004) that spreads passively down the axon, causing a brief suppression of tonic release, which is detected as an event. Thus, over a sequence of 200 ms integration times, a rod essentially transmits a binary stream (Rao et al., 1994).
This binary message stream is irreducibly simple and involves a low rate of inexpensive events (vesicles are far cheaper than spikes) that conveys small amounts of information. This message, according to our hypothesis, should be transmitted with minimal resources. Indeed, it is conveyed by a single active zone served by a very thin axon (Fig. 1A). Likewise, the dedicated relay for this binary signal, the rod bipolar cell, also has the thinnest axon and the fewest ribbon outputs among the retina's 10 bipolar types (Sterling and Freed, 2007). The rod's message, always “no” or “yes,” and the information rate are both invariant between rods. Thus, the hypothesis predicts invariant rod axon calibers, and indeed the distribution of rod axon diameters narrowly tuned around the mean (Fig. 1B).
At higher light intensities (twilight and daylight), the rod outer segment transduces many photons (up to 105 photons/integration time) (Yin et al., 2006). Under these conditions, the rod axon conducts a graded voltage of greater amplitude and higher temporal frequency. However, this information-rich signal is not conveyed primarily by the rod's single, high-gain synapse but instead is routed via gap junctions to the terminals of cone photoreceptors for transmission through their synapses (Smith et al., 1986). Thus, although the rod produces a richer voltage signal at higher light intensities, its axon needs to support its single active zone mainly for binary signaling at low light intensities.
Cone
The cone axon is thicker (mean, 1.32 μm; Fig. 1A). Again, the distribution of diameters is narrow (SD, 0.11 μm; coefficient of variation, 0.08) and fairly symmetrical about the mean, but with a small skew toward thinner axons (−0.54; Fig. 1B,C). The primate cone axon cross section accommodates ∼440 microtubules that supply 20 ribbon-type active zones (Hsu et al., 1998). These features are also conserved across mammals, with modest quantitative differences across the retina and across species (Sterling, 2004).
Serving the daylight regime (>104 photons/integration time), the cone transduces sufficient photons to generate a finely graded voltage signal. The latter spreads passively down the axon to the terminal to sum with the rod signals. Thus, the cone terminal, integrating the signal from its own outer segment and signals from ≥20 rods, operates at higher signal-to-noise ratios (SNRs) and higher temporal frequencies. The higher SNR and bandwidth contribute to a higher information rate (Shannon and Weaver, 1949). Thus, compared with the rod terminal, the cone terminal needs to transfer information at a higher rate.
Correspondingly, the cone terminal has ∼20-fold more active zones that support a much higher rate of tonic vesicle release (DeVries et al., 2006; Borghuis et al., 2009). Although direct measurements of information rates under natural conditions are not available, an estimated 10- to 20-fold greater release rate and higher frequency response will lead to a greater information rate because the SNR and bandwidth of cone vesicle release are higher than those of rods (Sterling and Freed, 2007). This predicts a thicker cone axon, and indeed ∼10- to 20-fold higher release rate in cone versus rod is supported by a ∼15-fold greater cross-sectional area (calculated from the ratio of cone vs rod mean axon diameters). Because the rod and cone have similar densities of microtubules (Hsu et al., 1998), the cone probably harbors a ∼10-fold larger array of microtubules for transport. This supports the hypothesis that photoreceptor axons match diameter to information rate.
The observed negative skew of cone axons (toward thinner diameters) is small, but significant when compared with the positive skew of rod axon diameters. Why this difference in direction of skew? Note that cones sensitive to middle (M) and long (L) wavelengths capture information at similar rates and constitute 95% of the population (Garrigan et al., 2010). According to the hypothesis, their axon diameters should distribute narrowly and with little skew. But cones sensitive to short (S) wavelengths (5% of the population) capture information at lower rates, and should therefore transmit with fewer quanta (Garrigan et al., 2010). Possibly reflecting this, the S terminal is smaller than the M/L terminals (Esfahani et al., 1993). If the S-cone axon is correspondingly finer than the M/L cone axons, this could explain the extended tail of the distribution toward finer axons.
Last, we noticed that rod and cone axons are nearly devoid of mitochondria (Fig. 1A). Photoreceptor axons conduct passively; therefore, the ionic gradient for the membrane potential, which is established at the inner segment with energy from its dense aggregation of mitochondria, is used to charge/discharge the axonal capacitance. However, there is no further discharging of ionic batteries and therefore no further energy cost along the axon. Thus, while rod and cone axons incur different costs in space, they seem to have the same energetic cost: zero. The cost of synaptic transmission is nevertheless higher for a cone because of its greater rate of information transfer via a greater rate of vesicle release.
Retinal bipolar cells
The axons of bipolar cells also conduct graded signals passively. Earlier studies on five types of bipolar neuron showed that axon cross-sectional area is proportional to the number of ribbon-type active zones (cone bipolar types b1 > b2 > b3 > b4 > rod bipolar) (Sterling and Freed, 2007; Freed and Liang, 2010), and the number of active zones correlates with information rate at the bipolar cell output (Freed and Liang, 2010). Thus, here too axon diameter increases with information rate.
In overall summary, where members of a given cell type carry similar information rates, their axon diameters cluster narrowly about the mean (rod, cone). Where one type carries a higher rate, its axon is thicker (cone ≫ rod; M/L > S predicted) and its synaptic terminal is larger (cone ≫ rod; M/L > S). The absence of mitochondria in rod and cone axons also suggests that when conduction along an axon is passive and does not require voltage-gated sodium currents, energy expenditure is lower and energy capacity (mitochondrial volume fraction) is smaller.
Spiking axons: unmyelinated
Granule cells
The cerebellar granule cell produces the brain's finest axon, the parallel fiber. Its mean diameter, measured near the pial surface in the molecular layer, was 0.16 μm, comparable to the findings of Wyatt et al. (2005) (Fig. 2). This caliber approaches the lower limit set by noise due to spontaneous opening of voltage-gated sodium channels (Faisal et al., 2005; Faisal and Laughlin, 2007). That this axon is irreducibly fine is fortunate for vertebrate brain design because it is the most numerous of all axon types. The parallel fiber, being both unmyelinated and extremely fine, is the slowest conducting axon (∼0.25 m/s) (Vranesic et al., 1994). However, upon reaching its assigned depth in the molecular layer, the axon bifurcates as a T, so the conduction distance to its final termination is relatively short, ∼2.5 mm in rat, with a conduction time of ∼10 ms (Napper and Harvey, 1988).
Parallel fibers decrease in thickness from the base of the molecular layer to the pial surface (Wyatt et al., 2005), but locally the diameters distribute narrowly: the SD, 0.04 μm, was the smallest among all the tracts that we studied. The coefficient of variation was also small (0.27), but this was threefold greater than for photoreceptors. The broad hypothesis predicts that parallel fibers, being thin, should fire at low rates, and having a small coefficient of variation, should have a rather stereotyped input, though less so than photoreceptors.
Indeed, granule cells do express structural stereotypy at the input: four short dendrites that each receive multiple contacts from a “mossy” fiber. In rat, a mossy fiber fires at a mean rate of ∼40 Hz (Maex and De Schutter, 1998), so the granule cell is excited by glutamatergic input at ∼160 spikes per second. Integrating these inputs, the granule cell output is controlled by strong inhibitory circuits, so it is mostly silent except for brief, high-frequency bursts that give a low mean rate, <0.5 Hz (Ruigrok et al., 2011). Thus, in accord with the hypothesis, the firing rate is low and the circuit motif seems to be strongly conserved, suggesting that granule cells are functionally stereotyped to process information at similar rates.
Within the narrow range of parallel fiber diameters, we also found a significant positive skew (1.4; Fig. 2D). This probably originates from two factors unrelated to the intrinsic distribution of axon diameters. First, at such a small axon caliber (0.18 μm), the presence of a single mitochondrion (∼0.2 μm) could—like a pig in a boa constrictor—double the parallel fiber's diameter. Excluding such profiles decreased the skewness by 35% (skew, 0.9). Second, a fiber sectioned near a varicosity could exhibit a spurious increase in diameter. Consequently, we excluded axon profiles containing synaptic vesicles (indicators of a nearby varicosity).
Olfactory receptors
Olfactory receptor axons in guinea pig are thin (mean, 0.28 μm; Fig. 2B). Some of these axons are regenerating and so are thinner than mature axons (Whitman and Greer, 2009). But even this admixture is nearly twice the thickness of parallel fibers (Fig. 2D,E). Olfactory axon diameters distribute quite narrowly (SD, 0.07 μm, the second narrowest of the tracts studied; the coefficient of variation is 0.26, similar to parallel fibers). In contrast to rod axons of similar diameter, which distribute symmetrically, the olfactory axons distribute with a marked tail toward larger calibers (Fig. 2D). This is reflected quantitatively as a strong, positive skew (1.92; Fig. 2E).
We note first that, assuming mature olfactory fibers are ∼0.35 μm, our hypothesis predicts low mean firing rates. Indeed recordings from various olfactory receptors in rat show very low spontaneous rates, ∼1 Hz or less, and odorant responses up to tens of spikes per second for brief periods followed by adaptation (Duchamp-Viret et al., 1999, 2000; Rospars et al., 2008; Savigner et al., 2009); recordings from mouse olfactory receptors yield mean rates of ∼3 Hz (M. Ma, personal communication). Thus averaged over minutes, the mean rates are probably as predicted from the diameters. Consistent with the prediction of low firing rates, olfactory axons predominantly express the NaV1.7 isoform of the voltage-sensitive sodium channel (Ahn et al., 2011), which recovers slowly and limits firing rate in fine axons of peripheral nerve that serve nociception (Rush et al., 2007).
But how can the hypothesis that axon caliber matches information rate explain both the symmetric distribution of rod axons and also the asymmetric distribution of olfactory axons? Olfactory axons approach the lower limit of diameter set by channel noise, so they cannot be much thinner; however, they can become thicker and thus skewed toward larger diameters. But the skew for olfactory axons is the second largest among the surveyed mammalian tracts, and we considered possible reasons.
Each olfactory receptor neuron expresses one particular G-protein-coupled receptor (GPCR), which defines its activation by natural odorants. The population of olfactory receptor neurons expresses ∼1000 different GPCRs, and in nature these likely differ in their frequency and intensity of activation (Nara et al., 2011). Even olfactory neurons expressing the same GPCR and projecting to the same glomerulus have different sensitivities to the same odorant. Thus, we expect olfactory axons to fire at a range of mean rates, and they do (Duchamp-Viret et al., 1999; Rospars et al., 2008; Savigner et al., 2009). Correspondingly, they should use a range of axon calibers as we find. But what generates the skew? For various senses, the distribution of stimulus intensities is highly skewed with a low peak and a long tail. For example, the distributions of light intensities and sound intensities in natural stimuli are skewed (Richards, 1982; Lewicki, 2002), and the huge diversity in afferent sensory axons in the somatosensory system suggests a similar skew (Kandel et al., 2000). Our broad hypothesis applied to the distribution of olfactory axon diameters predicts that, in natural conditions, the distribution of odorant intensities and thus the distribution of mean receptor response rates will be skewed.
Fornix: unmyelinated component
The fornix, a compact tract connecting the hippocampus to hypothalamus and other brainstem structures, comprises mostly myelinated fibers (Fig. 3A). However, it also contains a number of unmyelinated axons, which we measured separately. Their diameter distribution matched remarkably well that of the olfactory receptors (Fig. 2D) with similar mean (0.3 μm), SD (0.078 μm), coefficient of variation (0.26), and skew (1.35). We found no data on firing rates of these fibers, but predict that when such measurements are made the firing rate distribution will match that of olfactory receptors.
Ganglion cells
Ganglion cell axons are unmyelinated within the retina and myelinated within the optic nerve, but in both locations axon caliber distributes nearly identically (Perge et al., 2009). The axons are medium caliber (mean, 0.64 μm). The distribution spans a fairly broad range (SD, 0.29 μm; coefficient of variation, 0.46) and is strongly skewed (2.7; Fig. 3D,E). Thus, the broad hypothesis predicts a middle range of mean firing rates. Indeed, ganglion cells are known to comprise many types that carry different information rates, with an overall mean of ∼10 Hz (Koch et al., 2006). Moreover, the distribution of firing rates is known to match the distribution of fiber diameters (Perge et al., 2009), and it is this correspondence that led to the present broad hypothesis.
Energy capacity of unmyelinated axons
Parallel fibers, which are thinnest and have the lowest firing rates, devote ∼2.6% of their volume to mitochondria. The somewhat thicker olfactory receptor axons and unmyelinated fornix axons devote, respectively, ∼6.4% and ∼7.8% of their volumes to mitochondria (Fig. 2B; see Fig. 8B). The ratio of fiber calibers (olfactory and fornix vs parallel) is 1.75, and the ratio of their mitochondrial volume fractions is 2.7. So, unmyelinated fibers may increase their mitochondrial volume fractions supralinearly with diameter. This disproportionate cost of increasing d adds to the increase in energy capacity that goes as d2. Thus, unmyelinated axons require disproportionately more energy capacity than their myelinated versions at the same diameter. This is certainly one good reason to myelinate.
The unmyelinated axon segments of retinal ganglion cells are thicker than those in the olfactory nerve or fornix and still have a lower mitochondrial volume fraction (3.5%; Fig. 2C, see Fig. 8B). This is puzzling. However, we note that parallel fibers and olfactory axons pack closely, often in direct contact and with sparse glial wrapping (Fig. 2A,B). In contrast, the ganglion cell axons are surrounded by Muller glia (Fig. 2C). The glial cytoplasm is dark due to a high concentration of glycogen, suggesting that the metabolic needs of retinal axons may be partially satisfied by Muller cells (Perge et al., 2009). Possibly then, olfactory and fornix axons, lacking an auxiliary glycolytic energy source, require higher internal concentrations of mitochondria.
Spiking axons: myelinated
We compared several other distinct myelinated tracts in guinea pig, selecting arbitrarily (1) the fornix, a short tract connecting hippocampus to hypothalamus and brainstem (∼5 mm); (2) the pyramidal tract, a long corticospinal pathway (≥23 mm); and (3) the optic nerve, intermediate in length (17 mm).
To our surprise, the distributions of fiber diameter for these three tracts essentially superimpose (Fig. 3). Thus, they all peak near 0.7 μm and all have similar means (0.81 μm, 0.9 μm, and 0.88 μm), similar SDs (0.28 μm, 0.36 μm, and 0.32 μm), similar coefficients of variation (0.35, 0.4, and 0.36), and similar skew (1.8, 1.6, and 1.48). The quantitative similarities seem remarkable given the tracts' vastly different functions and nearly fivefold range of conduction distances.
Regarding firing rates, nothing (to our knowledge) is known for the fornix. However, the pyramidal tract is known to be functionally diverse, with thicker axons firing at higher rates (Evarts, 1965; Armstrong and Drew, 1984). The thicker axons tend to fire phasically (i.e., in brief bursts), whereas finer axons fire tonically. Spike rates during active movement in monkey are reported to be ∼12 Hz for thick fibers and ∼3 Hz for finer fibers with a range of spike frequencies <18 Hz (Evarts, 1965). Pyramidal tract axon diameters across species do not scale linearly with tract length. For instance, pyramidal axons in the cow appear smaller on average than those in the human despite the much longer conduction distance (Lassek and Rasmussen, 1940). However, the distributions are always skewed, the most common axons being small and with a long tail toward thicker fibers (Häggqvist, 1937; Lassek, 1942).
The pyramidal tract responses approximately resemble optic nerve responses in that cell types with thick axons fire phasically and types with medium axons fire tonically. The pyramidal tract rates during natural movements thus resemble the ganglion cell rates to natural movies (Koch et al., 2006). Finally, the distribution of spike rates for optic fibers is known to be broad and skewed (Koch et al., 2006), matching the distribution of diameters (Perge et al., 2009). The estimated mean firing rate from the diameter distribution is 10 Hz.
Thicker fibers and higher rates: the foliar tract in cerebellum
The white matter beneath each folium in cerebellar cortex comprises a mixed tract containing three types of myelinated axon whose proportions were quantified (Palkovits et al., 1972). Four-sixths are afferent “mossy fibers” that deliver diverse inputs to the granule cells. Their firing rates are not known in detail, but modeling studies suggest a mean rate of ∼40 Hz (Maex and De Schutter, 1998). One-sixth are afferent “climbing fibers” that originate in the inferior olive, each providing multiple contacts to a single Purkinje cell dendritic arbor. These axons fire at ∼1 Hz (Lang and Rosenbluth, 2003). One-sixth are efferent Purkinje cell axons, which fire at ∼40 Hz (LeDoux and Lorden, 2002).
Our hypothesis predicts that, if axon diameter is linear with spike rate, then most fibers in this foliar tract should be thick. Purkinje axons, with a uniform origin and stereotyped firing, should be uniformly thick. Mossy fibers, being diverse in origin (including several spinocerebellar tracts and various corticopontine projections) should distribute firing rates and axon diameters broadly and with skew; whereas, only the climbing fibers should be thin. Indeed, a section through the foliar white matter shows a high proportion of thick axons (Fig. 4B,C). The mean diameter is large (1.3 μm), the spread is broad (SD, 0.6 μm; coefficient of variation, 0.45), and the skew is strong (1.11).
Purkinje cell axons could be identified separately within the foliar tract because of their rich complement of endoplasmic reticulum and extra-thick, compact myelin (Palay and Chan-Palay, 1974). Thus, we could directly compare their spike rates, known to be high and stereotyped, to their fiber distribution (Fig. 4C). The Purkinje axon mean (1.63 μm) is greater than the tract mean (1.34 μm), and the distribution is narrow (SD, 0.2; coefficient of variation, 0.18) and symmetrical (skewness, 0.39). Thus, the Purkinje axon certainly supports the hypothesis. This seems especially noteworthy because these axons terminate mostly in the deep cerebellar nuclei, so their conduction distances are quite short (∼5 mm in rat) (Sugihara et al., 2009). Consequently, their conduction times are ∼0.5 ms, comparable to the duration of the action potential (Martina et al., 2007). Given this short distance and negligible delay, it seems unlikely that the reason for a thick axon would be to increase conduction velocity.
Climbing fibers should have fine axons, and indeed, conduction velocity measured from foliar white matter to the terminals is 0.6 m/s, implying an axonal (exclusive of myelin) diameter of ∼0.35 μm (Baker and Edgley, 2006).
Thickest axons and highest mean rates
Cochlear and vestibular axons are among the thickest, but for different reasons. So to forestall confusion, we note that the stimulus frequencies encoded by auditory axons are orders of magnitude higher than the stimulus frequencies encoded by vestibular axons. Thus, even though both afferents are driven by hair cell synapses, we should expect the coding strategies to differ.
Cochlear axons are thick and relatively uniform (Fig. 5A). The diameter distribution peaks at ∼1.9 μm diameter (Fig. 5B), predicting a mean firing rate of ∼45 Hz. Indeed, these thick auditory fibers fire at high frequencies—up to 140 Hz—either continually or intermittently (Liberman, 1980, 1982; Merchan-Perez and Liberman, 1996). Moreover, their central projections maintain the morphological specializations associated with high information rates: thick axons have large synaptic endings on cell bodies with fast glutamate receptors and other molecular specializations, such as fast potassium channels (Carr and Soares, 2002; Hasenstaub et al., 2010)
The cochlear distribution has among the narrowest spreads, relative to the mean, of tracts that we measured (SD, 0.43 μm; coefficient of variation, 0.23). Furthermore, despite a slight skew (−0.38), it is noticeably more symmetrical than most tracts (Fig. 5B). The broad hypothesis thus predicts that auditory fibers will have a fairly uniform distribution of average firing rates and information rates while responding to natural stimuli. Because the power spectra of natural sounds typically fall with frequency, approximately flattening the distribution of firing rates would require cochlear fibers to respond over frequency bands that increase in breadth with the central frequency of the band, which is indeed observed (Rhode and Smith, 1985; Lewicki, 2002; Rodríguez et al., 2010). Filters that maximize coding efficiency for representing a variety of natural sounds should also show such bandwidth growth as a function of frequency (Lewicki, 2002). Thus, the shape of the cochlear nerve distribution seems to be adapted for efficient coding.
Although the cochlear fiber diameter distribution is narrow, at a finer level there is a topographic gradient of axon caliber: the apex, which codes low-frequency sounds, uses axons thinner by half than axons for the base, which codes high frequencies (Friede, 1984). The thinnest axons at the apex and the thickest axons at the base thus constitute the tails at either end of the distribution for the whole cochlea, whose middle region uses axons of intermediate caliber. This pattern intriguingly matches the observed increase in auditory fiber coding bandwidth with frequency (Rhode and Smith, 1985; Lewicki, 2002; Rodríguez et al., 2010). Correspondingly, the inner hair cells that innervate the primary auditory axons show a gradient, with those at the base having more ribbons than those at the apex (Martinez-Dunst et al., 1997; Meyer et al., 2009).
When the diameters are plotted for a particular cochlear region, such as the basal portion (Fig. 5B), the distribution is very symmetrical (skew, 0.34) and even more narrowly distributed about the mean (cv, 0.16). This fits the hypothesis because all the fibers from this cochlear region serve similar sound frequencies. However, the mean diameter is very large, so even a small coefficient of variation leads to a substantial SD (0.41 μm). What accounts for this broad spread in diameters of axons serving similar sound frequencies?
Each cochlear inner hair cell innervates several afferent axons and serves each with a single ribbon-type active zone. An afferent contacted by a large active zone exhibits low spontaneous firing and relatively high threshold to sound, whereas an afferent contacted by a small active zone exhibits high spontaneous firing and a lower threshold to sound (Liberman, 1982; Merchan-Perez and Liberman, 1996; Tsuji and Liberman, 1997). The low-spontaneous-rate axons (large active zones) are thinnest, and the high-spontaneous-rate axons (small active zones) are thickest, with intermediate-rate afferents having intermediate caliber. Thus, even locally along the basilar membrane, the predicted association between higher firing rates and larger axon diameters is confirmed. The spread in axon diameters seems to be related to a mechanism for coding different sound intensities, which require different activation thresholds and hence lead to different firing rates.
Vestibular axons are thickest of all (mean, 2.88 μm). The distribution peaks at 3 μm and extends out to 9 μm (Fig. 6) (Gacek and Rasmussen, 1961). Thus, the distribution is broad (SD, 1.2 μm; coefficient of variation, 0.41) with an extended tail (skew, 1.43). While the coefficient of variation and skew resemble that for tracts of medium caliber, the peak diameter is nearly fourfold greater than for medium caliber axons (Fig. 3D). So, in the following comments, “thinner” vestibular fibers are actually quite thick.
Given that these are the thickest axons in our survey, they should have the highest firing rates. Indeed, vestibular axons fire tonically at high mean rates, on the order of 60–80 Hz, with some >100 Hz (Goldberg, 2000; Eatock et al., 2008). The high tonic rate allows each fiber to encode two directions of hair cell motion, increasing for one direction and decreasing for the opposite direction. The thinner axons fire “regularly,” that is, with evenly spaced interspike intervals. They best encode low-frequency stimuli and are driven by relatively few, small, ribbon-type synapses. The thicker axons fire irregularly in high-frequency bursts and best encode higher frequency stimuli (Sadeghi et al., 2007). The thicker, irregular-firing axon is driven by a large “calyceal” synapse with many ribbon-type active zones. The two different coding mechanisms are also supported by different channel properties (Goldberg, 2000; Eatock et al., 2008).
In short, the thick vestibular axons serve high firing rates. Moreover, the thickest axons encode higher frequency stimuli by firing bursts that produce the highest peak firing rates. So, again, larger synapses with multiple active zones and thicker axons correlate with high spike rates and thus high information rates.
Fiber distributions across phyla
The most common distribution of axon diameters observed for mammalian tracts (positive skew with a tail toward thicker axons) is observed in other phyla. For example, the connective between ganglia in the locust ventral nerve cord, where all axons are unmyelinated and considered to be spiking, shows this pattern (Fig. 7A,B). The fibers are fine (mean, 0.62 μm) but with fairly broad distribution (SD, 0.70 μm; coefficient of variation, 1.1) and marked skew (3.3). This makes particular sense because the ventral nerve cord contains axons from heterogeneous sources.
The largest axons belong to interneurons that trigger escape responses (Bräunig and Burrows, 2004). Some axons convey information to neighboring ganglia, while others ascend to or descend from the thoracic ganglia, subesophageal ganglion, and brain. Based upon measured conduction velocities, at least some of the very smallest axons (<0.15 μm) belong to descending neuromodulatory neurons (Bräunig and Burrows, 2004). The distribution of axon diameters contains a high proportion of very small axons <0.2 μm. Thus, the design strategy that reduces axonal space and energy costs transmit information over short distances, by having them fire at very low rates is also present in arthropods.
The connective for the chromatophore lobe in octopus is also highly skewed, as plotted in Figure 6C (Camm, 1986). The fibers are fine (mean, 0.58 μm), and the distribution is broad (SD, 0.28 μm; coefficient of variation, 0.47) and considerably skewed (1.4). It seems remarkable that the distribution of axon diameters in the brain of octopus (a mollusk) and the brain of a mammal (optic nerve, pyramidal tract, fornix) follow a similar pattern, reflecting a common design principle (Fig. 6C).
Axonal energy capacity
Earlier we noted and explained the virtual absence of mitochondria in nonspiking photoreceptor axons. Here we focus on how mitochondrial volume fraction varies with axon diameter. We had observed for unmyelinated and myelinated segments of retinal axons above a threshold diameter that mitochondrial volume fraction is constant (Perge et al., 2009). Therefore, we expected that the mitochondrial volume fraction for the parallel, olfactory, and unmyelinated fornix axons would be constant and the same as for unmyelinated retinal axons. To the contrary, we found a fourfold range in volume fractions (Fig. 8A). Thus, some tracts cost more than others independently of mean diameter.
For myelinated tracts of medium caliber (optic, pyramid, fornix), mitochondrial volume fractions rise sharply from the same threshold diameter (∼0.7 μm). The optic and pyramidal distributions then plateau, holding their volume fractions thereafter fairly constant with diameter (Fig. 8B). The plateau levels differ between the two tracts by ∼1.6-fold (2.8 vs 1.8%; Fig. 8B), suggesting that mitochondrial volume fractions can differ across tracts even when the diameter distributions are similar. The fornix volume fraction did not plateau but peaked (Fig. 8B), possibly because this tract is heterogeneous and contains different axon types with different intrinsic energy capacities.
We found that capillary area correlates with mitochondrial volume fraction (r = 0.97, p = 0.03; Fig. 8C), adding to evidence that mitochondrial volume fraction estimates energy capacity (Borowsky and Collins, 1989; Weibel, 2000). Finally, because for myelinated axons of a given class, mitochondrial volume fraction above a threshold is approximately constant with fiber caliber (Fig. 8B), energy capacity in myelinated axons generally rises as d2. This confirms more generally our previous finding for the optic nerve: that large axon diameters, which are associated with higher firing rates, are disproportionately expensive, probably leading to selective pressure favoring thin, low-rate axons (Perge et al., 2009).
Finally there is one striking observation: Purkinje cell axons, although slightly finer than the cochlear axons and with a somewhat lower mean firing rate, express a mitochondrial volume fraction that is more than threefold greater, the largest of any fiber type. Compared with the parallel fiber, the Purkinje axon takes up ∼74 times more space and ∼6 times more mitochondrial density, making its energy consumption ∼440 times higher. What function of the Purkinje axon requires this exceptionally high energy capacity is at present a mystery.
Costs of redundancy: the motor neuron “size principle”
A mammalian skeletal muscle comprises hundreds to tens of thousands of individual cells (“fibers”). Recruitment from this pool of effectors should maximize the resolution of force production—a motor equivalent of Weber's law (Mendell, 2005). The design solution: generate the slowest, finest contraction by firing a motor neuron that contacts a small number of muscle fibers; and generate greater force and speed by recruiting additional motor neurons that innervate progressively larger numbers of fibers. Motor unit size varies from ∼10 to 1000 fibers. The motor neuron's contact to each fiber employs >100 active zones, so the motor neuron for a small force unit would operate ∼1000 active zones, whereas the motor neuron for a large force unit would operate ∼100,000 active zones. Motor neurons supplying the fewest active zones have the smallest cell bodies and the finest axons.
Under natural conditions, the first units to be recruited and the last to drop out are the smallest ones. Thus, the smallest motor neurons, coupled to the smallest force units, fire most frequently. Fiber biochemistry reflects this in that fibers serving the smallest units are adapted for aerobic metabolism and are located near capillaries. Fibers in the larger units are adapted for glycolysis, and because they store glycogen, they can be located farther from the oxygen supply (Henneman and Olson, 1965; Burke, 1979). The question of what mechanism accounts for this orderly recruitment by size/force increment remains unsettled (Mendell, 2005).
In terms of information, a small motor neuron sends many messages to a few fibers, so its information is high and its redundancy is low. The large motor neuron sends few messages, but identical copies to many fibers, so its information is low and redundancy is high. In this case the thicker axon is not a cost of sending a higher information rate but rather of adding redundancy. This supports our suggestion at the end of the Discussion that axon diameters expand to supply more active zones.
This might also explain why brisk-transient (Y) axons in cat are substantially thicker than brisk-sustained (X) axons, even though their information rates are comparable (Passaglia and Troy, 2004; Koch et al., 2006). Y axons branch upon reaching the brain and provide generous arbors to several lateral geniculate nuclei and also to the superior colliculus with more total boutons than X axons (Tamamaki et al., 1995). Thus, although the Y axon's information rate is similar to the X axon's, there is more redundancy, for which it pays with a thicker axon.
Discussion
Technical advances during the 1920s and 1930s allowed sensitive electrical measurements on peripheral nerves, leading to the conclusion that axons differ widely in conduction velocity and in direct proportion with their diameters. These studies earned a Nobel Prize for Joseph Erlanger and Herbert Gasser, and it seems worth recalling the closing comment in Gasser's 1944 Nobel lecture:
“What then is the significance of the wide velocity range? Is it timing? One need but consider the speed with which posture is controlled in preparation for the reception of oncoming detailed information and the adjustment of fine movement. The more one sees of the exquisite precision with which events take place in the CNS, the more the idea of timing grows in meaning. Differential axonal velocities must play their part in the mechanism.”
The key experiments were conducted primarily on leg nerves of cat and rabbit, which extend over tens of centimeters and serve spinal reflexes where high conduction velocities are critical to reducing conduction times. Given that conduction distance is set by limb length, increasing diameter to increase conduction velocity is the only way to shorten conduction time. But, more than 70 years since this proposal, no one seems to have asked whether it also explains the range and distributions of fiber diameter.
Certainly, tracts should minimize delays, and this appears to have driven many features that reduce conduction distance. For example, partitioning neural tissue into gray and white, placing neurons optimally, and arranging neurons into topographic maps all save time by reducing distance (Wen and Chklovskii, 2005; Chen et al., 2006), which has the additional benefits of saving space and energy. In contrast, saving time by increasing conduction velocity is disadvantageous because it increases space and energy costs as the diameter squared. Therefore, to save time, brain designs should emphasize reducing distance.
An alternative interpretation of axon diameter
We had found for optic nerve that the mean firing rate is linear with fiber diameter. Because low firing rates are intrinsically more efficient (Balasubramanian et al., 2001), finer fibers carry more bits per spike and therefore more bits per ATP and more bits per axon volume (Koch et al., 2006; Perge et al., 2009). We hypothesized that other tracts would follow the same principle and thus use the lowest firing rates and finest fibers consistent with their other needs. Exploring this idea, we here compared diameter with mean firing rate for nine central tracts. The scatter plot suggests a linear relationship (Fig. 9).
Although firing rates for optic axons were obtained for naturalistic conditions and spanned the full range of diameters, data for other types were less complete, and none was drawn from a full distribution of rates during natural stimulation/behavior. However, the general point, that rates are low for fine fibers and high for thick fibers, seems not to be in doubt. Olfactory fibers, based on current evidence, are unlikely to fire at high mean rates, like vestibular and auditory fibers; nor are vestibular and auditory fibers likely to fire at low mean rates. Also it appears that fine parallel fibers fire at low rates while thick mossy and Purkinje fibers fire at high rates. Thus, while a linear relationship between axon diameter and firing rate is unproven, it certainly seems to have preliminary support.
What does this empirical relationship suggest? First, because space and energy together constrain brain design and rise as ≥d2, axons are under selective pressure to be as fine as possible. Second, smaller diameters are associated with lower mean firing rates, and this improves their signaling efficiency: more bits/spike (Koch et al., 2006). More bits per spike also yield more bits per cubic micrometer and more bits per ATP. These efficiencies explain why various tracts should approach the lower limit set by channel noise and use thicker fibers sparingly.
Third, as a corollary, the reason for a thicker fiber is to achieve a higher information rate in that axon. Because this requires a disproportionately higher spike rate, it is expensive and inefficient, but for some purposes it is essential (Hasenstaub et al., 2010). For example, higher rates are used by certain retinal ganglion cells to signal rapid motion and by certain pyramidal tract neurons to trigger fast movements. Possibly, coding a rapid process over several low-rate channels renders its decoding too slow. This explains better why tracts of disparate length and function that process diverse kinds of information (fornix, optic, and pyramidal) all skew their diameter distributions (Fig. 3D). It also explains why certain short axons, such as parallel fibers (Brand et al., 1976; Mugnaini, 1983) and Purkinje fibers, differ by >10-fold in diameter.
The need for higher information rates also explains why the shortest axons (auditory ∼3–5 mm and vestibular ∼5–7 mm) are the thickest. Auditory neurons encode high stimulus frequencies, which generate high information rates. To capture these high rates requires high temporal precision. Axon thickness contributes to this precision (Carr and Soares, 2002). Higher information rates in turn require disproportionately high mean spike rates, which require disproportionately high vesicle release rates at the output (Koch et al., 2006). On the other hand, vestibular axons process very low stimulus frequencies. Yet they are thickest of all (Goldberg, 2000; Eatock et al., 2008). This seems initially surprising and requires some explanation.
Vestibular axons are thick due to a design decision to send an unrectified signal
Vestibular hair cells depolarize to one direction of hair bending and hyperpolarize to the opposite direction. The full range of motion is captured in the spike train because the axon fires tonically, increasing its frequency to one direction and decreasing its frequency to the opposite. This coding scheme requires the highest mean firing rate of any central neuron (50 to >100 Hz) and consequently requires the thickest axons (Fig. 6). In contrast, retinal ganglion cells, which also encode signal increments and decrements, use strongly rectified signals, wherein one type increases firing to a signal increment and another type increases firing to a signal decrement (Liang and Freed, 2010). This coding scheme allows the tonic rate to approach zero and thus allows each type to encode only half of the stimulus range. Consequently, ganglion cell axons can be much thinner (Fig. 3).
Why does the vestibular system use a nonrectified coding scheme? Probably because it drives very fast reflex movements of the eyes that cannot wait for a neural circuit to rectify the signal (Angelaki and Cullen, 2008). That this unrectified scheme is affordable partly depends on the axons being relatively few (∼104). Retinal ganglion cell axons are 10- to 100-fold more numerous, and thus, if they were as thick as vestibular axons, their cost in space and energy would rise by 10- to 100-fold, with consequences for key resources. The cone axon also uses a nonrectified coding scheme, and indeed its axon is also fairly thick (Fig. 1). However, the energetic cost of one of its synaptic vesicles is only ∼0.1% of the cost of a spike (Attwell and Laughlin, 2001). Therefore, this analog system can postpone rectification until the next stage, at the bipolar cell synaptic output, just before converting to spikes (Sterling and Freed, 2007).
Why do higher information rates need a thicker axon?
We have noted that (1) thicker axons improve timing precision, which contributes to higher information rates; and (2) doubling information rate requires that a pulse code more than doubles the pulse rate. But there is another demand at the output, where the signal modulates transmitter release. Possibly, thicker axons are needed to supply more synaptic terminals, which are estimated to use 64% of the brain's total energy budget (Sengupta et al., 2010).
Consistent with this idea, the cone's need to supply 20-fold more ribbons than a rod correlates with its nearly 20-fold greater cross-sectional area that houses nearly 20-fold more microtubules (Hsu et al., 1998). Retinal bipolar neurons exhibit a similar correspondence between relative information rates, number of active zones, and axon cross-sectional areas (Sterling and Freed, 2007; Freed and Liang, 2010).
Beyond the retina, thicker axons with higher firing rates generally employ more active zones at their outputs. This pattern holds for central arbors of retinal ganglion cells, such as brisk-sustained versus brisk-transient axons and for their relays from thalamus to visual cortex (Perge et al., 2009). This pattern also holds for vestibular axons with regular versus irregular firing (Goldberg, 2000; Eatock et al., 2008) and for auditory axons, whose thinner fibers fire at low spontaneous rates and whose thicker fibers fire at higher spontaneous rates, in this case reflecting differences in sound threshold (Liberman, 1982; Merchan-Perez and Liberman, 1996; Tsuji and Liberman, 1997). Likewise, central auditory axons projecting forward from the ventral cochlear nucleus preserve a correlation between axon caliber and number of active zones (Carr and Soares, 2002). Thus, greater energy cost of thicker axons with higher firing rates may be partly attributable to their greater need for restoring ionic gradients, but also partly to their higher need to transport materials to the synaptic terminals with more active zones.
Footnotes
This research was supported by National Eye Institute Grant EY08124, National Science Foundation Grant EF-0928048, and National Institutes of Health Grant RO1 NS 09904. We are extremely grateful to several colleagues for information and discussion. Regarding auditory and vestibular systems, we thank Tobias Moser, Paul Fuchs, Ruth-Anne Eatock, Catherine Carr, Charles Liberman, Elizabeth Glowatzki, and Maria Geffen. Regarding cerebellum, we thank John Simpson, Peter Strick, and David Attwell. Regarding pyramidal tract, we thank Roger Lemon and Peter Strick. Regarding olfactory receptors, we thank Minmin Luo and Minghong Ma. We also thank Christopher Graham for his contributions to the locust work, Simon Laughlin for pointing us to the octopus data, Sally Shrom for her generous and skillful contribution in preparing electron microscopic images, and Sharron Fina for helping to prepare the manuscript.
The authors declare no competing financial interests.
- Correspondence should be addressed to Peter Sterling, Department of Neuroscience, University of Pennsylvania, Philadelphia, PA 19104. peter{at}retina.anatomy.upenn.edu