Abstract
Auditory neurons in the superior colliculus (SC) respond preferentially to sounds from restricted directions to form a map of auditory space. The development of this representation is shaped by sensory experience, but little is known about the relative contribution of peripheral and central factors to the emergence of adult responses. By recording from the SC of anesthetized ferrets at different age points, we show that the map matures gradually after birth; the spatial receptive fields (SRFs) become more sharply tuned and topographic order emerges by the end of the second postnatal month. Principal components analysis of the head-related transfer function revealed that the time course of map development is mirrored by the maturation of the spatial cues generated by the growing head and external ears. However, using virtual acoustic space stimuli, we show that these acoustical changes are not by themselves responsible for the emergence of SC map topography. Presenting stimuli to infant ferrets through virtual adult ears did not improve the order in the representation of sound azimuth in the SC. But by using linear discriminant analysis to compare different response properties across age, we found that the SRFs of infant neurons nevertheless became more adult-like when stimuli were delivered through virtual adult ears. Hence, although the emergence of auditory topography is likely to depend on refinements in neural circuitry, maturation of the structure of the SRFs (particularly their spatial extent) can be largely accounted for by changes in the acoustics associated with growth of the head and ears.
- virtual acoustic space
- receptive field
- ferret
- linear discriminant analysis
- head-related transfer function
- sound localization
Introduction
Understanding the processes that guide sensory system development and give rise to adult perceptual abilities represents a major goal in neuroscience. Changes at the periphery and in the CNS both contribute to the development of sensory functions, but isolating their relative contributions has proved to be difficult. This is particularly the case for a function like auditory localization, which relies on the measurement of physical cues generated by the auditory periphery: interaural time differences (ITDs), interaural level differences (ILDs) and spectral cues produced by the acoustic properties of the head and external ears (King et al., 2001). Because the values of these cues depend on the size, shape and separation of the ears, neural representations of sound-source location cannot become fully mature until the auditory periphery has stopped growing. However, this maturational process also includes refinements in the neural circuits themselves (King, 1999; Knudsen, 2002; Grothe, 2003; Kubke and Carr, 2005), which enable the auditory localization system to be calibrated by experience of the cues available to individual listeners.
A role for experience in shaping the spatial selectivity of auditory neurons has been most clearly demonstrated in the superior colliculus (SC), which contains topographically aligned maps of different sensory modalities. The auditory spatial receptive fields (SRFs) of SC neurons in infant animals are larger than those recorded in adults (Withington-Wray et al., 1990a; Wallace and Stein, 1997, 2001). The emergence of topographic order and registration with other sensory maps relies on both auditory and visual inputs (King et al., 1988; Knudsen and Brainard, 1991; Gold and Knudsen, 2000; Wallace et al., 2004; Withington-Wray et al., 1990b,c). Although these studies highlight the importance of experience in the development of the auditory SRFs, they have not considered the possible contribution of changes in the acoustic localization cues that occur throughout postnatal development as the head and ears grow. Indeed, using virtual acoustic space (VAS) stimuli, we have shown that sharpening of spatial tuning with age in the ferret primary auditory cortex (A1) is attributable to growth-related changes in the localization cues (Mrsic-Flogel et al., 2003).
Here, we examine the relative contributions of peripheral and central factors to the development of the auditory space map in the ferret SC. We first charted the time course of development of the localization cues available to the animal [the “head-related transfer function” (HRTF)] after the onset of hearing. We then quantified how the SRFs of SC neurons change during development, and used linear discriminant analysis (LDA) to measure the degree of similarity between adult and infant response properties. Finally, by presenting infant SC neurons with virtual adult ears, we were able to bypass the developmental changes in the auditory periphery. As in A1 (Mrsic-Flogel et al., 2003), this immediately caused the SRFs to become more sharply tuned. However, topographic order in the representation was not improved, suggesting that the construction of a map of auditory space, which is a feature of the SC but not the cortex, relies on postnatal refinements in neural circuitry.
Materials and Methods
General.
All experiments were conducted on pigmented ferrets (Mustela putorius furo) of different ages, were approved by local ethical review committee, and were licensed by the UK Home Office in accordance with the Animal (Scientific Procedures) Act 1986. Twenty-three ferrets aged from postnatal day (P) 33 to adulthood (≥1 year of age) were used in the free-field experiments, and a further eight infant animals, aged P32–P39, and seven adults were used for mapping the SC with virtual acoustic space stimuli derived from acoustical measurements from the animals' own ears. Additional acoustical data, collected as part of a previous study by Mrsic-Flogel et al. (2003), were also reanalyzed for this study.
Surgery and anesthesia.
An otoscopic examination and tympanometry were performed before surgery to ensure the ear canal was clear and disease free. Animals were anesthetized with alphaxalone/alphadolone acetate (3 ml/kg, i.p.) (Saffan; Schering-Plough Animal Health) and given initial doses of atropine sulfate (30 μg, i.p.; Animal Care) and doxapram hydrochloride (3 mg, i.m.) (Dopram-V; Fort Dodge Animal Health). The left radial vein was cannulated to provide a continuous infusion of 5% glucose Hartmann's solution (typical rate 5 ml/h) and for use as a route of administration for supplementary doses of Saffan and Dopram-V. A tracheotomy was performed and the body temperature was maintained at ∼39°C. The animal was then placed in a stereotaxic frame, fitted with blunt ear bars, and the skull exposed. A small steel bar was attached to the skull with screws and dental cement (Simplex Rapid; Austenal Dental), so that the head could be supported from behind without using the stereotaxic frame. A craniotomy was performed over the right SC and an additional two holes drilled so that electrodes could be inserted for recording of the electroencephalogram (EEG). The scalp was carefully reattached with tissue adhesive (Vetbond; SM Animal Care Products), so that the external ears assumed their preoperative positions according to a series of measurements made before the first scalp incision. For the VAS experiments, we performed acoustical recordings (see below) before the craniotomy.
In most cases (all adult recordings and the VAS experiments in infant ferrets), the animals were paralyzed with pancuronium bromide (0.2 ml/kg; Pavulon; NV Organon) to prevent eye movements. If this was not done (free-field experiments in infant ferrets), eye movements were minimized by running a series of sutures through the conjunctiva and fixing them to the skin surrounding the eye with tissue adhesive. Where possible, eye position was measured by back-projecting the location of the optic disc using a reversible ophthalmoscope fitted with a corner-cube prism. The mean ± SD azimuth values were 28 ± 11° in infant ferrets in which the eyes were stabilized mechanically and 24 ± 5° under paralysis. We have previously shown that the range of visual best azimuths recorded along the rostrocaudal extent of the superficial layers of the SC in these two conditions completely overlap (King et al., 1996). In all cases, the eyelids were trimmed, the pupils were dilated with atropine sulfate, and the eyes were protected with zero refractive power contact lenses to allow mapping of visual receptive fields. Anesthesia and paralysis were maintained with a continuous intravenous mixture of ketamine (Ketaset, 5 mg · kg−1 · hr−1; Fort Dodge Animal Health Ltd) and medetomidine hydrochloride (Domitor 10 μg · kg−1 · hr−1; Pfizer) plus pancuronium bromide (0.2 ml · kg−1 · h−1) in Hartmann's solution. The paralyzed animals were ventilated artificially (7025 respirator; Ugo Basile) with oxygen-enriched air and the heart rate, end-tidal CO2, ECG, and EEG were monitored continuously to ensure a stable level of anesthesia.
Acoustical measurements.
All acoustical and electrophysiological measurements were performed in a sound-attenuated anechoic chamber. We recorded the HRTF of each animal so that VAS stimuli could be presented through the animal's own virtual ears. A damped polythene probe tube (length, 30 mm; inner diameter, 0.86 mm; outer diameter, 1.52 mm) was passed caudally through each ear canal wall and secured internally with a small flange, which abutted against the canal wall, and externally with a rubber O-ring that was pushed against the skin behind the pinna. The acoustic signals were recorded through condenser microphones (miniature KE-4–211-2 microphone capsules; Sennheiser) attached to the probe tubes. Using a speaker (KEF T27) mounted on a vertical motorized hoop (radius, 65 cm), broadband signals (512-point Golay codes) (Zhou et al., 1992) were presented from 66 different directions at 16° intervals in azimuth from −160 to + 160° and at 6 vertical angles from + 80 to −60° elevation. The sampled positions were arranged so that their diagonal separation was 34°. The generation of the Golay codes and the recording of the microphone signals were performed digitally using TDT system 2 A/D and D/A converters (sample rate of 80 kHz; Tucker-Davis Technologies) and 30 kHz anti-aliasing filters. The microphone signals were analyzed for each stimulus direction to calculate a spectral transfer function containing both the animal's HRTF and the transfer characteristics of the loudspeaker and probe microphones. The ITDs were extracted from the microphone signals by cross-correlation of the impulse responses after low-pass filtering (0–4 kHz). An in situ calibration to remove the transfer functions of the probe microphones and earphone drivers used for presenting the VAS stimuli was then carried out. Minimum phase filters were calculated from the equalized amplitude spectra using the Hilbert transform. The VAS stimuli used during recording consisted of short (100 ms) Gaussian noise bursts, which were convolved with the appropriate minimum phase filter for each direction, and delayed to generate the appropriate ITD. Psychophysical studies in humans have shown that this “minimum-phase-plus-delay method” (i.e., with frequency-dependent ITDs excluded) adequately approximates the HRTF phase spectrum as long as the low-frequency ITD is appropriate (Kulkarni et al., 1999).
Mapping visual responses.
In both free-field and VAS experiments neural activity was recorded extracellularly using a tungsten-in-glass microelectrode that was lowered vertically through the intact cortex into the midbrain via a remotely controlled motorized microdrive. In the free-field experiments, the electrodes had an uninsulated tip length of up to 100 μm, to sample multiunit activity at different depths within the SC. Electrodes with a more conventional tip length of 10–15 μm were used in the VAS experiments, so that single units could be isolated. As the electrode was lowered, we searched for visual activity using a flashing yellow LED (wavelength, 590 nm; interstimulus interval, 1000 ms) located close to the contralateral eye. The entry of the electrode into the superficial layers of the SC was indicated clearly by robust visually driven activity. We then determined the best direction of the visual activity by presenting discrete light flashes from a 1-cm-diameter LED mounted on the motorized hoop.
Free-field auditory experiments.
Having mapped the visual responses in the superficial layers of the SC, the hoop loudspeaker was placed at the center of the visual receptive field and the electrode advanced into the deeper layers while 100 ms white noise bursts (ramped with 5 ms rise/fall times) were presented at a rate of 0.5–0.7 Hz. Stimuli were produced by a Brüel and Kjær type 1405 noise generator. The threshold of each auditory multiunit cluster encountered was determined either at this loudspeaker position or, if different, at the auditory best azimuth of other units previously recorded in the same electrode penetration. At each recording site, responses were measured to five noise bursts presented at two sound levels (near-threshold and suprathreshold, see Results) and at 20° intervals in the horizontal plane from directions 160° contralateral to 160° ipsilateral to the side of the brain from which the recordings were made. By convention, contralateral sound directions were denoted by negative numbers and ipsilateral directions by positive numbers. The extracellular recordings were amplified, filtered (using a CED 1701 programmable filter) and stored using a CED 1401 laboratory interface and personal computer. Recordings were displayed on-line and time windows chosen to encompass both the stimulus-evoked response and an equivalent period of spontaneous activity before the stimulus onset. Using methods based on Chung et al. (1987), a fast Fourier transform was applied to the neural activity in these windows, and the increase in total power spectral density (PSD) over control noise segments calculated from 200 to 3000 Hz. This frequency range was chosen on the basis of the power bandwidth of the individual action potentials. We adopted this signal processing technique, which has been used previously to quantify the tuning of sensory neurons in a range of species (Chung et al., 1987; Withington-Wray et al., 1990a–c; King and Carlile, 1994), in the free-field experiments, because it did not require single units to be isolated and therefore allowed preferred sound directions to be rapidly determined. This is particularly advantageous for charting the maturation of auditory topography across a range of postnatal ages. The activity in the response and control windows was monitored carefully during data collection to ensure that no bursting or dying units were included in the power spectral density measurements. At the end of each electrode penetration through the SC, one or more electrolytic lesions (−5 μA for 5 s) were made at variable depths below the surface of the nucleus.
Virtual acoustic space experiments.
After mapping of the visual best position (see above) we searched for auditory responses by presenting closed-field contralateral broadband noise bursts. In these experiments, stimulus generation and data acquisition were controlled using Brainware software (Tucker-Davis Technologies) and TDT System 2 or System 3 hardware. Stimuli were delivered through Panasonic earphone drivers (RP-HV297), coupled to otoscope specula which were inserted into the ear canals. When an auditory response was identified the threshold of the unit was determined by presenting freshly generated, unfiltered noise bursts to the contralateral ear (100 ms duration with an interstimulus interval of 1000 ms) at a range of sound levels. Thresholds were determined as the lowest sound level to elicit an increase in firing rate significantly greater (p ≤ 0.05) than the unit's resting level. The SRFs of most units were measured at two sound levels, one near (typically 5–15 dB above) unit threshold and a second at a level well (typically 25–35 dB) above unit threshold. SRFs were measured by presenting VAS stimuli in a random order with an interstimulus interval of 1000 ms from the same 66 sound directions used for measuring the HRTF. This process was repeated until 10–20 responses for each virtual stimulus direction had been collected. On each presentation freshly generated Gaussian noise bursts were convolved with a pair (one for each ear) of minimum-phase filters corresponding to a particular virtual space direction.
The implanted probe microphones enabled us to confirm that the VAS stimuli accurately replicated each animal's own HRTF measured with free-field stimuli. This is illustrated in Figure 1, which shows how the gain measured in one ear of one animal varies with stimulus azimuth and frequency (Fig. 1B), together with examples of the amplitude spectrum measured for both VAS and free-field stimuli for three different directions in space (Fig. 1D). A detailed comparison of the VAS and free-field amplitude spectra for numerous sound-source directions revealed a very small and consistent difference in amplitude across frequency of <1 dB in 80% of cases and <3 dB in 90%. Larger differences were found only at high frequencies on the side contralateral to the microphone, which were due to the attenuating effect of the head causing the signals to approach the noise floor of our recording equipment. By comparing the SRFs measured for single units in the inferior colliculus (IC) with each form of stimulation, we also previously showed that the VAS stimuli are capable of evoking very similar neural responses to free-field stimulation (Campbell et al., 2006). In addition to using individualized (own-ear) VAS stimuli, we presented virtual sound directions to infant animals that were based on adult HRTF measurements. To achieve this we simply selected a prerecorded adult HRTF from an animal of the same sex, applied the in situ calibration, and constructed new minimum-phase filters from these acoustical data. The order in which the own-ear and adult-ear stimuli were presented was varied from unit to unit.
Electrode signals were bandpass filtered (500 Hz–5 kHz), amplified (up to 15,000×) and digitized at 25 kHz. Individual spike shapes of multiunit clusters were sorted off-line using a k-means clustering algorithm incorporated into Brainware. Recordings in the VAS experiments were either single units or small multiunit clusters as judged from the spike shapes and the presence of a refractory period in the autocorrelation histogram. The response period for each unit was individually determined from the peri-stimulus time histogram. In all cases, firing rates had returned to spontaneous background levels by 400 ms after stimulus onset. Response magnitude was measured relative to the spontaneous activity of the neuron, which was obtained from a second window drawn between 500 and 1000 ms after stimulus onset.
Matlab (Mathworks) was used for constructing SRF plots, analyzing acoustical data, and preprocessing raw data for further statistical analyses. SRFs recorded with VAS stimuli were visualized by producing a smoothed map projection showing the mean evoked spike rate for each sound direction. Smoothing was done by interpolation of the averaged responses over a uniform grid of 7.5° resolution. To avoid discontinuities because of extrapolation over positions above and behind the animal, we extended the matrix maps to cover a −200° to + 200° azimuth range by copying values across from the opposite edge, i.e., from −160° to + 200° and from + 160° to −200°. This ensured that the algorithm could interpolate smoothly and without discontinuities across the full ± 180° azimuthal range. Statistical modeling (linear discriminant analysis and mixed-effects linear regression) and the generation of associated plots were performed using R, an open-source statistical package (www.r-project.org, also see Venables and Ripley, 1999). Further details are provided in the supplemental material, available at www.jneurosci.org.
Results
Maturation of auditory map topography
We quantified the maturation of the auditory space map in the deeper layers of the ferret SC by measuring its correlation with the visual map in the overlying superficial layers at a range of age points. Auditory brainstem response (ABR) measurements have shown that hearing onset occurs at P26–P30 in this species (Moore and Hine, 1992). This is followed by a rapid decrease in ABR threshold and latency, most likely because of improved conduction through the external and middle ear, with adult-like values reached by P34–P36. We therefore charted the development of auditory topography from P33, after these peripheral changes have taken place, to P62, around the age at which the animals are weaned, and also recorded from adult animals (>1 year of age). The visual map is a suitable reference for revealing the maturation of the auditory representation, because the retinocollicular projection is topographically ordered at birth (King et al., 1998a), and an adult-like visual map is already present in the superficial layers at eye opening (King et al., 1996), which occurs at ∼1 month of age.
Figure 2 plots the auditory best azimuth as a function of the visual best azimuth obtained in the superficial layers of that electrode penetration. Data are broken down by age point and sound level, with near-threshold levels (mean ± SD, 10.5 ± 2.7 dB above threshold) shown by the red symbols and suprathreshold responses (25.5 ± 2.7 dB above threshold) indicated by the blue symbols. These sound levels were chosen because previous studies in adult mammals have shown that near-threshold responses are derived primarily from monaural spectral cues provided by the contralateral external ear (Palmer and King, 1985; King et al., 1994), whereas suprathreshold spatial tuning is based on neuronal sensitivity to a combination of ILDs (Wise and Irvine, 1985) and spectral cues (Carlile and King, 1994). In the adult SC (Fig. 2E), there is an excellent correlation between the visual and auditory best azimuths at both sound levels (see inset panel for R values), indicating the presence of a level-invariant map of auditory space. This confirms our previous finding made by counting spikes from isolated single units (King et al., 1994).
In contrast to the highly ordered auditory representation found in adult animals, the best azimuths in the youngest ferrets recorded (the so-called “infant” animals) were much more scattered throughout the contralateral hemifield, and a few recordings even showed a preference for ipsilateral sound directions (Fig. 2A). The correlation between the visual and auditory best azimuths gradually improved with age at both sound levels (Fig. 2A–E), as indicated by the steady increase in the correlation coefficient (Fig. 2E). By P48–P52 (Fig. 2C), the slope of the regression line for the suprathreshold data (blue) was significantly different from zero (p < 0.05), whereas a few days later, at P53–P62 (Fig. 2D), the regression slopes obtained at both sound levels were significant. Thus, spatial selectivity based on different acoustical cues appears to emerge at a similar age.
Maturation of acoustical cues
The data shown in Figure 2 indicate that whereas some auditory neurons are tuned to directions that lie within the adult range at the earliest age examined, there is no systematic relationship between auditory and visual best positions until later age points. During development, the head and external ears are growing in size, which alters the localization cue values associated with different directions in space (Carlile, 1991; Mrsic-Flogel et al., 2003; Schnupp et al., 2003). This therefore raises the question of whether a correlation exists between the maturation of the auditory periphery and the time course of map development in the SC.
Figure 3 shows example horizon directionality transfer functions (DTFs) from two different adult (Fig. 3A,B) and two different infant (P33) ferrets (Fig. 3C,D). These plots indicate the spectrum level gains as a function of sound azimuth at 0° elevation. DTFs are the direction-dependent components of the HRTF, and were calculated by subtracting the mean spectral transfer function for all measured sound source directions from the spectral function for each individual direction. DTFs therefore remove aspects of the HRTF that are independent of sound-source direction, such as ear canal resonances, and eliminate interanimal variation due to differences in microphone position within the ear canal.
There are marked similarities between individual DTFs within an age group and clear differences between age groups. The adult DTFs exhibit more spectral “features,” i.e., azimuth-frequency combinations in which the gain increased or decreased, than did the infants, particularly for sound-source directions ipsilateral to the ear from which the recordings were made. The azimuthal extent of the high gain region is also narrower for the adults than for the infants (i.e., the infant DTFs are less directional). Moreover, where corresponding features could be identified, they are shifted to higher frequencies in the infants. Thus, a notch is present in the adult DTFs in which transmission gain is reduced by ∼10 dB over a narrow range of high frequencies. The frequency of this notch moves from ∼23–28 kHz as the stimulus direction shifts from the midline to the anterior ipsilateral quadrant (Fig. 3A,B, indicated by the narrow green region). A corresponding spectral notch could be identified in some of the infant DTFs, but was found at higher frequencies than in the adults (Fig. 3C), whereas in other infant DTFs, including the one illustrated in Figure 3D, these notch frequencies appeared to be so high as to fall mostly beyond the 30 kHz cutoff of our recording equipment.
We used principal components analysis (PCA) to quantify the individual and age-related differences across our population of horizon DTF recordings. Although we obtained acoustical data from a range of elevations, we restricted our PCA analysis to 0° elevation to facilitate comparison with the free-field SC data described in the previous section. We converted the DTF of each ear (in total there were n = 40 DTFs, 18 from adults, 14 from infants, and 8 from juvenile ferrets aged ∼P50) from a two-dimensional matrix to a vector; each DTF was based on 33 azimuth positions and 190 frequency bins. Every other frequency bin was removed to yield a 33 by 95 matrix, which was converted to a vector of length Q = 33 × 95. We conducted a PCA on the population of DTFs by concatenating the vectors to form an n by Q matrix.
Figure 3E shows the position of each of the individual DTFs in the space spanned by the first two resultant principal components. Cumulatively, these first 2 components explain 56% of the variance in the DTFs. Each point in this figure represents the data from one ear of one animal, with different color symbols used according to the age of the animals. The large crosses indicate the mean and its 95% confidence interval. There is a very clear distinction between the adult (green) and infant (blue, ∼P33) animals because their distributions do not overlap. However, the distribution of the juvenile (∼P50) data, represented by the red symbols, does overlap with that of the adults, indicating that juvenile DTFs are substantially more similar to adults than they are to infants. Because a significant relationship between the visual and auditory SRFs first emerges at ∼P50 (Fig. 2), these data reveal that the development of the auditory map follows a similar time course to the maturation of the acoustic cues for sound location. This correlation suggests the possibility that the immaturity of the auditory periphery could be preventing the emergence of adult-like auditory topography in the developing SC.
How important is maturation of the periphery to the maturation of the auditory space map?
Assessing the relative importance of peripheral and central factors in the development of the SC auditory map requires more than just looking for parallels between the acoustical and physiological changes that take place during postnatal development. By presenting VAS stimuli over earphones, we can recreate the acoustic filtering of an individual's head and external ears (Wightman and Kistler, 1989a,b; King et al., 2001). The resulting stimuli are “externalized” and their apparent direction in space can be controlled by the experimenter. Stimuli can be presented through “virtual ears”, which either reproduce the animal's own ears or mimic those of another individual by convolving the acoustic stimulus with the appropriate filters derived from prerecorded HRTFs. This enabled us to test the importance of peripheral maturity directly by recording SRFs from the SC of infant animals through the virtual ears of an adult ferret.
Using individualized VAS stimuli, we obtained detailed SRFs from 48 units from infant animals (P32–P39), plus 44 units recorded in the SC of adult ferrets (data originally gathered for Campbell et al., 2006), by measuring the mean evoked spike rate from virtual sound-source directions covering a large range of azimuths and elevations. The overall preferred sound direction of each SRF was determined by calculating the direction of the centroid vector (see supplemental material, available at www.jneurosci.org). Figure 4 plots visual best azimuths of multiunit activity recorded in the superficial layers of each electrode penetration against the azimuth of the auditory centroids for both infants (Fig. 4A) and adults (Fig. 4B). Within each age group we fitted a single linear model to the data from both sound levels. Pooling across levels introduces correlation to the regression because each unit was generally recorded at more than one sound level. We corrected for correlation (which leads to an overestimate of the number of degrees of freedom) by applying mixed-effects linear models to these data (Pinheiro and Bates, 2000).
Infant animals showed no clear relationship between the visual and auditory data (Fig. 4A; the slope of the linear model was not significantly different from zero, p = 0.11, df = 44, r = 0.17), indicating a lack of association between auditory centroid directions and visual best azimuths. This therefore confirms the free-field data obtained from other ferrets of this age (Fig. 2A). Again like the free-field responses (Fig. 2E), the adult data from the VAS experiments showed a significant relationship between the visual and auditory representations (Fig. 4B), although, in this case, a second order model provided a better fit (both first and second order terms were significantly different from zero; p[x1] = 0.001; p[x2] = 0.02, df = 32, r = 0.69). Because we showed that the VAS stimuli faithfully replicated the spectral cues produced by real free-field sound sources (Fig. 1), this difference in the order of the model that best fits the auditory topography data are most likely to arise from differences in the range of stimulus locations sampled and in the electrophysiological recording and analytical techniques used in each case. The free-field data were obtained by varying stimulus azimuth at a fixed elevation and spatial tuning was assessed at different ages using power spectral density measurements that sample neural activity, regardless of signal-to-noise ratio, over a relatively large region of the SC. In contrast, stimuli were presented from 66 virtual sound directions, varying in azimuth and elevation, with spike counts from single units or small multiunit clusters, comprising two or more nearby units, used to construct the SRFs. Nevertheless, both approaches clearly show that sound azimuth is mapped within the SC and that this topographic order, and therefore visual-auditory map alignment, emerges during the course of postnatal development.
Our acoustical analysis revealed a correlation in the time courses of the development of the localization cues and auditory map topography, suggesting that the observed order in the auditory representation at different ages might simply reflect the values of the localization cues available then. To investigate this possibility, we recorded infant SRFs through the virtual ears of adult animals. If the infant auditory system is “prewired” for adult acoustic cues, then presenting closed-field sounds that faithfully mimic those cues should lead to more adult-like responses, including improved topographic order, in the auditory representation in the SC. However, Figure 4C shows that conducting this manipulation did not lead to immediate improvements in map topography. The black points in this panel are taken from Figure 4A, showing the data obtained by mapping SRFs in infant ferrets with individualized VAS stimuli, whereas the red points in Figure 4C show the SRF centroids from the same infant SC units, but recorded with VAS stimuli derived from an adult of the same sex. The black and red lines are from a linear model fitted to each data set. Replacing the animals' own ears with mature acoustical cues did not produce an adult-like map of space. This is confirmed by the linear model, which shows that there is no significant difference in the slopes of the two data sets (p = 0.56, df = 73). It therefore appears that the poor correlation between auditory and visual responses in early life is probably related more to the immaturity of the central auditory system than the spatial information provided by the head and external ears.
Quantifying the development of the spatial receptive field properties
So far, we have focused solely on the preferred sound direction, defined as the SRF centroid direction. However, SRFs can vary substantially in size and shape even if the measure of preferred sound direction remains the same. Switching from infant to adult ears might therefore make the infant SRFs more similar to those found in the adult SC in ways that are not captured by considering the centroid directions alone. This possibility is supported by the representative examples shown in Figure 5. The SRFs recorded with their own ears from infant ferrets were often larger at both near-threshold and suprathreshold sound levels than those recorded in adult animals. But when adult spatial cues were provided, the infant SRFs tended to become smaller (Fig. 5, compare middle two columns), a change that was usually associated with an alteration in the mean evoked spike rate of the units (Fig. 5, last column). To compare the SRFs across age, we measured the maximum mean evoked spike rate (spikes/s), the area (rad2) within which the response was ≥50% of the maximum value, and the length of the centroid direction vector as a means of assessing the “patchiness” of the response. Centroid vector length can vary in value from 0 to 0.75, with longer values indicating highly focused (less patchy) SRFs. These parameters are described in greater detail in the supplemental text and supplemental Fig. 1, available at www.jneurosci.org as supplemental material. The distribution of values obtained for each parameter is plotted separately for single units and small multiunit clusters in supplemental Fig. 2, available at www.jneurosci.org as supplemental material. Because we observed no clear differences between them, data were combined from the single units and multiunits in subsequent analyses.
Figure 6 compares own-ear SRF statistics from adult and infant animals with the data broken down by sound level. The differences between the two age groups are summarized using 6 t tests for significant differences between the means within each sound level. The results of these are shown by the array below the box-plots: each of the parameters used for characterizing the response differed significantly across age for at least one sound level, with significant differences observed in the 50% SRF area between infants and adults at both sound levels. The results of equivalent comparisons between the responses of infant SC units recorded with their own virtual ears and with adult virtual ears are depicted in Figure 7. Paired t tests revealed significant changes in the mean maximum spike rate when adult cues were provided. Although the 50% areas were not significantly different between the two conditions at either sound level, the box plots nevertheless show a trend for these values to decrease with adult ears.
Similar analyses have been used in previous studies with data of this sort (Mrsic-Flogel et al., 2003; Campbell et al., 2006). However, a p value does not address the more pertinent question of how separable or distinct one group (i.e., one distribution) is from the other. Furthermore, conducting a series of separate tests is not optimal for data in which each observation (in this case the SRF) is characterized by multiple parameters that are unlikely to be independent. To quantify the degree of similarity between the SRFs recorded in adult and infant ferrets, we therefore used a technique known as linear discriminant analysis (LDA), which finds the optimal linear separation between groups of data points. The principle behind LDA is straightforward. Given two groups in an n-dimensional space, LDA seeks to find the single direction, the so-called linear discriminant (LD), which maximally separates the two groups. LDA scales to problems involving multiple groups so that G groups are captured by an LD space comprised of G − 1 LDs, with each LD separating two groups. The direction of an LD is defined by n coefficients, corresponding to the n variables, or dimensions, with which the groups were originally defined (i.e., the original n-dimensional space). If these original axes are in the same units, the coefficients of the LD indicate the degree to which each variable contributes to separating the groups. More details on LDA, along with a graphical explanation, are provided in the supplemental text and supplemental Figs. 3–6, available at www.jneurosci.org as supplemental material.
Discriminating adults from infants
We used the distribution of the SRFs in LD space to determine, for any given SRF, the degree of certainty that it could be attributed to an adult or an infant. We then used the LDA coefficients to assess which features were best able to distinguish between the age groups. We first performed an LDA by pooling data across sound level into two groups (adult and infant SRFs), resulting in a single LD along which the two distributions can be represented as histograms (Fig. 8).
Figure 8A,B shows the LDA performed on the SRFs recorded in adult and infant ferrets using VAS stimuli derived from each animal's own ears. The optimal linear separation of the adult and infant data is shown in Figure 8A, which plots the distributions of the data along the LD direction. The scale of the LD is unit-less with the grand mean of the two groups having a value of zero. Infant SRFs had more positive values along the LD compared with adult SRFs. Classifying SRFs as adult or infant based solely on their position along the LDA gives correct results more often than would be expected by chance. On the basis of the observed data distributions shown in Figure 8A, SRFs could be assigned to their correct age-point 67% of the time. We assessed the significance of this value using confidence intervals derived from a bootstrapping and cross-validation approach. Chance classification performance was estimated by running the analysis 1000 times using randomly assigned group identities. The classification accuracies of these replicates are shown by the histogram in the inset plot of Figure 8A. To avoid over-fitting we did not test the observed value of 67% against this distribution. Instead we ran the LDA again using a cross-validation approach, in which the LD was calculated on a randomly chosen subsample of 90% of the data and the classification performed on the remaining 10%. This was repeated 1000 times and resulted in a mean classification accuracy of 61%, which is plotted as the dashed red line in the inset figure. Only 0.1% of the chance (bootstrapped) values exceeded this value. Hence we derive a p value of 0.001 to describe the probability of obtaining the observed classification accuracy by chance.
The coefficients of the LD indicate on what basis the groups are separated. The LDA in Figure 8A, along with the others presented in the following sections, was conducted on the z-scored parameter values. In other words, for each of the 3 parameters (maximum firing rate, SRF 50% area and centroid vector length) we subtracted the mean and divided by the variance, rendering them unit-less. Hence the LD coefficients shown in Figure 8B indicate the relative importance of each variable in separating the adult and infant groups. The largest coefficient was obtained for the 50% response area, which had a large positive value. This means that SRFs with more positive values along the LD are those with larger response areas. Because the infant data are skewed to the positive side of the LD (Fig. 8A), it follows that infant animals tended to have larger response areas than adults. This is further illustrated in Figure 9A, which plots for each unit in the infant and adult groups the z-score of the 50% area against that of the centroid vector length. These values clearly covary, with smaller areas associated with longer centroid lengths. Although there is substantial overlap between them, the data from the adult animals are distributed below the infant data, indicating that adult SRFs are smaller. This interpretation is confirmed by the box-plots in Figure 6 and also by examination of the example SRFs in Figure 5. Thus, in addition to the development of topographic order in the auditory representation, shown in Figures 2 and 4, there are also systematic changes in the structure of the SRFs, and particularly in their spatial extent, as the ferrets get older.
Recording infant SRFs with virtual adult ears
We next asked whether virtual adult ears made infant SRFs more adult-like. We addressed this by comparing the infant SRFs recorded with virtual adult ears first to those recorded with the infants' own ears and, second, to the SRFs recorded from adult animals. Figure 8, C and D, shows the results of an LDA which separates infant SRFs recorded with the animals' own ears (red) from those recorded through the virtual ears of an adult animal of the same sex (grey). Note that the distributions of infant own-ear data are different in Figure 8, A and C, because the direction of the LD is different in each case. The data from the virtual adult-ear condition tend to have more negative values along the LD compared with the own-ear condition. The SRFs could be correctly classified into one or other of these conditions with an accuracy of 62%, which is significantly better than chance (p = 0.023, see inset Fig. 8C; analysis is the same as that in Fig. 8A). Thus, in contrast to the measure of map topography (Fig. 4C), recording infant SRFs through adult ears did lead to a significant change in response properties. Furthermore, together, the two LDAs presented so far indicate that infant SRFs became more adult-like when recorded through adult ears. This is illustrated by the distributions of the data along the LD (the infant own-ear data always have the more positive values), the similarity in the classification rates, and, crucially, by the similarity in the direction of the LD as shown by the values of the coefficients in Figure 8, B and D. For instance, the first LDA showed that infant own-ear SRFs tended to have larger SRF 50% areas than adult own-ear SRFs (Fig. 8A,B). Similarly, the second LDA showed that infant own-ear SRFs became smaller (they have more negative values along the LD) when the same units were recorded using virtual adult ears (Fig. 8C,D). This is again demonstrated by plotting the z-score of the 50% area against that of the centroid vector length for each of these units (Fig. 9B). The higher incidence of grey symbols in the lower half of this plot indicates that adult-ear SRFs tended to be smaller than the own-ear SRFs. This is also consistent with the trends observed in the box-plots in Figure 7 and in the example SRFs in Figure 5. The systematic nature of the changes observed in the infant SRFs when switching from own-ear to adult-ear VAS stimuli, regardless of the order in which they were presented, also confirms that these effects cannot be attributed to drift over time in the responsiveness of the units.
If the infant SRFs recorded with adult ears do indeed become more adult-like, then we would predict that these responses should be similar to the SRFs recorded from the SC of adult ferrets. We tested this by conducting a third LDA to separate infant SRFs recorded with virtual adult ears from the SRFs recorded in adult animals with their own ears (Figs. 8E,F). In contrast to the analyses in Figure 8, A and C, the distributions tested in this comparison overlap completely and the classification accuracy performs at chance levels (p = 0.64) (Fig. 8E, inset). The magnitudes of the coefficients are non-zero (Fig. 8F) because the analysis seeks to best separate the two groups. However, despite taking all three variables into consideration the LDA found no significant difference between these two groups (Fig. 8E). Indeed, we observed much more overlap when their 50% area and centroid length z-scores were plotted against each other (Fig. 9C) than in the other comparisons of age group (Fig. 9A) or ear condition (Fig. 9B). Thus, by taking into account the centroid length, maximum response strength and response area, we found that the SRFs recorded from adult animals are not distinguishable from infant SRFs recorded with adult acoustic cues, illustrating the important role of the auditory periphery in shaping response properties at the level of the SC.
Discriminating adults from infants across different sound levels
To analyze the data in more detail and confirm that the above results hold when sound level is taken into account, we conducted LDAs using the same 3 pairs of conditions as those in Figure 8, but this time also subdivided into near-threshold and suprathreshold sound levels. For each comparison, there were therefore 4 groups (two age groups or HRTF conditions plus two sound levels). The ensuing LDA is 3-dimensional and each of the 3 linear discriminants separates two of the groups. The results of these analyses are shown in Figure 10. In each case we plotted the data along the 2 LDs that describe the largest proportion of the between-groups variance [(Fig. 10A) the first two LDs describe 85% of the between-groups variance], with the third LD omitted in the interests of clarity. We report the results of these analyses in terms of the degree of separability in each case.
Figure 10, A and B, show the results of the LDA on the own-ear SRFs from adult (black) and infant (red) animals. Different symbol shapes indicate near-threshold and suprathreshold sound levels. Although there is substantial overlap and the variance is relatively large (not unlike Fig. 8A), the positive side of LD1 clearly contains more infant (red) data points than adult (black). This is highlighted by the two sets of contour lines obtained by pooling data from both sound levels. The crosses indicate the medians of the adult and infant distributions. To quantify the separability and more clearly show the effects of sound level, we used a classification algorithm. For each unit, we calculated the posterior probability that it belonged to each of the 4 cluster centers (not plotted) and assigned it to the cluster for which the posterior probability was largest. Because the actual group to which the units belong is known, the results of the classification algorithm can be displayed as a confusion matrix (Fig. 10B), from which a value for the mutual information (MI) can be calculated. MI is a measure that has a positive bias because it cannot have negative values. Bias was determined by calculating the MI for 1000 data sets in which the group identities were assigned randomly. The mean of these 1000 MI values was taken as the bias and was subtracted from the observed value obtained from the raw data. The resulting debiased MI estimate can, therefore, be negative in cases in which the true MI value is zero or too small to reach statistical significance with the available data. The de-biased MI for the adult versus infant comparison was 0.12 bits (the maximum possible for four groups being 2.0 bits). The confusion matrices shown were all obtained by tenfold cross-validation (see above).
As before, we then performed the analysis on the SRFs recorded from the infant ferrets with either their own or adult ears (Fig. 10C,D). As in Figure 8, the distributions of infant own-ear data are not the same in Figure 10, A and C, because the direction of the LD is different in each case. The resulting confusion matrix resembles that in Figure 10A which is based on true adult and infant data and had a similar MI value of 0.085 bits. Finally, we attempted to segregate true adult SRFs from infant SRFs recorded with virtual adult ears (Fig. 10E,F). Based on the two-group analysis in Figure 8E, we would predict the MI of this comparison to be close to zero. The distributions in the scatter plot showed greater overlap (Fig. 10E) than in the other two plots, and the resulting confusion matrix (Fig. 10F) yielded a negative bias-corrected MI estimate (−0.04 bits). In other words, the classification algorithm performed at chance and so there was no significant difference between the groups. Together, these analyses support the notion that the provision of adult acoustical cue values significantly changed the structure of the SRFs recorded in the SC of infant ferrets at both near-threshold and suprathreshold sound levels and made them more adult like.
Do all spatial receptive fields become more adult like?
In an attempt to explain why the SRFs of infant SC units become smaller when their SRFs are recorded through adult ears, we explored whether it was possible to predict the degree of shift along the LD for each unit. As a predictor, we used the centroid azimuth of the own-ear SRF, which describes the spatial preference of each unit afforded by the acoustical cues available to the infant ferrets. These data are shown in Figure 11, A and B, where they are broken down by sound level. In each panel, the y-axis shows the degree of shift (Δ) along the LD after switching from own ears to adult ears. Because the own-ear condition tended to have more positive values along the LD than the adult-ear condition (Fig. 8C), positive Δ values indicate that an SRF is more adult-like when recorded through the adult HRTF.
Figure 10A shows that for near-threshold recordings there is a significant relationship between the degree of shift and the centroid azimuth of the own-ear SRFs (p = 0.019), indicating that the spatial selectivity exhibited by infant SC neurons influences how they respond when stimulated through virtual adult ears. Units with lateral centroid azimuths were more likely to become adult-like when recorded through an adult HRTF, whereas those with frontal azimuths often became less adult-like, either broadening or displaying more patchy SRFs compared with those recorded with the animals' own ears. Such a relationship was not evident at suprathreshold sound levels (Fig. 11B). Nevertheless, at both sound levels the mean of the Δ values was positive and significantly different from zero, indicating that SRFs became more adult-like at both sound levels tested (two-tailed t tests; near-threshold: μ = 0.46, t(38) = 2.99, p = 0.005; suprathreshold: μ = 0.31, t(40) = 2.69, p = 0.01).
Discussion
We have shown that the auditory space map in the ferret SC matures gradually after hearing onset, becoming adult-like by ∼P60. Although the time course of map development matches that of the HRTF, recording SRFs in infant animals through virtual adult ears did not result in the immediate appearance of a topographically ordered representation. This implies that emergence of adult-like topography cannot be attributed solely to the maturation of the acoustical cues. However, an LDA showed that SRF features other than centroid direction did change systematically, causing infant response properties to become adult like when recorded through adult ears. These findings highlight the importance of examining multiple response properties when investigating the development of sensory representations.
Several studies have shown that auditory SRFs in the SC become smaller during postnatal development (Withington-Wray et al., 1990a; Wallace and Stein, 1997, 2001), although the extent to which the early auditory representation is topographically organized seems to vary between species. In keeping with a previous study in guinea pigs (Withington-Wray et al., 1990a), we found that whereas the centroid vectors of some infant SRFs fell within the adult range of values, they were not significantly correlated with the mature visual responses recorded in the superficial layers, indicating a lack of topography in the auditory representation in young ferrets. Although auditory SRFs are mapped within the SC of newborn monkeys, their response properties are immature in many ways (Wallace and Stein, 2001). This supports one of the principal findings of our study, namely that the sharpening of auditory SRFs and emergence of map topography with age involve largely different processes.
Factors contributing to the maturation of auditory spatial receptive fields
In mammals, SC neurons derive their spatial selectivity from monaural spectral cues and ILDs (Hirsch et al., 1985; Palmer and King, 1985; Wise and Irvine, 1985; Middlebrooks, 1987; Middlebrooks and Knudsen, 1987; Carlile and King, 1994; King et al., 1994; Campbell et al., 2006). As shown here and in previous studies (Clifton et al., 1988; Moore and Irvine, 1979; Carlile, 1991; Schnupp et al., 2003), the cue values corresponding to particular directions in space undergo substantial modifications during postnatal development. Thus, even in the absence of any central changes in neural circuitry, we would expect the size and shape of the SRFs to change as the acoustic information on which they are based matures with growth of the head and ears.
It is extremely unlikely, however, that the sensitivity of SC neurons to these cues is mature in infancy. The major source of auditory input to the ferret SC is provided by a topographically ordered projection from the nucleus of the brachium of the IC (King et al., 1998b). Although the overall organization of this pathway remains unchanged after birth (Nodal et al., 2005), developmental alterations occur in the morphology, connectivity, and synaptic properties of neurons in the lateral superior olive (Henkel and Brunso-Becktold, 1991, 1995; Sanes, 1993; Kotak et al., 1998; Rietzel and Friauf, 1998; Sanes and Friauf, 2000; Kotak and Sanes, 2003), the first site at which ILDs are encoded. Moreover, ILD sensitivity changes with age both in this nucleus (Sanes and Rubel, 1988) and in the IC (Brown et al., 1978; Moore and Irvine, 1981).
The maturation of auditory SRFs in the SC therefore potentially depends on both peripheral and central factors, but, until now, it has not been possible to separate these. By using VAS to provide ferrets with mature localization cues values soon after hearing onset, we were able to examine directly whether the infant auditory system is capable of supporting an adult-like map of auditory space. A limitation of this approach is that we were obviously unable to provide each infant ferret with the actual HRTF that it would have had in adulthood. However, there are substantial similarities between the HRTFs within the adult and infant age groups and clear differences between these groups (Fig. 3). Moreover, in a study of adult ferret A1, we found that HRTFs from other adults alter SRF shape and location only if there are marked differences between the dimensions of the two sets of ears (Mrsic-Flogel et al., 2001). Because of the difference in head size between male and female ferrets, we ensured that the adult HRTF used here was sex-matched for each infant animal. The VAS stimuli should therefore have provided a good approximation of the cue values that the infant ferrets would have experienced in adulthood.
We used LDA (Fisher, 1936) to compare the SRFs across age and in different ear conditions. Although this technique has been adopted in other recent studies (Briggman et al., 2005; Bhandawat et al., 2007), it has been relatively little used in systems neuroscience. The advantage of LDA is that it allowed us to use correlations between response variables to obtain more sensitive statistical tests than would be possible based on the individual parameters. The analysis revealed clear differences between adult and infant SRFs, which disappeared when infant SRFs were recorded with virtual adult ears. This effect was observed across sound level, therefore affecting responses derived from both monaural and binaural cues, and was due primarily to changes in the spatial extent of the SRFs. The capacity of the LDA to identify the most important variable, in this case SRF area, for separating the two groups illustrates a further advantage of this technique, which would be hard to derive from visual inspection of the individual parameter distributions.
These findings suggest that changes in the spatial extent of the receptive fields during development can be attributed to maturation of the auditory periphery. A similar result has been found in A1, where mapping infant ferret SRFs with near-threshold stimuli presented through virtual adult ears resulted in significantly sharper tuning than when the animals' own ears were used (Mrsic-Flogel et al., 2003). Because the spatial selectivity of A1 units at these sound levels is determined principally by the gain provided by the contralateral external ear (Middlebrooks and Pettigrew, 1981; Rajan et al., 1990; Schnupp et al., 2001), this sharpening can be explained by the fact that the HRTF becomes more directional with age. However, SC neurons show level-independent azimuth selectivity that spans a much larger spatial region, indicating that their SRFs cannot arise simply by integrating acoustic power, which reaches a maximum just in front of the interaural axis. Indeed, the presence of a near-threshold space map in monaurally deafened animals indicates that SC neurons are differentially sensitive to direction-dependent spectral cues (Palmer and King, 1985; King et al., 1994).
In view of this, the immediate sharpening of the infant SRFs produced by recording them with virtual adult ears is perhaps surprising. However, the changes seen depended on the units' centroid direction and therefore on the cue values to which they were most sensitive. The near-threshold SRFs that became adult-like were those with lateral direction vectors (Fig. 11). There is relatively little structure in the DTF in this region of space (Fig. 3, broad red regions in the vicinity of the interaural axis, +90°), suggesting that, as in A1, pinna directionality is predominantly responsible for shaping these SRFs. In contrast, a preference for other directions depends on sensitivity to specific spectral features, such as the high-frequency notch found near the anterior midline, which, in cats, contributes to accurate sound localization in the frontal sound field (Huang and May, 1996). As expected, the relatively few anterior SRFs found at near-threshold levels tended not to sharpen when recorded with adult ears and instead behaved unpredictably.
Emergence of auditory topography
Although providing adult ears made many of the infant SRFs more adult like, this did not improve the topographic order in the auditory representation. This suggests that the neural circuitry underlying the map of auditory space is immature just after hearing onset and that the gradual emergence of topography during the second postnatal month cannot be attributed solely to growth-related changes in the localization cue values. The developing auditory representation is shaped by experience so that topographically aligned maps of auditory and visual space emerge even when highly abnormal information is provided by one or other of these modalities (King et al., 1988; Knudsen and Brainard, 1991; Gold and Knudsen, 2000). Visual activity in the superficial SC layers appears to guide the developing auditory responses (King et al., 1998a), although experience-dependent refinement of auditory brainstem circuits is also likely to be involved.
Previous studies suggest that experience is also involved in the developmental sharpening of auditory SRFs in the SC. Rearing animals in omni-directional noise (Withington-Wray et al., 1990c) or in the dark (Withington-Wray et al., 1990b; Wallace et al., 2004) results in large SRFs that resemble those seen in infancy. But our results indicate that the reduction in SRF size over development primarily reflects the growth of the head and ears. Consequently, the abnormally large SRFs observed in adult animals that have been deprived of experience during infancy are more likely to result from a reorganization of the neural circuits rather than maintenance of the immature state.
Footnotes
-
This work was supported by the Wellcome Trust through a four-year studentship (R.A.A.C.) and Senior and Principal Research Fellowships (A.J.K.), by the Lister Institute for Preventive Medicine (A.J.K.), and by Biotechnology and Biological Sciences Research Council Grant BB/D009758/1 (J.W.H.S; A.J.K.). We thank Andreas Schulz, Kerry Walker, Ben Willmore, and Nicolas Heess for useful discussion.
- Correspondence should be addressed to Dr. Andrew J. King, Department of Physiology, Anatomy and Genetics, Sherrington Building, University of Oxford, Parks Road, Oxford OX1 3PT, UK. andrew.king{at}dpag.ox.ac.uk